WebM-Enabled Browser Usage Share Exceeds H.264-Enabled Browser Usage Share on Desktop (in StatCounter Numbers)

2012-01-21T14:29:11Z

Looking at StatCounter stats, it occurred to me that they might not match the common narrative about H.264 market share. I decide to run some numbers using StatCounter stats. It turns out that during the first two weeks of 2012, on desktop, the usage-share of browsers that support WebM in HTML5 video exceeded the usage share of browsers that support H.264 in HTML5 video—except in North America and Oceania.

Methodology

I downloaded StatCounter data as CSV and fed it to a Python script I wrote. The script categorizes the StatCounter-reported usage share into six buckets:

Total desktop (this excludes Safari on iPad, the Android browser, the browser of Playstation 3, etc.)
Theora
WebM
H.264
No support for HTML5 video
Unknown HTML5 video support

The total desktop bucket adds up to 97.7% globally. In other words, the non-desktop browsers that StatCounter reports in its desktop stats account for less than 3% of usage. I divided the numbers in the other buckets by the total desktop bucket to scale the other numbers to be percentages of desktop only.

The script (which you can audit) has baked-in knowledge of which browser versions support what. Only out-of-the-box support is considered, so IE9 is considered to support H.264 but not WebM even though WebM support can be added as a separate installation. As an exception, Safari is counted as if Safari for Windows always also had QuickTime installed (because the data is not granular to do otherwise).

Results for Week 1 and Week 2 of 2012

The percentage means the Web usage share (as measured by StatCounter) of desktop browsers that support a given video format in HTML5 . The percentages do not add up to 100%, because some browsers support multiple video formats. The numbers for Theora are always higher than the numbers for WebM, because all browsers that support WebM also support Theora but there are browsers that support Theora but not WebM.

World

Theora	55%
WebM	51%
H.264	45%
No	28%
Unknown	0%

Africa

Theora	64%
WebM	55%
H.264	38%
No	27%
Unknown	0%

Asia

Theora	58%
WebM	52%
H.264	42%
No	32%
Unknown	0%

Europe

Theora	62%
WebM	57%
H.264	44%
No	21%
Unknown	0%

North America

Theora	42%
WebM	38%
H.264	47%
No	32%
Unknown	0%

Oceania

Theora	49%
WebM	45%
H.264	50%
No	25%
Unknown	0%

South America

Theora	66%
WebM	62%
H.264	53%
No	25%
Unknown	0%

Profiling CSS for fun and profit. Optimization notes.

2012-01-04T19:22:37Z

I’ve been recently working on optimizing performance of a so-called one-page web app. The application was highly dynamic, interactive, and was heavily stuffed with new CSS3 goodness. I’m not talking just border-radius and gradients. It was a full stack of shadows, gradients, transforms, sprinkled with transitions, smooth half-transparent colors, clever pseudo-element -based CSS tricks, and experimental CSS features.

Aside from looking into bottlenecks on Javascript/DOM side, I decided to step into the CSS land. I wanted to see the kind of impact these nice UI elements have on performance. The old version of the app — the one without all the fluff — was much snappier, even though the JS logic behind it hasn’t changed all that drastically. I could see by scrolling and animations that things are just not as quick as they should be.

Was styling to blame?

Fortunately, just few days before, Opera folks came out with an experimental “style profiler” (followed by WebKit’s ticket+patch shortly after). The profiler was meant to reveal the performance of CSS selector matching, document reflow, repaint, and even document and css parsing times.

Perfect!

I wasn’t thrilled about profiling in one environment, and optimizing according to one engine (especially the engine that’s used only in one browser), but decided to give it a try. After all, the offending styles/rules would probably be similar in all engines/browsers. And this was pretty much the only thing out there.

The only other somewhat similar tool was WebKit’s “timeline” tab in Developer Tools. But timeline wasn’t very friendly to work with. It wouldn’t show total time of reflow/repaint/selector matching, and the only way to extract that information was by exporting data as json and parsing it manually (I’ll get to that later).

Below are some of my observations from profiling using both WebKit and Opera tools. TL;DR version is at the end.

Before we start, I’d like to mention that most (if not all) of these notes apply best to large, complex applications. Documents that have thousands of elements and that are highly interactive will benefit the most. In my case, I reduced page load time by ~650ms (~500ms (!) on style recalculation alone, ~100ms on repaint, and ~50ms on reflow). The application became noticeably snappier, especially in older browsers like IE7.

For simpler pages/apps, there are plenty of other optimizations that should be looked into first.

Notes

The fastest rule is the one that doesn’t exist. There’s a common strategy to combine stylesheet “modules” into one file for production. This makes for one big collection of rules, where some (lots) of them are likely not used by particular part of the site/application. Getting rid of unused rules is one of the best things your can do to optimize CSS performance, as there’s less matching to be done in the first place. There are certain benefits of having one big file, of course, such as the reduced number of requests. But it should be possible to optimize at least critical parts of the app, by including only relevant styles.

This isn’t a new discovery by any means. Page Speed has always been warning against this. However, I was really surprised to see just how much this could really affect the rendering time. In my case, I shaved ~200-300ms of selector matching — according to Opera profiler — just by getting rid of unused CSS rules. Layout and paint times went down as well.
Reducing reflows — another well-known optimization — plays big role here as well. Expensive styles are not so expensive when fewer reflows/repaints need to be performed by the browser. And even simple styles could slow things down if they’re applied a lot. Reducing reflows AND reducing complexity of CSS go hand in hand.
Most expensive selectors tend to be universal ones ("*"), and those with multiple classes (".foo.bar", "foo .bar.baz qux", etc.). We already knew this, but it’s nice to get confirmation from profilers.
Watch out for universal selectors ("*") that are used for “no reason”. I found selectors like "button > *", even though throughout the site/app buttons only had ’s in them. Replacing "button > *" with "button > span" made for some amazing improvements in selector performance. The browser no longer needs to match every element (due to right-left matching). It only needs to walk over ’s — the number of which could be significantly smaller — and check if parent element is
I used this snippet to quickly find which elements to substitute "*" with.
```
$$(selector).pluck('tagName').uniq(); // ["SPAN"]
```
This relies on Array#pluck and Array#uniq extensions from Prototype.js. For plain version (with reliance on ES5 and selectors API), perhaps something like this would do:
```
Object.keys([].slice.call(
  document.querySelectorAll('button > *'))
    .reduce(function(memo, el){ memo[el.tagName] = 1; return memo; }, {}));
```
In both Opera and WebKit, [type="..."] selectors seem to be more expensive than input[type="..."]. Probably due to browsers limiting attribute check to elements of specified tag (after all, [type="..."] IS a universal selector).
In Opera, pseudo "::selection" and ":active" are also among more expensive selectors — according to profiler. I can understand ":active" being expensive, but not sure why "::selection" is. Perhaps a “bug” in Opera’s profiler/matcher. Or just the way engine works.
In both Opera and WebKit, "border-radius" is among the most expensive CSS properties to affect rendering time. Even more than shadows and gradients. Note that it doesn’t affect layout time — as one would think — but mainly repaint.

As you can see from this test page, I created a document with 400 buttons.

I started checking how various styles affect rendering performance (“repaint time” in profiler). The basic version of button only had these styles:
```
background: #F6F6F6;
border: 1px solid rgba(0, 0, 0, 0.3);
font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif;
font-size: 14px;
height: 32px;
vertical-align: middle;
padding: 7px 10px;
float: left;
margin: 5px;
```
The total repaint time of 400 buttons with these basic styles only took 6ms (in Opera). I then gradually added more styles, and recorded change in repaint time. The final version had these additional styles, and was taking 177ms to repaint — a 30x increase!
```
text-shadow: rgba(255, 255, 255, 0.796875) 0px 1px 0px;
box-shadow: rgb(255, 255, 255) 0px 1px 1px 0px inset, rgba(0, 0, 0, 0.0976563) 0px 2px 3px 0px;
border-radius: 13px;
background: -o-linear-gradient(bottom, #E0E0E0 50%, #FAFAFA 100%);
opacity: 0.9;
color: rgba(0,0,0,0.5);
```
The exact breakdown of each one of those properties was as follows:

The text-shadow and linear-gradient were among the least expensive ones. Opacity and transparent rgba() color were a little more expensive. Then there was box-shadow, with inset one (0 1px 1px 0) slightly faster than regular one (0 2px 3px 0). Finally, the unexpectedly high border-radius.

I also tried transform with rotate parameter (just 1deg) and got really high numbers. Scrolling the page — with 400 slightly rotated buttons on it — was also noticeably jerky. I’m sure it’s not easy to arbitrarily transform an element on a page. Or maybe this is the case of lack of optimization? Out of curiosity, I checked different degrees of rotation and got this:

Note how even rotating element by 0.01 degree is very expensive. And as the angle increases, the performance seems to drop, although not linearly but apparently in a wavy fashion (peaking at 45deg, then falling at 90deg).

There’s room for so many tests here — I’d be curious to see performance characteristics of various transform options (translate, scale, skew, etc.) in various browsers.
In Opera, page zoom level affects layout performance. Decreasing zoom increases rendering time. This is quite understandable, as more stuff has to be rendered per same area. It might seem like an insignificant detail, but in order to keep tests consistent, it’s important to make sure zoom level doesn’t mess up your calculations. I had to redo all my tests after discovering this, just to make sure I’m not comparing oranges to grapefruits.

Speaking of zoom, it could make sense to test decreased font and see how it affects overall performance of an app — is it still usable?
In Opera, resizing browser window doesn’t affect rendering performance. It looks like layout/paint/style calculations are not affected by window size.
In Chrome, resizing browser window does affect performance. Perhaps Chrome is smarter than Opera, and only renders visible areas.
In Opera, page reloads negatively affect performance. The progression is visibly linear. You can see from the graph how rendering time slowly increases over 40 page reloads (each one of those red rectangles on the bottom correspond to page load followed by few second wait). Paint time becomes almost 3 times slower at the end. It looks almost like page is leaking. To err on a side of caution, I always used the average of first ~5 results to get “fresh” numbers.

Script used for testing (reloading page):
```
window.onload = function() {
  setTimeout(function() {
    var match = location.href.match(/\?(\d+)$/);
    var index = match ? parseInt(match[1]) : 0;
    var numReloads = 10;
    index++;
    if (index < numReloads) {
      location.href = location.href.replace(/\?\d+$/, '') + '?' + index;
    }
  }, 5000);
};
```
I haven’t checked if page reloads affect performance in WebKit/Chrome.
An interesting offending pattern I came across was a SASS chunk like this:
```
a.remove > * {
  /* some styles */
  .ie7 & {
    margin-right: 0.25em;
  }
}
```
..which would generate CSS like this:
```
a.remove > * { /* some styles */ }
.ie7 a.remove > * { margin-right: 0.25em }
```
Notice the additional IE7 selector, and how it has a universal rule. We know that universal rules are slow due to right-left matching, and so all browsers except IE7 (which .ie7 — probably on element — is supposed to target) are taking an unnecessary performance hit. This is obviously the worst case of IE7-targeted selector.

Other ones were more innocent:
```
.steps {
  li {
    /* some styles */
    .ie7 & {
      zoom: 1;
    }
  }
}
```
..which produces CSS like:
```
.steps li { /* some styles */ }
.ie7 .steps li { zoom: 1 }
```
But even in this case engine needs to check each
element (that’s within element with class “steps”) until it would “realize” that there’s no element with “ie7” class further up the tree.

In my case, there was close to a hundred of such .ie7 and .ie8 -based selectors in a final stylesheet. Some of them were universal. The fix was simple — move all IE-related styles to a separate stylesheet, included via conditional comments. As a result, there were that many less selectors to parse, match and apply.

Unfortunately, this kind of optimization comes with a price. I find that putting IE-related styles next to the original ones is actually a more maintainable solution. When changing/adding/removing something in the future, there’s only one place to change and so there’s less chance to forget IE-related fixes. Perhaps in the future tools like SASS could optimize declarations like these out of the main file and into conditionally-included ones.
In Chrome (and WebKit), you can use “Timeline” tab in Developer tools to get similar information about repaint/reflow/style recalculation performance. Timeline tab allows you to export data as JSON. First time I’ve seen this done was by Marcel Duran in this year’s Performance Calendar. Marcel used node.js and a script to parse and extract data.

Unfortunately, his script was including “Recalculate styles” time in the “layout” time — something I wanted to avoid. I also wanted to avoid page reloads (and getting average/median time). So I tweaked it to a much simpler version. It walks over entire data, filtering entries related to Repaint, Layout, and Style Calculation; then sums up total time for each of those entries:
```
var LOGS = './logs/',
    fs = require('fs'),
    files =  fs.readdirSync(LOGS);

files.forEach(function (file, index) {
  var content = fs.readFileSync(LOGS + file),
      log,
      times = {
        Layout: 0,
        RecalculateStyles: 0,
        Paint: 0
      };

  try {
    log = JSON.parse(content);
  }
  catch(err) {
    console.log('Error parsing', file, ' ', err.message);
  }
  if (!log || !log.length) return;

  log.forEach(function (item) {
    if (item.type in times) {
      times[item.type] += item.endTime - item.startTime;
    }
  });

  console.log('\nStats for', file);
  console.log('\n  Layout\t\t', times.Layout.toFixed(2), 'ms');
  console.log('  Recalculate Styles\t', times.RecalculateStyles.toFixed(2), 'ms');
  console.log('  Paint\t\t\t', times.Paint.toFixed(2), 'ms\n');
  console.log('  Total\t\t\t', (times.Layout + times.RecalculateStyles + times.Paint).toFixed(2), 'ms\n');
});
```
After saving timeline data and running a script, you would get information like this:
```
Layout                      6.64 ms
Recalculate Styles          0.00 ms
Paint                       114.69 ms

Total                       121.33 ms
```
Using Chrome’s “Timeline” and this script, I ran original button test that I tested before in Opera and got this:

Similarly to Opera, border-radius was among least performant. However, linear-gradient was comparatively more expensive than that in Opera and box-shadow was much higher than text-shadow.

One thing to note about Timeline is that it only provides “Layout” information, whereas Opera’s profiler has “Reflow” AND “Layout”. I’m not sure if reflow data analogous to Opera’s is included in WebKit’s “Layout” or if it’s discarded. Something to find out in the future, in order to have correct testing results.
When I was almost done with my findings, WebKit has added selector profiler similar to Opera’s one.

I wasn’t able to do many tests with it, but noticed one interesting thing. Selector matching in WebKit was marginally faster than that of Opera. The same document — that one-page app I was working on (before optimizations) — took 1,144ms on selector matching in Opera, and only 18ms in WebKit. That’s a ~65x difference. Either something is off in calculations of one of the engines, or WebKit is really much much faster at selector matching. For what it’s worth, Chrome’s timeline was showing ~37ms for total style recalculation (much closer to WebKit), and ~52ms for repaint (compare to Opera’s 225ms “Paint” total; different but much closer). I wasn’t able to save “Timeline” data in WebKit, so couldn’t check reflow and repaint numbers there.

Summary

Reduce total number of selectors (including IE-related styles: .ie7 .foo .bar)
Avoid universal selectors (including unqualified attribute selectors: [type="url"])
Page zoom affects CSS performance in some browsers (e.g. Opera)
Window size affects CSS performance in some browsers (e.g. Chrome)
Page reloads can negatively affect CSS performance in some browsers (e.g. Opera)
“border-radius” and “transform” are among most expensive properties (in at least WebKit & Opera)
“Timeline” tab in WebKit-based browsers can shed light on total recalc/reflow/repaint times
Selector matching is much faster in WebKit

Questions

As I end these notes, I have tons of other questions related to CSS performance:

Quoted attribute values vs. unquoted ones (e.g. [type=search] vs [type="search"]). How does this affect performance of selector matching?
What are the performance characteristics of multiple box-shadows/text-shadows/backgrounds? 1 text-shadow vs. 3 vs. 5.
Performance of pseudo selectors (:before, :after).
How do different border-radius values affect performance? Is higher radius more expensive? Does it grow linearly?
Does !important declaration influence performance? How?
Does hardware acceleration influence performance? How?
Are styles similarly expensive in different combinations? (e.g. text-shadow with linear-gradient vs. text-shadow on one-color background)

Future

As our pages/apps become more interactive, the complexity of CSS increases, and browsers start to support more and more “advanced” CSS features, CSS performance will probably become even more important. The existing tools are only scratching the surface. We need the ones for mobile testing and tools in more browsers (IE, Firefox). I created a ticket for Mozilla, so perhaps we’ll see something come out of it soon. I would love to see CSS performance data exposed via scripting, so that we could utilize it in tools like jsperf.com (cssperf.com?). Meanwhile, there’s plenty of tests to be done with existing profilers. So what are you waiting for? ;)

Записи о веб-разработке, рекомендованные Артемием Трегубенко