This article is in need of a technical review.
Performance means efficiency. In the context of Open Web Apps, this document explains in general what performance is, how the browser platform helps improve it, and what tools and processes you can use to test and improve it.
What is performance?
Ultimately, user-perceived performance is the only performance that matters. Users provide inputs to the system through through touch, movement, and speech. In return, they perceive outputs through sight, touch, and hearing. Performance is the quality of system outputs in response to user inputs.
All other things being equal, code optimized for some target besides user-perceived performance (hereafter UPP) loses when competing against code optimized for UPP. Users prefer, say, a responsive, smooth app that only processes 1,000 database transactions per second, over a choppy, unresponsive app that processes 100,000,000 per second. Of course, it's by no means pointless to optimize other metrics, but real UPP targets come first.
The next few subsections point out and discuss essential performance metrics.
Responsiveness simply means how fast the system provides outputs (possibly multiple ones) in response to user inputs. For example, when a user taps the screen, they expect the pixels to change in a certain way. For this interaction, the responsiveness metric is the time elapsed between the tap and the pixel change.
Responsiveness sometimes involves multiple stages of feedback. Application launch is one particularly important case discussed in more detail below.
Responsiveness is important simply because people get frustrated and angry when they're ignored. Your app is ignoring the user every second that it fails to respond to the user's input.
Framerate is the rate at which the system changes pixels displayed to the user. This is a familiar concept: everyone prefers, say, games that display 60 frames per second over ones that display 10 frames per second, even if they can't explain why.
Framerate is important as a "quality of service" metric. Computer displays are designed to "fool" user's eyes, by delivering photons to them that mimic reality. For example, paper covered with printed text reflects photons to the user's eyes in some pattern. By manipulating pixels, a reader app emits photons in a similar pattern to "fool" the user's eyes.
As your brain infers, motion is not jerky and discrete, but rather "updates" smoothly and continuously. (Strobe lights are fun because they turn that upside down, starving your brain of inputs to create the illusion of discrete reality). On a computer display, a higher framerate simply makes for a more faithful imitation of reality.
Note: Humans usually cannot perceive differences in framerate above 60Hz. That's why most modern electronic displays are designed to refresh at that rate. A television probably looks choppy and unrealistic to a hummingbird, for example.
Memory usage is another key metric. Unlike responsiveness and framerate, users don't directly perceive memory usage, but memory usage closely approximates "user state". An ideal system would maintain 100% of user state at all times: all applications in the system would run simultaneously, and all applications would retain the state created by the user the last time the user interacted with the application (application state is stored in computer memory, which is why the approximation is close).
From this comes an important but counter-intuitive corollary: a well-designed system does not maximize the amount of free memory. Memory is a resource, and free memory is a unused resource. Rather, a well-designed system has been optimized to use as much memory as possible to maintain user state, while meeting other UPP goals.
That doesn't mean the system should waste memory. When a system uses more memory than necessary to maintain some particular user state, the system is wasting a resource it could use to retain some other user state. In practice, no system can maintain all user states. Intelligently allocating memory to user state is an important concern that we go over in more detail below.
The final metric discussed here is power usage. Like memory usage, users perceive power usage only indirectly, by how long their devices can maintain all other UPP goals. In service of meeting UPP goals, the system must use only the minimum power required.
The remainder of this document will discuss performance in terms of these metrics.
Platform performance optimizations
This section provides a brief overview of how Firefox/Gecko contributes to performance generally, below the level of all applications. From a developer's or user's perspective, this answers the question, "What does the platform do for you?"
<canvas> element (which includes WebGL). Somewhere "in between" HTML/CSS and Canvas is SVG, which offers some benefits of both.
HTML and CSS greatly increase productivity, sometimes at the expense of framerate or pixel-level control over rendering. Text and images reflow automatically, UI elements automatically receive the system theme, and the system provides "built-in" support for some use cases developers may not think of initially, like different-resolution displays or right-to-left languages.
canvas element offers a pixel buffer directly for developers to draw on. This gives developers pixel-level control over rendering and precise control of framerate, but now the developers need to deal with multiple resolutions and orientations, right-to-left languages, and so forth. Developers draw to canvases using either a familiar 2D drawing API, or WebGL, a "close to the metal" binding that mostly follows OpenGL ES 2.0.
The graphics pipeline in Gecko that underpins HTML, CSS, and Canvas is optimized in several ways. The HTML/CSS layout and graphics code in Gecko reduces invalidation and repainting for common cases like scrolling; developers get this support "for free". Pixel buffers painted by both Gecko "automatically" and applications to
canvas "manually" minimize copies when being drawn to the display framebuffer. This is done by avoiding intermediate surfaces where they would create overhead (such as per-application "back buffers" in many other operating systems), and by using special memory for graphics buffers that can be directly accessed by the compositor hardware. Complex scenes are rendered using the device's GPU for maximum performance. To improve power usage, simple scenes are rendered using special dedicated composition hardware, while the GPU idles or turns off.
Fully static content is the exception rather than the rule for rich applications. Rich applications use dynamic content with
transition effects. Transitions and animations are particularly important to applications: developers can use CSS to declare complicated behaviour with a simple, high-level syntax. In turn, Gecko's graphics pipeline is highly optimized to render common animations efficiently. Common-case animations are "offloaded" to the system compositor, which can render them in a performant, power-efficient fashion.
An app's startup performance matters just as much as its runtime performance. Gecko is optimized to load a wide variety of content efficiently: the entire Web! Many years of improvements targeting this content, like parallel HTML parsing, intelligent scheduling of reflows and image decoding, clever layout algorithms, etc., translate just as well to improving Web applications on Firefox.
Note: See Firefox OS performance testing for more information about Firefox OS specifics that help to further improve startup performance.
This section is intended for developers asking the question: "How can I make my app fast"?
Application startup is punctuated by three user-perceived events, generally speaking:
- The first is the application first paint — the point at which sufficient application resources have been loaded to paint an initial frame
- The second is when the application becomes interactive — for example, users are able to tap a button and the application responds
- The final event is full load — for example when all the user's albums have been listed in a music player
The key to fast startup is to keep two things in mind: UPP is all that matters, and there's a "critical path" to each user-perceived event above. The critical path is exactly and only the code that must run to produce the event.
For example, to paint an application's first frame that comprises visually some HTML and CSS to style that HTML:
- The HTML must be parsed
- The DOM for that HTML must be constructed
- Resources like images in that part of the DOM have to be loaded and decoded
- The CSS styles must be applied to that DOM
- The styled document has to be reflowed
Nowhere in that list is "load the JS file needed for an uncommon menu"; "fetch and decode the image for the High Scores list", etc. Those work items are not on the critical path to painting the first frame.
It seems obvious, but to reach a user-perceived startup event more quickly, the main "trick" is run only the code on the critical path. Shorten the critical path by simplifying the scene.
Another problem that can delay startup is idle time, caused by waiting for responses to requests (like database loads). To avoid this problem, applications should issue requests as early as possible in startup (this is called "front-loading"). Then when the data is needed later, hopefully it's already available and the application doesn't have to wait.
Note: For much more information on improving startup performance, read Optimizing startup performance.
On the same note, notice that locally-cached, static resources can be loaded much faster than dynamic data fetched over high-latency, low-bandwidth mobile networks. Network requests should never be on the critical path to early application startup. Caching resources locally is also the only way applications can be used offline, and for standard Open Web Apps, at the moment this requires use of HTML5 AppCache.
Note: Firefox OS allows applications to cache resources by being installed as applications, either being "packaged" in a compressed ZIP file or "hosted" through HTML5 AppCache. How to choose between these options for a particular type of application is beyond the scope of this document, but in general application packages provide optimal load performance; AppCache is slower. Installable apps will hopefully be coming to other platforms soon!
The first important thing for high framerate is to choose the right tool. Use HTML and CSS to implement content that's mostly static, scrolled, and infrequently animated. Use Canvas to implement highly dynamic content, like games that need tight control over rendering and don't need theming.
For content drawn using Canvas, it's up to the developer to hit framerate targets: they have direct control over what's drawn.
For HTML and CSS content, the path to high framerate is to use the right primitives. Firefox is highly optimized to scroll arbitrary content; this is usually not a concern. But often trading some generality and quality for speed, such as using a static rendering instead of a CSS radial gradient, can push scrolling framerate over a target. CSS media queries allow these compromises to be restricted only to devices that need them.
Many applications use transitions or animations through "pages", or "panels". For example, the user taps a "Settings" button to transition into an application configuration screen, or a settings menu "pops up". Firefox is highly optimized to transition and animate scenes that:
- use pages/panels approximately the size of the device screen or smaller
- transition/animate the CSS
Transitions and animations that adhere to these guidelines can be offloaded to the system compositor and run maximally efficiently.
Memory and power usage
Improving memory and power usage is a similar problem to speeding up startup: don't do unneeded work or lazily load uncommonly-used UI resources. Do use efficient data structures and ensure resources like images are optimized well.
Modern CPUs can enter a lower-power mode when mostly idle. Applications that constantly fire timers or keep unnecessary animations running prevent CPUs from entering low-power mode. Power-efficient applications shouldn't do that.
When applications are sent to the background, a
visibilitychange event is fired on their documents. This event is a developer's friend; applications should listen for it. Applications that drop as many loaded resources as possible when sent to the background use less memory and are less likely discarded, in the case of Firefox OS (see the note below). This in turn means they "start up" faster (since they are already running) and have better UPP.
Note: As mentioned above, Firefox OS tries to keep as many applications running simultaneously as it can, but does have to discard applications sometimes, usually when the device runs out of memory. To find out more about how Firefox OS manages memory usage and kills apps when out of memory issues are encountered, read Debugging out of memory errors on Firefox OS.
Specific coding tips for application performance
The following practical tips will help improve one or more of the Application performance factors discussed above.
Use CSS animations and transitions
Instead of using some library’s
animate() function, which probably currently uses many badly performing technologies (
left positioning, for example) use CSS animations. In many cases, you can actually use CSS Transitions to get the job done. This works well because the browser is designed to optimize these effects and use the GPU to handle them smoothly with minimal impact on processor performance. Another benefit is that you can define these effects in CSS along with the rest of your app's look-and-feel, using a standardized syntax.
CSS animations give you very granular control over your effects using keyframes, and you can even watch events fired during the animation process in order to handle other tasks that need to be performed at set points in the animation process. You can easily trigger these animations with the
:target, or by dynamically adding and removing classes on parent elements.
Use CSS transforms
Instead of tweaking absolute positioning and fiddling with all that math yourself, use the
transform CSS property to adjust the position, scale, and so forth of your content. The reason is, once again, hardware acceleration. The browser can do these tasks on your GPU, letting the CPU handle other things.
In addition, transforms give you capabilities you might not otherwise have. Not only can you translate elements in 2D space, but you can transform in three dimensions, skew and rotate, and so forth. Paul Irish has an in-depth analysis of the benefits of
translate() from a performance point of view. In general, however, you have the same benefits you get from using CSS animations: you use the right tool for the job and leave the optimization to the browser. You also use an easily extensible way of positioning elements — something that needs a lot of extra code if you simulate translation with
left positioning. Another bonus is that this is just like working in a
Note: You may need to attach a
translateZ(0.1) transform if you wish to get hardware acceleration on your CSS animations, depending on platform. As noted above, this can improve performance, but can also have memory consumption issues. What you do in this regard is up to you — do some testing and find out what's best for your particular app.
requestAnimationFrame() instead of
window.setInterval() run code at a presumed frame rate that may or may not be possible under current circumstances. It tells the browser to render results even while the browser isn't actually drawing; that is, while the video hardware hasn't reached the next display cycle. This wastes processor time and can even lead to reduced battery life on the user's device.
Instead, you should try to use
window.requestAnimationFrame(). This waits until the browser is actually ready to start building the next frame of your animation, and won't bother if the hardware isn't going to actually draw anything. Another benefit to this API is that animations won't run while your app isn't visible on the screen (such as if it's in the background and some other task is operating). This will save battery life and prevent users from cursing your name into the night sky.
Make events immediate
As old-school, accessibility-aware Web developers we love click events since they also support keyboard input. On mobile devices, these are too slow. You should use
touchend instead. The reason is that these don’t have a delay that makes the interaction with the app appear sluggish. If you test for touch support first, you don’t sacrifice accessibility, either. For example, the Financial Times uses a library called fastclick for that purpose, which is available for you to use.
Keep your interface simple
One big performance issue we found in HTML5 apps was that moving lots of DOM elements around makes everything sluggish — especially when they feature lots of gradients and drop shadows. It helps a lot to simplify your look-and-feel and move a proxy element around when you drag and drop.
General application performance analysis
Firefox, Chrome, and other browsers include built-in tools that can help you diagnose slow page rendering. In particular, Firefox's Network Monitor will display a precise timeline of when each network request on your page happens, how large it is, and how long it takes.
The Built-in Gecko Profiler is a very useful tool that provides even more detailed information about which parts of the browser code are running slowly while the profiler runs. This is a bit more complex to use, but provides a lot of useful details.
Note: You can use these tools with the Android browser by running Firefox and enabling remote debugging.
Using YSlow (which requires Firebug) provides extremely helpful recommendations for improving performance. Many of the identified problems and suggested solutions are especially useful for mobile browsers. You should definitely run YSlow and follow its recommendations.
In particular, making dozens or hundreds of network requests takes longer in mobile browsers. Rendering large images and CSS gradients can also take longer. Simply downloading large files can take longer, even over a fast network, because mobile hardware is sometimes too slow to take advantage of all the available bandwidth. For useful general tips on mobile Web performance, have a look at Maximiliano Firtman's Mobile Web High Performance talk.
Testcases and submitting bugs
If the Firefox and Chrome developer tools don't help you find a problem, or if they seem to indicate that the Web browser has caused the problem, try to provide a reduced test case that maximally isolates the problem. That often helps in diagnosing problems.
See if you can reproduce the problem by saving and loading a static copy of an HTML page (including any images/stylesheets/scripts it embeds). If so, edit the static files to remove any private information, then send them to others for help (submit a Bugzilla report, for example, or host it on a server and share the URL). You should also share any profiling information you've collected using the tools listed above.