성능은 광범위한 주제입니다. 본 문서는 Firefox OS가 어떻게 설계 및 최적화되어 있는지의 관해 대략적인 개요를 설명합니다. 그리고 개발자가 자신의 코드 성능을 향상시키는 데 사용할 수있는 도구와 프로세스를 소개합니다.
성능은 전적으로 사용자에 의해 인식됩니다. 사용자가 터치, 이동, 대화를 통해 시스템에 입력 정보를 제공할때 그 대가로 사용자는 시각적, 촉각 적, 청각 적 피드백 방식에 의해 출력 정보를 받습니다. 성능은 그 입력에 대한 응답의 출력 품질입니다.
유저 인식 성능(이 다음부터는 UPP라 부릅니다) 대신 다른 목적으로 최적화된 코드는 여러가지 타겟을 동등하게 최적화된 코드와 비교해서 떨어질수밖에 없습니다. 사용자는 프로세싱 성능이 떨어져도 응답성이 좋고 부드러운 응용 프로그램들을 선호합니다. 예를 들자면, 사용자들은 응답성이 좋으면서 부드럽지만 초당 1,000개의 데이터베이스 트렌젝션 처리를 하는 앱을 응답성이 좋지 않고 부드럽지 않지만 초당 100,000,000개의 데이터베이스 트렌젝션 처리를 하는 앱보다 선호할 것입니다.
당연히 데이터베이스 초당 트렌젝션 수같은 성능을 빠르게 처리하는게 의미가 없다는게 아닙니다; 그런건 당연히 의미가 있죠. 저희가 말하려는건 이런 것에 주를 두지 말고, UPP를 개선하는데 주를 둬야 한다는 것입니다.
성능에 관한 주요 지표는 여러 가지가 있습니다. 첫번째로 "응답성 (responsiveness)"입니다. 응답성은 단순히 사용자의 입력에 대한 시스템 출력(복수가 될 수도 있습니다)의 반환 속도입니다. 예를 들어 사용자가 스크린을 누를 때 사용자들은 픽셀에 어떤 변화가 일어난다고 생각합니다. 이 경우에는 "탭"제스처에서 픽셀이 변경 될 때까지 걸린 시간이 응답성의 지표가 됩니다.
응답성은 자주 여러 단계의 피드백을 필요로합니다. 응용 프로그램의 시작은 특히 중요한 사례 중 하나이며, 이에 대해서는 나중에 자세히 설명합니다.
응답성은 간단히 아무도 무시되는걸 원하지 않는다는 단순한 이유 때문에 중요합니다. 사용자가 입력을 한 후에 시스템이 반응하기까지의 시간은 유저가 무시되는 시간입니다. 무시되는 것은 짜증과 화를 유발합니다.
다음으로 중요한 지표는 "프레임 레이트"입니다. 프레임 레이트란 시스템이 사용자에게 표시하는 픽셀을 변경하는 속도입니다. 이것은 흔한 컨셉이고 모두가 좋아합니다. 예를 들어 모든 사람은 초당 60 프레임 레이트를 보여주는 게임은 이유를 설명 할 수 없어도 초당 10 프레임 레이트를 보여주는 게임보다 더 선호할 것입니다.
프레임 레이트는 "서비스의 질"의 지표로서 중요합니다. 컴퓨터의 디스플레이는 전자를 움직여서 현실을 모사하여 사용자들의 눈을 속이도록 디자인되어 있습니다. 예를 들자면, 문서 리더의 디스플레이는 실제 종이의 선명한 텍스트에 반사되는 빛과 같은 패턴으로 사용자의 망막에 닿는 빛을 생성하도록 설계된 디스플레이 픽셀을 만들어 텍스트를 표시합니다.
현실에서는 동작은 "연속적입니다" (저희 두뇌가 알려주는 바에 따르면요); it's not jerky and discrete, but rather "updates" smoothly. (Strobe lights are fun because they turn that upside down, starving our brains of inputs to create the illusion of discrete reality.) On a computer display, a higher framerate simply allows the display to imitate reality more faithfully.
(흥미로운 점은 인간은 보통 60Hz 이상의 프레임 레이트를 구분하지 못한다는 것입니다. 그렇기 때문에 대부분의 현대의 전자식 디스플레이들은 60Hz로 디자인 되어 있습니다. 예를 들자면 TV 스크린은 벌새에게는 비자연스럽고 끊기는 것처럼 보일 것입니다.)
Memory usage is another key metric. Unlike responsiveness and framerate, users don't directly perceive memory usage. However, memory usage is a close approximation to "user state". An ideal system would maintain 100% of user state at all times: all applications in the system would run simultaneously, and all applications would retain the state created by the user the last time the user interacted with the application. (Application state is stored in computer memory, which is why the approximation is close.)
An important corollary of this is contrary to popular belief: a well-designed system should not be optimized to maximize the amount of free memory. Memory is a resource, and free memory is a unused resource. Rather, a well-designed system should be optimized to use as much memory as possible in service of maintaining user state, while meeting other UPP goals.
Optimizing a system to use memory doesn't mean it should waste memory. Using more memory than is required to maintain some particular user state is wasting a resource that could be used to retain some other user state.
In reality, no system can maintain all user state. Intelligently allocating memory to user state is an important concern that's discussed in more detail below.
The final metric discussed here is power usage. Like memory usage, users don't directly perceive power usage. Users perceive power usage indirectly by their devices being able to maintain all other UPP goals for a longer duration. In service of meeting UPP goals, the system must use only the minimum amount of power required.
The remainder of this document will discuss performance in terms of these metrics.
이 부분은 Firefox OS가 응용 프로그램 아래서 보통 어떻게 성능을 향상시키기 위해 공헌하는지에 대한 간략적 요약입니다. 앱 개발자나 유저의 입장에서 "이 폴랫폼이 저를 위해서 뭘 해줄수 있나요?" 라는 질문에 대한 대답입니다.
This section assumes the reader is familiar with the basic conceptual design of Firefox OS.
Because the core operating system is built with the same web technologies that applications are built with, the performance of those technologies is critical. There's no "escape hatch". This greatly benefits developers because all the optimizations that enable a performant OS window manager, for example, are available to third-party applications as well. There's no "magic performance sauce" available only to preinstalled code.
HTML과 CSS는 생산성을 크게 증가시킵니다, pixel-level control over rendering or a few frames per second. Text and images are reflowed automatically, the system theme is applied to UI elements by default, and "built-in" support is provided for some use cases developers may not think about initially, like different-resolution displays or right-to-left languages.
The canvas element offers a pixel buffer directly to developers to draw on. This gives pixel-level control over rendering and precise control of framerate to developers. But it comes at the expense of extra work needed to deal with multiple resolutions and orientations, right-to-left languages, and so forth. Developers draw to canvases using either a familiar 2D drawing API, or WebGL, a "close to the metal" binding that mostly follows OpenGL ES 2.0.
(Somewhere "in between" HTML/CSS and canvas is SVG, which is beyond the scope of this document.)
The graphics pipeline in Gecko underlying HTML/CSS and canvas is optimized in several ways. The HTML/CSS layout and graphics code in Gecko minimizes invalidation and repainting for common cases likes scrolling; developers get this support "for free". Pixel buffers painted by both Gecko "automatically" and applications to canvas "manually" minimize copies when being drawn to the display framebuffer. This is done by avoiding intermediate surfaces where they would create overhead (such as per-application "back buffers" in many other operating systems), and by using special memory for graphics buffers that can be directly accessed by the compositor hardware. Complex scenes are rendered using the device's GPU for maximum performance. To improve power usage, simple scenes are rendered using special dedicated composition hardware, while the GPU idles or turns off.
Fully static content is the exception rather than the rule for rich applications. Rich applications use dynamic content with animations, transitions, and other effects. Transitions and animations are particularly important to applications. Developers can use CSS to declare even complicated transitions and animations with a simple, high-level syntax. In turn, Gecko's graphics pipeline is highly optimized to render common animations efficiently. Common-case animations are "offloaded" to the system compositor, which can render them both performantly and power efficiently.
The runtime performance of applications is important, but just as important is their startup performance. Firefox OS improves startup experience in several ways.
Gecko is optimized to load a wide variety of content efficiently: the entire Web! Many years of improvements targeting this content, like parallel HTML parsing, intelligent scheduling of reflows and image decoding, clever layout algorithms, etc, translate just as well to improving web applications on Firefox OS. The content is written using the same technologies.
Each web application has its own instance of the Gecko rendering engine. Starting up this large, complicated engine is not free, and because of that, Firefox OS keeps around a preallocated copy of the engine in memory. When an app starts up, it takes over this preallocated copy and can immediately begin loading its application resources.
Applications "start" most quickly when they're already running. To this end, Firefox OS tries to keep as many applications running in the background as possible, while not regressing the user experience in foreground applications. This is implemented by intelligently prioritizing applications, and discarding background applications according to their priorities when memory is low. For example, it's more disruptive to a user if their currently-playing music player is discarded in the background, while their background calculator application keeps running. So, the music player is prioritized above the calculator automatically by Firefox OS and the calculator is discarded first when memory is low.
Firefox OS prevents applications that are running in the background from impacting the user experience of foreground applications through two mechanisms. First, timers created by background apps are "throttled" to run at a low frequency. Second, background applications are given a low CPU priority, so that foreground applications can get CPU time when they need it.
In addition to the above, Firefox OS includes several features designed to improve power usage that are common to mobile operating systems. The Firefox OS kernel will eagerly suspend the device for minimal power usage when the device is idle. Relatedly, ICs like the GPU, cellular radio, and Wifi radio are powered down when not being actively used. Firefox OS also takes advantage of hardware support for media decoding.
응용 프로그램 성능
This section is intended for developers asking the question: "how can I make my app fast"?
시작 시간 성능
Application startup is punctuated by three user-perceived events, generally speaking. The first is the application "first paint": the point at which sufficient application resources have been loaded to paint an initial frame. Second is when the application becomes interactive; for example, users are able to tap a button and the application responds. The final event is "full load", for example when all the user's albums have been listed in a music player.
The key to fast startup is to keep two things in mind: UPP is all that matters, and there's a "critical path" to each user-perceived event above. The critical path is exactly and only the code that must run to produce the event.
For example, to paint an application's first frame that comprises visually some HTML and CSS to style that HTML, (i) the HTML must be parsed; (ii) the DOM for that HTML must be constructed; (iii) resources like images in that part of the DOM have to be loaded and decoded; (iv) the CSS styles must be applied to that DOM; (v) the styled document has to be reflowed. Nowhere in that list is "load the JS file needed for an uncommon menu"; "fetch and decode the image for the High Scores list"; etc. Those work items are not on the critical path to painting the first frame.
It seems obvious, but to reach a user-perceived startup event more quickly, the main "trick" is to just not run code that's off the critical path. Alternatively, shorten the critical path by simplifying the scene.
Another problem that can delay startup is idle time, caused by waiting on responses to requests like database loads. To avoid this problem, applications can "front load" the work by issuing requests as early as possible in startup. Then when the data is needed later, it's hopefully already been fetched and the application doesn't need to wait.
Relatedly, it's important to separate network requests for dynamic data from static content that can be cached locally. Locally-cached resources can be loaded much more quickly than they can be fetched over high-latency and lower-bandwidth mobile networks. Network requests should never be on the critical path to early application startup. Caching resources locally is also the only way applications can be used when "offline". Firefox OS allows applications to cache resources by either being "packaged" in a compressed ZIP file or "hosted" through HTML5 appcache. How to choose between these options for a particular type of application is beyond the scope of this document, but in general application packages provide optimal load performance; appcache is slower.
A few other hints are listed below:
Don't include scripts or stylesheets that don't participate in the critical path in your startup HTML file. Load them when needed.
Use the "defer" or "async" attribute on script tags needed at startup. This allows HTML parsers to process documents more efficiently.
Don't force the web engine to construct more DOM than is needed. A "hack" to do this simply is to leave your HTML in the document, but commented out.
<div id="foo"><!-- <div> ... --></div>
When that part of the document needs to be rendered, load the commented HTML.
foo.innerHTML = foo.firstChild.nodeValue;
Use Web Worker Threads for background processing. Only the application "main thread" can process user events and render primary UI. But a common use case is to fetch some data, process it, then update the UI. Use Worker Threads for this work and keep the main thread free for interacting with the user.
The first important consideration for achieving high framerate is to select the right tool for the job. Mostly static content that's scrolled and infrequently animated is usually best implemented with HTML/CSS. Highly dynamic content like games that need tight control over rendering, and don't need theming, is often best implemented with canvas.
For content drawn using canvas, it's up to the developer to hit framerate targets: they have direct control over what's drawn.
For HTML/CSS content, the path to high framerate is to use the right primitives. Firefox OS is highly optimized to scroll arbitrary content; this is usually not a concern. But often trading some generality and quality for speed, such as using a static rendering instead of a CSS radial gradient, can push scrolling framerate over a target. CSS media queries allow these compromises to be restricted only to devices that need them.
Many applications use transitions or animations through "pages", or "panels". For example, the user taps a "Settings" button to transition into an application configuration screen, or a settings menu "pops up". Firefox OS is highly optimized to transition and animate scenes that
- Use pages/panels that are approximately the size of the device screen or smaller
- Transition/animate the CSS transform and opacity properties
Transitions and animations that adhere to these guidelines can be offloaded to the system compositor and run maximally efficiently.
To help diagnose low framerates, see the section below.
메모리와 전원 사용량
Improving memory and power usage is a similar problem to speeding up startup: don't do unnecessary work; use efficient data structures; lazily load uncommonly-used UI resources; ensure resources like images are optimized well.
Modern CPUs can enter a lower-power mode when mostly idle. Applications that constantly fire timers or keep unnecessary animations running prevent CPUs from entering low-power mode. Power-efficient applications don't do that.
When applications are sent to the background, a visibilitychange event is fired on their documents. This event is a developer's friend; applications should listen for it. As mentioned above, Firefox OS tries to keep as many applications running simultaneously as it can, but does have to discard applications sometimes. Applications that drop as many loaded resources as possible when sent to the background will use less memory and be less likely to be discarded. This in turn means they will "start up" faster (by virtue of already being running) and have better UPP.
Similarly, applications should prepare for the case when they are discarded. To improve user-perceived memory usage, that is to say, making the user feel that more of their state is being preserved, applications should save the state needed to return the current view if discarded. If the user is editing an entry, for example, the state of the edit should be saved when entering the background.
성능을 측정하고 문제를 분석하기
성능을 측정하고 문제를 분석하기 전에, 이걸 기억하세요:
절대. 기기. 위에서. 테스트. 하세요.
A great strength of the web platform is that the same code written for "desktop" web browsers runs on Firefox OS on mobile devices. Developers should use this to improve productivity: develop on "desktops", in comfortable and productive environments, as much as possible.
But when it comes time to test performance, mobile devices must be used. Modern desktops can be more than 100x more powerful than mobile hardware. The lower-end the mobile hardware tested on, the better.
With that caveat, the sections below describe tools and processes for measuring and diagnosing performance issues.
Firefox OS comes built-in with some convenient and easy-to-use tools that, when used properly, can be used to quickly measure performance. The first tool is the "framerate monitor". This can be enabled in the Firefox OS Settings application.
The framerate monitor continuously reports two numbers. The values reported are an average of recent results within a sliding window, meant to be "instantaneous" but fairly accurate. As such, both numbers are "guesses". The left number is the "composition rate": the estimated number of times per second Firefox OS is drawing frames to the hardware framebuffer. This is an estimate of the user-perceived framerate, and only an estimate. For example, the counter may report 60 compositions per second even if the screen is not changing. In that case the user-perceived framerate would be 0. However, when used with this caveat in mind and corroborated with other measurements, the monitor can be a useful and simple tool.
The rightmost number is the "layer transaction rate", the estimated number of times per second processes are repainting and notifying the compositor. This number is mostly useful for Gecko platform engineers, but it should be less than or equal to the composition rate number on the left.
Firefox OS also has a tool that can help measure startup time, specifically the "first paint" time described above. This "time-to-load" tool can be enabled using the Settings application. The value shown by the tool is the elapsed time between when the most recent application was launched, and an estimate of the first time that application painted its UI. This number only approximates the real "first paint" time, and in particular underestimates it. However, lowering this number almost always correlates to improvements in real startup time, so it can be useful to quickly measure optimization ideas.
For accurately measuring both startup times and responsiveness, a high-speed camera is indispensable. "High-speed" means that the camera can record 120 frames per second or above. The higher the capture rate, the more accurate the measurements that can be made. This may sound like exotic technology, but consumer models can be purchased for a few hundred US dollars.
The measuring process with these cameras is simple: record the action to be studied, and then play back the capture and count the number of frames that elapse between the input (say, a tap gesture) and the desired output (pixels changing in some way). Divide the number of counted frames by the capture rate, and the resulting number is the measured duration.
Mozilla built an automated tool called Eideticker which operates on the same principle as described above. The difference is that Eideticker uses synthetic user input events and HDMI capture to measure durations. The code is available and can be used with any device with an HDMI output.
Measuring power can be more difficult. It's possible to jury-rig measurement apparatus with a soldering iron, but a good approximation of power usage can be gathered by observing CPU load. Simple command-line tools like |top| allow monitoring CPU usage continuously.
In general, when measuring performance, don't be proud! "Primitive technology" like a stopwatch or logging, when used effectively, can provide eminently usable data.
성능 문제를 분석하기
If performance measurements show an application is below its targets, how can the underlying problem be diagnosed?
The first step of any performance work is to create a reproducible workload and reproducible measurement steps. Then gather baseline measurements, before any code changes are made. It seems obvious, but this is required to determine whether code changes actually improve performance! The measurement process selected isn't too important; what's important is that the process be (i) reproducible; (ii) realistic, in that it measures what users will perceive as closely as possible; (iii) precise as possible; (iv) accurate as possible. Even stopwatch timings can fit this spec.
Firefox OS includes two built-in tools for quickly diagnosing some performance issues. The first is a render mode called "paint flashing". In this mode, every time a region of the screen is painted by Gecko, Gecko blits a random translucent color over the painted region. Ideally, only parts of the screen that visually change between frames will "flash" with a new color. But sometimes more area than is needed is repainted, causing large areas to "flash". This symptom may indicate that application code is forcing too much of its scene to update. It may also indicate bugs in Gecko itself.
The second tool is called "animation logging", and can also be enabled in Settings. This tool tries to help developers understand why animations are not offloaded to the compositor to be run efficiently as possible. It reports "bugs" like trying to animate elements that are too large, or trying to animate CSS properties that can't be offloaded.
I/Gecko ( 5644): Performance warning: Async animation disabled because frame size (1280, 410) is bigger than the viewport (360, 518) [div with id 'views']
A common pitfall is to animate left/top/right/bottom properties instead of using CSS transforms to achieve the same effect. For a variety of reasons, the semantics of transforms make them easier to offload, but left/top/right/bottom are much more difficult. Animation logging will report this.
Similarly, advanced users may wish to use a whole-system profiler like the linux |perf| tool. This is mostly useful for platform engineers, though.
As with measuring performance, don't be proud about tools used to diagnose it! A few well-placed Date.now() calls with logging can often provide a quick and accurate answer.
Finally, the only way to keep improving performance is to not regress it. The only way to not regress performance is to test it, preferably with automated tests. A full discussion of that topic is beyond the scope of this document, though.
파리 Firefox OS 성능 & 최적화 워크샵, 3월 4 - 8일, 2013
To illustrate these concepts here are some videos and slides from the Paris Workshop dedicated to performances and optimizations.
파트 1: Technical basics and more (Gabriele & Thomas)
파트 2: Performances in a UX point of view (Josh)
파트 3: Performances measurement & automation (Julien & Anthony)