Firefox OS/Performance/App Performance Validation
Note: This page has now been edited published on MDN, as part of the App Center performance section — see https://developer.mozilla.org/en-US/Apps/Build/Performance/App_performance_validation. Any further changes should be made there. Talk to Chris Mills for more details.
This page outlines a simple process for manually auditing and improving the performance of an app.
Section listed first have a higher priority. This means that for example fixing over-invalidation and responsiveness will lead to vastly improved checker-boarding thus should be performed first when possible.
Contents
1. Over-invalidation
Platform Contact: Layout team
What does it mean: Over-invalidation means that we're repainting content that isn't changing. This leads to higher CPU usage & bandwidth.
Verification Steps:
- In the Developer menu turn on 'Flash repainted area'.
- Perform all common interactions with the app. Look for areas of the page that are flashing but where the content hasn't changed.
- Examples:
- Common issues: Please add any common issues to this list. Currently empty.
Debugging:
- Attach with the App Manager and tweak the structure and styles of the page to trigger the desired behavior. Positioning, z-index, and opacity styles are common culprits here.
- Use dump-painting: https://wiki.mozilla.org/Gecko:DisplayListBasedInvalidation#Debugging_Invalidations_Problems (Example Displaylist: https://bug979026.bugzilla.mozilla.org/attachment.cgi?id=8396816)
Report:
- Report as completed once all excessive painting is resolved.
2. Reflows/Restyles
Platform Contact: Layout team
What does it mean:
NOTE: All the following information applies for Restyles. Consider minimizing restyles, avoiding sync style queries and batching all style changes in the frame (or event if possible).
Reflowing the page is the process of deciding where all the DOM elements will be positioned the page. Any changes to the page that has the potential of effecting the flow of the page will have to reflow the page again. The larger the DOM the longer this can potentially take.
Reflowing is expected as the page/app is being loaded. Once the page is ready, app authors should carefully consider what interactions should make changes that can affects the flow of the document. If changes to the page are required they should happen in the same frame/refresh tick as all changes will be validated at the same time. Reflows should not happen on scrolling unless using a virtual list approach.
Sync reflow is when the position information (like scrollLeft/scrollTop/clientWidth/clientHeight) of a DOM node is queried when there are pending changes made to the flow of the document. This will prevent reflow from being delayed to the next refresh tick, block all current execution and reflow the document.
Sometimes reflows can be avoided by using features like CSS transforms that don't affect the flow of the page.
Verification Steps:
- In the Developer menu -> Developer HUD, turn on the 'Reflow' counter.
- Perform all common interactions with the app. Perform reflows only when necessarily i.e. the structure of the app must change.
More information: http://paulrouget.com/e/fxoshud/
3. Event loop responsiveness
Platform Contact: Varies, use profilers to narrow down the issue or contact the Performance team.
What does it mean: See http://paulrouget.com/e/fxoshud/ for an excellent description. Keeping the event loop delay is important to having a smooth and responsive app.
This step is very important. It's only listed 3rd to make sure the simplest problems have been resolved first.
Verification Steps:
- In the Developer menu -> Developer HUD, turn on the 'Jank' counter.
- Perform all common interactions with the app. Track the 'Jank' number for each of the interaction. When that number is above 10ms your app' scrolling is affected. Once that number is 100ms painting will fall behind and input will be delayed. 500ms+ indicate that this interaction will be noticeable jerky.
- Understand the worse case responsiveness for each interaction of your app. Consider various workload for your app (like 5000 pictures in the Gallery App).
Debugging:
- Profile the app' main thread. This is useful if you can't make an educated guess: https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Boot_to_Gecko_%28with_a_real_device%29
- For JavaScript related slowdown: Add function duration logging throughout the code.
- Disable some functionality of your app to see if that reduces the event loop lag (jank).
Report:
- Report the numbers for various interaction and the workload used.
4. Load Time
Platform Contact:
Varies, use profilers to narrow down the issue or contact the Performance team.
What does it mean:
The duration, in millisecond, required to load the app. This is usually pretty difficult to measure because the definition of "load the app" varies from app to app. Generally though, we're talking about the time between the user launching the app and the visible UI being fully drawn and responsive. I put an emphasis on the visible UI because many apps require lots of computation and/or I/O to initialize their state enough to draw the full UI for the initial app screen. If every app waited until it was fully initialized before drawing the UI, they would all appear to be very slow and unresponsive.
The trick to getting a fast load time is to do as little as possible before putting up the visible part of the UI. All long term calculations and I/O should be delayed into an idle timer callback if possible. Apps that load to list UIs (e.g. contacts, sms, email) should only load enough data to display the UI that is initially visible to the user. The rest of the list can then be loaded in an idle timer callback.
The real key here is to make sure that the UI is drawn as early as possible. Preferably, show a UI that isn't going to have a lot of box size re-adjustments due to reflows. Having a bunch of box resizes during load makes the app feel like web page rather than a native app. The goal here is to provide a native app experience using web technologies.
Verification Steps:
The easiest way to verify this is to use the existing profiler: [1] Since there are no obvious markers in the profiles of where your app load ends, it will be up to you to either output data to the log or call a dummy function where the loading should be done and then find the function call in the profile. The latter method is tricky because of the asynchronous nature of the callbacks and the rendering code. It all comes down to knowing your app's code well and then spending time to do a careful examination of the profiles in the Cleopatra tool: [2]
If you are working on a B2G app, the datazilla tool measures cold_load_time for each of those. That's a good place to start when see if there has been a regression: [3]
Report:
- Capture profiles using the build-in profiler.
- Upload them to Cleopatra and put the Cleopatra url in the bug.
- Contact the performance team if you're stuck.
5. Layer Tree
Platform Contact: Layout & Graphics
What does it mean:
Having a good layer tree means we can perform most changes without having to repaint the page. Think of a cartoon on the television like The Simpsons: Most cartoons will divide their scene into a few layer and will move the layers themselves instead of redrawing the scene for every frame of the cartoon. Using layers to perform animations means we can avoid redrawing the page every frame and can simple move these layers around.
Verification Steps:
NOTE: Verifying an optimal layer tree is complex task that require a lot of internal knowledge of Gecko. Be ready to ask for help.
- In the Developer menu, turn on 'Layer border' and/or 'Flash repainted area'.
- Perform all common interactions with the app.
- Look for borders around the content that is animating.
- Verify that paint flashing is kept to a minimum as done in the 'Over-invalidation' section.
and
- Enable 'layers.dump'. Each frame the exact layer structure will be printed to 'adb logcat'.
- Perform all common interactions with the app.
Look for the following (Note that they -may- not be bugs):
- Layers that are created around content that isn't animating.
- Layers that are created and destroyed because something changes once.
- Large layers that are created at the start or during an important user interaction.
- Excessive number of layers.
- RGBA layers in the layers.dump when the animated content is opaque
Debugging:
- Attach with the App Manager and tweak the structure and styles of the page to trigger the desired behavior.
and
- Get a display list dump. Build b2g with 'export B2G_DUMP_PAINTING=1' in your .userconfig (note: if you have a debug build, you don't need this). Set the preference 'user_pref("layout.display-list.dump", true);'.
- Look for the display list dump for the particularly process at the moment of interest. It will contains a list of display items and their mappings between layers and their original frames.
Creating/Destroying layers: Continuously creating and destroying layers can hurt performance. Setting the pref "layers.flash-borders" when layer borders are activated will add a fading animation to layer borders when layers are created. Immediately after a layer is created, it's border is black and then it fades into it's usual color (green for most layers) during a second or two. This can highlight counter-intuitive behaviors, such as elements being already "layerized" seeing their layer being destroyed and recreated after a style change (for example, an animated element that moves around and then fades to transparent currently gets its layer recreated at the start of its opacity transition). Beware that this triggers full-tilt compositing (The compositor will try to composite continuously at 60fps regardless of whether anything has changed on the screen). So while this can be useful to understand the layerization behavior in some cases, it should not be used while doing performance measurements.
6. Checkerboarding
Platform Contact: Graphics, verify previous steps first
What does it mean:
NOTE: Ideally at this point this section is just a verification. If the over-invalidation, reflows & event loop responsiveness is good then checkerboarding should be minimal to none. More in depth explanation here.
Checkerboarding is when the app fails to keep up with the panning happening in the main process. When this happens a solid background color will be shown temporary until the page catches up.
Verification Steps:
- Perform all common interactions with the app including heavy scrolling.
- Looks for the background color.
- Consider first repeating some of the previous steps to rule out over-invalidation, reflows and non responsive event loop.
7. Compositor FPS
Platform Contact: Graphics
What does it mean:
Check that the generated layer tree can composited by the GPU every frame.
Verification Steps:
- Turn on 'Frames per second' from the Developer HUD.
- Watch the left most FPS counter during animations and scrolling. This number should remain very close to the refresh rate of the display (~60 FPS for most devices). Large layer transformations (check with layer border/layers.dump) will cause a dip in this counter.
and
- Turn on 'Frames per second' from the Developer HUD.
- Set layers.offmainthreadcomposition.frame-rate to 0.
- Watch the left most FPS counter during idle at various stage of your app. This number should remain very close to the refresh rate of the display (~60 FPS for most devices). There should be little to no dips below 60 FPS here.
8. Memory usage
Platform Contact: Memshrink
What does it mean: Check that memory consumption is not excessive.
Verification Steps:
- Use tools/get_about_memory.py to get a memory reports dump while the app is running -- both shortly after start-up, and after using the app for a while.
- Load the dumps in about:memory in desktop Firefox. Check for any measurements that seem to be excessive, particularly "heap-unclassified".
- If "heap-unclassified" is high, re-run tools/get_about_memory.py with a DMD-enabled build, which will produce data that can be used to understand where additional memory reporters need to be added.
9. Profiling
Platform Contact: Performance Team & Graphics
What does it mean:
At this point most common issues should have been fixed. It's important to check some of the other things first because bugs like over-invalidation will incorrectly report that the page is expensive to paint where instead we're simply painting too much.
With sampling profiling we can get a good approximation of where the remaining CPU time is spent and look for stages in the pipeline that are exceeding their budget or finding things that are running that simply shouldn't be.
Verification Steps:
- Start by profiling only the main thread at a high resolution.
- Profile the app and the compositor: ./profile.sh start b2g -t Compositor && ./profile.sh start APP_NAME_OR_PID
10. Power usage
Platform Contact: Performance Team
What does it mean:
Power usage for a given change is difficult to measure. Power usage is directly correlated with CPU usage, and also with the use of various hardware features of the phone.
Verification Steps:
If you have a power harness available to you, you can do before and after power measurements by following the instructions on this page. If you don't have a power harness, please contact the performance team (#fxos-perf on IRC) for help.