Gecko:PortlandRendering
From MozillaWiki
Contents
What Gecko does
- On every paint, on the main thread:
- Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
- Analyze display list (FrameLayerBuilder)
- Build layers for some items
- Complex heuristics to choose resolution
- Accumulate other items into PaintedLayers (buffers)
- Simple PaintedLayers (solid colors, single image) optimized to ColorLayer/ImageLayer
- Items assigned to PaintedLayers based on "animated geometry roots" (items placed in the same PaintedLayer when they move together via scrolling etc)
- Choose resolutions for PaintedLayers
- Recycle existing layers when possible
- Deem some layers as "inactive" in which case we create them, but rasterize and composite them into PaintedLayers using the CPU
- Track changes to items in PaintedLayers to compute precise invalid areas (DLBI)
- Paint invalid areas of PaintedLayers (rasterizing on main thread)
- Send all layer changes to compositor
- Pro: Very flexible layer assignment (full CSS compliance)
- Pro: Highly optimized layer tree (memory usage and component-alpha avoidance)
- Pro: Very precise invalidation; minimal invalidation cost during reflow
- Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow)
- Con: can trigger invalidation of scrolled content
- Con: complex
What Blink does (in Gecko terms)
- At frame construction time, assign layers to frames
- During reflow and at other times, accumulate invalid regions via explicit invalidation
- On every paint, on the main thread:
- Repaint invalid areas using Skia
- On another Skia thread:
- Rasterize paint commands
- Pro: low overhead for small paints
- Pro: no invalidation of scrolled content
- Pro: simpler
- Con: unfixable CSS rendering bugs
- Con: less efficient layer trees generated
- Con: less precise invalidation in some situations
Where Blink is going: "Slimming Paint"
- At paint time, generate display list and send it to the compositor
- Actually a flat list with start/end markers for container items, in paint order
- Generic "ContentDisplayItem" contains an SkPicture with drawing commands
- Lists split by type with indices into the ContentDisplayItem list to track scope of effects
- Incremental list updates will be supported
- List includes hints from layout
- In compositor
- Group display items into layers
- Render
- Invalidation calculation; still crude compared to DLBI
- Since compositor makes layerization decisions, it handles inactive layers
- Optimizing SkPicture for small lists of drawing commands
- Each SkPicture has relatively high overhead (88/136 bytes)
- May end up not using SkPicture
- Pro: may have low overhead for small paints
- Incremental update of display lists is a struggle.
- You can skip processing frame subtrees that are stacking contexts and have no invalid frames
- You can skip processing of display items that don't intersect invalid frames
- Pro: more efficient layer trees
- Pro: CSS compliance
- Con: moving towards Gecko's current scheme in complexity
- Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess
http://dev.chromium.org/blink/slimming-paint https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045
Where Gecko should go (speculative proposal)
- Recap: our main problems are:
- Performance, especially for small incremental updates
- Complexity
- Invalidation of scrolled layers
- Identify container layers during frame construction and store this as frame state (like old Blink)
- Potential problem with merging display items for continuation frames of the same element
- Not too difficult, associate state with first-in-flow
- I don't think making container-layerization decisions in the compositor makes sense
- Track layer activity status in frame tree too
- Computing layer activity in the compositor makes more sense, but I still think it's OK to not be there
- will-transform anti-abuse measures require some work
- Need to eliminate BasicLayers component-alpha-avoidance flattening pass
- With this change, painting and remaining layerization problems can be handled separately per active container layer
- Set invalid bits on frames when repainting is needed (can be coarse)
- At paint time, on the main thread, identify active container layers whose contents need to be updated. For each one:
- Its child active container layers can be obtained from the frame tree, so we only care about descendant content without its own active container layer
- For each relevant animated geometry root, split its coordinate space into tiles!
- Create an invalid region
- For every invalid frame, the tiles of its animated geometry root that it intersects are marked for update
- The invalid region is the union of all invalid tiles across all animated geometry roots
- Walk the frame tree to build a display list for the region
- Only the container layer's frame subtree
- Skipping frames with their own container layers or that are entirely outside the dirty region (and will stay outside regardless of any async geometry changes!)
- Maintain invalidation state for each PaintedLayer x tile combination (PaintedTile)
- Walk the display list to assign each display item to a PaintedLayer
- And run DLBI at the same time
- But limit invalidation to the tiles marked for update
- Repaint the invalid area of each PaintedLayer
- Pro: Preserves flexible, optimized layer assignment and precise invalidation
- Pro: Layer assignment on the content thread --- seems simpler, more scalable
- Pro: Can be implemented by evolving current code incrementally
- Pro: More modular design than the current code
- Pro: Painting overhead proportional to the number of invalid tiles
- Con: compared to Slimming Paint, less parallelism (busier main thread)
- How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
- Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.
- I'm comfortable with carrying on doing DL and Layer building on the main thread.
- Maybe offload rasterization to another thread, but that's compatible with this.
- Needs a cool name