Accessibility/CacheTheWorld
Contents
What?
Firefox's current architecture for multi-process accessibility suffers from severe performance issues and is costly and difficult to maintain due to the massively different and specialised approaches necessary on different operating systems. In addition, it is currently impossible to support builtin Windows accessibility tools such as Narrator and Windows Speech Recognition. This project aims to re-architect our multi-process accessibility support to cache the entire accessibility trees for all content processes within the parent process.
Why?
This will allow us to address several problems that are difficult or impossible to fix with the current architecture:
- Performance, especially on Windows. While performance is acceptable for daily usage in most cases, there are many use cases which are far from delightful and some which are still completely unusable. Performance with the JAWS screen reader, which is used heavily in enterprise, is sluggish at best. A great deal of work has been done over the past few years to improve this, but we're approaching a point where we will not be able to improve this any further with the current architecture. Because software other than assistive technology uses accessibility APIs (e.g. Windows touch, East Asian input methods, enterprise SSO tools), this can impact even users without disabilities. The proposed new architecture will allow us to significantly improve performance across all operating systems. See bug 1737192 for a list of performance bugs which we expect will be fixed (or at least improved) by Cache the World.
- Stability. Because the current architecture makes heavy use of synchronous IPC, there is a high risk of deadlocks between accessibility and other Firefox components. While all known cases have been addressed as they have been discovered, the underlying cause remains and future instances of this problem are very likely. In addition, our use of obscure COM features on Windows has resulted in stability problems which are extremely difficult to diagnose and fix. One of these remains a problem today, despite months of investigation, and forces some NVDA screen reader users to forcibly kill Firefox (or worse, forcibly power off their computers) every few hours. The proposed new architecture will not suffer from these inherent stability risks and known problems. See bug 1737193 for a list of stability/reliability bugs which we expect will be fixed (or at least improved) by Cache the World.
- Complexity and cost. Our existing architecture is necessarily very different on each operating system, making it extremely complex and difficult to maintain. For example, the IPC layer on Windows (~8000 lines of code) has an entirely different architecture to other platforms. This also means that maintenance is very costly, especially when implementing support for new operating systems (e.g. Android, Mac) or major Gecko architectural initiatives (e.g. Fission). Furthermore, our use of esoteric operating system specific features, especially on Windows (where we depend on a whole separate ~11000 line module), makes it very difficult for this work to be distributed across the team because of the highly specific expertise required. In the proposed new architecture, most of the "heavy lifting" will be done in cross-platform code, decreasing complexity and maintenance cost.
- Support for other builtin accessibility tools. It is impossible to support Windows Narrator and Windows Speech Recognition with our current architecture. The proposed new architecture will allow for this. As these builtin tools rise in popularity, we do not want Firefox to be left behind.
Meeting Notes
Jamie, Morgan, and Eitan meet weekly to discuss this project. You can find the meeting notes in this google doc.
Roadmap
Given the large scope of the project, we are breaking the project down into quarterly milestones. Each milestone will aim to support a set of user scenarios. This roadmap is in the early stages and subject to significant change. It will be updated as milestones become clearer, with future milestones being less well defined than earlier ones.
Milestone 0: December 2021
Initial proof of concept.
All testing will be performed with the NVDA screen reader, for two reasons:
- Cache the World is all or nothing on Windows. That makes it easy to determine where we're at regarding real world usage.
- The performance benefits are most necessary and noticeable on Windows.
In milestone 0, the following capabilities will be provided:
- Reading and navigating a page with text, links and headings.
- Plain text editing: reading the line of text when focused; backspacing; moving the caret by line, word and character.
- Access to formatting information: font, bold/italic, etc.
- Access to screen coordinates on simple pages.
- Loading very large pages will be at least 10x faster with the cache than without.
Test Scenarios
- Do a Google search, navigate the results using heading navigation and follow a result link.
- Fill out the form for a Google Advanced Search.
- Go to https://www.reaper.fm. Check the formatting of the “This is REAPER.” text (which isn’t a heading even though it should be) and confirm that the font size is reported as bigger than the paragraph of text below it.
- Build Gecko in the background. Do a Google search. Verify that the browser does not become unresponsive.
- Load https://searchfox.org/mozilla-central/source/layout/base/nsCSSFrameConstructor.cpp. Page should take < 10 sec to be usable.
- Open Gmail, find a message in the inbox, open it, read it, return to the inbox.
- Compose a message in Gmail containing text, a link, a bulleted list and a block quote. Read back through the message.
- Open Slack, use the quick switcher to switch to a channel, read some messages, write a message.
Bugzilla
Note that the roadmap wasn't created until late in milestone 0, so many bugs are missing below.
ID | Summary | Assigned to | Status |
---|---|---|---|
1735955 | Cached bounds all 0s for many (most?) elements on Google search | James Teh [:Jamie] | RESOLVED |
1739050 | If the focused Accessible is moved, the RemoteAccessible is recreated but focus isn't fired on it (AKA broken Google Search box on Windows + CTW) | James Teh [:Jamie] | RESOLVED |
1741792 | Cache the caret | James Teh [:Jamie] | RESOLVED |
1742902 | Fix window emulation when the cache is enabled | James Teh [:Jamie] | RESOLVED |
1742915 | Cache tag object attribute | James Teh [:Jamie] | RESOLVED |
1742917 | Implement StartOffset for RemoteAccessible and LinkIndexAtOffset for HyperTextAccessibleBase | James Teh [:Jamie] | RESOLVED |
1746827 | Crash in [@ PLDHashTable::Search | mozilla::a11y::RemoteAccessibleBase<T>::MinValue] | James Teh [:Jamie] | RESOLVED |
7 Total; 0 Open (0%); 7 Resolved (100%); 0 Verified (0%);
Milestone 1: March 2022
Android.
The primary focus of this milestone is getting the cache working for Android. Mozilla aims to implement Fission for Android in 2022h1. Modifying the existing multi-process architecture to support Fission on Android will require significant engineering effort. Rather than investing in a solution which we will be throwing away once the cache is implemented, we will instead switch Android to use the cache and extend the cache to include functionality required by Android.
In milestone 1, the following capabilities will be provided:
- The cache will support GroupPosition.
- TextLeafRange will support word end and line end boundaries, which are needed for Android text navigation.
- Pivot will support navigating text using TextLeafRange, which is needed for Android text navigation.
- Cached screen bounds will be updated appropriately when scrolling.
- The cache will support tables. Only a very small subset of table functionality is needed for Android, but full table functionality will be implemented regardless.
- Android will use the cache for all functionality except hit testing.
- As an interim solution, Android will use the existing async IPDL mechanism for hit testing, updated to target the call at the correct document to handle OOP iframes. (Synchronous hit testing in the core cache will take longer to implement and will be done in a future milestone.)
Test Scenarios
- Do a Google search, navigate the results using heading navigation and follow a result link.
- Sign up for facebook, enter information including alternative gender. Don’t submit password (that is when it gets real).
- Go to en.wikipedia.org and navigate by char/word/line.
- Navigate en.wikipedia.org with explore by touch.
- With Talkback, load https://www.nvaccess.org/. Navigate to the embedded video in four different ways: item navigation, explore by touch, controls navigation and Talkback search. Activate the Play button to play the video.
- On Windows with NVDA, load https://developer.mozilla.org/en-US/docs/Web/API/Window/getSelection.
- Press control+home to ensure you're at the top of the document, then t twice to move to the Browser Compatibility table.
- Press control+alt+rightArrow twice to move to the Mobile column. Ensure NVDA says "column 8 through 13".
- Press control+alt+downArrow once. Ensure NVDA says "WebView Android".
- Press control+alt+downArrow again. Ensure NVDA says "getSelection" and "Full support".
- Press control+alt+leftArrow. Ensure NVDA says "Safari", "Desktop" and "Full support".
Bugzilla
40 Total; 0 Open (0%); 40 Resolved (100%); 0 Verified (0%);
Milestone 2: August 2022
Windows opt-in user preview.
This milestone will focus on the work required to get most daily usage working well on Windows. Before the end of this milestone, we will invite interested users to enable the cache in Firefox Nightly, test it and provide any feedback.
In milestone 2, the following capabilities will be provided:
- Caching of additional properties, including document URL, access key/keyboard shortcut and ARIA attributes such as aria-current.
- Missing cache updates; e.g. coalesced selection events, name change events for selection changes in Gmail message lists.
- Caching of spelling errors.
- Caching of relations (labelledBy/labelFor, etc.).
- Support for live regions.
- Caching of text screen coordinates.
- Support for interaction triggered by clients, including scrolling.
- Hit testing.
- Any fixes required to get JAWS working correctly.
Test Scenarios
- Relevant tests from m0 (which was NVDA only), but with JAWS:
- Do a Google search, navigate the results using heading navigation and follow a result link.
- Fill out the form for a Google Advanced Search.
- Go to https://www.reaper.fm. Check the formatting of the “This is REAPER.” text (which isn’t a heading even though it should be) and confirm that the font size is reported as bigger than the paragraph of text below it.
- Open Gmail, find a message in the inbox, open it, read it, return to the inbox.
- Compose a message in Gmail containing text, a link, a bulleted list and a block quote. Read back through the message.
- Open Slack, use the quick switcher to switch to a channel, read some messages, write a message.
- NVDA: Open https://www.mozilla.org/en-US/about/manifesto/. Press h several times to move to the "Principle 2" heading. In the same tab, open https://google.com/. Press alt+leftArrow to go back. Ensure the screen reader cursor is still near "Principle 2". (Tests document URL caching.) (This currently doesn't work with JAWS in any browser.)
- NVDA, JAWS: Open https://wiki.mozilla.org/. Move to the navigation landmark. Ensure that Edit and View history report access keys (on Windows, alt+shift+e and alt+shift+h, respectively).
- NVDA, JAWS: Open https://www.nvaccess.org/. Confirm that the Home link is reported as the "current page". Press the Download link. Once it loads, confirm that the Download link is now reported as the "current page".
- NVDA, JAWS: Open the Gmail inbox. Focus a message in the list. Press x to select it. Press down arrow to move away from that message, then up arrow to move back to it. Ensure the screen reader reports that the message is selected.
- NVDA, JAWS: Open Slack. Focus the message box. Type "tset" and press space. Press left arrow to move back to the word you just typed. Ensure the screen reader reports a spelling error.
- NVDA: Open Google Advanced Search. Confirm that "all these words:" only appears once in NVDA browse mode. (This is broken with JAWS in Firefox regardless of the cache.)
- NVDA, JAWS: Open a document in Google Docs. Press control+alt+a, control+alt+c. Ensure the screen reader says "Not on a comment".
- NVDA, JAWS: Open https://google.com/. In the search box, type "asdf". Move the screen reader's review cursor to "s", route the mouse and simulate a left click. Verify that the caret is now on "s".
- NVDA, JAWS: Open https://www.nvaccess.org/. Move the screen reader's cursor to the bottom of the page. Verify that the bottom of the page is scrolled into view and that the top of the page is scrolled out of view.
- NVDA: Open https://google.com/. Move the mouse to Gmail, Images, Privacy, Terms and Settings. Ensure that the screen reader reports these as you move the mouse to them.
- NVDA, JAWS: Open a blank document in Google Docs. Add a heading, some text wrapping across two lines, a bulleted list and a link. Read back through the text and verify that it is read correctly. Go to the start of the first line of wrapped text. Type two additional words so that the line wrapping changes. Read the first line and verify that it is reported correctly, accounting for the changed wrapping.
Bugzilla
45 Total; 0 Open (0%); 43 Resolved (95.56%); 2 Verified (4.44%);
Milestone 3: November 2022
Enable by default on Nightly 109.
Bugzilla
56 Total; 0 Open (0%); 54 Resolved (96.43%); 2 Verified (3.57%);
Milestone 4: January 2023
Enable by default for Windows and Linux users in beta 110.
Bugzilla
26 Total; 0 Open (0%); 15 Resolved (57.69%); 11 Verified (42.31%);
Milestone 5: March 2023
50% experiment for Windows and Linux users in release 111.
Bugzilla
19 Total; 0 Open (0%); 18 Resolved (94.74%); 1 Verified (5.26%);
Milestone 6: April 2023
Enable for all Windows users on release.
Bugzilla
ID | Summary | Assigned to | Status |
---|---|---|---|
1818726 | Content of a table missing in NVDA's virtual buffer with CTW enabled | James Teh [:Jamie] | RESOLVED |
1819741 | [CTW] Hit testing reports Accessibles hidden by another element with overflow: auto/scroll | Nathan LaPré | VERIFIED |
1819802 | [CTW] HyperTextAccessibleBase::OffsetAtPoint is very slow with large text containers | James Teh [:Jamie] | VERIFIED |
1821223 | [CTW] Poor performance querying paragraph boundaries in large tables | James Teh [:Jamie] | RESOLVED |
1822340 | Using Firefox Nightly with GitHub Blog I can no longer read the links on the page with NVDA using the mouse | James Teh [:Jamie] | VERIFIED |
1823294 | [CTW] Name not updated when text of text leaf changes | James Teh [:Jamie] | VERIFIED |
1824293 | Enable CtW on release for Windows and Linux | James Teh [:Jamie] | RESOLVED |
7 Total; 0 Open (0%); 3 Resolved (42.86%); 4 Verified (57.14%);
Milestone 7
Enable for for all Mac and Linux users on beta.
Bugzilla
18 Total; 0 Open (0%); 18 Resolved (100%); 0 Verified (0%);
Milestone 8
Enable for all Mac and Linux users on release.
Bugzilla
ID | Summary | Assigned to | Status |
---|---|---|---|
1827557 | [CTW][Mac][New Text Impl] Regressions in text navigation announcements | James Teh [:Jamie] | VERIFIED |
1829167 | Crash in [@ mozilla::a11y::Accessible::IsOuterDoc] | James Teh [:Jamie] | VERIFIED |
1830208 | Disable CtW on Mac for 113 release | James Teh [:Jamie] | RESOLVED |
3 Total; 0 Open (0%); 1 Resolved (33.33%); 2 Verified (66.67%);
Backlog
22 Total; 22 Open (100%); 0 Resolved (0%); 0 Verified (0%);