Necko/Cache/Plans
Contents
New Cache Plans
We have decided to rewrite our HTTP disk cache.
People
The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.
Design team:
- Michal Novotny
- Taras Glek
- Steve Workman
- Honza Bambas
- Nick Hurley
- Brian Bondy
- Doug Turner
- Patrick McManus
- Steve Workman
Implementation team:
- Honza Bambas
- Michal Novotny
Primary Design Goals
This section documents issues that need to be addressed in the new cache's design.
- Version API for the cache so we can update easily.
- All APIs should be async. No main-thread locking or i/o at all.
- A crash or abnormal program termination should not invalidate the entire cache.
- Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
- Make use of fallocate.
- Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
- Consider eliminating memory cache.
- Competing ideas:
- Temporal layout so that sub-resources are together.
- Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
- Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
- Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
- Browser should behave properly with disk cache entirely disabled.
- Allow for effectively racing cache against network, so as to not wait serially.
- Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)
Success Metrics
This section documents the ways in which we'll determine whether or not the new cache design is a success.
- Should not be possible to trigger main-thread i/o.
- Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.
API
This section documents the APIs for interacting with the new disk cache.
XPCOM APIs (exposed to JS)
C++ APIs (exposed to Necko)
Locking
This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.
On-Disk Layout
This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.