Services/Sync/WEP/111
Hibernation
Contents
Summary
If a user has not saved anything to their weave account in a certain period (a month?), they should be consdered hibernating, and their sync data should be moved to a lower-speed larger-capacity disk to allow for more space for the active users. If they return, their data is immediately restored to the active database, a process that should be transparent to the user, aside from a longer-than-expected first request.
Storage
Each shard has a large-capacity, slow disk that the frontends have access to. When the daily monitoring script finds an account that has been inactive for the inactive period, it does the following:
- Mysqldumps the users sync data into a file on the slow disk
- Appends a mysqldump of the category data onto that file
- Deletes the user's sync and category data from mysql
- Marks the users last_active usersummary column with a 0 to indicate > 1 month (so that these users can be ignored for further monitoring process runs
- Expires their memcache entries (not strictly necessary, provided there's a memcache expiration shorter than the period being used to detect hibernation)
Detection
Any user who fires up their Weave client again will first issue a category timestamp request. A category request will begin with the following:
- Check memcache for the presence of category timestamps. If it is there, then return it - user is not hibernating
- Issue a category timestamp query.
- If no data comes back from the category timestamp query, check for the exstence of the mysqldump file for that user. If it exists, follow the Retrieval process below, then issue the category timestamp query again.
- Save the category timestamps to memcache
Retrieval
Upon a request from a user determined to be in cold storage, the code should do the following:
- Make sure no lock exists on the mysqldump file. If it does, issue a backoff.
- Lock the mysqldump file. This could be as simple as touching a file in /tmp/userid
- Undump the mysqldump file.
- Update some known element with the new timestamp (items in the clients collection?). This prevents the user from being hibernated again if they don't do a write.
- Remove the lock
Issues
The process will produce unusual data if a hibernated user doesn't first issue a category timestamp query. This is expected to be the default behavior, but if they do not, they will, for all intents and purposes, be operating on an empty account. It is possible that the user would then undump some duplicate data, possibly including keys. The only way to avoid this is to check cold storage before every operation. While the check itself would be cheap (existence of a file), we'd be doing it an awful lot.
There are some small race conditions, though highly unlikely, and they're designed to minimize the likelihood of data disruption.
We may also see issues if we change the mysql schema - that's the limitation of dump/undump that we tradeoff for some good speed improvements. In theory, we'll be able to rewrite the files before undumping, but that will introduce some overhead at that point. This is acceptable, given the relative infrequency we expect for undumping a user. If it proves to be unworkable, then we can serialize the data as wbos and store them that way.
We might also be sneaky and prime the system by having the 'forgot password' mechanism quietly issue a category count in the background.