Sheriffing/TBPL/DeveloperDocs
Contents
About the service
Overview / basic idea
In one sentence, TBPL shows Firefox developers what is happening in their repositories.
Every checkin to a Firefox repository triggers the execution of several automated test jobs for the new revision. For each push the new source is compiled to a fresh binary on all supported platforms, and the resulting binaries are then tested by lots of automated unit and performance tests.
TBPL shows a repository's state by correlating pushes (checkins) with the results of these automated jobs. For each push, the corresponding changesets (code changes) are listed in the left half of the page, and the triggered jobs are listed in the right half of the page. Every job (also called "build" or "run") is represented by one colored letter. The color of each letter depends on the success status of the job.
Developers need to make sure that their changes don't "break the tree", i.e. that they don't cause any tests to fail, so they'll usually monitor TBPL for a while after they've checked in.
Further functionality
On top of the basic functionality, TBPL has these features:
- Clicking on a job shows details of the job result in a pane at the bottom. Job details include:
- Exact numbers of passed / failed / skipped tests, or performance numbers for performance tests
- Links to the job's log (in abridged and full form)
- For unsuccessful jobs, an excerpt of the log that only lists failures ("blue box", "summary")
- Comments on the job ("yellow box", "stars")
- From the job details panel, the user can perform actions on the job:
- Comment on the job ("star it")
- Restart the job (for example to get more performance data, or to see if a test failure is intermittent)
- Cancel the job, if it hasn't finished yet
- more
Development documentation
There is no design doc for development. Development happens in the repository at https://hg.mozilla.org/webtools/tbpl/ and in the Bugzilla component Webtools : Tinderboxpushlog.
Installation instructions are given in the README file.
TBPL doesn't have versions or milestones. Changes are usually implemented in response to a bug in Bugzilla (see list of open bugs) and deployed as they come in. The motivation behind specific parts of the code can be figured out by looking at the hg annotate of the file and tracing it back to the bug where the code was landed.
Implementation overview
(I'm going to start a design doc right here, though it should probably be broken out to a different wiki page at some point.)
TBPL is mostly about useful presentation of data that's already available elsewhere. For example, the list of pushes to a repository is at hg.mozilla.org/repository/pushloghtml (e.g. mozilla-central pushlog). Another example is the self-serve UI that allows one to cancel and restart builds. So it's TBPL's job to reassemble that data in a way that's as accessible for developers as possible.
For some of that source data, TBPL is the only human-readable presentation. Run result data, for example, is only available as JSON (from Buildbot).
There are two exceptions to the "data comes from outside" principle: Job comments (also called "build stars") and the list of hidden builders will be stored on the TBPL server itself, in a MongoDB database. In Tinderbox mode both these things were stored on Tinderbox.
Architecture
TBPL started out as a pure client-side JavaScript webapp. All data used to be gathered and assembled in the user's browser. Over time, TBPL has grown a small collection of PHP scripts that make up its server-side component.
We want to do as much as possible on the client side because that's easier to develop and play with, especially for outsiders that haven't been involved in TBPL development before. If something is handled on the server side, that's for one of these reasons:
- It acts as an interface to the database (e.g. getRevisionBuilds.php or submitBuildStar.php).
- It makes use of "secret" data (submitBugzillaComment.php which needs a bugzilla user password).
- It performs a very expensive operation where the results should be cached (all other php scripts).
File descriptions
I'm building a table here, I'll import it into the Wiki when I'm finished
(to be continued...)
Hardware
(what is going to be used here, physically?)
The hardware design will be worked on by IT once the design doc here is complete, see Bug 677346.
OS
(self explanatory, note any exceptions)
Any form of unix that's supported by Apache, PHP, Python and MongoDB should work. IT will decide what will actually be used.
Network flows
(firewall needs?)
See Tinderboxpushlog/ArchitectureAndDependencies.
Load Balancing / Caching Load balancing
(round robin? VIP? GLB?)
Undecided. It should be possible to have one shared server for the MongoDB and multiple servers serving the PHP / HTML / JS / CSS. All state is stored in the MongoDB; local data on the PHP server is only used for caching. (Cached files are in cache/ and in summaries/.)
Health checks
(how will the app be checked for validity from the lb?)
Since the web servers don't store state, they can't be invalid. (Does that make sense?)
Front end caching
(http caching) No caching.
Back end caching
(memcache etc) No idea.
Database
(what database server(s) - rw & ro, db name(s), db username) Other requirements
TBPL MySQL databases: Production lives on generic1.db.phx1.mozilla.com at the db tbpl_mozilla_org
dev and stage both live on dev1.db.phx1.mozilla.com: Stage: tbpl_allizom_org Dev: tbpl_dev_allizom_org
File storage
(internally or externally mounted filesystems.. where will static data for this service live?)
Results from the more expensive PHP scripts (getParsedLog.php, getTinderboxSummary.php, getLogExcerpt.php) are stored in directories called "cache" and "summaries" in the TBPL root directory. The "summaries" directory will go away once we get rid of Tinderbox mode.
Automation
Cron jobs
(if the cron jobs run from an admin machine, please specify where they will run)
(what's an admin machine?)
There needs to be a cron job that periodically runs tbpl/dataimport/import-buildbot-data.py in order to import buildbot data into the MongoDB. The import frequency hasn't been fixed yet, but it's probably going to be between one and 5 minutes. (The Buildbot source data is regenerated every minute.) The importer is idempotent; it never destroys data and it doesn't insert duplicates.
Other links
See the main TBPL page: Sheriffing/TBPL