EngineeringProductivity/Projects/Stockwell/Meetings/2016-10-25
From MozillaWiki
Contents
Previous Action Items
Status
- [gbrown] - doing triage with orange factor data, determining what additional data makes triage and bug fixing easier
- [jmaher] - fixing a few infra intermittent bugs
- [jmaher] - starting to experiment with [quarantine] (thanks to [:malayaleecoder])
- [ekyle] - Neglected Oranges (new) https://people.mozilla.org/~klahnakoski/NeglectedOranges/index.html
- [ekyle] - OFv2 (not verified, on hold) https://people.mozilla.org/~klahnakoski/testfailures/
- [jgraham] - Added upstream pre-commit stability testing in web-platform-tests
Discussion Topics
- [etherpad]
- Classification Tasks (aka Quarantine) ideas:
- Purgatory (frequent intermittent tests) - what is the point of running these if the job will always be orange? could we detect if a regression was introduced? Would we disable tests if they stayed here too long (say 30 days)?
- example [try push]
- Quarantine (new/edited tests) - All tests which are edited/added would run for 2 weeks to ensure it isn't intermittent. If intermittent, would we move to Purgatory, backout original patch?
- Sanity (quick, very safe - 100% green) vs longer more complex/slightly intermittent. How would we define these safe tests? would we backout if any test here was intermittent? criteria for moving to main test suites (<10 intermittents total)?
- Making these changes will help keep our test jobs very green, without a purpose and workflow, these will add more confusion.
- are there concerns if something meets criteria on one platform, but not another?
- would our criteria change if this was disabled on one or more buildtypes/configs?
- Purgatory (frequent intermittent tests) - what is the point of running these if the job will always be orange? could we detect if a regression was introduced? Would we disable tests if they stayed here too long (say 30 days)?
- Communication Format proposals:
- weekly communication to dev.platform like Memshrink tracking orange factor, new bugs, bugs fixed, major and upcoming milestones, tools
- monthly open format meetings for 2017- discuss data from stockwell weekly updates, ideas, concerns
- should we announce top intermittent bug fixers?
- should we track by module (webrtc, dom, layout, infra, etc.)? announce big module changes in intermittents?
New Ideas to investigate
- [jmaher] running each test in a fresh process/profile (measure runtime, failures)
- [jmaher] test case review- what criteria is useful for determining a potential intermittent?
Action Items
- Discussion topics for next time
- [wlach] reproduce top oranges with 'rr' (possibly 2 weeks out)
- [gbrown] summarize findings of triaging- propose changes to tools