Breakpad/Status Meetings/2015-10-14
From MozillaWiki
< Breakpad | Status Meetings
« previous meeting — index – next week » create?
Contents
Meeting Info
Breakpad status meetings occur on Wed at 11:00am Pacific Time.
Conference numbers:
Vidyo: Stability 650-903-0800 x92 conf 98200# 800-707-2533 (pin 369) conf 98200#
IRC backchannel: #breakpad
Mountain View: Dancing Baby (3rd floor)
Operations Updates
- stage has had some challenges and opportunities
- deploy failed earlier this week, but exited 0
- systemd was running stuff as the wrong user
- race condition in our infra that we hadn't hit in the prior six months
- crash mover happened to start sooner this time
- monitoring failure. only discovered because we were looking at a change on stage in detail and saw nothing was running
- alerts were firing, but dev team was not seeing them
- looking for a way to connect them to irc
- stage admin node had not been running crontabber, but it was all green
- divergence between our consul config and the code
- crontabber has in the past taken its config from code
- consul was overriding with a different set of jobs in config
- now have monitoring to check to ensure crontabber is running every so often
Project Updates
- (peterbe) Wanna monitor QA saucelabs tests?
- do we want to tighten the loop?
- there's a long delay with the QA tests being used by monitoring
- tests failing on stage don't block a prod release, now
- we had a bunch of flakey tests, and so removed the alerting in irc
- now we only see them as bugs filed by QA after the fact, easy to miss
- schedule something after this meeting
- (peterbe) Graphics update
- lars has a giant pull request being a blocked by an external dashboard run by the graphics team
- peter wrote an alternate way for them to get the data on-demand instead of through cron + warehousing
- PR is open for them, waiting on them to switch
- (adrian) shipped links from postgres powered to es powered reports
- email coming
- looking for feedback, will mail stability.
- (peterbe) featured versions maintenance need to be set on prod and stage
- button on stage to match prod is up in a PR
- baby step towards full automation
- (lars) signature shortening change is coming
- 5PM if JP is available, we will cut over to the new abbreviated signature method
- if he is not, we will do it tomorrow instead
- a few minutes of instability, insignificant
Deployment Triage
PR Triage
other business
Travel, etc
- mbrandt - PTO 2015-10-15 - 2015-10-16