Breakpad/Status Meetings/2017-02-15
From MozillaWiki
< Breakpad | Status Meetings
« previous meeting — index – next week » create?
Contents
Meeting Info
Breakpad status meetings occur on Wed at 10:30am Pacific Time.
NOTE: Meeting will start 30minutes later than normal due to MoCo meeting.
Conference numbers:
Vidyo: Stability 650-903-0800 x92 conf 98200# 800-707-2533 (pin 369) conf 98200#
IRC backchannel: #breakpad
Mountain View: Dancing Baby (3rd floor)
Operations Updates
- antenna week
- new relic working on stage
- same load balancer, one node with newrelic and one node without (for profiling)
- antenna should do intelligent deploys shortly
- scale down old stack after new stack is up and healthy
- socorro alerts!!!!
- pre-warning alerts went off before things broke for users
- 2 GB per host memory jump in all ES nodes (nearly all of it is fielddata)
- still not sure what sure what the cause was
- dropped most of the small indices, down to 24 weeks of retention
- developed a reliable, repeatable process for adding data nodes on stage
- unless otherwise directed, going ahead on prod today
- secreeeet ansible playbooks
- miles has been keeping ansible files around
- mostly for dealing with elastic search
- wrote scripts instead of doing anything manually
- been kept privately until now
- but now we know
- what dark magicks are contained within?
Project Updates
Deployment Triage
PR Triage
Major Projects
Splitting out collector (Antenna)
- (willkg) set up a "to load test" github project board to keep track of current status: https://github.com/mozilla/antenna/projects/2
- why? because we've got some things in bugs and a lot of things not in bugs and it was hard to see where things were at.
- how does it get updated? you update it! and Will will periodically ask people where things are at and update it.
- (willkg) fixed a bug in Antenna's healthcheck endpoint where if the IAM credentials weren't set up right, it errored on erroring out
- (willkg) updated antenna-loadtests to use molotov master tip
- (miles) set up new relic on a single -stage node so we can trace execution
- (miles) started setting up a -prod environment
- (mbrandt) we have AIloads (old way), Molotov (the new way). Molotov is ready, waiting on a final r? from Tarek.
- (mbrandt) there are some questions about molotov features where molotov reporting doesn't provide an easy way to line up server metrics and the molotov reports
- (mbrandt) at rpopa's recommendation, going to test a single node and increase molotov resources until it breaks, and then do some math
- (mbrandt) ailoads is ready anytime, needs some responses before molotov goes off
- (miles) will spin up a single node mbrandt to go after lunchtime
Deprecation rampage
- no updates
Processor rewrite
- NO UPDATE
Upgrading elasticsearch
- (Adrian) still blocked by the most annoying bug (but Will gave me some tips to solve it, working on it)
- I suspect it might be a bug in an underlying library
- (Adrian) mapping issue is solved, just need to import prod's Super Search Fields data locally and mess with it
- ... once I have a Socorro that works with ES 5.1
Other Business
- outage looks like background quantum aws weirdness and re-balancing
Travel, etc
- Adrian switching Monday (out) and Friday (working) next week