CIDuty/Meetings:2013-08-27

From MozillaWiki
Jump to: navigation, search

« previous week | index | next week »
< most recent | upcoming >


Release Engineering Buildduty Meeting

Status of buildduty period

Bugs filed

https://bugzil.la/908354 Need a way to obtain production keys for win64 machines (or fix original approach) https://bugzil.la/908359 Machine without reboot history even after not taking jobs for almost two days https://bugzil.la/908670 We cannot file new bugs from slavealloc (using old components)

Previous action items

Agenda

  • (Callek) put foopies in slave_health and slavealloc [as slaves]?
      • Defer to when coop is here, is probably best.
    • (coop) should go in slavealloc, but how? ideas:

1. new category (and associated db table) for foopies, where tegra/panda maps to foopy maps to master 2. foopies as masters 3. foopies as slaves

    • for 2 and 3, still need a join table to do the one-to-many mapping
  • (armenzg) Amazon issues last week
  • (armenzg) nagios checks
    • mozpool pandas (no nagios) VS regular pandas
    • desktop machines (only complain after kittenherder has had its chance)
      • Do you know if we report nagios issues for *desktop* and *tegras* hosts earlier than 6 hours from an incident? The question comes because I beleive that we should not report anything before 6 hours since we hope briar patch to do something about it.
        • hwine says: Agreed -- if you are seeing that, please reopen https://bugzil.la/886637 which was supposed to have done that already.
      • I assume the pandas that use mozpool should not report on nagios at all.
  • (armenzg) IPMI tends to hung and we should take into consideration into our code
    • do not PING regularly or it will go down
    • my covo with arr
  • (bhearsum) IT escalation for slaveapi reboots - reboot vs. hardware diags
  • (bhearsum) difficult in testing new tools due to default deny
  • (bhearsum) ssh config changes on r3 w7 machines
    • windows auto-reboot status

List of current projects

Action items

  • armenzg - to reach IT wrt to nagios URL links documentation
  • jhopkins to look at bug with no reboot history - https://bugzil.la/908359