Releases:Release Post Mortem:2016-02-10
From MozillaWiki
Meeting Details
- 3:30pm ET
- Vidyo - Release Engineering room
- Dial-in:
- 650-903-0800 or 650-215-1282 x92 Conf# 98225 (US/INTL)
- 1-800-707-2533 (pin 369) Conf# 98225 (US)
- #releaseduty on irc.mozilla.org
- https://trello.com/b/MXHaVRcP/release-promotion-meeting
« previous week |
index |
next week »
< most recent |
upcoming >
Release Duty
- FF 45 cycle: mtabara
Misc
Shipped
Firefox/Fennec 45.0b4 (mtabara/nick/rail)
- flawless victory!
- building despite bug 1246854, should not impact the release
- Firefox intermittent errors:
- failed at repack_4/10 on linux: failed because no space was left on device while downloading update, automatic retry
- failed at repack_3/10 on win32: lower Sorbian locale download, retriggered
- regular GTK3 known-issue errors for linux/linux64 update_verify_beta steps
- Fennec - shipped a day later, all good!
TODO: awaiting push to Google Play Store and post-release step
Firefox 44.0.1 (jlund/rail/mtabara/nick)
- This dot release is desktop only and has fixes for bug 1244505, bug 1242176, bug 1222171 and bug 1244069.
- once shipped, the update rates have been set to 100% at RelMan's instructions. However 44.0.1 we reduced the updates to 0% yet again because of a new security bug issue bug 1245724.
Chemspill/dot release is awaited soon=> 44.0.2 is underway for both desktop and mobile - build1
- some intermittent errors:
- failed at repack_4/10 on win32, retriggered - It seems to have been some trouble in handling/processing the firefox-42.0-44.0.1.partial.mar while repacking for Scottish Gaelic locale
- failed at firefox_antivirus, retriggered - seems to have been a network downloading issue with German language locale for one of the mac partials MARs.
- failed at firefox_antivirus, retriggered - seems to have been a network downloading issue with Songhay language locale for one of the mac partials MARs.
- failed at update_verify_release_1/6 on win32 - timeout, retriggered
- abandoned - Media Playback team has received reports of A/V sync problems (multiple seconds) with some YouTube content (bug 1245696. This is a regression in FF 44 from bug 1229605. Turning off "media.mediasource.webm.audio.enabled" will revert from Opus audio to AAC audio, which is a well-tested code path. Opus has slight better sound quality and/or lower bandwidth requirements than AAC. Building build2 with that pref turned off
- some intermittent errors:
- build2
- some intermittent errors:
- failed at repack_10/10 on win64: similar to "Thunderbird 45.0b1 build1: failed at repack_2/10 on win32", xh locale failed while running make_incremental_update.sh, possibly from a balrog submission step? At any rate, this seems intermittent.
- we were a bit impatient and didn't wait for 'ready for release' email before pushing updates. Turns out the bouncer check was stalled
- some intermittent errors:
2016-02-08 14:25:43-0800 [HTTPPageGetter,client] TriggerBouncerCheck: uptake is 0
which was from the request
https://bounceradmin.mozilla.com/api/uptake/?product=Firefox-44.0.1-Partial-41.0.2&os=win64
not having any uptake. Win64 started at 42.0, so a partial from 41.0.2 makes no sense. Would not be surprised if this is another case of 'ship-it makes bad suggestions for partial updates' (bug 1146863)
- to fix this up nthomas dropped a dummy text file at pub/firefox/releases/44.0.1/update/win64/zh-TW/firefox-41.0.2-44.0.1.partial.mar, but buildbot failed to continue once it had uptake for all requests:
2016-02-08 14:30:43-0800 [HTTPPageGetter,client] TriggerBouncerCheck: uptake is 2000100 2016-02-08 14:30:43-0800 [HTTPPageGetter,client] TriggerBouncerCheck: Stopping uptake monitoring: Reached required uptake: 2000100 2016-02-08 14:30:43-0800 [HTTPPageGetter,client] TriggerBouncerCheck failed: Traceback (most recent call last): File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 441, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 664, in _cbDeferred self.callback(self.resultList) File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 318, in callback self._startRunCallbacks(result) File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 424, in _startRunCallbacks self._runCallbacks() --- <exception caught here> --- File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 441, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbotcustom/scheduler.py", line 327, in checkUptake Triggerable.trigger(self, self.ss, self.set_props) File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_41f0fbee10f4_production_0.8-py2.7.egg/buildbot/schedulers/triggerable.py", line 66, in trigger d = self.parent.db.runInteraction(self._trigger, ss, props) exceptions.AttributeError: 'NoneType' object has no attribute 'db'
- there was a reconfig at 11:00 PST (ie after the initial start of TriggerBouncerCheck and it succeeding) which is a known to cause failures.
- to work around this the release-mozilla-release-firefox_release_start_uptake_monitoring builder was forced (setting script_repo_revision and release_config properties), which fired off jobs and emails as expected, culminating in the 'ready to release' email.
Firefox/Fennec_45.0b3 (jlund/rail/callek/nick/mtabara)
- yet-another victory!
- GTK3 status update on this release - excerpt from bug 1245476:
As discussed in 1227024, bug1205199 is critical enough to disable gtk 3 in 45: * 45 is an ESR release, it would be safer to introduce gtk 3 in 46 * We only had two beta with gtk3, this is not enough for such important changes * We need time to make sure gtk2 is still in a good shape
- Some intermittent errors:
- tl;dr retriggered - build step failed on win32: known intermittent - bug 1224886 - Intermittent Win PGO build LINK : fatal error LNK1000: Internal error during IMAGE::BuildImage after workerprivate.cpp(3075) : fatal error C1001: An internal error has occurred in the compiler
- from the IRC channel
02:37:28 <nthomas> this sort of thing can lead to compiler upgrades to pick up fixes 02:38:09 <nthomas> acksully, https://bugzilla.mozilla.org/show_bug.cgi?id=1224886 02:39:05 <nthomas> they do fix these kind of bugs, eg https://connect.microsoft.com/VisualStudio/feedback/details/819439/fatal-error-c1001-an-internal-error-has-occurred-in-the-compiler
- failed at repack_7/10 on linux64 - random network issue while setting up for the job, rerun it
- failed at firefox_antivirus - network downloading issue with Marathi language locale for one of the win32 partials MARs, retriggered.
- regular GTK3 errors for linux/linux64
Ongoing
Firefox 44.0.2 (mtabara/rail/nthomas)
- new security bug issue bug 1245724. 44.0.2 is underway for both desktop and mobile
-
build1:- stopped in order to add one more critical fennec issue and start a build 2
- build2:
- intermittent errors for Firefox:
- failed at firefox_antivirus, retriggered - intermittent download error for locale/partial
- intermittent errors for Firefox:
Fennec 44.0.2 (mtabara/rail/nthomas)
- new security bug issue bug 1245724. 44.0.2 is underway for both desktop and mobile
-
build1:- stopped in order to add one more critical fennec issue and start a build 2
-
build2:- abandoned here for yet-another build to follow with a hotfix
- intermittent errors:
- Fennec 44.0.2 build2: build step failed on android-api-11 - failure to clone build/tools when the fingerprint didn't match. gps suspects AWS are rolling out new certs
-
build3:- abandonded here as buildbot-master73 froze our builds and was really slow today - given that there was too much room for human error to interfere, we'll follow-up with a fourth build.
- build4:
- intermittent errors:
- [release-runner] WARNING: Reconfig exceeded 900m then 1800 seconds - looks like buildbot-master73 is naughty today and really slow hence it delayed the whole reconfig step
- at least three builders have been grabbed by the same bm73 yet-again. We might end up in the same scenario as build3.
- intermittent errors:
Thunderbird_45.0b1 (jlund/rail/callek/nick/mtabara)
- TODO: awaiting decision as lack of TB equivalent watershed Firefox beta gtk3 rule in Balrog, please see email on TB-drivers email
-
build1- we had two win32 repacks failing
- failed at repack_6/10 on win32 - retriggered, intermittent timeout
- failed at repack_2/10 on win32 - retriggered
- retriggered upon 'da' locale failed while submitting to balrog, specifically around the make_incremental_update.sh script
- retriggered upon loosing slave instance
- retriggered upon timeout
- from tb-drivers mailing list: "We'll likely abandon build1 and go for build2 after getting some fixes"
- we had two win32 repacks failing
-
build2:- "Same changesets as before, but buildbot changes merged to production."
- gave up build 2 because of build error
- build3:
- intermittent errors:
- failed at repack_7/10 on win32, automatic retry
- failed at repack_3/10 on win32, automatic retry
- failed at update_verify_beta_2/6 on linux64 - GTK3 known issue error
- failed at update_verify_beta_2/6 on linux - GTK3 known issue error
- intermittent errors: