Sheriffing/How To/Bisecting
From the Wikipedia article on bisection:
"Bisection is a method used in software development to identify change sets that result in a specific behavior change. It is mostly employed for finding the patch that introduced a bug."
Sometimes sheriffs will need to perform a bissection to find out what changeset cause a failure if we can’t determine this via code inspection. This might happen for intermittent bugs or because tests were skipped due to SETA.
Here's a little scenario to demonstrate the process.
Contents
Example Bisection Scenario
After a merge from integration to m-c the the xpcshell tests on linux-asan are busted and it’s not clear which changeset cause - inbound and autoland were fine.
Examine the merge details
Changes of the merge (fictitious):
2c497462f25e Merge inbound to m-c a=merge 02851079c451 Bug 1359458 - Increase assertion count range for test_bug437844.xul. r=jmaher B37b46c7f38f Bug 1358241 - [1.1] Add mutex locking around the library handles cache. r=jchen 751455b663d0 Bug 1358241 - [1.2] Make direct library reference counter atomic to avoid mutex locking issues. r=jchen 841fa5fb06a8 Bug 1355676 - Check for nulls when decoding icons. r=sebastian Ad9d525e6db7 Bug 1356243 - Enable Screenshots by default. r=Mossop 2e44294b9f5c Bug 1359273 - Split up DevTools' sort-arrows.svg to improve performance. r=jryans
Preparation for bisecting
- Clone and/or update your mozilla-central repo. You should already have this as part of your unified repo.
- Add the try server settings (see https://wiki.mozilla.org/ReleaseEngineering/TryServer#Configuration )
- Find the try syntax you need to run the specific test(s) you need. The Trychooser can help. In this scenario it is:
try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10
- The rebuild 10 means that you basically run the xpcshell tests 10 times instead of only once. This is 'really important if the bug is intermittent.
Verify the failure
Confirm that you set everything up correctly and that you can reproduce the problem on try, i.e. do a try push with mozilla-central tip as topmost revision:
- Make a dummy change, e.g touch the CLOBBER file
- Use a commit message like the following:
hg commit -m “central tip try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10”
- Use a commit message like the following:
- Push to try
- Check that you get the same failure.
Bisection begins in earnest
Bisection means cutting things in half, so split up the merge and check if the test failure already starts in the middle of the merge. In this case, let's say you decide use 841fa5fb06a8 as your new revision:
-
hg up -r 841fa5fb06a8
to update to the older topmost rev- You can confirm this with doing: hg summary and it should show something like
parent: 354839:841fa5fb06a8 Bug 1355676 - Check for nulls when decoding icons. r=sebastian
-
hg commit -m "central rev 841fa5fb06a8 try: -b o -p linux64-asan -u xpcshell -t none --rebuild 10"
- Push to try
- Check the results
Let’s assume results for 841fa5fb06a8 are green. Now you need to do the same steps again for the later changes (02851079c451 and B37b46c7f38f / 751455b663d0 ) to check which of this 2 bugs caused the issue.
Backing out and follow-up
Once you've found the bad changeset, follow the instructions to back it out. In most cases, you will be bisecting after the problem code has been merged around to different branches, so you will need to back it out from more than one branch. For this reason, you shouldn't offer developers the chance for a follow-up fix.
Caveats
- When bisecting, you can push up to 6 pushes to Try at the same time to be able to have results ASAP.
- Be considerate though. Running any Try jobs with the
--rebuild
parameter set will tie up more resources than normal and will impact your fellow developers if the trees are not closed.
- Be considerate though. Running any Try jobs with the