Socorro:PRD Interviews
From MozillaWiki
Contents
dbaron
- Use cases
- Top crashers -> details of particular crash
- Admin UI to add versions
- Constrain search by signature
- CSV files
- Rate of increase of a crash? (What URIs?)
- Top URLs for a given crash
- View user comments for a particular crash
- Wishlist
- Constrain top crash lists
- When did a crash start? (search on both time and buildid)
- Ad hoc queries
- What percentage of crashes are caused by Flash? By extension X?
- Faceted search
- Map/Reduce
- Explosive crashes - post to dev.tree-management (notification)
- More correlations
- Data in the minidump that is not available in the UI (some for privacy reasons)
- Stackwalking code could generate better stacks if it had copies of the DLLs? (ask ted)
- Better symbol coverage
damon
- Use cases
- Look at top crashers for new bugs (ones without bugzilla ids), fast rising crashes
- Feedback/wishlist
- Front page should always show current shipped version
- Sudden crash patterns - email for sudden spike in shipped release to dev-tree-management
- subscribe to explosive bugs (rss feed?)
- Definition of explosive/critical bugs:
- (initial) growth of more than 25 positions in the ranking
- upwards change in rank and no related bugzilla id
- time since startup < 1 minute
- highlight these crashes in red or something
- View all crashes - needs filter by hang/crash
- Search:
- Search by time since startup
- Search for bugs with no bugzilla bug
jonas
- Use cases
- Top crashers: for each bug, what cause?
- pull down minidump
- open in debugger
- classification
- file bugs
- Graphs (build time more useful than clock time)
- Does a new version improve crashiness?
- For each crash/bug, how are we doing?
- Is the bug assigned?
- Last commit date / last active bz date
- Which component is it assigned to?
- Which group? (third party/Mozilla)
- Other scripts
- Uses dbaron's correlation reports
- Uses jst's script to pull down minidumps
- Top crashers: for each bug, what cause?
- Feedback/Wishlist
- Find all crashes for a given signature where there is an email supplied
- No correlation reports for startup crashes (no addon info for these) - need to google DLLs. Internal mapping of DLLs to addons would be useful.
- Whiteboard notations in bz that socorro could pull in
- Search all 3rdparty crashes / search mozilla only ("find all the bugs I can act on")
- Group crashes by DLLs (one DLL may be responsible for several crash sigs)
- Go from crashes to DLLs, DLLs to crashes, filter by DLL
- Find new crashes easily
- 3 categories of new crash
- trunk/mozilla-central - broken checkins (low #s)
- new dot release (early life crashes)
- existing release starts crashing, typically 3rd party problem, or sometimes a rank change from low to many
chofmann
- Use cases
- release.next
- look at top 300 crash reports
- look at sigs without bugs
- what do we need to do to get a good bug on file?
- CSV
- Identify correlations where we don't have a correlation report e.g. OS where we might most easily reproduce
- What OS
- What other versions does this crash appear in
- If pre-existing, has frequency changed? ("volume regressions")
- dbaron's correlations to plugins and addons
- sanitize URLs and add to bug (if on public website)
- look at time since startup - not many URIs, URIs not useful
- within 30s is a startup bug, but anything in first few minutes is interesting
- release.next
- Wishlist
- breakdown by:
- OS
- Fx version
- Time from startup
- look at individual crash reports
- what other reports are like this?
- correlations:
- integrate addons and plugins
- cron job to grab interesting data
- search UI
- search by each individual field in crash report - "like bugzilla search" - use a crash report template
- search by DLL (also check out /CrashKill/dll-dictionary - can we crowdsource this?)
- interested in map/reduce UI
- DLL check, similar to plugincheck?
- breakdown by:
jesse
- Use cases
- bugs with stack traces
- search for specific sigs, trying to find
- frequency
- correlations
- Uses dbaron's text files
- Always uses advanced search. Always uses "contains" not "is exactly" - this should be the default.
- Feedback/wishlist
- List of reports (for one signature)
- Data not presented well. Most columns have identical data. Display could be more compact
- When many results are returned (more than one page) - does sort sort all results, or just those on the current page?
- Everything seems to be 0 or 100%?
- Broken in Safari
- Sort crashes by component
- Suggest ways to narrow search (lt: faceted search)
- Restrict search by caller (next thing on stack, nth thing on stack)
- Personal adhoc skiplists
- Write own Map/Reduce jobs - share them, elevate them
- Change search length
- List of reports for a sig is not useful
- Should summarize the reports
- Should have stats: 90% OS X, correlations, etc, how common?
- Need to click through to an individual crash to get anything useful
- Sort by stack trace, sort by caller
- Highlight unexpected correlations
- Developer discussions/comments per crash signature - this would be more useful than bugzilla, sometimes crashes do not map to an actionable bug report
- Subscribe to a feed of new crashes in my component
- Add names of extensions (AMO integration)
- Give feedback to users, how to avoid crashes (bug 411425, bug 336872)
- Query for exploitable crashes - jumps to random memory addresses (lt: not sure of what we would be matching against here) - instruction pointer and what it tried to access - would require changes to client and MDSW
- Wanted bugs
- bug 512910 anything in a correlation should be searchable
- bug 562411 clever weighting
- bug 524847
- bug 524851
- Faceted search
- Crashes by caller
- Sort by Flash version
- List of reports (for one signature)
jst
- Uses
- Runs a script to extract data
- Find new crashes in 3.6 not seen before, put in flat file
- sigs, # crashes per release
- ignore crashes with no sig, ones that happen in plugins
- configurable thresholds
- runs this nightly off desktop in the office
- Does not use web ui now, did during crashkill push
- Investigate reported crashes
- Correlation data is critical
- All correlations: addon versions, plugins, DLLs
- Top crashers -> correlations
- Loves graphs of frequency of a crash by date/build
- Runs a script to extract data
- Feedback/wishlist
- Collect DLLs - if something is in 95% of windows 7 installs, call it official
- Frequency count these
- Then in correlations, mark as "Windows 7 DLL" (also for other windows versions)
- Wrote script to pull 5 minidumps per sig, eval stacks - what does the rest of the stack look like? (callers)
- Let anybody write their own queries
- Collect DLLs - if something is in 95% of windows 7 installs, call it official
dmandelin
- Uses
- Not to *find* top/new crashes, but to fix them
- Interested in top JS crashes, can mostly id these by signature (JS in there somewhere)
- chofmann would find and file bugs, dmandelin would look at crashes to fix
- 99% of use is trying to figure out cause of crashes
- First thing: correlations: "it's Flash"
- Another use: look at various reports with same sig, look for patterns:
- Do they all have the same address? If yes, this is a strong hint.
- Operating systems?
- Uptime?
- Next: look at individual stack traces
- what method?
- what LOC?
- Then:
- download a raw dump or two
- can be tricky, but can plug dumps into Visual Studio, really need a disassembler to make sense of it all
- Next: crashes by build (graph)
- What is the first build where this crash appeared? (or spiked in volume)
- Combine and summarize all this data
- Pre-graphs, ran python scripts to pull down dumps and summarize
- A special case: "hard bugs"
- Add a patch (debug code) to make build crash in a different way
- Search - advanced
- Based on sig + Fx version + 1-2 weeks back
- Feedback/wishlist
- Graph by build, all builds side by side
- When did it start (build time), when did it change freq (build time)
- Crash spikes must be reported relative to ADUs otherwise meaningless
- Search results page
- Group by /sort by (tag)
- Categorize - yes, faceted search would solve this
- Aggregation is key to understanding the data
- Easy switch between views without having to go back to front page/dropdowns
- Full text search
- Run own queries (yes, would like access to Map/Reduce UI)
tomcat
- Uses
- File bugs, crashes for each build, str
- Rarely uses search
- Try to develop test cases for each crash
- Correlation reports - what addons are involved?
- Feedback/wishlist
- Which plugins? What versions are they? Are those versions up to date? (Tie in to plugincheck)
- Explosive crashes
- Are there bugs for the top 25 crashes?
- Expose comments field
- Current way of showing trends is good
- Emailing users (bug 411425) - should do this, but how to do it right (talk to beltzner)
- DLLs are interesting but may be too hard
marcia
- Uses
- dbaron's correlations
- investigate new crashes - top crashes, mostly on trunk
- file bugs for new crashes
- dig for problematic addons
- try to drill down by addon
- would love search by addon
- look for crashes in JS stack
- find correlated URIs - these are usually easy to repro
- find by URI
- find by component
- Individual stacks
- On one OS: crashes, URLs, try to repro
- Feedback/wishlist
- Search is hard to use with signatures
- Advanced search should be for longer window
- Identify plugin names and versions
- Pattern matching in stack and signature search
- Match by component
juan
- Uses
- New to Socorro, works on OOPP
- Feedback
- Homepage releases should be per-user (as with other prefs)
- Current throttle percentage should be shown (wontfix)
- Need to be able to distinguish between builds as we approach GA - show buildid
- Top changers - what do the numbers mean?
- Compare top crashers for two most recent builds
- Need visibility of extensions and plugins in human readable format
- Aggregate crashes per ADU should be prominently displayed since it's a KPI
wsmwk
- Uses
- STR
- Stack
- Extension
- Contact reporter by email - comments usually not enough - response rate here is 10-20%. Sends between 1-15 emails per signature, but selective - useful comments, written in English.
- JS: lots of Tb problems - JS state info is sometimes missing (ask ted for more info?)
- Time from startup is interesting
- Probability of it being an extension?
- File bugs on crashes without bugs
- Why are there hardly any Mac crashes for TB?
- With TB, little correlation between dev/beta crash rate and shipped crash rate
- Different type of users, low volume
- STR
- Feedback/wishlist
- Correlations (and between releases)
- Explosive crashes
- Compare side by side stacks between two different crash types
- Search - keyword + sig
- should be contains by default
- longer search range
- longer "contains" in stack
ludovic
- Uses
- After a new release - what bugs have we missed?
- New crashes without bugs
- Crashes with increased volume
- Feedback/wishlist
- Remember TB cannot be throttled (too low volume)
- When did crash first appear/spike?
- What crashes are related to 3rdparty extensions/etc
- Wants bug 411425