SocorroRoadmap2010
From MozillaWiki
DRAFT
The content of this page is a work in progress intended for review.
Please help improve the draft!
Ask questions or make suggestions in the discussion
or add your suggestions directly to this page.
Dates
- Dates for the following goals can be found in google docs
Related Quarterly Goals
- Metrics q2 goals (for background):
- Replace NFS in production
- Have cluster doing background processing of 100% of crash reports
- Provide replacement for Postgres big table
- [stretch] Developer API, likely to slide to Q3
- Client team goal: Gather more information from crashes bug 528657
Milestones
1.7: Hbase, part I
- Get individual crash reports into Hbase
- Begin rewriting pythonic middleware to support UI -> Hbase (transparent to UI at this stage)
- OOPP hang reports supported
- End of NFS
1.8: Hbase, part II
- Daemonize processor/MDSW and run on Hbase worker nodes (architecture diagram coming)
1.9 Middleware API
- Create an API to HBase/SOLR to replace most PostgreSQL queries in the webapp
2.0: new UI
- Rewrite webUI to use new middleware
- (stretch goal, may slip to 2.1) Implement a general purpose full text search. Should be able to search on any data associated with a crash, e.g any part of the stack trace and/or module list, any permutation or combination of field values
- [Existing search bugs]
- PRD needed (implicit in bugs, make explicit)
- UX work needed here [chowse]
2.01 Cleanup
- Post 2.0, let's do a clean up release to do a bunch of housekeeping
- Perform a team survey of the unit testing landscape
- Define unit testing needs
- Hadoop
- Python
- PHP
- Define integration testing needs
- Define acceptance testing needs
- Define unit testing needs
- Define unit testing strategy
- Assign people to champion each area of testing
- Perform a team survey of the documentation landscape
- Define documentation needs
- http://code.google.com/p/socorro
- Python
- PHP
- Assign people to champion each documentation area
- Define documentation needs
- Better app monitors / business level monitoring
- Subversion
- Decide whether to change to branch release system - https://bugzilla.mozilla.org/show_bug.cgi?id=481479
2.(x+1) Trend Reports, part 1: Explosive bugs
- Explosive Bugs Analysis
- Automated detection of explosive bugs
- First stage is bug 519423
- PRD is needed here [chofmann/laura]
- UX is needed here
2.(x+2) Trend reports, part 2: better correlations
- Other cloud based correlation reports:
- Between one report and other related reports: what are the logical correlatons? (PRD needed)
- Correlation between any single piece of data and another (e.g. plugins, time, etc
- Replace current correlations HACK with cloud version bug 554373
2.(x+3)
- Draft goal: smarter analysis
Process
These improvements shall be made over the course of Q2 (and likely continuing in Q3)
Better release process
- See DevProcess
Testing and QA
- Better code review practices: commits to mailing list
- Add QA to release cycle
- See Test Plan for UI testing
- More unit tests, more integration tests
- Validate data sources against each other (e.g. bug 552539, bug 553144) - also look back at similar fixed bugs for test cases
- Run tests automatically on checkin (Hudson?)
Monitoring
- Write scripts for app level monitoring for IT to hook up to nagios
- Implement "business logic" monitors: check things like hourly volume via webapp, db, etc
- Expand application health [dashboard]
- Some existing bugs on this. What granularity? What is "normal"?
- [deinspanjer] Hbase monitoring to be expanded
Staging
- Staging closer to production/more realistic
- Perf/load test before deployment
- Better access to staging for testing
- Best:
- database write access
- ability to run scripts
- Acceptable:
- log viewing
- database browsing
- view config files
- view automated test output (Hudson?)
- Install/write some admin tools to accomplish this (may also be useful in production)
- Best: