Places/Stats

From MozillaWiki
Jump to: navigation, search

Context

See Places-Stats.mozilla


Analysis

Goals:

  • Insights into usage of bookmarks history
  • Define characteristics for test places databases
  • Use open source tools to create and iterate on reproducible analysis of the places stats data set
  • [Andyed] Investigate potential to gather updated stats for metrics tracked in historical research (% usage of bookmarks, % new urls visited, etc.)

Toolset

Code

See the Etherpad Page for the scratchpad

Load Data (save https://places-stats.mozilla.com/stats?format=csv locally)

places <- read.csv("...places.csv")

Compute age metrics

places$oldest_stamp = as.POSIXct(strptime(as.character(places$visit_date_oldest),format="%m/%d/%y %H:%M"))
places$newest_stamp = as.POSIXct(strptime(as.character(places$visit_date_newest),format="%m/%d/%y %H:%M"))
places$time_delta = difftime(places$newest_stamp,places$oldest_stamp, units="days")

Tags & Bookmark Metrics

places$bookmark_tagged_pct = (places$bookmark_cnt - places$bookmark_nontag_cnt )/ places$bookmark_cnt
places$folder_cnt_crrctd = places$folder_cnt - places$bookmark_cnt
places$user_of_tags  <- ifelse(places$tag_cnt > 0, c("1"), c("0"))

Other Derived Values

places$percent_visits_new = places$places_visited_unique_cnt / places$moz_historyvisits_cnt
places$pages_per_day = places$moz_historyvisits_cnt / as.numeric(places$time_delta)

Subsets

taggers <- places[places$tag_cnt > 0,]
livemarkers <- places[places$livemark_container_cnt > 0,]
bookmarkers <- places[places$bookmark_cnt>20,]

Improvements to Data Collection

  • Add visit_type summation to capture usage of bookmarks and overall visitation patterns (link click, bookmark, etc)
    • select count(*) as N, visit_type from moz_historyvisits group by visit_type
  • Compute distribution metrics on Session Length to inform design of trail style history visutalizations
    • sum of square, cubes, and 4th powers provides greater ability to characterize a distribution than min, max, mean, median

Problems with Places

The following problems in the database schema, APIs, or Places in general are preventing really interesting experimentation, analysis, or user experience:

  • Tab spawns... (fill me out!)

Related Research

  • Hartmut Obendorf, Harald Weinreich, Eelco Herder, Matthias Mayer. Web Page Revisitation Revisited: Implications of a Long-term Click-stream Study of Browser Usage in: CHI 2007