Thursday 13 June 2013

Big in Japan

Inspired by this post on R-bloggers, I decided to check how BCEA was doing. Unfortunately, it does not feature in the top 100 most downloaded R packages. However, I think it's doing well $-$ considering the book (which is the main medium of advertising of the package) has been out for only a few months (since October last year) and it's kind of a specialised software, which basically you only need if you do health economic evaluations...

I've used some simple R code to download the log files containing all hits to http://cran.rstudio.com/ since October of 2012. Once the files (in .csv format, compressed in .gz files, one per each day) are downloaded, I have R extract the original file and then create a table, only selecting the records for BCEA.

The resulting dataset contains the date(s) and time(s) in which the library has been downloaded from CRAN, some information about the R version and architecture of the person who has downloaded the package, as well as their country.

Overall, BCEA has been officially downloaded 862 times (I suppose I should have a big celebration as soon as I hit 1000); most of the times, the download was from a user in the US (185). Surprisingly, BCEA is big in Japan (135 downloads). I did not see this coming, I have to say, but 日本ありがとう$-$ that's "thank you Japan", for those of you who can't speak Japanese (or can't use Google Translate).

Here's the (quickly prepared and hence not particularly elegant, nor necessarily super-efficient) code to download and format the data:
start <- as.Date('2012-10-01')
today <- as.Date('2013-06-12')
all_days <- seq(start, today, by = 'day')
year <- as.POSIXlt(all_days)$year + 1900
urls <- paste0('http://cran-logs.rstudio.com/', year, '/', all_days, '.csv.gz')
file <- basename(urls)
download.file(urls[1], file[1])
data <- read.table(gzfile(file[1]),sep=",",header=TRUE)
data <- data[data$package=="BCEA",]
for (i in 2:length(urls)) {
download.file(urls[i], file[i])
tmp <- read.table(gzfile(file[i]),sep=",",header=TRUE)
tmp <- tmp[tmp$package=="BCEA",]
data <- rbind(data,tmp)
}
data <- na.omit(data)

3 comments:

  1. Also for my package (Stem) Japan is at the second place in the ranking ...

    US JP DE CN AT ES IT IN BR CA GB KR CO NL AU ...
    314 129 55 42 40 40 33 32 29 27 26 25 22 21 19 ...

    :-)

    ReplyDelete
  2. I have just re-run the script.. well over 1000!

    ReplyDelete
  3. Damn! --- we missed the celebration... This means we *need* to make sure we have a proper party for download no. 2000!... ;-)

    ReplyDelete