Tuesday, 16 January 2018

Bayes 2018/Bayesian Biostatistics

This year, our annual Bayes 20XX conference has been jointly organised with the MRC Biostatistics Unit Cambridge and is also a satellite event before the main ISBA conference in Edinburgh.

I believe that the call for abstract is now officially open (or will be very, very shortly). We have a very good lineup of speakers, so, as usual, should be very good and I already look forward to it!

Monday, 15 January 2018


I've been asked to post about the EuroCIM (European Causal Inference Meeting), which will be held later this year in Florence. I very happily oblige, because: a) this is usually a very good conference; b) it is organised by nice and obviously very good people (well $-$ at least I like them!); c) at a time where everything UK seem to move away from anything Euro, it's actually very nice to see a conference formerly known as UKCIM going fully Euro!

EuroCIM: the European Causal Inference Meeting, April 2018, Florence
 We are pleased to announce that after five successful editions of the UK-CIM, *the first European Causal Inference Meeting (EuroCIM) will take place in Florence, Italy, in April 2018. *The meeting will be focused on “*Causal Inference in Health, Economic and Social Sciences*”. EuroCIM 2018 is organized by the Department of Statistics, Computer Science, Applications (DiSIA) and ARCO of the University of Florence, Italy. Conference dates are Wednesday April 11 to Friday April 13 2018, early bird January 17, Submission of Abstracts February 1The conference will include keynote addresses from: Moreover, on April 10 2018 four workshops will be offered by Rhian Daniel (Cardiff University), Johannes Textor (Radboud University Medical Center), Fabrizia Mealli (University of Florence) and Guido Imbens (Stanford Graduate School of Business). The conference will also feature presentations and a poster section that will give researchers and practitioners the opportunity to show their work. For more info on the meeting, the fees, how to register and submit an abstract please visit: http://eurocim2018.arcolab.org/

Wednesday, 10 January 2018

MSc studentships @ UCL

Two National Institute for Health Research (NIHR) studentships in Medical Statistics are available for the 2018/19 academic year. The studentships cover tuition fees at the UK/EU rate and a maintenance stipend of £17,050 per annum (based on the standard UK Research Council rate with London weighting). All eligible applicants for the MSc Medical Statistics Course will automatically be considered. And: you don't have to be a capricorn...

For further information please contact Dr Russell Evans (russell.evans@ucl.ac.uk)

Friday, 22 December 2017

A Bayesian analysis of polls in the Catalan elections

(Invited post by Virgilio Gómez-Rubio, UCLM, Albacete, Spain. Thanks Gianluca for the invitation!!)

I have been involved in the planning and analysis or survey polls almost since I came back to Albacete 9 years ago. Last months in Spanish politics have been dominated by the 'Catalan referendum' and the call for new elections from the national government via article 155 in the Spanish Constitution (which had never been enforced before). This elections have been different for many reasons, so I decided to do a (last minute) analysis of the available polls to try to predict the allocation of seats in the elections.

The Catalan parliament has 135 seats, split in four electoral districts which correspond to the four provinces in the region, with different number of seats depending on their population: Barcelona (85 seats), Gerona (17 seats), Lérida (15 seats) and Tarragona (18 seats). Seats are allocated according to D'Hondt method.

Several polls have been published in the mass media, and the proportions of votes to parties (as well as sample size, etc.)  are either reported at the regional level  (which is useless to allocate seats per provinces) and province level. Given that most polls are aggregated at the regional level it makes sense to combine both types of polls into a single model to provide some insight on the voters' preferences at the province level to allocate the number of seats.

Bayesian hierarchical models are great at combining information from different sources. The model that I have considered now is very simple. The number of votes (reported in the poll) to each party at the regional level are assumed to follow a multinomial distribution with probabilities $P_i,  i=1,\ldots, p$, where $p$ is the number of political parties. In this case, we have 7 main parties plus another group for 'other parties'. Probabilities $P_i$ are assigned a vague Dirichlet prior. The number of votes at the province level are assumed to follow a multinomial distribution as well, with probabilities $p_{i,j},\ i=1,\ldots,p, j=1,\ldots,4.$. Both probabilities are linked by assuming that $\log(p_{i,j})$ is proportional to  $\log(P_i)$ plus a province-party specific random effect $u_{i,j}$. I have used this model before with good results.

As simple as it is, this model allows the combination of polls at different aggregation levels. I have used JAGS to fit the model and to allocate the number of seats by exploiting the probabilities from the MCMC output to obtain 10000 draws of the allocation of seats by applying D'Hont rule to the proportion of votes to each party at the proven level.

Next plot shows the distribution of seats against the actual distribution of seats:

I'd say the coverage is good for most parties. Polls did not show the loss of voters for CUP and Partido Popular (PP).

Another nice thing of being Bayesian (and using MCMC) is that other probabilities could be computed. For example, the next plot shows the posterior distribution of the number of seats allocated to pro-independence parties so that the probability of them having a majority can be computed (59.86%):

As I promised to have a shot for each seat allocated correctly, I've got some work left to do until the end of the Christmas break... Merry Christmas and Happy New Year!!!

Tuesday, 19 December 2017

Does Peppa Pig encourage inappropriate use of primary care resources?

This is a very important contribution to the medical literature, recently published in the BMJ.

I think the sample size is probably not large enough to grant robust inference. And perhaps it would have been helpful to consider alternative settings, say to consider the wide diversity in the target population of Ben and Holly's little Kingdom, just to give an example. 

But I do applaud the effort of the author!

Monday, 18 December 2017


Last week, Kristian Lum has written a blog post to report her experience of inappropriate behaviour by some senior male colleagues at statistical conferences (ISBA and JSM, in particular).

I don't personally know Kristian, although I think I did have lunch with her, a common friend and bunch of other people, at JSM in Montreal in 2013. Anyway, even if I were completely agnostic about the whole thing (and I don't think I am...), seems to me like her account has been corroborated by some hard facts as well as discussion with other friends/colleagues who actually know her rather well. So while it's important to avoid "courts martial", I think the discussion here isn't really about whether these things happened or not (which at this point I'm pretty sure they did $-$ just to clarify). 

I've been left with mixed feelings and a sense of kind-of-having lost my bearings, since I found this out last week. Firstly, I am not surprised to hear that such things can happen at a conference or in academia, in general. What has kind of surprised me is the fact that while I do move more or less in those circles, I wasn't aware of the reputation of the two people who have been named. Some people (for example here) have made a point that these stories were well known and Kristian said so herself in her blog post. As somebody who's involved in ISBA, this is troubling and I kind of feel like we've hid our collective head under the sand, possibly for a very long time. To be fair, ISBA is now coming up with a task-force to create protocols and prevent issues such as these arising again in the future. Still, doesn't feel particularly good...

Secondly, this may be some sort of self-preservation (or may be denial?) instinct and may be there is indeed a much more rooted problem in statistics and in fact in Bayesian statistics, which I make myself struggle to see because it hurts to think that the environment in which I work is actually flawed in bad ways. But what I mean is that perhaps it's not like there's a couple of areas in which bad guys operate and if only we could get rid of those bad guys in those areas, then society would be idyllic. I think that, unfortunately, there's plenty of examples where people with/in power (statistically more likely to be white men) do behave badly and abuse their power in many ways, including sexually. May be our field does represent men disproportionately $-$ and it may well be that this is even truer for Bayesian statistics than for other branches of statistical science. And so, as painful as it is to realise quite clearly that the grass ain't so green after all, it is what it is. But the problem is (much) bigger than that...

Finally, I've particularly liked my friend Julien's Facebook post (I actually see now that he was in fact linking to somebody else's tweet):
Retweeted Carlos Scheidegger (@scheidegger):
We should all read and acknowledge @KLdivergence's and other women's harrowing stories. But I want to try something different here. Do you all know of her amazing work at @hrdag? This, on predictive policing, is so good https://t.co/YDsijFsiT2 https://t.co/GbwgKzSgMb

Dan's post has some lengthy discussion about the use of the term "mediocre" to characterise the two offenders. I think that neither mediocrity (= how poor one is at their work) nor excellence (= how good one is at their work) should be excuses $-$ but I see how this may matter because, arguably, the better and more respected you are in your field, the more power you wield over junior colleagues... But I think it feels right to point out Kristian's work qualities. Somehow, it seems to put things in a better perspective, I think.

Tuesday, 21 November 2017


Recently, I've been doing a lot of work on the beta version of BCEA (I was after all born in Agrigento $-$ in the picture to the left $-$, which is a Greek city, so a beta version sounds about right...). 

The new version is only available as a beta-release from our GitHub repository - usual ways to install it are through the devtools package.

There aren't very many changes from the current CRAN version, although the one thing I did change is kind of big. In fact, I've embedded the web-app functionalities within the package. So, it is now possible to launch the web-app from the current R session using the new function BCEAweb. This takes as arguments three inputs: a matrix e containing $S$ simulations for the measures of effectiveness computed for the $T$ interventions; a matrix c containing the simulations for the measures of costs; and a data frame or matrix containing simulations for the model parameters. 

In fact, none of the inputs is required and the user can actually launch an empty web-app, in which the inputs can be uploaded, say, from a spreadsheet (there are in fact other formats available).

I think the web-app facility is not necessary when you've gone through the trouble of actually installing the R package and you're obviously using it from R. But it's helpful, nonetheless, for example in terms of producing some standard output (perhaps even more than the actual package $-$ which I think is more flexible) and of reporting, with the cool facility based on pandoc.

This means there are a few more packages "suggested" on installation and potentially a longer compilation time for the package $-$ but nothing major. The new version is under testing but I may be able to release it on CRAN soon-ish... And there are other cool things we're playing around (the links here give all the details!).

Monday, 20 November 2017

La lotteria dei rigori

Seems like my own country has kind of run out of luck... First we fail to qualify for the World Cup, then lose the right to host the relocated headquarters of the European Medicine Agency, post Brexit. If I were a cynic ex-pat, I'd probably think that the former will be felt like the worst defeat across Italy. May be it will.

As I've mentioned here, I'd been talking to Politico, about how the whole process looked like the Eurovision. I think the actual thing did have some elements $-$ earlier today, on the eve of the vote, it appeared like Bratislava was the hot favourite. This kind of reminded me of the days before the final of the Eurovision, when one of the acts is often touted as the sure-thing, often over and above its musical quality. And I do believe that there's an element of "letting people know that we're up for hosting the next one" going on to pimp up the experts' opinions. Although sometimes, as it turns out, the favourites are not so keen in reality $-$ cue their poor performance come the actual thing...

In the event, Bratislava was eliminated at the first round. The contest went all the way to extra times, with Copenhagen dropping out at the semifinals and Amsterdam-Milan contesting the final head-to-head. As the two finalists got the same number of votes (with I think one abstaining), the decision was made on luck $-$ basically on penalties, or as we say in Italian, la lotteria dei rigori.

I guess there must have been some thinking behind the set-up of the voting system that, in case it came down to a tie at the final round, both remaining candidates would be "acceptable" (if not to everybody, at least to the main players) and so they'd be happy for this to go 50:50. And so Amsterdam it is!