Earlier this week I was at the Bayes 2016 meeting, in lovely Leuven. Although I've been to Belgium quite a few times before, this was my first trip to Leuven $-$ somebody who used to work at UCL once told me that they didn't really like the place, which they said was boring and not-so-nice. So, when I got there I was expecting something totally different than the gorgeous cobbled-streets with lots of restaurants, buzzing with students. Of course, as I came to realise only later, she was talking about a different Leuven, of which I didn't really quite know... As it turns out, I didn't really know much about Belgium in general $-$ so over and above the scientific merits of the workshop, this has been a nice formative experience!
As for the more statsy-bits of the conference, as usual I really enjoyed it very much $-$ we always try to make a point of getting talks of high methodological level from both academia and industry, which in my opinion makes for a very nice three days!
In addition, the social part of being at Bayes 20XX is also usually very attractive $-$ and this year has not disappointed. At the end of the first day of the conference, we had a beer-tasting tour $-$ as Emmanuel put it, that was really a study to find the maximum tolerated dose: we were there to determine at what point in the escalation of the beer alcohol percentage we would seriously need to stop (we passed 5% without problems, then worked our way to an 8% and then had to go for dinner after a 10%)...
Anyway, I think the programme was packed with very interesting talks and hopefully, we'll be soon able to upload the presentations! And next year we'll go to Albacete, where Virgilio will play host. In keeping with our grand tradition, several people have been ambushed and as a consequence we do have quite a few candidates to host the next edition in 2018...
Monday, 23 May 2016
Sunday, 22 May 2016
BCEA 2.2-3 is out
I think the newest release of BCEA, our R package to standardise and post-process the output of a health economic model, is now available from CRAN $-$ in fact, the source code is also available here. The package is rather stable, so the changes aren't many, but the few ones are quite substantial, I think. In particular, we've now modified the function evppi, which is used to perform the analysis of the expected value of partial information (incidentally, that's also related to our upcoming short course).
In the last few years this has been a very interesting and fertile area of research within the health economics community, with interesting methods being proposed $-$ this is a nice editorial by Nicky Welton and Howard Thom, while this is (an arxived version of) our own technical review.
BCEA implements all the most recent methods, with particular focus on Strong et al's based on Gaussian Process regression and our own work (just published in Statistics in Medicine), which, building on their work, uses INLA to speed up the computation even further. In addition, we have also included a graphical tool that can be used to describe, at least as a first order approximation, the individual impact of each parameter on the overall uncertainty in the decision-making process. We have called this the info-rank plot, which is basically a generalisation of commonly used (especially when economic evaluations are performed under a frequentist approach) Tornado plots. The info-rank is based on the single-parameter EVPPI and can be used to roughly determine the contribution of each single parameters to the overall value of partial information (of course, because the EVPPI is a highly non-linear function, combinations of parameters are not additive, so some caution is needed here).
In the last few years this has been a very interesting and fertile area of research within the health economics community, with interesting methods being proposed $-$ this is a nice editorial by Nicky Welton and Howard Thom, while this is (an arxived version of) our own technical review.
BCEA implements all the most recent methods, with particular focus on Strong et al's based on Gaussian Process regression and our own work (just published in Statistics in Medicine), which, building on their work, uses INLA to speed up the computation even further. In addition, we have also included a graphical tool that can be used to describe, at least as a first order approximation, the individual impact of each parameter on the overall uncertainty in the decision-making process. We have called this the info-rank plot, which is basically a generalisation of commonly used (especially when economic evaluations are performed under a frequentist approach) Tornado plots. The info-rank is based on the single-parameter EVPPI and can be used to roughly determine the contribution of each single parameters to the overall value of partial information (of course, because the EVPPI is a highly non-linear function, combinations of parameters are not additive, so some caution is needed here).
Monday, 9 May 2016
How to be Bayesian and spare yourself a dreadful afternoon with your stupid football team losing the derby
Yesterday was the second-last game of the Italian Serie A; I've been a Sampdoria supported since I was 12 $-$ at that time, they were starting to become one of the best clubs in Serie A (and that was back in the 80's when Serie A was arguably the best league in the world), although they hadn't won anything and didn't have prospects for that season either. But they were a young, good side, playing nicely and so I kind of fell in love with them (and their shirt). Then they did become a very good side, winning the league and a few more trophies $-$ so good timing on my part! But also, then they reverted to some relative mediocrity $-$ of course, once you've decided you support a team, you're stuck with them no matter what.
Anyway, this season has been rather crappy and yesterday it was a crucial game: we were playing the derby against local rival Genoa entering the game with 40 points and two games left in the campaign. Two teams couldn't reach us any more (as they were trailing by over 6 points). But at least one between Carpi and Palermo could still overtake us if we lost our two remaining games and they won all of theirs. Also, Udinese was just one point behind us so they too could overtake us, technically. With three teams being relegated, we weren't statistically safe yet.
So, that's kind of nervous and earlier last week I thought about this a bit. I had a bad feeling about our game, because we've not been great lately (the previous game we were beaten by Palermo) and, clearly, Genoa would try really hard to mess it up for us... But, irrespective of the outcome of the derby, if at least one between Carpi, Palermo and Udinese failed to win their match we would be safe (as there wouldn't be enough points left for them to catch us). Carpi played at home against Lazio, whose season hasn't been great either, but they were already safe and with not much else to fight for, except a strong finish; Palermo were away at Fiorentina, who theoretically were still fighting for a Europa league qualification and so should have something to play for; Udinese were away at Atalanta, who much as Lazio were mathematically safe and with not much to play for.
Although one can make a much more complex model, I reasoned that instead of the actual result, what was only important was the chance that either of the three teams behind us would win and so I set up a model with $ y_{\rm{Car}} \sim \mbox{Bernoulli}(\theta_{\rm{Car}})$, $y_{\rm{Pal}} \sim \mbox{Bernoulli}(\theta_{\rm{Pal}})$ and $y_{\rm{Udi}} \sim \mbox{Bernoulli}(\theta_{\rm{Udi}})$ where the "success" would in fact be the worst possible outcome, ie a win for them.
Then I set up some priors: I reasoned that because they were playing at home, Carpi may have a slightly higher chance of winning the game $-$ I figured something about 35%. Also, I thought (hoped) that Lazio wouldn't be a walkover and so I assumed that 90% of the mass for the chance of Carpi winning their game was around 45%. These can be turned into an informative Beta(15.80107,28.4877) prior $-$ it's fairly easy to work out the parameters of a Beta distribution given the mode (0.35, in this case) and some percentile (0.45 as the 90th percentile, in this case); Christensen et al (page 100) show some theory, while this is some relevant R code.
This is effectively the prior I was assuming:
Anyway, this season has been rather crappy and yesterday it was a crucial game: we were playing the derby against local rival Genoa entering the game with 40 points and two games left in the campaign. Two teams couldn't reach us any more (as they were trailing by over 6 points). But at least one between Carpi and Palermo could still overtake us if we lost our two remaining games and they won all of theirs. Also, Udinese was just one point behind us so they too could overtake us, technically. With three teams being relegated, we weren't statistically safe yet.
So, that's kind of nervous and earlier last week I thought about this a bit. I had a bad feeling about our game, because we've not been great lately (the previous game we were beaten by Palermo) and, clearly, Genoa would try really hard to mess it up for us... But, irrespective of the outcome of the derby, if at least one between Carpi, Palermo and Udinese failed to win their match we would be safe (as there wouldn't be enough points left for them to catch us). Carpi played at home against Lazio, whose season hasn't been great either, but they were already safe and with not much else to fight for, except a strong finish; Palermo were away at Fiorentina, who theoretically were still fighting for a Europa league qualification and so should have something to play for; Udinese were away at Atalanta, who much as Lazio were mathematically safe and with not much to play for.
Although one can make a much more complex model, I reasoned that instead of the actual result, what was only important was the chance that either of the three teams behind us would win and so I set up a model with $ y_{\rm{Car}} \sim \mbox{Bernoulli}(\theta_{\rm{Car}})$, $y_{\rm{Pal}} \sim \mbox{Bernoulli}(\theta_{\rm{Pal}})$ and $y_{\rm{Udi}} \sim \mbox{Bernoulli}(\theta_{\rm{Udi}})$ where the "success" would in fact be the worst possible outcome, ie a win for them.
Then I set up some priors: I reasoned that because they were playing at home, Carpi may have a slightly higher chance of winning the game $-$ I figured something about 35%. Also, I thought (hoped) that Lazio wouldn't be a walkover and so I assumed that 90% of the mass for the chance of Carpi winning their game was around 45%. These can be turned into an informative Beta(15.80107,28.4877) prior $-$ it's fairly easy to work out the parameters of a Beta distribution given the mode (0.35, in this case) and some percentile (0.45 as the 90th percentile, in this case); Christensen et al (page 100) show some theory, while this is some relevant R code.
This is effectively the prior I was assuming:
and I thought it was just about reasonable (the dotted vertical lines indicate a rough estimate of the 95% prior credible interval). Then I did something similar to derive the priors for a Palermo and Udinese win $-$ because they were playing away, I figured they would have an average chance of winning of around 20% with 90% of the mass before 40%, which can be turned into a Beta(3.279775,10.1191) prior, looking like this:
Again, I was relatively happy with this and so used these priors in my model, which one could code in R as something like
p.car ~ rbeta(10000,15.80107,28.4877) # P(win) on average .35 and with 95% mass <= .45
p.pal ~ rbeta(10000,3.279775,10.1191) # P(win) on average .2 and with 95% mass <=.4
p.udi ~ rbeta(10000,3.279775,10.1191) # P(win) on average .2 and with 95% mass <=.4
p.safe <- 1-(p.car*p.pal*p.udi)
The most important variable in the model is the probability of Sampdoria being mathematically certain of avoiding relegation, p.safe, which is 1 minus the probability of the worst happening $-$ this assumes independence in the three games for Palermo, Carpi and Udinese; in general that's probably not the best assumption, but in this case they kind of had to win to have a good shot at safety themselves and so I think it's OK to assume independence. The results were kind of reassuring:
$-$ I got an estimated posterior average of 97.8% with a 95% credible interval of 93.8 to 99.7%.
I am not really one to stay at home on a Sunday just to watch the football game (so perhaps I'm not really a footaball fan?) and we'd planned to see some friends, but this reassured me that we shouldn't be in too much trouble, even if we lost the derby. In the event, Kobi wasn't great (possibly as a result of venturing an outing at the seaside on Saturday) and so we stayed at home $-$ but I decided not to bother with watching the game (again: a) a bold move for a real football fan, confident about his team; b) a cowardly move from a real football fan scared of what the outcome may be; c) not a real football fan).
We did lose the derby very badly, but Carpi, Palermo and Udinese all failed to win their games, which means we are safe. I'm glad I didn't watch the game...
Subscribe to:
Posts (Atom)