Next week I'm off to the RSS conference in Sheffield, where I'll present our work on the Eurovision contest. I'm quite excited about going back to Sheffield, where some time in the last century, I've spent a semester as an Erasmus Exchange student. In fact, this will be the first time I'm back since then $$ I've spoken with a few friends who tell me that the city has changed so much in the last few years, so I'm curious to see what I'll make of it.
I have fond memories of my experience and I'm very glad I've taken that opportunity (although at the time the exchange rate between Italian Lire and the Pound was 2 million to 1...). In particular, I'm glad I got to be at one of the last editions of the Pyjama Jump!
Gianluca Baio's blog
Bayesian statistics, health economics and random stuff
Wednesday, 27 August 2014
Wednesday, 20 August 2014
Workshop on Efficient Methods for Value of Information
Nicky Welton has invited me to talk at a very interesting workshop, which she has organised at the University of Bristol (here's a flyer). The day will be about the recent (and current) development of methods to perform calculations of the expected value of information, particularly for specific parameters (EVPPI).
This is something that is extremely relevant in health economic evaluation and I've already implemented a couple of possible methods in BCEA (in fact, we're writing more on this in the BCEA book $$ we're running a little late on the original timeline, but it's looking good and I'll post more on this, later).
At the workshop, Chris Jackson and I will give a joint talk to describe how the different methods can be integrated in a single, general framework. We'll also have a new PhD student who will start working in September on fully Bayesian extension of Gaussian Processes approximations to the net benefit. Pretty exciting stuff (if you're easily excited by these kind of things, that is...).
This is something that is extremely relevant in health economic evaluation and I've already implemented a couple of possible methods in BCEA (in fact, we're writing more on this in the BCEA book $$ we're running a little late on the original timeline, but it's looking good and I'll post more on this, later).
At the workshop, Chris Jackson and I will give a joint talk to describe how the different methods can be integrated in a single, general framework. We'll also have a new PhD student who will start working in September on fully Bayesian extension of Gaussian Processes approximations to the net benefit. Pretty exciting stuff (if you're easily excited by these kind of things, that is...).
Thursday, 14 August 2014
(Some) Spaces available
Requests for registration to our short course on Bayesian methods in Health Economics are coming in steadily $$ in fact, we had started advertising quite in advance (the course is in November), but we're nearly booked up.
We set a total of 30 participants, so hurry up if you're interested!
We set a total of 30 participants, so hurry up if you're interested!
Monday, 11 August 2014
Two weights and two measures?
This is an interesting story about the Meningitis B vaccine (some additional background here and here). In a nutshell, the main issue is that vaccines are subject to a slightly different regulation than other "normal" drugs. For example, patents do not really apply to vaccines (I believe the argument is that the composition is so difficult to set up that in effect there is no point in patenting it $$ although there may be more to this...).
More to the point, unlike "normal" drugs or health interventions, the economic evaluation of vaccines in the UK is within the remit of a special body, the Joint Committee on Vaccination and Immunisation (JCVI), rather than NICE.
On the one hand, this is perfectly reasonable, as, arguably, vaccines do have some specific characteristics that make modelling and evaluation slightly more complex $$ for example, vaccination is usually associated with phenomena such as herd immunity (the more people are vaccinated, the more people are directly or indirectly protected). While it is essential to include these dynamic aspects in modelling, it also makes for more complicated mathematical/statistical structures.
On the other hand, however, this raises the question as to whether it makes sense at all to try and evaluate these very special interventions using the same yardstick used for the others (eg costutility/effectiveness analysis). Or whether the thresholds for costeffectiveness should be the same $$ after all, infectious diseases may have incredible burden during epidemics and so, arguably, effective interventions may be worth extra money than the usual £2030,000 per QALY.
There are all sort of related issues (some of which perhaps more of a political nature, for example in terms of the overall evaluation process, in direct comparison to what NICE do) $$ I think I'll discuss them some more at a later stage. But this is interesting nonetheless, also from a technical point of view.
In my opinion, the point is that, all the more for infectious diseases, to continuously reassess the evidence and its implications for modelling is absolutely fundamental. Techniques such as the value of information (some discussed and available in BCEA) should be used more widely. And both regulators and industry should be open to this sort of stepwise approach to marketing.
More to the point, unlike "normal" drugs or health interventions, the economic evaluation of vaccines in the UK is within the remit of a special body, the Joint Committee on Vaccination and Immunisation (JCVI), rather than NICE.
On the one hand, this is perfectly reasonable, as, arguably, vaccines do have some specific characteristics that make modelling and evaluation slightly more complex $$ for example, vaccination is usually associated with phenomena such as herd immunity (the more people are vaccinated, the more people are directly or indirectly protected). While it is essential to include these dynamic aspects in modelling, it also makes for more complicated mathematical/statistical structures.
On the other hand, however, this raises the question as to whether it makes sense at all to try and evaluate these very special interventions using the same yardstick used for the others (eg costutility/effectiveness analysis). Or whether the thresholds for costeffectiveness should be the same $$ after all, infectious diseases may have incredible burden during epidemics and so, arguably, effective interventions may be worth extra money than the usual £2030,000 per QALY.
There are all sort of related issues (some of which perhaps more of a political nature, for example in terms of the overall evaluation process, in direct comparison to what NICE do) $$ I think I'll discuss them some more at a later stage. But this is interesting nonetheless, also from a technical point of view.
In my opinion, the point is that, all the more for infectious diseases, to continuously reassess the evidence and its implications for modelling is absolutely fundamental. Techniques such as the value of information (some discussed and available in BCEA) should be used more widely. And both regulators and industry should be open to this sort of stepwise approach to marketing.
Friday, 25 July 2014
Pat pat
This is probably akin to an exercise in selfpleasing, but I'll indulge in this anyway to celebrate the fact that our paper on the Bias in the Eurovision song contest voting (the last in a relatively long series of posts on this is here) has now over 4,000 "article views".
The Journal of Applied Statistics website defines these as: "Article usage statistics combine cumulative total PDF downloads and fulltext HTML views from publication date [23 Apr 2014, in our case] to 23 Jul 2014. Article views are only counted from this site."
In case you're wondering, neither Marta nor I have actually downloaded the paper to boost the numbers!
The Journal of Applied Statistics website defines these as: "Article usage statistics combine cumulative total PDF downloads and fulltext HTML views from publication date [23 Apr 2014, in our case] to 23 Jul 2014. Article views are only counted from this site."
In case you're wondering, neither Marta nor I have actually downloaded the paper to boost the numbers!
Monday, 7 July 2014
The Oracle (8)  let's go all the way!
This is (may be) the final post in the series dedicated to the prediction of the World Cup results $$ I'll try and actually write another to wrap things up and summarise a few comments, but this will probably be a bit later on. Finally, we've decided to use our model, which so far has been applied incrementally, ie stagebystage, to predict the result of both the semifinals and the finals.
The first part is relatively straightforward; the quarter finals have been played and we do know the results that have occurred. Thus, we can reiterate the procedure (which we described here) and i) update the data with the observed results; ii) update the "current form" variable and the offset; iii) rerun the model to estimate each team's propensity to score; iv) predict the result of the unobserved games $$ in this case the two semifinals (BrazilGermany and ArgentinaNetherlands).
However, to give the model a nice twist, I thought we should include some piece of extra information that is available right now, ie the fact that Brazil will, for certain, play their semifinal without their suspended captain Thiago Silva and their injured "star player" Neymar (who will also miss the final, due to the gravity of his injury). Thus, we ran the model by modifying the offset variable (see a more detailed description here) for Brazil, to slightly decrease their "shortterm" quality. [NB: if this were a "serious" model, we would probably try to embed these changes in a more formal way, rather than as "ad hoc" modifications to the general set up. Nevertheless, I believe that the possibility of dealing with additional information, possibly in the form of subjective/expert knowledge, is actually a strength of the modelling framework. Of course, you could say that the selection of the offset distribution is arbitrary and other possibilities were possible $$ that's of course true and a "serious" model would certainly require more extensive sensitivity analysis at this stage!]
Using this formulation of the model, we get the following results, in terms of the overall probability of going through to the final (ie accounting for potential draws in the 90 minutes and then extra times and possibly penalties, as discussed here):
Brazil Germany 0.605 0.395
Argentina Netherlands 0.510 0.490
So, the second semifinal is predicted to be much tighter (nearly 50:50), while Brazil are still favourites to reach the final, according to the model prediction.
As I said earlier, however, this time we've gone beyond the simple onestep prediction and have used these results to also rerun the model before the actual results of the semifinals are known and thus predict the overall outcome, ie who's winning the World Cup.
Overall, our estimation gives the following probabilities of winning the championship (these may not sum to 1 because of rounding):
Brazil: 0.372
Germany: 0.174
Argentina: 0.245
Netherlands: 0.206
Of course, these probabilities encode extra uncertainty, because we're going one extra step forward in the future $$ we don't know which of the potential futures will occur for the semifinals. Leaving the model aside), I think would probably like the Netherlands to win $$ if only for the fact that in that way, Italy would still be the 2nd most frequent World Cup winners, only one title behind Brazil, and one and two above Germany and Argentina, respectively.
The first part is relatively straightforward; the quarter finals have been played and we do know the results that have occurred. Thus, we can reiterate the procedure (which we described here) and i) update the data with the observed results; ii) update the "current form" variable and the offset; iii) rerun the model to estimate each team's propensity to score; iv) predict the result of the unobserved games $$ in this case the two semifinals (BrazilGermany and ArgentinaNetherlands).
However, to give the model a nice twist, I thought we should include some piece of extra information that is available right now, ie the fact that Brazil will, for certain, play their semifinal without their suspended captain Thiago Silva and their injured "star player" Neymar (who will also miss the final, due to the gravity of his injury). Thus, we ran the model by modifying the offset variable (see a more detailed description here) for Brazil, to slightly decrease their "shortterm" quality. [NB: if this were a "serious" model, we would probably try to embed these changes in a more formal way, rather than as "ad hoc" modifications to the general set up. Nevertheless, I believe that the possibility of dealing with additional information, possibly in the form of subjective/expert knowledge, is actually a strength of the modelling framework. Of course, you could say that the selection of the offset distribution is arbitrary and other possibilities were possible $$ that's of course true and a "serious" model would certainly require more extensive sensitivity analysis at this stage!]
Using this formulation of the model, we get the following results, in terms of the overall probability of going through to the final (ie accounting for potential draws in the 90 minutes and then extra times and possibly penalties, as discussed here):
Brazil Germany 0.605 0.395
Argentina Netherlands 0.510 0.490
So, the second semifinal is predicted to be much tighter (nearly 50:50), while Brazil are still favourites to reach the final, according to the model prediction.
As I said earlier, however, this time we've gone beyond the simple onestep prediction and have used these results to also rerun the model before the actual results of the semifinals are known and thus predict the overall outcome, ie who's winning the World Cup.
Overall, our estimation gives the following probabilities of winning the championship (these may not sum to 1 because of rounding):
Brazil: 0.372
Germany: 0.174
Argentina: 0.245
Netherlands: 0.206
Of course, these probabilities encode extra uncertainty, because we're going one extra step forward in the future $$ we don't know which of the potential futures will occur for the semifinals. Leaving the model aside), I think would probably like the Netherlands to win $$ if only for the fact that in that way, Italy would still be the 2nd most frequent World Cup winners, only one title behind Brazil, and one and two above Germany and Argentina, respectively.
Thursday, 3 July 2014
The Oracle (7)
We're now down to 8 teams left in the World Cup. Interestingly, despite a pretty disappointing display by some of the (more or less rightly so) highly rated teams, such as Spain, Italy, Portugal or England, European sides are exactly 50% of the lot. Given the quarter final game between France and Germany, at least one European team is certain to reach the semifinals. Also, it is worth noticing that the 8 remaining teams are the group winners $$ which kind of confirms Michael Wallace's point.
We've now reupdated the data, the "form" and the "offset" variables (as briefly explained here) using the results of the round of 16. The model had predicted (as shown in the graphs here) wide uncertainty for the potential outcomes of the games (also, we had not included the added complication of extra times & penalties $$ more on this later). I believe this has been confirmed by the actual games. In many cases (in fact, probably all but the ColombiaUruguay game, which was kindofdominated by the former), the games have been substantially close. As a result, we've observed a slightly higher than usual proportion of games ending up at extra times.
So, we've also complicated (further!) our model to estimate the result by including extra times and penalties. In a nutshell, when the game is predicted to be a draw (ie the predicted number of goals scored by the two teams is the same), then we've additionally simulated the outcome of extra times.
In doing this, we've used the same basic structure as for the regular time, but we've added a decremental factor to the linear predictor (describing the "propensity" of team A to score when playing against team B). This makes sense, since the duration of extra time is 1/3 of the normal game. Also, there is added pressure and teams normally tend to be more conservative. Thus, in this prediction, we've increased the chance of observing 0 goals and accounted for the shorter time played. If the prediction is still for a draw, then we've determined the winner by assuming that penalty shoot outs essentially are a randomising device $$ each team have 50% chance of winning them.
These are the contour plots for the posterior predictive distribution of the goals scored in the quarter finals, based on our revised model.
We've now reupdated the data, the "form" and the "offset" variables (as briefly explained here) using the results of the round of 16. The model had predicted (as shown in the graphs here) wide uncertainty for the potential outcomes of the games (also, we had not included the added complication of extra times & penalties $$ more on this later). I believe this has been confirmed by the actual games. In many cases (in fact, probably all but the ColombiaUruguay game, which was kindofdominated by the former), the games have been substantially close. As a result, we've observed a slightly higher than usual proportion of games ending up at extra times.
So, we've also complicated (further!) our model to estimate the result by including extra times and penalties. In a nutshell, when the game is predicted to be a draw (ie the predicted number of goals scored by the two teams is the same), then we've additionally simulated the outcome of extra times.
In doing this, we've used the same basic structure as for the regular time, but we've added a decremental factor to the linear predictor (describing the "propensity" of team A to score when playing against team B). This makes sense, since the duration of extra time is 1/3 of the normal game. Also, there is added pressure and teams normally tend to be more conservative. Thus, in this prediction, we've increased the chance of observing 0 goals and accounted for the shorter time played. If the prediction is still for a draw, then we've determined the winner by assuming that penalty shoot outs essentially are a randomising device $$ each team have 50% chance of winning them.
These are the contour plots for the posterior predictive distribution of the goals scored in the quarter finals, based on our revised model.
Basically all games are again quite tight $$ perhaps with the (reasonable?) exception of NetherlandsCosta Rica in which the Dutch are favourite and predicted to have a higher chance of scoring more goals (and therefore winning the game).
As shown in the above graph, draws are quite likely in almost all the games; the European derby is probably the closest game (and this seems to make sense given both the short and longterm standing of the two teams). Brazil and Argentina both face tough opponents (based on the model $$ but again, in line with what we've seen so far).
Using the result of the model in terms of prediction of the results at extra time & penalties, we estimate the overall probability of winning the game (ie either within 90 minutes or beyond) as
Brazil

Colombia

0.657

0.343


Netherlands

Costa Rica

0.776

0.224

France

Germany

0.497

0.503

Argentina

Belgium

0.607

0.393

(in the above table, the third and fourth columns indicate, respectively, the predicted chance that the team in column one and two, respectively, win the game and progress to the semifinals).
One final remark, which I think it's generally interesting, is that by the time we've reached the quarter finals, the value of the "current form" variable for Brazil (who started as hot favourites based on the evidence synthesis of the published odds that we've used to define it at the beginning of the tournament) is lower than that of their opponent. But again, Colombia have sort of breezed through all of their games so far, while Brazil have kind of stuttered and have not won games that they probably should have (taking at face value their "strength"). This doesn't seem enough to make Colombia favourites in their game against the host $$ but beware of surprises! After all, the distribution of the possible results is not so clear cut...
Wednesday, 2 July 2014
Short course: Bayesian methods in health economics
Chris, Richard and I tested this last March in Canada (see also here) and things seem to have gone quite well. So we have decided to replicate the experiment (so that we can get a bigger sample size!) and do the short course this coming November (35th), at UCL.
Full details (including links for registration) are available here. As we formally say in an advert we've circulated on a couple of relevant mailing lists:
"This course is intended to provide an introduction to Bayesian analysis and MCMC methods using R and MCMC sampling software (such as OpenBUGS and JAGS), as applied to costeffectiveness analysis and typical models used in health economic evaluations.
The course is intended for health economists, statisticians, and decision modellers interested in the practice of Bayesian modelling and will be based on a mixture of lectures and computer practicals, although the emphasis will be on examples of applied analysis: software and code to carry out the analyses will be provided. Participants are encouraged to bring their own laptops for the practicals.
We shall assume a basic knowledge of standard methods in health economics and some familiarity with a range of probability distributions, regression analysis, Markov models and randomeffects metaanalysis. However, statistical concepts are reviewed in the context of applied health economic evaluations in the lectures."
The timetable and additional info are here.
Full details (including links for registration) are available here. As we formally say in an advert we've circulated on a couple of relevant mailing lists:
"This course is intended to provide an introduction to Bayesian analysis and MCMC methods using R and MCMC sampling software (such as OpenBUGS and JAGS), as applied to costeffectiveness analysis and typical models used in health economic evaluations.
The course is intended for health economists, statisticians, and decision modellers interested in the practice of Bayesian modelling and will be based on a mixture of lectures and computer practicals, although the emphasis will be on examples of applied analysis: software and code to carry out the analyses will be provided. Participants are encouraged to bring their own laptops for the practicals.
We shall assume a basic knowledge of standard methods in health economics and some familiarity with a range of probability distributions, regression analysis, Markov models and randomeffects metaanalysis. However, statistical concepts are reviewed in the context of applied health economic evaluations in the lectures."
The timetable and additional info are here.
Subscribe to:
Posts (Atom)