Coronavirus modelling work reported by the BBC

This article by the BBC’s Rachel Schraer explores the modelling for the progression of the Coronavirus Covid-19. In the article we see some graphs showing epidemic growth rates, and in particular this one showing infection rate dependency on how many one individual infects in a given period.

The BBC chart showing infection rate dependency on how many an individual infects

This chart led me to look into more sophisticated modelling tools than just the spreadsheet I already mentioned in my previous article on Coronavirus modelling; this is a very specialist area, and I’m working hard to model it more fully.

My spreadsheet model was a simple power law model; it allows you to enter a couple of your own parameters (the number of days out, and the doubling period for cases in days) to see the outcome; see it at:

It lists, as a table, case outcomes after a given number of days (up to 30 – but you can enter your own forecast number of days, and doubling period) since 100 cases, given how many days it is assumed it takes for cases to double. It’s just a simple application of a power law, and is only an analysis of output rate numbers, not a full model. It explains potential growth, on various doubling assumptions. It appears, for example, in the following Johns Hopkins chart (this time for deaths, but it’s a similar model for cases), which presents the UK and Italy prognoses lying between doubling every two days and three days since the index day at 10 deaths:

An example of a log scale y-axis chart for deaths on two doubling rate assumptions

Any predicted outcomes are VERY dependent on that doubling rate assumption, as my spreadsheet showed – in terms of cases, after 30 days since 100 cases, a doubling every 2 days would lead to about 3 million cases, but a doubling every 3 days leads to 100 thousand cases. This is an example of the non-linearity of the modelling – a 50% improvement in the case doubling period leads to a 30-fold improvement in prediction for case numbers after 100 days.

To reproduce the infection rate growth numbers in the BBC article above, relating the resultant number of cases after 30 days (say) to the average number of people an individual infects (the so-called R0 number, the Basic Reproduction Number) requires a deeper modelling technique. For an R0 explanation, see

I was interested, seeing the BBC infection rate chart, and its implications, to understand how precisely the number of people an individual is assumed to infect (on average) is related to the “doubling” rate assumptions we can make in the spreadsheet analysis.

I’ve been looking at SIR modelling – Susceptibility-Infected-Recovered modelling – in a simple form, to get the idea of how it works. There are quite a few references to the topic, going back a long way. A very useful paper I have been consulting is from Stanford University in 2007 (, and some of the basis for the shape of that basic modelling goes back to Kermack and McKendrick in 1927).

Usefully I have found some Python code in the Gillespie reference below that codifies a basic model, using a solution technique to the basic equations (which although somewhat simple first-order differential equations, are non-linear and therefore difficult to solve analytically) employing this Gillespie algorithm, which derives from work done in 1976, and is basically a Monte-Carlo probabilistic iterative time-stepping model well suited to computers (of the type I used to play with (for a quite different purpose) for the MoD in the 1970s).

My trial model (to become familiar with the way that the model behaves) is based on the Python code, and I found that with the small total population (N) of 350, with generic parameters for infection rate (α) and recovery rate (β), there is slow growth in cases for a long time (relatively) and then a sharp increase (at about t=0.1 time units in the chart below) leading to a peak at about t=0.3, when recovery starts to happen; the population returns to health at about t=10. The very slow initial growth (from ONE index case) is why I show the x-axis with a log scale.

This very slow growth from ONE case is, I guess, why most charts begin with the first 100 cases (or, in the case of deaths, 10) so that the chart saves horizontal axis space by suppressing the long lead-in period.

My next task is to put some real numbers into a model like this, and to work it though for a LARGE population, and for comparison, to run it from time zero at 100 cases (which might avoid the long lead time in this current generic model).

I expect to find that I could then use a linear x-axis time scale, but that I would have to present the chart with a log y-scale for cases, as the model would need to represent the exponential growth we have seen for Coronavirus.

Charting my initial model, showing a test SIR model output, using generic input parameters

More sophisticated models also include birth-death adjustments (a demographic model) in the work, but as the life-cycle being assessed for the Covid-19 virus is much shorter (hopefully!) than the demographic cycle, this is ignored to start with.

Another parameter that might be included for some important infections, where there is a significant incubation period during which individuals have been infected but are not yet infectious themselves, is the “Exposed” parameter. During this period the individual is in a compartment E (for Exposed), prior to entering compartment I (for Infected), turning the SIR model into a SEIR model.

Another version of the model might take into consideration the exposed or latent period of the disease, and where an infection does not leave any immunity after recovery, so that individuals that have recovered return to being susceptible again, moving them back into the S(t) (Susceptible as a function of time) compartment of the model; this model is therefore called the SEIS model. For a description of these models, and more, see

So we see that this is a far more complicated issue than at first sight. It is why, I think, Sir Patrick Vallance, the Chief Scientific Adviser, today began to talk about the R0 figure (a dimensionless number (a ratio)) relating to the average number of people that one individual might infect.

My feeling was that we are far from a value for R0 that would lead to the end of the epidemic being in sight, since, if we in the UK are tracking a doubling of cases every 3 days (as we have been), then this might be nearer to an R0 of 2.5, rather than anywhere near ONE. If R0 drops below 1, then the epidemic would eventually die out, which he mentioned. Above 1 and it continues to grow. As I said, I think we are far from an R0<1 situation.

The amount by which R0 exceeds the value 1 might not seem to have such a great effect on the numbers of cases we are seeing, at these early stages of an epidemic, as a but as the days wear on, the effects are VERY (i.e. exponentially) noticeable, and this is why the charts often have y-axis scales that are logarithmic, because otherwise they couldn’t easily be displayed.

In a linear y-axis chart, we run out of y-axis space quite quickly for exponential functions; to see all the data, at the later time values, we have to compress the chart vertically so much that it is then hard to see the earlier, lower numbers; we see this in such a chart below, that has a lot of “growing” to do. Note the dotted line which is the predicted line for doubling of cases every 3 days (which we in the UK have been tracking):

Data presented with linear x and y axes

It has therefore become more usual to present the data differently, with a log scale for the y-axis, where, for example, the sample dotted “doubling” lines are straight lines, not steeply growing exponential curves (in the chart below, two dotted guidelines are shown for deaths, one for 2 day doubling, and one for 3 days); the shorter the doubling period, the steeper the straight line on such a chart:

Data presented with a log y-axis – each y-gridline is set at a factor of 10x the previous one

In the 30th March Vallance presentation on TV, the growth curves on the last couple of log charts shown (cases and deaths, respectively) had a SLOPE that was DEcreasing slowly, not INcreasing (exponentially) rapidly (as the raw numbers actually are) although for a mathematician (or a data visualiser) this is a valid way to present such data.

The visual effect of choosing a such a log scale for the y-axis would have been explained in more detail in an academic lecture theatre (as I have tried to do here), and I think it is useful to point this out, and would be a clarification in the Government updates.

A final point, made in both the 29th and 30th March daily TV presentations, is that actions taken today will not have a tangible effect until a few days (maybe a week to two weeks) later; the outputs lag the inputs because of the lead times involved in infection rates, and in the effect of counter-measures on their reduction. What we see tomorrow doesn’t relate to today’s actions, it depends more on actions taken a week or more ago.

From recent charts, shown in Government updates, it does seem that what was done a week ago (self-isolation, social distancing and reduction in opportunities for people to meet other than in their household units) is beginning to have a visible effect on travel patterns, but any moves in the infection charts, if at all, are rather small so far.


Coronavirus – possible trajectories

I guess the UK line in the Johns Hopkins chart, reported earlier, might well flatten at some point soon, as some other countries’ lines have.

But if we continue at 3 days for doubling of cases, according to my spreadsheet experiment, we will see over 1m cases after 40 days. See:
and the example outputs attached for 3, 5 and 7 day doubling.

A million cases by 40 days if we continue on 3 day doubling of cases

If we had experienced (through the social distancing and other precautionary measures) and continue to experience a doubling period of 5 days (not on the chart but a possible input to my spreadsheet), it would lead to 25,000 cases after 40 days.

25600 cases at 5 day doubling since case 100

If we had managed to experience 7 days for doubling of cases (as Japan and Singapore seem to have done), then we would have seen 5000 cases at 40 days (but that’s where we are already, so too late for that outcome).

Not a feasible outcome for the UK, as we are already at 5000 cases or more

So the outcomes are VERY sensitively dependent on the doubling period, which in turn is VERY dependent on the average number of people each carrier infects.

I haven’t modelled that part yet, but, again, assumptions apart, the doubling period would be an outcome of that number, together with how long cases last (before death or recovery) and whether re-infection is possible, likely or frequent. It all gets a bit more difficult to be predictive, rather than mathematically expressing known data.

On a more positive note, there is a report today of the statistical work of Michael Levitt (a proper data scientist!), who predicted on February 21st, with uncanny accuracy, the March 23rd situation in China (improvements compared with the then gloomy other forecasts). See the article attached.

Michael Levitt article from The Times 24th March 2020

Coronavirus – forecasting numbers

A few people might have see the Johns Hopkins University Medical School chart on Covid-19 infection rates in different countries. This particular chart (they have produced many different outputs, some of them interactive world incidence models – see for more) usefully compares some various national growth rates with straight lines representing different periods over which the number of cases might double – 1 day, 2 days, 3 days and 7 days. It’s a kind of log chart to base 2.

Johns Hopkins University national trends, log base 2 chart

I’ve been beginning to simulate the outcomes for 2 input data items:

your chosen number of days (x) since the outbreak (defined at 100 cases on day zero to give a base of calculation);

your chosen rate of growth of cases, expressed by an assumed number of days for doubling the cases number (z), and then;

the output, the number of cases (y) on day x.

This spreadsheet allows you , in the last columns, to enter x and z in order to see the outcome, y.

Of course this is only an output model, it knows nothing about the veracity of assumptions – but the numbers (y) get VERY large for small doubling periods (z).

Try it. Only change the x and z numbers, please.


Coronavirus modelling – GLEAMviz15

Here’s the kind of stuff that the Covid-19 modellers will be doing. I have downloaded GleamViz,, and it is quite complicated to set up (I used to have a little Windows app called Wildfire that just needed a few numbers to get a pictorial progression of life/recovery/death from the disease, depending on infectivity, time taken to kill, etc). GLEAMviz15 is a proper tool* that needs a lot of base data to be defined. I’m thinking about it! PS – Some of the GleamViz team seem to be based in Turin.
*see my comment to this post.

Such modelling reminds me of event-based simulation models I used to develop for the MoD back in the 70s, using purpose-designed programming languages such as Simscript (Fortran-like) and Simula (Algol-like), all based around what today might be called object oriented programs (OOP), where small modules of code represent micro-events that occur, having inputs that trigger them from previous (or influencing) events, and outputs that trigger subsequent (or influenced) events, all inputs and outputs coded with a probability distribution of occurrence. In the MoD case this was about – erm, what am I allowed to say! – reliability and failure rates in aircraft, refuelling tankers and cruise missiles (even then). I recall that in my opening program statement, I named it “PlayGod”. I did ask for a copy of it years later (it’s all in very large dusty decks of Hollerith cards somewhere (I did overlap with paper tape, but I was a forward looking person!)) but they refused. Obviously far too valuable to the nation (even if I wasn’t, judging by my salary).

With regard to the publication of the modelling tools, I’ll leave aside the data part of it…there is a lot, and much of it will be fuzzy, and I’m sure is very different for every country’s population, depending where they are with the disease, and what measures they have taken over a period and whether, for example, they had “super-spreaders” early on, as some countries did.

Back in the early seventies, a European macro-economic project led to the publication of a book, The Limits to Growth. The context is described in
Not long after, our Government released the macro-economic model software they were using to explore these aspects of the performance and the future of our economy in a worldwide scenario.
I worked at London University’s commercial unit for a while, and we mounted this freely available Government model, and offered it to all-comers.

The Economist publication was one of our clients for this, and their chief economics journalist, Peter Riddell, was someone we met several times in connection with it. He was very interested in modelling different approaches (to matters such as exchange rates, money availability (monetary and fiscal policy) and their impact on economic development in a constrained environment) and report on them, as a comparison with the Government’s own modelling, policies and strategies.

This was all at a time when the Economy was perceived to be at risk (as it is now from this pandemic), and inflation and exchange rates, monetary and fiscal policy (for example) were very much seen as determinants of our welfare as a nation, and for all other nations, including the emerging European Community, who originally sponsored this work.

We seem to be in a very similar situation with regard to the modelling of this Covid-19 pandemic, and I don’t see why the software being used (at least by the UK Government’s analysts (probably academics, I think Sir Patrick Vallance said in his answers to the Select Committee this morning)) could not be released so that we might get a better understanding of the links assumed between actions and outcomes, impact of assumptions made, and the impact of actual and potential measures (and their feedback loops, positive or negative) taken in response to the outbreak.
Apparently the Government IS going to release its modelling, and I would certainly be interested to see it – its parameters, assumptions, its logic and its variety of outcomes and dependencies on assumptions. The possible outcomes are probably EXACTLY why the Government hasn’t released it yet.


New Year skiing at Breckenridge 2003/4/5

This video is about skiing in Breckenridge during our vacation at New Year 2003-4.

We had been skiing in Breckenridge since the early 90s. Once our boys Harry & Tom discovered firstly Whistler, and then Breckenridge, we couldn’t persuade them to go back to skiing in Europe as we had for many years!

On our first trip there we met Dutch van Andel, our ski instructor, and spent many happy days out with him, a most effective way of taking ski lessons over a day as a family instead of individual or on large groups – a happy medium!

Dutch and his sons, Hayden and Gerhard, became firm friends of ours, and more recently, when Dutch had moved back to the Netherlands and we had resumed skiing in Europe, we had a great ski trip together in Bormio (a feature video yet to come!)

Meanwhile, here is a video taken by the ski school on one of our days out with Dutch, when he took us to the Nastar slalom course as part of our  – erm – training!

Here is another video taken by the ski school on one of our days out with Dutch, at New Year 2005, with a colleague, Mike, from the ski school, following us and catching the magic moments!


Courmayeur in January 2008

In January 2008, we spent a week skiing in Courmayeur, a smaller Italian resort, following Zermatt the year before and St Anton the year after. Courmayeur lost nothing by comparison, and as you will see below, we enjoyed the wonderful snow conditions and varied terrain, including off-piste and tree skiing, enormously.

The first video is about a day at Courmayeur, 15th Jan 2008, when fresh overnight snow left several runs in beautiful condition for deep snow skiing. Here are Harry and Tom, and their poor old cameraman, trying their hand at several of them.

Lunch at Maison Vieille was a real treat, the best of the mountain restaurants, and we went there a few times during the week. Amazingly there was no queuing for lunch, although it was busy enough, and we couldn’t wait to go back.

This is not the largest resort we have been to by any means, but in our time there, in January, there were so many excellent runs in wonderful snow conditions, and good weather nearly all of the time. We would go back there like a shot – but there are so many European resorts we want to revisit after so may years skiing in Breckenridge, Colorado.

While the boys were young, they wouldn’t let us ski anywhere else owing to the familiar food and language in the US, and of course we had wonderful times with our instructor, Dutch van Andel, and his two boys, Hayden and Gerhard, who all became firm friends, and hosted us for Christmases as well! Dutch joined us in Bormio, in 2014, on one of our more recent European expeditions. More of that in another post…

Early off-piste in fresh snow in Courmayeur

A second day of equally lovely snow conditions! Harry and I were the early starters, but we soon called Tom up the mountain to enjoy the wonderful snow conditions.

The next day, Harry and I get out early, and call Tom up to the spectacular conditions

The next day, 17th January, we went off-piste skiing again in Courmayeur, when more fresh overnight snow left several runs in beautiful condition for deep snow skiing. We decided to use the Mont Blanc guides, to help us find some new runs and some tree skiing. It was quite difficult here and there, as you will see!

All of us, Ann, Harry and Tom and I spent the day with Enrico Bonino, of Mont Blanc Guides (and an experienced climber) and it was quite an education!

Some more difficult off-piste skiing, with Enrico Bonino, including in the trees