Re-modelling after changes to UK Coronavirus data collection and reporting

Change of reporting basis

The UK Government yesterday changed the reporting basis for Coronavirus numbers, retrospectively (since 6th March 2020) adding in deaths in the Care Home and and other settings, and also modifying the “Active Cases” to match, and so I have adjusted my model to match.

This historic information is more easily found on the Worldometer site; apart from current day numbers, it is harder to find the tabular data on the UK.gov site, and I guess Worldometers have a reliable web services feed from most national reporting web pages.

The increase in daily and cumulative deaths over the period contrasts with a slight reduction in daily active case numbers over the period.

Understanding the variations in epidemic parameters

With more resources, it would make sense to model different settings separately, and then combine them. If (as it is) the reproduction number R0<1 for the community, the population at large (although varying by location, environment etc), but higher in hospitals, and even higher in Care Homes, then these scenarios would have different transmission rates in the model, different effectiveness of counter-measures, and differences in several other parameters of the model(s). Today the CSA (Sir Patrick Vallance) stated that indeed, there is to be a randomised survey of people in different places (geographically) and situations (travel, work etc) to work out where the R-value is in different parts of the population.

But I have continued with the means at my disposal (the excellent basis for modelling in Alex Visscher’s paper that I have been using for some time).

Ultimately, as I said I my article at https://www.briansutton.uk/?p=1595, a multi-phase model will be needed (as per Imperial College and Harvard models illustrated here:-

Repeated peaks with no pharmaceutical intervention

and I am sure that it is the Imperial College version of this (by Neil Ferguson and his team) that will be to the forefront in that advice. The models looks at variations in policy regarding different aspects of the lockdown interventions, and the response of the epidemic to them. This leads to the cyclicity illustrated above.

Model adjustments

In my model, the rate of deaths is the most accurately available data, (even though the basis for reporting it has just changed) and the model fit is based on that. I have incorporated that reporting update into the model.

Up to lockdown (March 23rd in the UK, day 51), an infection transmission rate k11 (rate of infection of previously uninfected people by those in the infected compartment) and a correction factor are used to get this fit for the model as close as possible prior to the intervention date. For example, k11 can be adjusted, as part of a combination of infection rates; k12 from sick (S) people, k13 from seriously sick (SS) people and k14 from recovering (B, better) people to the uninfected community (U). All of those sub-rates could be adjusted in the model, and taken together define the overall rate of transition of people from from Unifected to Infected.

After lockdown, the various interventions – social distancing, school and large event closures, restaurant and pub closures and all the rest – are represented by an intervention effectiveness percentage, and this is modified (as an average across all those settings I mentioned before) to get the fit of the model after the lockdown measures as close as possible to the reported data, up to the current date.

I had been using an intervention effectiveness of 90% latterly, as the UK community response to the Government’s advice has been pretty good.

But with the UK Government move to include data from other settings (particularly the Care Home setting) I have had to reduce that overall percentage to 85% (having modelled several options from 80% upwards) to match the increased reported historic death rate.

It is, of course, more realistic to include all settings in the reported numbers, and in fact my model was predicting on that basis at the start. Now we have a few more weeks of data, and all the reported data, not just some of it, I am more confident that my original forecast for 39,000 deaths in the UK (for this single phase outlook) is currently a better estimate than the update I made a week or so ago (with 90% intervention effectiveness) to 29,000 deaths in the Model Refinement article referred to above, when I was trying to fit just hospital deaths (having no other reference point at that time).

Here are the charts for 85% intervention effectiveness; two for the long term outlook, into 2021, and two up until today’s date (with yesterday’s data):

Another output would be for UK cases, and I’ll just summarise with these charts for all cases up until June 2020 (where the modelled case numbers just begin to level off in the model):-

As we can see, the fit here isn’t as good, but this also reflects the fact that the data is less certain than for deaths, and is collected in many different ways across the UK, in the four home countries, and in the conurbations, counties and councils that input to the figures. I will probably have to adjust the model again within a few days, but the outlook, long term, of the model is for 2.6 million cases of all types. We’ll see…

Outlook beyond the Lockdown – again

I’m modest about my forecasts, but the methodology shows me the kind of advice the Government will be getting. The behavioural “science” part of the advice (not in the model) taking the public “tiredness” into account, was the reason for starting partial lockdown later, wasn’t it?

It would be more of the same if we pause the wrong aspects of lockdown too early for these reasons. Somehow the public have to “get” the rate of infection point into their heads, and that you can be infecting people before you have symptoms yourself. The presentation of the R number in today’s Government update might help that awareness. My article on R0 refers

Neil Ferguson of Imperial College was publishing papers at least as far back as 2006 on the mitigation of flu epidemics by such lockdown means, modelling very similar non-pharmaceutical methods of controlling infection rates – social distancing, school closures, no public meetings and all the rest.  Here is the 2006 paper, just one of 188 publications over the years by Ferguson and his team.  https://www.nature.com/articles/nature04795 

The following material is very recent, and, of course, focused on the current pandemic. https://www.imperial.ac.uk/…/Imperial-College-COVID19…

All countries would have been aware of this from the thinking around MERS, SARS and other outbreaks. We have a LOT of prepared models to fall back on.

As other commentators have said, Neil Ferguson has HUGE influence with the UK Government. I’m not sure how quickly UK scientists themselves were off the mark (as well as Government). We have moved from “herd immunity” and “flattening the curve” as mantras, to controlling the rate of infection by the measures we currently have in place, the type of lockdown that other countries were already using (in Europe, Italy did that two weeks before we did, although Government is saying that we did it earlier in the life of the epidemic here in the UK).

One or two advisory scientists have broken ranks (John Edmunds reported at https://www.reuters.com/…/special-report-johnson… ) on this to say that the various Committees should have been faster with their firm advice to Govenment. Who knows?

But what is clear from the public pronouncements is that Governments are now VERY aware of the issue of further peaks in the epidemic, and I would be very surprised to see rapid or significant change in the lockdown (it already allows some freedoms here in the UK, not there in some other countries, for example exercise outings once a day). I wouldn’t welcome being even more socially distanced than others, as a fit 70+ year-old person, through the policy of shielding, but if it has to be done, so be it.

Cambridge Conversations and model refinement

Cambridge Conversations video

In my post a few days ago at https://www.briansutton.uk/?p=1536, I mentioned that the webinar I reported there was to be released on YouTube this week, and it is now available.

It is VERY much worth 40 minutes of your time to take a look.

The potential cyclical behaviour described there is one that, as we see, the Imperial group under Neil Ferguson have been looking at; and the Harvard outlook seems similar, well into 2021, and as far as 2022 in the Harvard forecasting period.

Cyclical pandemic behaviour
Cyclical pandemic behaviour

It seems likely, (as Imperial have been advising the UK Government) from the recent daily updates, that Government is preparing us for this, and have been setting the conversation about any reduction in the lockdown clearly in the context of avoiding a further peak (or more peaks) in infection rates.

Alex de Visscher’s model I have been working with (I described it at https://www.briansutton.uk/?p=1455 ), and other similar “single-phase” models, would need some kind of iterative loop to model such a situation, something entirely feasible, although requiring many assumptions about re-infection rates, the nature and timings of any interventions Government might choose to relax and/or add, and so on. It’s a much more uncertain modelling task than even the work I’ve been doing so far.

Model refinement

Inasmuch as I have been looking at the fit of my current UK model to the daily data (see https://www.briansutton.uk/?p=1571), it was clear that the model was approaching a critical period over the next couple of weeks: UK death data, and to some extent the daily number of cases was levelling out, such that I could see that my model (predicting c. 39,000 deaths) was ahead of what might be likely to happen.

The modelled pandemic growth rates depend on many data, and two parameters – the time it takes for a person to infect another (i.e. the initial rate of infection per day, k11=0.39 in the current model), and also the intervention effectiveness parameter (how well the social distancing is working at reducing that infection rate per day), interv_success, set at 85% in the current model – are pertinent to the shape of the growth from the intervention date.

Since the lockdown measures are critical to that infection rate outcome as we go forward, yesterday I decided to run the model with interv_success set at 90%, and indeed it did bring down the predicted death outcome to c. 29,000. Here are those new charts, one with linear axes, and the other with logarithmic (base 10) y-axis:

These charts match the reported deaths data so far a little better; but for case numbers (a harder and more uncertain number to measure, and to model) with 90% intervention effectiveness, the reported data begins to exceed the model prediction, as we see below. This effect was enhanced by the weekend’s lower UK reported daily figures (April 18th-20th), but the April 21st figure showed another increase. It’s probably still a little to early to be axiomatic about this parameter, and the later section on model sensitivity explains why.

The two original charts for cases, with the 85% effectiveness, had looked likely to be a better match, going forward, as we seem to be entering a levelling-off period:

We can see that the model is quite sensitive to this assumption of intervention effectiveness. All data assumptions at an early stage in the exponential growth of pandemics are similarly sensitive, owing to the way that day-on-day growth rates work. The timing of the start of interventions (March 23rd in the case of the UK) and also, as we see below, the % effectiveness assumption both have a big effect.

Model sensitivities

This highlights the need to be cautious about adjusting lockdown measures (either in real life or in the models), until we can be sure what the likely impact of the “intervention effectiveness” percentage will be, and how this affects the outlook. Here we see two base-case sensitivity charts from Alex’s original paper at https://arxiv.org/pdf/2003.08824.pdf

These are generic logarithmic charts from his base model. On the left, we see the number of deaths after 60 days (bright red), 150 days (dark red) and 300 days (black) versus % effectiveness of intervention, for the base case of the (then) typical (outside China) doubling time 4 days, with intervention on day 30 (in the UK it (the lockdown) was on Day 51, March 23rd).

On the right, we see the number of deaths after 60 days (bright red), 150 days (dark red) and 300 days (black) versus the time of intervention, for the base case, doubling time 4 days, and intervention effectiveness 70 %.

Since the UK intervention on March 23rd, I feel that the intervention effectiveness has been quite good overall, so 85% and 90% don’t seem unreasonable options for the UK model.

For our UK intervention timing, the jury is out as to whether this was too late, or if an earlier intervention might have seen poorer compliance (effectiveness). The Government made the case (based on behavioural science, I believe) that the UK public would tire of lockdown, and therefore wanted to save lockdown for when they felt it would really be needed, to have its maximum effect nearer to when any peak in cases might have swamped the NHS.

The point in this section is to illustrate the sensitivity of outcomes to earlier choices of actions, modelled by such mathematical models; such models being the basis of some of the scientific advice to the UK Government (and other Governments).

Epidemic decline and multiple peak issues

A full discussion of the following points can be seen in Alex’s paper at https://arxiv.org/pdf/2003.08824.pdf

Alex had run his generic model (population 100m, with 100 infected, 10 sick and 1 seriously sick on day 1) for a couple of scenarios, in slow, medium and fast growth modes: one, that represented a “herd immunity approach” (which turns out not to be effective unless there is already a pharmaceutical treatment to reduce the number of infective people); and two, a “flattening the curve” approach, which modelled reducing the infectivity, but NOT controlling the disease (i.e. R0 still well above 1), which delays the peak, but still swamps the healthcare system, and allows very large numbers (about 2/3 of the base model numbers) of deaths.

This must have been the kind of advice the UK Government were getting in mid-March, because these approaches were demoted as the principal mechanisms in the UK (no more talk of herd immunity), in favour of the lockdown intervention on March 23rd.

What next after lockdown?

The following charts (from Alex’s paper) show the outcomes, in this generic case, for an intervention (in the form of a lockdown) around day 30 of the epidemic.

Starting from the base case, it is assumed that drastic social distancing measures are taken on day 30 that reduce R0 to below 1. It is assumed that the value of k11 (the starting infectivity) is reduced by 70 % (i.e., from 0.25 per day to 0.075 per day, equivalent to R0 decreasing from 2.82 to 0.85).

On the left chart, we see: Uninfected (black), all Infected (blue) and Deceased (red) people versus time. Base case, doubling time 4 days, intervention on day 30 with 70 % effectiveness.

On the right chart, we see Incubating (blue), Sick (yellow), Seriously Sick (brown), Recovering (green) and Deceased (red) people versus time. Base case, doubling time 4 days, intervention on day 30 with 70 % effectiveness.

Conclusions for lockdown

What is clear from these charts is that the decline of an epidemic is MUCH slower than its growth.

This has important repercussions for any public health policy aiming to save lives. Even seven months into this generic modelled intervention, the number of infected is comparable to the number of infected two and a half weeks before the intervention.

On the basis of these simulations, terminating the intervention would immediately relaunch the epidemic. The intervention would need to be maintained until the population can be vaccinated on a large scale.

This difficulty is at the heart of the debate right now about how and when UK (or any) lockdown might be eased.

This debate is also conducted in the light of the Imperial and Harvard modelling concerting the cyclical behaviour of the epidemic in response to periodic and partial relaxations of the lockdown.

This is where the two parts of the discussion in this blog post come together.

UK Coronavirus Modelling – match to data

Even having explored Prof. Alex Visscher’s published MatLab code for a week or two now, with UK data, even I am surprised at how well the model is matching up to published UK data so far (on April 18th 2020).

I have reported my previous work a couple of times, once relating to the modelling equations, at 7 Compartment modelling, Alex Visscher, and once on the nature of the models on this kind of work, and the r0 number in particular, at SIR models and R0.

I’ll just present the charts produced my my version of the model for now. One chart is with linear scales, and the other with a log y-axis to base 10 (each main gridline is 10x the previous one):

Linear scales
Logarithmic y-axis base 10

We can see that the upcoming acid test will be as the case numbers (and associated death rates) begin to fall, thanks to the social distancing, working from home and other measures people are being asked to take (successfully, in general, at the moment).

The new consideration (new in public, at least) is the question of whether there will be a vaccine, how well it might work, for how long immunity might last, and whether there will be re-infection, and at what rate. If there is no pharmaceutical resolution, or only a partial one, these single phase infection models will need to develop, and, as we saw in my last post at The potential cyclical future pandemic, a cyclical repetition of a lockdown/epidemic loop might look the most likely outcome for a while, until the vaccine point is resolved.

Meanwhile, this single phase model flattens out at a little over 39,000 deaths. I hope for the sake of everyone that the model turns out to be pessimistic.

Cambridge conversations 17th April 2020 – exit strategy?

As an alumnus, I had the opportunity today (with 2000 other people in over 60 countries) to attend a Cambridge University webinar on the current pandemic crisis. It was moderated by Chris Smith, a consultant medical virologist and lecturer at Cambridge (and a presenter on https://www.thenakedscientists.com/ where more about this event will be found).

The experts presenting were Dr Nick Matheson, a Principal Investigator in Therapeutic Immunology, and Honorary Consultant at Addenbrookes Hospital, and Prof. Ken Smith, Professor in Medicine at Cambridge, and Director of Studies for Clinical Medicine at Pembroke College.

The video of the 45 minute session is available, and I will share it here in due course (it’s only on a closed group at the moment, but will be on the Cambridge YouTube channel https://www.youtube.com/channel/UCfcLirZA16uMdL1sMsfNxng next week) but of most interest to me, since I was interested in the modelling of the pandemic outbreak, was the following slide put up by Nick Matheson, depicting research by Neil Ferguson’s group at Imperial College, and also by the research group at Harvard University School of Public Health. Here it is:

Modelling showing repeated cycles of Pandemic in response to lockdown changes

It seems to me that as the Neil Ferguson group is involved with this (and the Harvard slide seems to corroborate something similar) it is likely that the UK Government is receiving the same material.

I am generally in support of not forecasting or detailing any imminent reduction in lockdown measures, and I can see from this why the Government (led by the science) might be pretty reluctant too.

You will know from my previous posts on this topic why the R0 number is so important (the Reproductive Number, representing the number of people one person might infect at a particular time). Sir Patrick Vallance (CSA) has mentioned several times that there is evidence that is R0 is now below 1 (on the 16th April he actually said it was between .5 and 1) for the general population (although not so in care homes and hospitals, probably).

He and Prof. Chris Whitty (CMO) (an epidemiologist) have also been saying (and so have Government, of course, on their advice) that they don’t want to reduce restrictions yet, in order to avoid a second peak of cases.

But from the Imperial College and Harvard research (mathematical modelling, presumably, on the basis of “non-pharmaceutical” likely actions that might be taken from time to time to control the outbreak) it would seem that we are in for multiple peaks (although lower than the first we have experienced) going well into 2021, and into 2022 in the case of the Harvard chart.

I was one of a very large number of people putting in a written question during the webinar, and mine was regarding the assumptions here (in the modelling) about the likelihood (or otherwise) of a vaccine on the one hand, and repeat infections on the other.

It might be that any vaccine might NOT confer long lasting immunity. It might even be that, like Dengue Fever, there will be NO vaccine (this was stated verbally in the Q&A by Prof. Ken Smith).

All very sobering. And I haven’t been drinking. Here is the full slide set:

Making a little progress with a Covid-19 model and real data

In my research on appropriate (available and feasible (for me)) modelling tools for epidemics, I discovered this paper by Prof. Alex de Visscher (to be called Alex hereafter). He has been incredibly helpful.

https://arxiv.org/pdf/2003.08824.pdf

Thanks to Alex including the MatLab code (a maths package I have used before) in the paper, and also a detailed description of how the model works, it qualifies as both available and feasible. There is also a GNU (public and free) version of MatLab (called Octave)) which is mostly equivalent in terms of functionality for these purposes, which makes it practical to use the mathematical tools.

In my communications with Alex, he has previously helped me with working out the relationship between the R0 Reproductive Number (now being quoted every time on Government presentations) and the doubling period for the case number of disease outbreak. I published my work on this at:

https://docs.google.com/document/d/1b9lB4RszIHJipS27raXxN6ECTFbihJ_fQFhdy5oCEwA/edit?usp=sharing

since mathematical expression are too cumbersome here on WordPress.

I would still say the Government is being a little coy about these numbers, saying that R0 is below 1 (a crucial dividing point between epidemic increase at R0>1 and epidemic slow down and dying out for R0<1). They now say (April 14th) it’s less than one outside hospitals and care homes, but not inside these places, as if (as Alex says) Covid-19 is not lethal, except for those people who die from it.

At least the Government have now stopped using graphs with the misleading appearance of y-axis log charts.

These following charts (unlike the original Government ones) make it pretty clear what kind of chart is being shown (log or linear) and also illustrate how different an impression might be given to a member of the public looking on a TV screen a) if the labelling isn’t there or obscurely referenced, and b) how the log charts to some extent mask the perception of steep growth in the numbers up until late March.

This chart from 14th April show the Government doing a little better over recent days in presenting a chart that visually represents the situation more clearly. Maybe the lower growth makes them more comfortable showing the data on linear axes.

An April 14th Government Update chart comparing country death rates
Linear y-axis scale

Alex de Visscher model description

I’ll try to summarixe Alex’s approach in the model I have begun to use to work with the UK numbers (Alex had focused on other countries (US and Canada nearer to home, plus China, Italy, Spain, Iran with more developed outbreak numbers since the epidemic took hold).

I have previously described the SIR (Susceptible, Infected and Recovered) 3-compartment model on this blog at https://www.briansutton.uk/?p=1296, but in this new model we have as many as 7 compartments: Uninfected (U), Infected (I), Sick (S), Seriously Sick (SS), Deceased (D), Better but not recovered (B) and Recovered (R). They are represented by this diagram:

Alex de Visscher 7-compartment model for Coronavirus

Firstly, an overall infection rate r1 above can be calculated as a weighted sum of these four sub-rates;

r1 = (k11I+ k12S+ k13SS+k14B)U/P

where the infection rates k11, k12, k13, k14, are set separately, towards I, S, SS and B respectively, from healthy people, with the rates proportional to the fraction of uninfected people, compared to the total population, U/P.

All other transitions between compartments are assumed as first order derivatives with their own rate constants, with time t measured in days:

dU/dt = -r1; the rate of movement from Uninfected to Infected; and similarly, with in-/out-puts added and subtracted for other compartments:

Transition rates between compartmentsDifferential equation
r1 = (k11I+ k12S+ k13SS+k14B)U/PdU/dt = -r1
r2=k2I for I to S transitionswith dI/dt = r1-r2
r3=k3S for S to SS transitionswith dS/dt = r2-r3-r5
r4=k4SS for SS to D transitionswith dD/dt = r4
r5=k5S for S to B transitionswith dB/dt = r5+r6-r7
r6=k6SS for SS to B transitionswith dSS/dt = r3-r4-r6
r7=k7B for B to R transitionswith dR/dt = r7
Table 1: The possible transitions and their rates between compartments

Some simplifying assumptions are made: k12 = k11/2, k13 = k11/3 and k14 = k11/4, in other words scaling the infection rates to I from the other compartments S, SS and B, respectively, to the principal rate from U to I. (Subsequently k14 = 0*k11/4 was chosen to compensate for long contagious times once the SS median had been changed from to 10 days from 3.5 days. See below).

So we can recognise that the bones of the previous SIR model are here, but there is compartment refinement to allow an individual’s status to be more precisely defined, along with the related transitions.

The remaining parameters that can be set in the model are:

Parameter (units)Value
k11 (day-1)variable
k2 (day-1)loge(2)/5.1
k3 (day-1)k5/9
k4 (day-1)k6.(15/85)
k5 (day-1)loge(2)/3.5
k6 (day-1)loge(2)/10
k7 (day-1)loge(2)/10
Table 2: Parameters of the COVID-19 model

k11 was initially chosen by inspection of the observed doubling rate for the disease, and can be modified. k2 is based on the 5.1 day incubation rate observed in the paper by Lauer et al at https://annals.org/aim/fullarticle/2762808/incubation-period-coronavirus-disease-2019-covid-19-from-publicly-reported

k3/k5 is based on the assumption that 10% of the infected (I) become seriously sick (SS), whereas 90% get better without developing serious symptoms; less than the 80:20 proportion observed normally, because many with mild symptoms go unreported.

The k4/k6 ratio is set assuming a 15% non-survival rate for hospitalised cases.

k5 and k7 were set in the original model assuming a median rate of 3.5 days in each of the S and B stages before moving to the Recovered R stage (ie a median 7 days to recover from mild symptoms). k7 has now been set to a median of 10 days (implying a 13.5 day median for recovery from mild symptoms), and to compensate for the resulting long contagious times, k14 = 0*k11/4 has been chosen as stated above.

k6 is based on an SS stage patient remaining in that stage for 10 days before transitioning to B. Because the other pathway is to the Deceased D stage, the median is actually less, 8.5 days. This would be consistent with a mean duration of 12.26 days for getting better (B), which has been observed clinically.

Implementation of the model and further parameter setting

I have been operating the model with some UK data using Octave, but Alex was using the equivalent (and original licensed software) MatLab. He used the Ordinary Differential Equation Solver module in that software, ODE45, which I have used before for solving the astronomical Restricted Three Body Problem, so I know it quite well.

Just as all of the various Governmental, Johns Hopkins University and Worldometer published graphs and data are presented with a non-zero case baseline to start the simulations, as the lead-in time is quite long, so does this model. These initial parameters for the UK model so far are chosen as:

Population P67.8 million
Initial infected number I100
Initial Sick number S10
Initial Seriously Sick number SS1
Deaths D0
Better B0
Recovered R0
Table 3: Initial condition choices for the model

Thus there are (in the model) 111 infected people at the start, among the UK population of 67.8 million. This is still quite an early start for the simulation, with less than 10 known infections; the doubling time is estimated from the total of people in any infected stages at days 29 and 30 using a similar approach to the one I explained in that previous paper at https://docs.google.com/document/d/1xg4XY2wZPZFM2O5E76Y6zBPc_42l7PMPE4NF6XipDPQ/edit?usp=sharing

Thus the doubling time I called TD there is given by the equation:

TD loge2 = loge (C30 /C29)

where Cd is the reported number of cases on a given day d. Known cases are assumed to be 5% of infected cases I, 1/3 of sick cases S, 90% of seriously sick SS, 12% of recovering R and 90% of deceased cases D (not all are known but we assume, for example, that reporting is more accurate for SS and D). The calculated doubling time in the model is not sensitive to the choice of these fractions.

The critical factor R0, the Reproduction Number, can be calculated by a separate program module, for one infected patient, where the number of newly infected starting with that case can be calculated over time.

Solving the equations

The model, with the many interdependent parameter choices that can be made, solves the Differential Equations described earlier, using the initial conditions mentioned in Table 3. Alex’s paper describes runs of the model for different assumptions around interventions (and non-interventions), and he has presented some outcomes of National models (for example for China) on his YouTube channel at:

https://www.youtube.com/channel/UCSH5Bzb8Oa0BojCVMdXBW_g/

and Alex’s introduction to the model on that channel, for both China, and the world outside China which describes the operation of the model, and the post processing fitting to real data charts can be seen at

Model descriptions for China, and the rest of the world outside China

Alex’s early April update on Italy is at:

Model update for Italy

UK modelling

I’m at an early stage of working on a UK model, using Alex’s model and techniques, and the most recent UK data is presented at the UK Government site at:

https://www.gov.uk/government/publications/covid-19-track-coronavirus-cases

using the link to the total data that has been presented to date on that site. Unfortunately, each day I look, the presentation format changes (there used to be a spreadsheet which was the most convenient) and now the cumulative case count for the UK has increased suddenly by 3000 on top of today’s count. Death numbers are still in step. It isn’t hard to see why journalists might be frustrated with reporting by the UK Government.

I have, as a first cut, run the numbers through the model to get an initial feel as to outcomes. The charts produced are not yet annotated, but I can see the shape of the predictions, and I will just include one here before I go on to refine the inputs, the assumptions, the model parameters and the presentation.

The Figure 2 produced by the model for the UK input choices I made

In this chart, days are along the x-axis, and the y-axis in a log scale to base 10 (equal intervals are a multiple of 10 of the previous y-gridline) of numbers of individuals.

The red curve represents active cases, the purple curve the number of deaths D, and the yellow curve the number of seriously sick patients, SS. This chart is the simpler of the two that the model produces, but serves to illustrate what is going on.

I have not yet made any attempt to match the curves against actual UK outcomes for the first 50 days (say) as Alex has done for the other countries he has worked on.

The input choices for this version for the UK are:

interv_time = 52;    % intervention time: Day 51 = March 23 – 1 = March 22

    interv_success = 0.85; % fractional reduction of k11 during intervention

    k11 = 0.39;          % infection rate (per day)

    k5 = log(2)/3.5;

    k6 = log(2)/10;

    k7 = log(2)/10;

    fSS = 0.1;     % fraction seriously sick

    fCR = 0.04;    % fraction critical

    fmort = 0.015; % mortality rate

    correction = 0.0012;

    P = 67.8e6;       % total population

The correction number is set to adjust for the correct starting number of cases. On Alex’s advice I set k11 = 0.39 which seems to provide a good initial match to death data.

In terms of numerical outputs, this is just a model, and hasn’t even been tuned, let alone fine tuned. The charts above reflect the output numbers in the model, which were:

doubling_time = -7511748644 – not relevant here for a declining epidemic;
final_deaths = 39,319 reached after a little over 100 days;
total_infections = 2,621,289 peaking at about 60 days

and I’m still making sense of these and working out some possible adjustments. Firstly I want to match with real data up to to date, so there is some spreadsheet and plotting work to do.

Now the model has been run, there can be a combinatorial number of changes that could be made to get the model to fit the first 50 days’ data.

The work has only just started, but I feel that I have a worthwhile tool to begin to parallel what the Government advisers might be doing to prepare their advice to Government. Watch this space!

The SIR model and importance of the R0 Reproductive Number

In the recent daily UK Government presentations, the R0 Reproductive Number has been mentioned a few times, and with good reason. Its value is as a commonly accepted measure of the propensity of an infectious disease outbreak to become an epidemic.

It turns out to be a relatively simple number to define, although working back from current data to calculate it is awkward if you don’t have the data. That’s my next task, from public data.

If R0 is below 1, then the epidemic will reduce and disappear. If it is greater than 1, then the epidemic grows.

The UK Government and its health advisers have made a few statements about it, and I covered these in an earlier post.

This is a more technical post, just to present a derivation of R0, and its consequences. I have used a few research papers to help with this, but the focus, brevity(!) and mistakes are mine (although I have corrected some in the sources).

 

The Coronavirus briefing 2nd April 2020

This is an extract from the Government daily Coronavirus briefing on 2nd April 2020, led by Matt Hancock, with Professor Stephen Powis, Medical Director of the NHS in England.

In this clip, Prof. Powis states that he thinks there is “early academic evidence” that the R0 “Reproduction Number”, (what he calls the transmission rate) reflecting the average number of people infected by one person, “may have fallen below 1”. This is an important target, because when R0 for a given epidemic is below 1, the epidemic will die out.

Stephen Powis speaking at the Government Coronavirus briefing, April 2nd 2020

My issue here is that I would like to see the evidence.

In Matt Hancock’s statement, he said that “the best scientific analysis is that” the “doubling” period for the rate of infection of the virus (another important statistic, used by Johns Hopkins University in many of their charts for the epidemic) was between three and four days.

Matt Hancock presents the latest Coronavirus figures

In the early stages of a rapidly growing epidemic, either of three or four days lead to quite different outcomes (for power series (or exponential) growth) after, say, 30 days (since day zero, defined as 100 cases), for the total number of people infected – 3 days leads to 100,000 cases, and 4 days to 18,000.

See my previously posted spreadsheet to run your own numbers for this:

https://docs.google.com/spreadsheets/d/1kE_pNRlVaFBeY5DxknPgeK5wmXNeBuyslizpvJmoQDY/edit?usp=sharing

My opinion is that it’s an optimistic range, but in any case it is too wide to be meaningful without more detail. I suspect 2.5 days (which would lead to 400,000 cases) might be a better number, but without seeing the scientific data – the “evidence” in either case – who knows?

But do you see how sensitive outcomes are to quite small changes in estimates to the input data at this early stage in the epidemic?

Both of these video statements sound to me to be too optimistic, and smack of concern to reassure the public rather than undiluted science.

My mention of “exponential” above reminds me to repeat my usual issue with presentations of log scale graphs without explanation.

Log y-axis charts for Coronavirus deaths by country

The visual appearance of this chart from the 2nd April update, to the average member of the public, doesn’t really highlight visually that the growth in numbers is exponential – it looks more as if the rates are flattening. It requires scrutiny of the y-axis to see that the top gridline in the chart is representing a number (10,000) 200 times higher than the baseline, 50. It’s at least careless not to mention that, other than “Logarithmic scale” (not saying whether x- or y-axis) in a tiny font at the bottom of the chart. A similar chart was shown on 3rd April too.

I’m guessing that the chart above was drawn straight from either the Worldometer, or Johns Hopkins University site without due consideration of its meaning for non-practitioners.

They do know the difference, of course; this DHSC chart, with linear axes, was also presented, and it does show the characteristic exponential growth of the epidemic, in this case for the deaths from the outbreak.

Coronavirus deaths in the UK – linear x- and y-axes

Some progress with the Gillespie SIR model

In the news today (coincident with my current work on this post) we can see how important the measures taken through social distancing, self-isolation and (partial) lockdown are to reducing the rate of infections. So far, the SHTM researchers say “…the R0 could be cut…” and “likely lead to a substantial impact and decline…” – i.e. we are on track theoretically, but we are yet to see the full benefits. It is still my view, expressed in my last post, that in the UK, we are still at an R0 figure of 2.5-3.5, with a doubling period of 2-3 days (2 and 3 are VERY different figures in this exponential context, as we see from my earlier posts).

While it’s a small study, it’s interesting to note the intention to reduce the R0 Reproduction Number (that I talked about in my last post), equivalent to the average number of people a person infects in the “susceptible” part of the population (those not yet infected, or recovered). The SIR model I have been working on reflects this concept in generic terms, as described below.

The model is, in effect, a Markov Chain solution to the infection activity, where each state of the SIR (Susceptible-Infected-Recovered compartmentalised) population at time t+δt depends only on the previous state at time t, solved through Monte-Carlo sampling for the statistical behaviour (ie not analytically), using an iterative procedure called the Gillespie algorithm, described last time, to solve the differential equations of the SIR model.

Thanks to some great work by Tomasz Knapik (Sutton family friend) I can show show a couple of new charts relating to variations in the data for the Gillespie algorithm for solving the SIR equations I talked about in my previous post.

You’ll recall this chart from last time, where for a population of 350, with infection rate α, and recovery rate β set, the progress of this simple SIR model shows the peak in infections from 1 index case, with a corresponding reduction in susceptible individuals, and an increase in recoveries until the model terminates with all individuals recovered, and no susceptible individuals left.

I drew this with a log scale (base 10) for time, because in this model, with the default input data, not much happens for quite a while from time t=0, until the infections take off rapidly at t=0.1 (on the generic time scale) and reach a peak at t=0.3. The model is fully played out at t=10. The peak for infections, out of a population of 350, is 325, for this input data:

Default input data for the Gillespie SIR model

Key parameters for the basic SIR model are:

The total population, N; the rate of infection α; and the rate of cure β.

Base default data for the simulation is for 1 index infected case at t=0, and the “Spatial Parameter” V, set at 100 in the generic units used by the model. I’ll talk more about this parameter in a later post.

A larger V reduces the infection growth through “distancing” the infected cases from the susceptible compartment of the population, and a smaller V increases the pace of infection by reducing the distance in general terms.

The equations for the simple SIR model being analysed, for a limited total population N, where:

N = nS + nI + nR , the sum of the compartmentalised Susceptible, Infected and Recovered numbers at any time t;

SIR model equations

With linear scales for both the x-axis (time) and the y-axis (cases) we can see, in the following two charts, that a change in the spatial parameter (V) in the differential equations above does precisely what the Government is trying to achieve with its measures regarding self-isolation, social distancing and reduction in opportunities for people to meet other than in their household units.

Firstly, in the the modified Python Gillespie model (the Knapik, Sutton & Sutton model!) for the case illustrated above, but with linear axes:

Τhe SIR chart for the base data with Spatial parameter V=100

Second, I change the spatial parameter, V, to 1000, for illustrative purposes:

Τhe SIR chart for the base data with Spatial parameter V=1000

We see above that the spatial parameter, representing increasing “social distancing” in this SIR application, does what is intended by the social distancing behaviour encouraged by most Governments – it pushes the peak of the infections from t=0.3 further out, towards t=2, and also flattens the peak, down to about 140 cases, from 325 before.

My task now is to relate the following key parameters of this situation:

First of all, the Reproduction Number R0 (the number of people one person might infect).

Secondly, the generation interval (how long it takes that infection to take effect (a vital number, and the subject of some difference in the literature between medical practitioners on the one hand, and demographic / biological practitioners on the other). An epidemiological formula, and a demographic/ecological/evolutionary biology formula can both be used to assess the generation interval, from R0, but can lead to very different results, to summarise a point made in paper on exactly this topic (ref. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766383/).

Third, the doubling period – what is the right logarithmic multiplier in the Johns Hopkins University charts, given R0?

And lastly, the time taken, relatively, to see the R0 come down to a value near 1, when the epidemic stabilises and would tend to die out if R0<1 can be achieved.

No account is taken by my SIR model of recovery and re-infection; the SEIS model mentioned last time, or a variation of it, would be needed for that.

No-one is clear, furthermore, as to the seasonality of this Coronavirus, Covid-19, with regard to winter and summer, ambient humidity and temperatures, or the likelihood of mutation.