Categories

Phenomenology & Coronavirus – modelling and curve-fitting

Introduction

I have been wondering for a while how to characterise the difference in approaches to Coronavirus modelling of cases and deaths, between “curve-fitting” equations and the SIR differential equations approach I have been using (originally developed in Alex de Visscher’s paper this year, which included code and data for other countries such as Italy and Iran) which I have adapted for the UK.

Part of my uncertainty has its roots in being a very much lapsed mathematician, and part is because although I have used modelling tools before, and worked in some difficult area of mathematical physics, such as General Relativity and Cosmology, epidemiology is a new application area for me, with a wealth of practitioners and research history behind it.

Curve-fitting charts such as the Sigmoid and Gompertz curves, all members of a family of curves known as logistics or Richards functions, to the Coronavirus cases or deaths numbers as practised, notably, by Prof. Michael Levitt and his Stanford University team has had success in predicting the situation in China, and is being applied in other localities too.

Michael’s team have now worked out an efficient way of reducing the predictive aspect of the Gompertz function and its curves to a straight line predictor of reported data based on a version of the Gompertz function, a much more efficient use of computer time than some other approaches.

The SIR model approach, setting up an series of related differential equations (something I am more used to in other settings) that describe postulated mechanisms and rates of virus transmission in the human population (hence called “mechanistic” modelling), looks beneath the surface presentation of the epidemic cases and deaths numbers and time series charts, to model the growth (or otherwise) of the epidemic based on postulated characteristics of viral transmission and behaviour.

Research literature

In researching the literature, I have become familiar with some names that crop up or frequently in this area over the years.

Focusing on some familiar and frequently recurring names, rather than more recent practitioners, might lead me to fall into “The Trouble with Physics” trap (the tendency, highlighted by Lee Smolin in his book of that name, exhibited by some University professors to recruit research staff (“in their own image”) who are working in the mainstream, rather than outliers whose work might be seen as off-the-wall, and less worthy in some sense.)

In this regard, Michael Levitt‘s new work in the curve-fitting approach to the Coronavirus problem might be seen by others who have been working in the field for a long time as on the periphery (despite his Nobel Prize in Computational Biology and Stanford University position as Professor of Structural Biology).

His results (broadly forecasting, very early on, using his curve-fitting methods (he has used Sigmoid curves before, prior to the current Gompertz curves), a much lower incidence of the virus going forward, successfully so in the case of China) are in direct contrast to that of some some teams working as advisers to Governments, who have, in some cases, all around the world, applied fairly severe lockdowns for a period of several months in most cases.

In particular the work of the Imperial College Covid response team, and also the London School of Hygiene and Tropical Medicine have been at the forefront of advice to the UK Government.

Some Governments have taken a different approach (Sweden stands out in Europe in this regard, for several reasons).

I am keen to understand the differences, or otherwise, in such approaches.

Michael chooses to publish his work on Twitter (owing to a glitch (at least for a time) with his Stanford University laboratory‘s own publishing process. There are many useful links there to his work.

My own succession of blog posts (all more narrowly focused on the UK) have been automatically published to Twitter (a setting I use in WordPress) and also, more actively, shared by me on my FaceBook page.

But I stopped using Twitter routinely a long while ago (after 8000+ posts) because, in my view, it is a limited communication medium (despite its reach), not allowing much room for nuanced posts. It attracts extremism at worst, conspiracy theorists to some extent, and, as with a lot of published media, many people who choose on a “confirmation bias” approach to read only what they think they might agree with.

One has only to look at the thread of responses to Michael’s Twitter links to his forecasting results and opinions to see examples of all kinds of Twitter users: some genuinely academic and/or thoughtful; some criticising the lack of published forecasting methods, despite frequent posts, although they have now appeared as a preprint here; many advising to watch out (often in extreme terms) for “big brother” government when governments ask or require their populations to take precautions of various kinds; and others simply handclapping, because they think that the message is that this all might go away without much action on their part, some of them actively calling for resistance even to some of the most trivial precautionary requests.

Preamble

One of the recent papers I have found useful in marshalling my thoughts on methodologies is this 2016 one by Gustavo Chowell, and it finally led me to calibrate the differences in principle between the SIR differential equation approach I have been using (but a 7-compartment model, not just three) and the curve-fitting approach.

I had been thinking of analogies to illustrate the differences (which I will come to later), but this 2016 Chowell paper, in particular, encapsulated the technical differences for me, and I summarise that below. The Sergio Alonso paper also covers this ground.

Categorization of modelling approaches

Gerard Chowell’s 2016 paper summarises modelling approaches as follows.

Phenomenological models

A dictionary definition – “Phenomenology is the philosophical study of observed unusual people or events as they appear without any further study or explanation.”

Chowell states that phenomenological approaches for modelling disease spread are particularly suitable when significant uncertainty clouds the epidemiology of an infectious disease, including the potential contribution of multiple transmission pathways.

In these situations, phenomenological models provide a starting point for generating early estimates of the transmission potential and generating short-term forecasts of epidemic trajectory and predictions of the final epidemic size.

Such methods include curve fitting, as used by Michael Levitt, where an equation (represented by a curve on a time-incidence graph (say) for the virus outbreak), with sufficient degrees of freedom, is used to replicate the shape of the observed data with the chosen equation and its parameters. Sigmoid and Gompertz functions (types of “logistics” or Richards functions) have been used for such fitting – they produce the familiar “S”-shaped curves we see for epidemics. The starting growth rate, the intermediate phase (with its inflection point) and the slowing down of the epidemic, all represented by that S-curve, can be fitted with the equation’s parametric choices (usually three or four).

A feature that some epidemic outbreaks share is that growth of the epidemic is not fully exponential, but is “sub-exponential” for a variety of reasons, and Chowell states that:

Previous work has shown that sub-exponential growth dynamics was a common phenomenon across a range of pathogens, as illustrated by empirical data on the first 3-5 generations of epidemics of influenza, Ebola, foot-and-mouth disease, HIV/AIDS, plague, measles and smallpox.”

Choices of appropriate parameters for the fitting function can allow such sub-exponential behaviour to be reflected in the chosen function’s fit to the reported data, and it turns out that the Gompertz function is more suitable for this than the Sigmoid function, as Michael Levitt states in his recent paper.

Once a curve-fit to reported data to date is achieved, the curve can be used to make forecasts about future case numbers.

Mechanistic and statistical models

Chowell states that “several mechanisms have been put forward to explain the sub-exponential epidemic growth patterns evidenced from infectious disease outbreak data. These include spatially constrained contact structures shaped by the epidemiological characteristics of the disease (i.e., airborne vs. close contact transmission model), the rapid onset of population behavior changes, and the potential role of individual heterogeneity in susceptibility and infectivity.

He goes on to say that “although attractive to provide a quantitative description of growth profiles, the generalized growth model (described earlier) is a phenomenological approach, and hence cannot be used to evaluate which of the proposed mechanisms might be responsible for the empirical patterns.

Explicit mechanisms can be incorporated into mathematical models for infectious disease transmission, however, and tested in a formal way. Identification and analysis of the impacts of these factors can lead ultimately to the development of more effective and targeted control strategies. Thus, although the phenomenological approaches above can tell us a lot about the nature of epidemic patterns early in an outbreak, when used in conjunction with well-posed mechanistic models, researchers can learn not only what the patterns are, but why they might be occurring.

On the Imperial College team’s planning website, they state that their forecasting models (they have several for different purposes, for just these reasons I guess) fall variously into the “Mechanistic” and “Statistical” categories, as follows.

Mechanistic model: Explicitly accounts for the underlying mechanisms of diseases transmission and attempt to identify the drivers of transmissibility. Rely on more assumptions about the disease dynamics.

Statistical model: Do not explicitly model the mechanism of transmission. Infer trends in either transmissibility or deaths from patterns in the data. Rely on fewer assumptions about the disease dynamics.

Mechanistic models can provide nuanced insights into severity and transmission but require specification of parameters – all of which have underlying uncertainty. Statistical models typically have fewer parameters. Uncertainty is therefore easier to propagate in these models. However, they cannot then inform questions about underlying mechanisms of spread and severity.

So Imperial College’s “statistical” description matches more to Chowell’s description of a phenomenological approach, although may not involve curve-fitting per se.

The SIR modelling framework, employing differential equations to represent postulated relationships and transitions between Susceptible, Infected and Recovered parts of the population (at its most simple) falls into this Mechanistic model category.

Chowell makes the following useful remarks about SIR style models.

The SIR model and derivatives is the framework of choice to capture population-level processes. The basic SIR model, like many other epidemiological models, begins with an assumption that individuals form a single large population and that they all mix randomly with one another. This assumption leads to early exponential growth dynamics in the absence of control interventions and susceptible depletion and greatly simplifies mathematical analysis (note, though, that other assumptions and models can also result in exponential growth).

The SIR model is often not a realistic representation of the human behavior driving an epidemic, however. Even in very large populations, individuals do not mix randomly with one another—they have more interactions with family members, friends, and coworkers than with people they do not know.

This issue becomes especially important when considering the spread of infectious diseases across a geographic space, because geographic separation inherently results in nonrandom interactions, with more frequent contact between individuals who are located near each other than between those who are further apart.

It is important to realize, however, that there are many other dimensions besides geographic space that lead to nonrandom interactions among individuals. For example, populations can be structured into age, ethnic, religious, kin, or risk groups. These dimensions are, however, aspects of some sort of space (e.g., behavioral, demographic, or social space), and they can almost always be modeled in similar fashion to geographic space“.

Here we begin to see the difference I was trying to identify between the curve-fitting approach and my forecasting method. At one level, one could argue that curve-fitting and SIR-type modelling amount to the same thing – choosing parameters that make the theorised data model fit the reported data.

But, whether it produces better or worse results, or with more work rather than less, SIR modelling seeks to understand and represent the underlying virus incubation period, infectivity, transmissibility, duration and related characteristics such as recovery and immunity (for how long, or not at all) – the why and how, not just the what.

The (nonlinear) differential equations are then solved numerically (rather than analytically with exact functions) and there does have to be some fitting to the initial known data for the outbreak (i.e. the history up to the point the forecast is being done) to calibrate the model with relevant infection rates, disease duration and recovery timescales (and death rates).

This makes it look similar in some ways to choosing appropriate parameters for any function (Sigmoid, Gompertz or General Logistics function (often three or four parameters)).

But the curve-fitting approach is reproducing an observed growth pattern (one might say top-down, or focused on outputs), whereas the SIR approach is setting virological and other behavioural parameters to seek to explain the way the epidemic behaves (bottom-up, or focused on inputs).

Metapopulation spatial models

Chowell makes reference to population-level models, formulations that are used for the vast majority of population based models that consider the spatial spread of human infectious diseases and that address important public health concerns rather than theoretical model behaviour. These are beyond my scope, but could potentially address concerns about indirect impacts of the Covid-19 pandemic.

a) Cross-coupled metapopulation models

These models, which have been used since the 1940s, do not model the process that brings individuals from different groups into contact with one another; rather, they incorporate a contact matrix that represents the strength or sum total of those contacts between groups only. This contact matrix is sometimes referred to as the WAIFW, or “who acquires infection from whom” matrix.

In the simplest cross-coupled models, the elements of this matrix represent both the influence of interactions between any two sub-populations and the risk of transmission as a consequence of those interactions; often, however, the transmission parameter is considered separately. An SIR style set of differential equations is used to model the nature, extent and rates of the interactions between sub-populations.

b) Mobility metapopulation models

These models incorporate into their structure a matrix to represent the interaction between different groups, but they are mechanistically oriented and do this by considering the actual process by which such interactions occur. Transmission of the pathogen occurs within sub-populations, but the composition of those sub-populations explicitly includes not only residents of the sub-population, but visitors from other groups.

One type of model uses a “gravity” approach for inter-population interactions, where contact rates are proportional to group size and inversely proportional to the distance between them.

Another type described by Chowell uses a “radiation” approach, which uses population data relating to home locations, and to job locations and characteristics, to theorise “travel to work” patterns, calculated using attractors that such job locations offer, influencing workers’ choices and resulting travel and contact patterns.

Transportation and mobile phone data can be used to populate such spatially oriented models. Again SIR-style differential equations are used to represent the assumptions in the model about between whom, and how the pandemic spreads.

Summary of model types

We see that there is a range of modelling methods, successively requiring more detailed data, but which seek increasingly to represent the mechanisms (hence “mechanistic” modelling) by which the virus might spread.

We can see the key difference between curve-fitting (what I called a surface level technique earlier) and the successively more complex models that seek to work from assumed underlying causations of infection spread.

An analogy (picking up on the word “surface” I have used here) might refer to explaining how waves in the sea behave. We are all aware that out at sea, wave behaviour is perceived more as a “swell”, somewhat long wavelength waves, sometimes of great height, compared with shorter, choppier wave behaviour closer to shore.

I’m not here talking about breaking waves – a whole separate theory is needed for those – René Thom‘s Catastrophe Theory – but continuous waves.

A curve fitting approach might well find a very good fit using trigonometric sine waves to represent the wavelength and height of the surface waves, even recognising that they can be encoded by depth of the ocean, but it would need an understanding of hydrodynamics, as described, for example, by Bernoulli’s Equation, to represent how and why the wavelength and wave height (and speed*) changes depending on the depth of the water (and some other characteristics).

(*PS remember that the water moves, pretty much, up and down, in an elliptical path for any fluid “particle”, not in the direction of travel of the observed (largely transverse) wave. The horizontal motion and speed of the wave is, in a sense, an illusion.)

There is a range of modelling methods, successively requiring more detailed data, from phenomenological (statistical and curve-fitting) methods, to those which seek increasingly to represent the mechanisms (hence “mechanistic”) by which the virus might spread.

We see the difference between curve-fitting and the successively more complex models that build a model from assumed underlying interactions, and causations of infection spread between parts of the population.

I do intend to cover the mathematics of curve fitting, but wanted first to be sure that the context is clear, and how it relates to what I have done already.

Models requiring detailed data about travel patterns are beyond my scope, but it is as well to set into context what IS feasible.

Setting an understanding of curve-fitting into the context of my own modelling was a necessary first step. More will follow.

References

I have found several papers very helpful on comparing modelling methods, embracing the Gompertz (and other) curve-fitting approaches, including Michaels Levitt’s own recent June 30th one, which explains his methods quite clearly.

Gerard Chowell’s 2016 paper on Mathematical model types September 2016

The Coronavirus Chronologies – Michael Levitt, 13th March 2020

COVID-19 Virus Epidemiological Model Alex de Visscher, Concordia University, Quebec, 22nd March 2020

Empiric model for short-time prediction of Covid-19 spreading , Sergio Alonso et al, Spain, 19th May 2020

Universality in Covid-19 spread in view of the Gompertz function Akira Ohnishi et al, Kyoto University) 22nd June 2020

Predicting the trajectory of any Covid-19 epidemic from the best straight line – Michael Levitt et al 30th June 2020

Categories

Some thoughts on the current UK Coronavirus position

Introduction

A couple of interesting articles on the Coronavirus pandemic came to my attention this week; a recent one in National Geographic on June 26th, highlighting a startling comparison, between the USA’s cases history, and recent spike in case numbers, with the equivalent European data, referring to an older National Geographic article, from March, by Cathleen O’Grady, referencing a specific chart based on work from the Imperial College Covid-19 Response team.

I noticed, and was interested in that reference following a recent interaction I had with that team, regarding their influential March 16th paper. It prompted more thought about “herd immunity” from Covid-19 in the UK.

Meanwhile, my own forecasting model is still tracking published data quite well, although over the last couple of weeks I think the published rate of deaths is slightly above other forecasts as well as my own.

The USA

The recent National Geographic article from June 26th, by Nsikan Akpan, is a review of the current situation in the USA with regard to the recent increased number of new confirmed Coronavirus cases. A remarkable chart at the start of that article immediately took my attention:

The thrust of the article concerned recommendations on public attitudes, activities and behaviour in order to reduce the transmission of the virus. Even cases per 100,000 people, the case rate, is worse and growing in the USA.

A link between this dire situation and my discussion below about herd immunity is provided by a reported statement in The Times by Dr Anthony Fauci, Director of the National Institute of Allergy and Infectious Diseases, and one of the lead members of the Trump Administration’s White House Coronavirus Task Force, addressing the Covid-19 pandemic in the United States.

If the take-up of the vaccine were 70%, and it were 70% effective, this would result in roughly 50% herd immunity (0.7 x 0.7 = 0.49).

If the innate characteristics of the the SARS-CoV-2 virus don’t change (with regard to infectivity and duration), and there is no other human-to-human infection resistance to the infection not yet understood that might limit its transmission (there has been some debate about this latter point, but this blog author is not a virologist) then 50% is unlikely to be a sufficient level of population immunity.

My remarks later about the relative safety of vaccination (eg MMR) compared with the relevant diseases themselves (Rubella, Mumps and Measles in that case) might not be supported by the anti-Vaxxers in the US (one of whose leading lights is the disgraced British doctor, Andrew Wakefield).

This is just one more complication the USA will have in dealing with the Coronavirus crisis. It is one, at least, that in the UK we won’t face to anything like the same degree when the time comes.

The UK, and implications of the Imperial College modelling

That article is an interesting read, but my point here isn’t really about the USA (worrying though that is), but about a reference the article makes to some work in the UK, at Imperial College, regarding the effectiveness of various interventions that have been or might be made, in different combinations, work reported in the National Geographic back on March 20th, a pivotal time in the UK’s battle against the virus, and in the UK’s decision making process.

This chart reminded me of some queries I had made about the much-referenced paper by Neil Ferguson and his team at Imperial College, published on March 16th, that seemed (with others, such as the London School of Hygiene and Infectious Diseases) to have persuaded the UK Government towards a new approach in dealing with the pandemic, in mid to late March.

The thrust of this National Geographic article, by Cathleen O’Grady, was that we will need “herd immunity” at some stage, even if the Imperial College paper of March 16th (and other SAGE Committee advice, including from the Scientific Pandemic Influenza Group on Modelling (SPI-M)) had persuaded the Government to enforce several social distancing measures, and by March 23rd, a combination of measures known as UK “lockdown”, apparently abandoning the herd immunity approach.

The UK Government said that herd immunity had never been a strategy, even though it had been mentioned several times, in the Government daily public/press briefings, by Sir Patrick Vallance (UK Chief Scientific Adviser (CSA)) and Prof Chris Whitty (UK Chief Medical Officer (CMO)), the co-chairs of SAGE.

The particular part of the 16th March Imperial College paper I had queried with them a couple of weeks ago was this table, usefully colour coded (by them) to allow the relative effectiveness of the potential intervention measures in different combinations to be assessed visually.

Why was it, I wondered, that in this chart (on the very last page of the paper, and referenced within it) the effectiveness of the three measures “CI_HQ_SD” in combination (home isolation of cases, household quarantine & large-scale general population social distancing) taken together (orange and yellow colour coding), was LESS than the effectiveness of either CI_HQ or CI_SD taken as a pair of interventions (mainly yellow and green colour coding)?

The explanation for this was along the following lines.

It’s a dynamical phenomenon. Remember mitigation is a set of temporary measures. The best you can do, if measures are temporary, is go from the “final size” of the unmitigated epidemic to a size which just gives herd immunity.

If interventions are “too” effective during the mitigation period (like CI_HQ_SD), they reduce transmission to the extent that herd immunity isn’t reached when they are lifted, leading to a substantial second wave. Put another way, there is an optimal effectiveness of mitigation interventions which is <100%.

That is CI_HQ_SDOL70 for the range of mitigation measures looked at in the report (mainly a green shaded column in the table above).

While, for suppression, one wants the most effective set of interventions possible.

All of this is predicated on people gaining immunity, of course. If immunity isn’t relatively long-lived (>1 year), mitigation becomes an (even) worse policy option.

Herd Immunity

The impact of very effective lockdown on immunity in subsequent phases of lockdown relaxation was something I hadn’t included in my own (single phase) modelling. My model can only (at the moment) deal with one lockdown event, with a single-figure, averaged intervention effectiveness percentage starting at that point. Prior data is used to fit the model. It has served well so far, until the point (we have now reached) at which lockdown relaxations need to be modelled.

But in my outlook, potentially, to modelling lockdown relaxation, and the potential for a second (or multiple) wave(s), I had still been thinking only of higher % intervention effectiveness being better, without taking into account that negative feedback to the herd immunity characteristic, in any subsequent more relaxed phase, other than through the effect of the changing comparative compartment sizes in the SIR-style model differential equations.

I covered the 3-compartment SIR model in my blog post on April 8th, which links to my more technical derivation here, and more complex models (such as the Alex de Visscher 7-compartment model I use in modified form, and that I described on April 14th) that are based on this mathematical model methodology.

In that respect, the ability for the epidemic to reproduce, at a given time “t” depends on the relative sizes of the infected (I) vs. the susceptible (S, uninfected) compartments. If the R (recovered) compartment members don’t return to the S compartment (which would require a SIRS model, reflecting waning immunity, and transitions from R back to the S compartment) then the ability of the virus to find new victims is reducing as more people are infected. I discussed some of these variations in my post here on March 31st.

My method might have been to reduce the % intervention effectiveness from time to time (reflecting the partial relaxation of some lockdown measures, as Governments are now doing) and reimpose it to a higher % effectiveness if and when the Rt (the calculated R value at some time t into the epidemic) began to get out of control. For example, I might relax lockdown effectiveness from 90% to 70% when Rt reached Rt<0.7, and increase again to 90% when Rt reached Rt>1.2.

This was partly owing to the way the model is structured, and partly to the lack of disaggregated data I would have available to me for populating anything more sophisticated. Even then, the mathematics (differential equations) of  the cyclical modelling was going to be a challenge.

In the Imperial College paper, which does model the potential for cyclical peaks (see below), the “trigger” that is used to switch on and off the various intervention measures doesn’t relate to Rt, but to the required ICU bed occupancy. As discussed above, the intervention effectiveness measures are a much more finely drawn range of options, with their overall effectiveness differing both individually and in different combinations. This is illustrated in the paper (a slide presented in the April 17th Cambridge Conversation I reported in my blog article on Model Refinement on April 22nd):

What is being said here is that if we assume a temporary intervention, to be followed by a relaxation in (some of) the measures, the state in which the population is left with regard to immunity at the point of change is an important by-product to be taken into account in selecting the (combination of) the measures taken, meaning that the optimal intervention for the medium/long term future isn’t necessarily the highest % effectiveness measure or combined set of measures today.

The phrase “herd immunity” has been an ugly one, and the public and press winced somewhat (as I did) when it was first used by Sir Patrick Vallance; but it is the standard term for what is often the objective in population infection situations, and the National Geographic articles are a useful reminder of that, to me at least.

The arithmetic of herd immunity, the R number and the doubling period

I covered the relevance and derivation of the R0 reproduction number in my post on SIR (Susceptible-Infected-Recovered) models on April 8th.

In the National Geographic paper by Cathleen O’Grady, a useful rule of thumb was implied, regarding the relationship between the herd immunity percentage required to control the growth of the epidemic, and the much-quoted R0 reproduction number, interpreted sometimes as the number of people (in the susceptible population) one infected person infects on average at a given phase of the epidemic. When Rt reaches one or less, at a given time t into the epidemic, so that one person is infecting one or fewer people, on average, the epidemic is regarded as having stalled and to be under control.

Herd immunity and R0

One example given was measles, which was stated to have a possible starting R0 value of 18, in which case almost everyone in the population needs to act as a buffer between an infected person and a new potential host. Thus, if the starting R0 number is to be reduced from 18 to Rt<=1, measles needs A VERY high rate of herd immunity – around 17/18ths, or ~95%, of people needing to be immune (non-susceptible). For measles, this is usually achieved by vaccine, not by dynamic disease growth. (Dr Fauci had mentioned over 95% success rate in the US previously for measles in the reported quotation above).

Similarly, if Covid-19, as seems to be the case, has a lower starting infection rate (R0 number) than measles, nearer to between 2 and 3 (2.5, say (although this is probably less than it was in the UK during March; 3-4 might be nearer, given the epidemic case doubling times we were seeing at the beginning*), then the National Geographic article says that herd immunity should be achieved when around 60 percent of the population becomes immune to Covid-19. The required herd immunity H% is given by H% = (1 – (1/2.5))*100% ~= 60%.

Whatever the real Covid-19 innate infectivity, or reproduction number R0 (but assuming R0>1 so that we are in an epidemic situation), the required herd immunity H% is given by:

H%=(1-(1/R0))*100%  (1)

(*I had noted that 80% was referenced by Prof. Chris Whitty (CMO) as loose talk, in an early UK daily briefing, when herd immunity was first mentioned, going on to mention 60% as more reasonable (my words). 80% herd immunity would correspond to R0=5 in the formula above.)

R0 and the Doubling time

As a reminder, I covered the topic of the cases doubling time TD here; and showed how it is related to R0 by the formula;

R0=d(loge2)/T (2)

where d is the disease duration in days.

Thus, as I said in that paper, for a doubling period TD of 3 days, say, and a disease duration d of 2 weeks, we would have R0=14×0.7/3=3.266.

If the doubling period were 4 days, then we would have R0=14×0.7/4=2.45.

As late as April 2nd, Matt Hancock (UK secretary of State for Health) was saying that the doubling period was between 3 and 4 days (although either 3 or 4 days each leads to quite different outcomes in an exponential growth situation) as I reported in my article on 3rd April. The Johns Hopkins comparative charts around that time were showing the UK doubling period for cases as a little under 3 days (see my March 24th article on this topic, where the following chart is shown.)

In my blog post of 31st March, I reported a BBC article on the epidemic, where the doubling period for cases was shown as 3 days, but for deaths it was between 2 and 3 days ) (a Johns Hopkins University chart).

Doubling time and Herd Immunity

Doubling time, TD(t) and the reproduction number, Rt can be measured at any time t during the epidemic, and their measured values will depend on any interventions in place at the time, including various versions of social distancing. Once any social distancing reduces or stops, then these measured values are likely to change – TD downwards and Rt upwards – as the virus finds it relatively easier to find victims.

Assuming no pharmacological interventions (e.g. vaccination) at such time t, the growth of the epidemic at that point will depend on its underlying R0 and duration d (innate characteristics of the virus, if it hasn’t mutated**) and the prevailing immunity in the population – herd immunity.

(**Mutation of the virus would be a concern. See this recent paper (not peer reviewed)

The doubling period TD(t) might, therefore, have become higher after a phase of interventions, and correspondingly Rt < R0, leading to some lockdown relaxation; but with any such interventions reduced or removed, the subsequent disease growth rate will depend on the interactions between the disease’s innate infectivity, its duration in any infected person, and how many uninfected people it can find – i.e. those without the herd immunity at that time.

These factors will determine the doubling time as this next phase develops, and bearing these dynamics in mind, it is interesting to see how all three of these factors – TD(t), Rt and H(t) – might be related (remembering the time dependence – we might be at time t, and not necessarily at the outset of the epidemic, time zero).

Eliminating R from the two equations (1) and (2) above, we can find:

H=1-TD/d(loge2) (3)

So for doubling period TD=3 days, and disease duration d=14 days, H=0.7; i.e. the required herd immunity H% is 70% for control of the epidemic. (In this case, incidentally, remember from equation (2) that R0=14×0.7/3=3.266.)

(Presumably this might be why Dr Fauci would settle for a 70-75% effective vaccine (the H% number), but that would assume 100% take-up, or, if less than 100%, additional immunity acquired by people who have recovered from the infection. But that acquired immunity, if it exists (I’m guessing it probably would) is of unknown duration. So many unknowns!)

For this example with 14 day infection period d, and exploring the reverse implications by requiring Rt to tend to 1 (so postulating in this way (somewhat mathematically pathologically) that the epidemic has stalled at time t) and expressing equation (2) as:

TD (t)= d(loge2)/Rt (4)

then we see that TD(t)= 14*loge(2) ~= 10 days, at this time t, for Rt~=1.

Thus a sufficiently long doubling period, with the necessary minimum doubling period depending on the disease duration d (14 days in this case), will be equivalent to the Rt value being low enough for the growth of the epidemic to be controlled – i.e. Rt <=1 – so that one person infects one or less people on average.

Confirming this, equation (3) tells us, for the parameters in this (somewhat mathematically pathological) example, that with TD(t)=10 and d=14,

H(t) = 1 – (10/14*loge(2)) ~= 1-1 ~= 0, at this time t.

In this situation, the herd immunity H(t) (at this time t) required is notionally zero, as we are not in epidemic conditions (Rt~=1). This is not to say that the epidemic cannot restart – it simply means that if these conditions are maintained, with Rt reducing to 1, and the doubling period being correspondingly long enough, possibly achieved through social distancing (temporarily), across whole or part of the population (which might be hard to sustain) then we are controlling the epidemic.

It is when the interventions are reduced, or removed altogether that the sufficiency of % herd immunity in the population will be tested, as we saw from the Imperial College answer to my question earlier. As they say in their paper:

Once interventions are relaxed (in the example in Figure 3, from September onwards), infections begin to rise, resulting in a predicted peak epidemic later in the year. The more successful a strategy is at temporary suppression, the larger the later epidemic is predicted to be in the absence of vaccination, due to lesser build-up of herd immunity.

Herd immunity summary

Usually herd immunity is achieved through vaccination (eg the MMR vaccination for Rubella, Mumps and Measles). It involves less risk than the symptoms and possible side-effects of the disease itself (for some diseases at least, if not for chicken-pox, for which I can recall parents hosting chick-pox parties to get it over and done with!)

The issue, of course, with Covid-19, is that no-one knows yet if such a vaccine can be developed, if it would be safe for humans, if it would work at scale, for how long it might confer immunity, and what the take-up would be.

Until a vaccine is developed, and until the duration of any CoVid-19 immunity (of recovered patients) is known, this route remains unavailable.

Hence, as the National Geographic article says, there is continued focus on social distancing, as an effective part of even a somewhat relaxed lockdown, to control transmission of the virus.

Is there an uptick in the UK?

All of the above context serves as a (lengthy) introduction to why I am monitoring the published figures at the moment, as the UK has been (informally as well as formally) relaxing some aspects of it lockdown, imposed on March 23rd, but with gradual changes since about the end of May, both in the public’s response and in some of the Government interventions.

My own forecasting model (based on the Alex de Visscher MatLab code, and my variations, implemented in the free Octave version of the MatLab code-base) is still tracking published data quite well, although over the last couple of weeks I think the published rate of deaths is slightly above other forecasts, as well as my own.

Worldometers forecast

The Worldometers forecast is showing higher forecast deaths in the UK than when I reported before – 47924 now vs. 43,962 when I last posted on this topic on June 11th:

My forecasts

The equivalent forecast from my own model still stands at 44,367 for September 30th, as can be seen from the charts below; but because we are still near the weekend, when the UK reported numbers are always lower, owing to data collection and reporting issues, I shall wait a day or two before updating my model to fit.

But having been watching this carefully for a few weeks, I do think that some unconscious public relaxation of social distancing in the fairer UK weather (in parks, on demonstrations and at beaches, as reported in the press since at least early June) might have something to do with a) case numbers, and b) subsequent numbers of deaths not falling at the expected rate. Here are two of my own charts that illustrate the situation.

In the first chart, we see the reported and modelled deaths to Sunday 28th June; this chart shows clearly that since the end of May, the reported deaths begin to exceed the model prediction, which had been quite accurate (even slightly pessimistic) up to that time.

In the next chart, I show the outlook to September 30th (comparable date to the Worldometers chart above) showing the plateau in deaths at 44,367 (cumulative curve on the log scale). In the daily plots, we can see clearly the significant scatter (largely caused by weekly variations in reporting at weekends) but with the daily deaths forecast to drop to very low numbers by the end of September.

I will update this forecast in a day or two, once this last weekend’s variations in UK reporting are corrected.

Categories

Current Coronavirus model forecast, and next steps

Introduction

This post covers the current status of my UK Coronavirus (SARS-CoV-2) model, stating the June 2nd position, and comparing with an update on June 3rd, reworking my UK SARS-CoV-2 model with 83.5% intervention effectiveness (down from 84%), which reduces the transmission rate to 16.5% of its pre-intervention value (instead of 16%), prior to the 23rd March lockdown.

This may not seem a big change, but as I have said before, small changes early on have quite large effects later. I did this because I see some signs of growth in the reported numbers, over the last few days, which, if it continues, would be a little concerning.

I sensed some urgency in the June 3rd Government update, on the part of the CMO, Chris Whitty (who spoke at much greater length than usual) and the CSA, Sir Patrick Vallance, to highlight the continuing risk, even though the UK Government is seeking to relax some parts of the lockdown.

They also mentioned more than once that the significant “R” reproductive number, although less than 1, was close to 1, and again I thought they were keen to emphasise this. The scientific and medical concern and emphasis was pretty clear.

These changes are in the context of quite a bit of debate around the science between key protagonists, and I begin with the background to the modelling and data analysis approaches.

Curve fitting and forecasting approaches

Curve-fitting approach

I have been doing more homework on Prof. Michael Levitt’s Twitter feed, where he publishes much of his latest work on Coronavirus. There’s a lot to digest (some of which I have already reported, such as his EuroMOMO work) and I see more methodology to explore, and also lots of third party input to the stream, including Twitter posts from Prof. Sir David Spiegelhalter, who also publishes on Medium.

I DO use Twitter, although a lot less nowadays than I used to (8.5k tweets over a few years, but not at such high rate lately); much less is social nowadays, and more is highlighting of my https://www.briansutton.uk/ blog entries.

Core to that work are Michael’s curve fitting methods, in particular regarding the Gompertz cumulative distribution function and the Change Ratio / Sigmoid curve references that Michael describes. Other functions are also available(!), such as The Richard’s function.

This curve-fitting work looks at an entity’s published data regarding cases and deaths (China, the Rest of the World and other individual countries were some important entities that Michael has analysed) and attempts to fit a postulated mathematical function to the data, first to enable a good fit, and then for projections into the future to be made.

This has worked well, most notably in Michael’s work in forecasting, in early February, the situation in China at the end of March. I reported this on March 24th when the remarkable accuracy of that forecast was reported in the press:

Forecasting approach

Approaching the problem from a slightly different perspective, my model (based on a model developed by Prof. Alex de Visscher at Concordia University) is a forecasting model, with my own parameters and settings, and UK data, and is currently matching death rate data for the UK, on the basis of Government reported “all settings” deaths.

The model is calibrated to fit known data as closely as possible (using key parameters such as those describing virus transmission rate and incubation period, and then solves the Differential Equations, describing the behaviour of the virus, to arrive at a predictive model for the future. No mathematical equation is assumed for the charts and curve shapes; their behaviour is constructed bottom-up from the known data, postulated parameters, starting conditions and differential equations.

The model solves the differential equations that represent an assumed relationship between “compartments” of people, including, but not necessarily limited to Susceptible (so far unaffected), Infected and Recovered people in the overall population.

I had previously explored such a generic SIR model, (with just three such compartments) using a code based on the Galbraith solution to the relevant Differential Equations. My following post article on the Reproductive number R0 was set in the context of the SIR (Susceptible-Infected-Recovered) model, but my current model is based on Alex’s 7 Compartment model, allowing for graduations of sickness and multiple compartment transition routes (although NOT with reinfection).

SEIR models allow for an Exposed but not Infected phase, and SEIRS models add a loss of immunity to Recovered people, returning them eventually to the Susceptible compartment. There are many such options – I discussed some in one of my first articles on SIR modelling, and then later on in the derivation of the SIR model, mentioning a reference to learn more.

Although, as Michael has said, the slowing of growth of SARS-CoV-2 might be because it finds it hard to locate further victims, I should have thought that this was already described in the Differential Equations for SIR related models, and that the compartment links in the model (should) take into account the effect of, for example, social distancing (via the effectiveness % parameter in my model). I will look at this further.

The June 2nd UK reported and modelled data

Here are my model output charts exactly up to, June 2nd, as of the UK Government briefing that day, and they show (apart from the last few days over the weekend) a very close fit to reported death data**. The charts are presented as a sequence of slides:

These charts all represent the same UK deaths data, but presented in slightly different ways – linear and log y-axes; cumulative and daily numbers; and to date, as well as the long term outlook. The current long term outlook of 42,550 deaths in the UK is within error limits of the the Worldometers linked forecast of 44,389, presented at https://covid19.healthdata.org/united-kingdom, but is not modelled on it.

**I suspected that my 84% effectiveness of intervention would need to be reduced a few points (c. 83.5%) to reflect a little uptick in the UK reported numbers in these charts, but I waited until midweek, to let the weekend under-reporting work through. See the update below**.

I will also be interested to see if that slight uptick we are seeing on the death rate in the linear axis charts is a consequence of an earlier increase in cases. I don’t think it will be because of the very recent and partial lockdown relaxations, as the incubation period of the SARS-CoV-2 virus means that we would not see the effects in the deaths number for a couple of weeks at the earliest.

I suppose, anecdotally, we may feel that UK public response to lockdown might itself have relaxed a little over the last two or three weeks, and might well have had an effect.

The periodic scatter of the reported daily death numbers around the model numbers is because of the reguar weekend drop in numbers. Reporting is always delayed over weekends, with the ground caught up over the Monday and Tuesday, typically – just as for 1st and 2nd June here.

A few numbers are often reported for previous days at other times too, when the data wasn’t available at the time, and so the specific daily totals are typically not precisely and only deaths on that particular day.

The cumulative charts tend to mask these daily variations as the cumulative numbers dominate small daily differences. This applies to the following updated charts too.

**June 3rd update for 83.5% intervention effectiveness

I have reworked the model for 83.5% intervention effectiveness, which reduces the transmission rate to 16.5% of its starting value, prior to 23rd March lockdown. Here is the equivalent slide set, as of 3rd June, one day later, and included in this post to make comparisons easier:

These charts reflect the June 3rd reported deaths at 39,728 and daily deaths on 3rd June of 359. The model long-term prediction is 44,397 deaths in this scenario, almost exactly the Worldometer forecast illustrated above.

We also see the June 3rd reported and modelled cumulative numbers matching, but we will have to watch the growth rate.

Concluding remarks

I’m not as concerned to model cases data as accurately, because the reported numbers are somewhat uncertain, collected as they are in different ways by four Home Countries, and by many different regions and entities in the UK, with somewhat different definitions.

My next steps, as I said, are to look at the Sigmoid and data fitting charts Michael uses, and compare the same method to my model generated charts.

*NB The UK Office for National Statistics (ONS) has been working on the Excess Deaths measure, amongst other data, including deaths where Covid-19 is mentioned on the death certificate, not requiring a positive Covid-19 test as the Government numbers do.

As of 2nd June, the Government announced 39369 deaths in its standard “all settings” – Hospitals, Community AND Care homes (with a Covid-19 test diagnosis) but the ONS are mentioning 62,000 Excess Deaths today. A little while ago, on the 19th May, the ONS figure was 55,000 Excess Deaths, compared with 35,341 for the “all settings” UK Government number. I reported that in my blog post https://www.briansutton.uk/?p=2302 in my EuroMOMO data analysis post.

But none of the ways of counting deaths is without its issues. As the King’s Fund says on their website, “In addition to its direct impact on overall mortality, there are concerns that the Covid-19 pandemic may have had other adverse consequences, causing an increase in deaths from other serious conditions such as heart disease and cancer.

“This is because the number of excess deaths when compared with previous years is greater than the number of deaths attributed to Covid-19. The concerns stem, in part, from the fall in numbers of people seeking health care from GPs, accident and emergency and other health care services for other conditions.

“Some of the unexplained excess could also reflect under-recording of Covid-19 in official statistics, for example, if doctors record other causes of death such as major chronic diseases, and not Covid-19. The full impact on overall and excess mortality of Covid-19 deaths, and the wider impact of the pandemic on deaths from other conditions, will only become clearer when a longer time series of data is available.”

Categories

Another perspective on Coronavirus – Prof. Michael Levitt

Owing to the serendipity of a contemporary and friend of mine at King’s College London, Andrew Ennis, wishing one of HIS contemporaries in Physics, Michael Levitt, a happy birthday on 9th May, and mentioning me and my Coronavirus modelling attempts in passing, I am benefiting from another perspective on Coronavirus from Michael Levitt.

The difference is that Prof. Michael Levitt is a Nobel laureate in 2013 in computational biosciences…and I’m not! I’m not a Fields Medal winner either (there is no Nobel Prize for Mathematics, the Fields Medal being an equivalently prestigious accolade for mathematicians). Michael is Professor of Structural Biology at the Stanford School of Medicine.

I did win the Drew Medal for Mathematics in my day, but that’s another (lesser) story!

Michael has turned his attention, since the beginning of 2020, to the Coronavirus pandemic, and had kindly sent me a number of references to his work, and to his other recent work in the field.

I had already referred to Michael in an earlier blog post of mine, following a Times report of his amazingly accurate forecast of the limits to the epidemic in China (in which he was taking a particular interest).

I felt it would be useful to report on the most recent of the links Michael sent me regarding his work, the interview given to Freddie Sayers of UnHerd at https://unherd.com/thepost/nobel-prize-winning-scientist-the-covid-19-epidemic-was-never-exponential/ reported on May 2nd. I have added some extracts from UnHerd’s coverage of this interview, but it’s better to watch the interview.

As UnHerd’s report says, “With a purely statistical perspective, he has been playing close attention to the Covid-19 pandemic since January, when most of us were not even aware of it. He first spoke out in early February, when through analysing the numbers of cases and deaths in Hubei province he predicted with remarkable accuracy that the epidemic in that province would top out at around 3,250 deaths.

“His observation is a simple one: that in outbreak after outbreak of this disease, a similar mathematical pattern is observable regardless of government interventions. After around a two week exponential growth of cases (and, subsequently, deaths) some kind of break kicks in, and growth starts slowing down. The curve quickly becomes ‘sub-exponential’.

UnHerd reports that he takes specific issue with the Neil Ferguson paper, that along with some others, was of huge influence with the UK Government (amongst others) in taking drastic action, moving away from a ‘herd immunity” approach to a lockdown approach to suppress infection transmission.

“In a footnote to a table it said, assuming exponential growth of 15% for six days. Now I had looked at China and had never seen exponential growth that wasn’t decaying rapidly.

“The explanation for this flattening that we are used to is that social distancing and lockdowns have slowed the curve, but he is unconvinced. As he put it to me, in the subsequent examples to China of South Korea, Iran and Italy, ‘the beginning of the epidemics showed a slowing down and it was very hard for me to believe that those three countries could practise social distancing as well as China.’ He believes that both some degree of prior immunity and large numbers of asymptomatic cases are important factors.

“He disagrees with Sir David Spiegelhalter’s calculations that the totem is around one additional year of excess deaths, while (by adjusting to match the effects seen on the quarantined Diamond Princess cruise ship, and also in Wuhan, China) he calculates that it is more like one month of excess death that is need before the virus peters out.

“He believes the much-discussed R0 is a faulty number, as it is meaningless without the time infectious alongside.” I discussed R0 and its derivation in my article about the SIR model and R0.

Interestingly, Prof Alex Visscher, whose original model I have been adapting for the UK, also calibrated his thinking, in part, by considering the effect of the Coronavirus on the captive, closed community on the Diamond Princess, as I reported in my Model Update on Coronavirus on May 8th.

The UnHerd article finishes with this quote: “I think this is another foul-up on the part of the baby boomers. I am a real baby boomer — I was born in 1947, I am almost 73 years old — but I think we’ve really screwed up. We’ve caused pollution, we’ve allowed the world’s population to increase threefold in my lifetime, we’ve caused the problems of global warming and now we’ve left your generation with a real mess in order to save a relatively small number of very old people.”

I suppose, as a direct contemporary, that I should apologise too.

There’s a lot more at the UnHerd site, but better to hear it directly from Michael in the video.

Categories

Model update for the latest UK Coronavirus numbers

Introduction and summary

This is a brief update to my UK model predictions in the light of a week’s published data regarding Covid-19 cases and deaths in all settings – hospitals, care homes and the community – rather than just hospitals and the community, as previously.

In order to get the best fit between the model and the published data, I have had to reduce the effectiveness of interventions (lockdown, social distancing, home working etc) from 85% last week ( in my post immediately following the Government change of reporting basis) to 84.1% at present.

This reflects the fact that care homes, new to the numbers, seem to influence the critical R0 number upwards on average, and it might be that R0 is between .7 and .9, which is uncomfortably near to 1. It is already higher in hospitals than in the community, but the care home figures in the last week have increased R0 on average. See my post on the SIR model and importance of R0 to review the meaning of R0.

Predicted cases are now at 2.8 million (not reflecting the published data, but an estimate of the underlying real cases) with fatalities at 42,000.

The Government have said that they are to sample people randomly in different settings (hospital, care homes and the community), and regionally, better to understand how the transmission rate, and the influence on the R0 reproductive number, differs in those settings, and also in different parts of the UK.

Ideally a model would forecast the pandemic growth on the basis of these individually, and then aggregate them, and I’m sure the Government advisers will be doing that. As for my model, I am adjusting overall parameters for the whole population on an average basis at this point.

Another model upgrade which has already been made by academics at Imperial College and at Harvard is to explore the cyclical behaviour of partial relaxations of the different lockdown components, to model the response of the pandemic to these (a probable increase in growth to some extent) and then a re-tightening of lockdown measures to cope with that, followed by another fall in transmission rates; and then repeating this loop into 2021 and 2022, showing a cyclical behaviour of the pandemic (excluding any pharmaceutical (e.g. vaccine and medicinal) measures). I covered this in my previous article on exit strategy.

This explains Government reluctance to promise any significant easing of lockdown in any specific timescales.

Current predictions

My UK model (based on the work of Prof. Alex Visscher at Concordia University in Montreal for other countries) is calibrated on the most accurate published data up to the lockdown date, March 23rd, which is the data on daily deaths in the UK.

Once that fit of the model to the known data has been achieved, by adjusting the assumed transmission rates, the data for deaths after lockdown – the intervention – is matched by adjusting parameters reflecting the assumed effectiveness of the intervention measures.

Data on cases is not so accurate by a long way, and examples from “captive” communities indicate that deaths vs. cases run at about 1.5% (e.g. the Diamond Princess cruise ship data).

The Italy experience also plays into this relationship between deaths and actual (as opposed to published) case numbers – it is thought that a) only a single figure percentage of people ever get tested (8% was Alex’s figure), and b) again in Italy, the death rate was probably higher than 1.5% because their health service couldn’t cope for a while, with insufficient ICU provision.

In the model, allowing for that 8%, a factor of 12.5 is applied to public total and active cases data, to reflect the likely under-reporting of case data, since there are relatively few tests.

In the model, once the fit to known data (particularly deaths to date) is made as close as possible, then the model is run over whatever timescale is desired, to look at its predictions for cases and deaths – at present a short-term forecast to June 2020, and a longer term outlook well into 2021, by when outcomes in the model have stabilised.

Model charts for deaths

The fit of the model here can be managed well, post lockdown, by adjusting the percentage effectiveness of the intervention measure, and this is currently set at 84.1%. This model predicts fatalities in the UK at 42,000. They are reported currently (8th May 2020) at 31241.

Model charts for cases

As we can see here, the fit for cases isn’t as good, but the uncertainty in case number reporting accuracy, owing to the low level of testing, and the variable experience from other countries such as Italy, means that this is an innately less reliable basis for forecasting. The model prediction for the outcome of UK case numbers is 2.8 million.

If testing, tracking and tracing is launched effectively in the UK, then this would enable a better basis for predictions for case numbers than we currently have.

Conclusions?!

I’m certainly not at a concluding stage yet. A more complex model is probably necessary to predict the situation, once variations to the current lockdown measures begin to happen, likely over the coming month or two in the first instance.

Models are being developed and released by research groups, such as that being developed by the RAMP initiative at https://epcced.github.io/ramp/

Academics from many institutions are involved, and I will take a look at the models being released to see if they address the two points I mentioned here: the variability of R0 across settings and geography, and the cyclical behaviour of the pandemic in response to lockdown variations.

At the least, perhaps, my current model might be enhanced to allow a time-dependent interv_success variable, instead of a constant lockdown effectiveness representation.

Categories

Re-modelling after changes to UK Coronavirus data collection and reporting

Change of reporting basis

The UK Government yesterday changed the reporting basis for Coronavirus numbers, retrospectively (since 6th March 2020) adding in deaths in the Care Home and and other settings, and also modifying the “Active Cases” to match, and so I have adjusted my model to match.

This historic information is more easily found on the Worldometer site; apart from current day numbers, it is harder to find the tabular data on the UK.gov site, and I guess Worldometers have a reliable web services feed from most national reporting web pages.

The increase in daily and cumulative deaths over the period contrasts with a slight reduction in daily active case numbers over the period.

Understanding the variations in epidemic parameters

With more resources, it would make sense to model different settings separately, and then combine them. If (as it is) the reproduction number R0<1 for the community, the population at large (although varying by location, environment etc), but higher in hospitals, and even higher in Care Homes, then these scenarios would have different transmission rates in the model, different effectiveness of counter-measures, and differences in several other parameters of the model(s). Today the CSA (Sir Patrick Vallance) stated that indeed, there is to be a randomised survey of people in different places (geographically) and situations (travel, work etc) to work out where the R-value is in different parts of the population.

But I have continued with the means at my disposal (the excellent basis for modelling in Alex Visscher’s paper that I have been using for some time).

Ultimately, as I said I my article at https://www.briansutton.uk/?p=1595, a multi-phase model will be needed (as per Imperial College and Harvard models illustrated here:-

and I am sure that it is the Imperial College version of this (by Neil Ferguson and his team) that will be to the forefront in that advice. The models looks at variations in policy regarding different aspects of the lockdown interventions, and the response of the epidemic to them. This leads to the cyclicity illustrated above.

In my model, the rate of deaths is the most accurately available data, (even though the basis for reporting it has just changed) and the model fit is based on that. I have incorporated that reporting update into the model.

Up to lockdown (March 23rd in the UK, day 51), an infection transmission rate k11 (rate of infection of previously uninfected people by those in the infected compartment) and a correction factor are used to get this fit for the model as close as possible prior to the intervention date. For example, k11 can be adjusted, as part of a combination of infection rates; k12 from sick (S) people, k13 from seriously sick (SS) people and k14 from recovering (B, better) people to the uninfected community (U). All of those sub-rates could be adjusted in the model, and taken together define the overall rate of transition of people from from Unifected to Infected.

After lockdown, the various interventions – social distancing, school and large event closures, restaurant and pub closures and all the rest – are represented by an intervention effectiveness percentage, and this is modified (as an average across all those settings I mentioned before) to get the fit of the model after the lockdown measures as close as possible to the reported data, up to the current date.

I had been using an intervention effectiveness of 90% latterly, as the UK community response to the Government’s advice has been pretty good.

But with the UK Government move to include data from other settings (particularly the Care Home setting) I have had to reduce that overall percentage to 85% (having modelled several options from 80% upwards) to match the increased reported historic death rate.

It is, of course, more realistic to include all settings in the reported numbers, and in fact my model was predicting on that basis at the start. Now we have a few more weeks of data, and all the reported data, not just some of it, I am more confident that my original forecast for 39,000 deaths in the UK (for this single phase outlook) is currently a better estimate than the update I made a week or so ago (with 90% intervention effectiveness) to 29,000 deaths in the Model Refinement article referred to above, when I was trying to fit just hospital deaths (having no other reference point at that time).

Here are the charts for 85% intervention effectiveness; two for the long term outlook, into 2021, and two up until today’s date (with yesterday’s data):

Another output would be for UK cases, and I’ll just summarise with these charts for all cases up until June 2020 (where the modelled case numbers just begin to level off in the model):-

As we can see, the fit here isn’t as good, but this also reflects the fact that the data is less certain than for deaths, and is collected in many different ways across the UK, in the four home countries, and in the conurbations, counties and councils that input to the figures. I will probably have to adjust the model again within a few days, but the outlook, long term, of the model is for 2.6 million cases of all types. We’ll see…

Outlook beyond the Lockdown – again

I’m modest about my forecasts, but the methodology shows me the kind of advice the Government will be getting. The behavioural “science” part of the advice (not in the model) taking the public “tiredness” into account, was the reason for starting partial lockdown later, wasn’t it?

It would be more of the same if we pause the wrong aspects of lockdown too early for these reasons. Somehow the public have to “get” the rate of infection point into their heads, and that you can be infecting people before you have symptoms yourself. The presentation of the R number in today’s Government update might help that awareness. My article on R0 refers

Neil Ferguson of Imperial College was publishing papers at least as far back as 2006 on the mitigation of flu epidemics by such lockdown means, modelling very similar non-pharmaceutical methods of controlling infection rates – social distancing, school closures, no public meetings and all the rest.  Here is the 2006 paper, just one of 188 publications over the years by Ferguson and his team.  https://www.nature.com/articles/nature04795

The following material is very recent, and, of course, focused on the current pandemic. https://www.imperial.ac.uk/…/Imperial-College-COVID19…

All countries would have been aware of this from the thinking around MERS, SARS and other outbreaks. We have a LOT of prepared models to fall back on.

As other commentators have said, Neil Ferguson has HUGE influence with the UK Government. I’m not sure how quickly UK scientists themselves were off the mark (as well as Government). We have moved from “herd immunity” and “flattening the curve” as mantras, to controlling the rate of infection by the measures we currently have in place, the type of lockdown that other countries were already using (in Europe, Italy did that two weeks before we did, although Government is saying that we did it earlier in the life of the epidemic here in the UK).

One or two advisory scientists have broken ranks (John Edmunds reported at https://www.reuters.com/…/special-report-johnson… ) on this to say that the various Committees should have been faster with their firm advice to Govenment. Who knows?

But what is clear from the public pronouncements is that Governments are now VERY aware of the issue of further peaks in the epidemic, and I would be very surprised to see rapid or significant change in the lockdown (it already allows some freedoms here in the UK, not there in some other countries, for example exercise outings once a day). I wouldn’t welcome being even more socially distanced than others, as a fit 70+ year-old person, through the policy of shielding, but if it has to be done, so be it.

Categories

The SIR model and importance of the R0 Reproductive Number

In the recent daily UK Government presentations, the R0 Reproductive Number has been mentioned a few times, and with good reason. Its value is as a commonly accepted measure of the propensity of an infectious disease outbreak to become an epidemic.

It turns out to be a relatively simple number to define, although working back from current data to calculate it is awkward if you don’t have the data. That’s my next task, from public data.

If R0 is below 1, then the epidemic will reduce and disappear. If it is greater than 1, then the epidemic grows.

The UK Government and its health advisers have made a few statements about it, and I covered these in an earlier post.

This is a more technical post, just to present a derivation of R0, and its consequences. I have used a few research papers to help with this, but the focus, brevity(!) and mistakes are mine (although I have corrected some in the sources).