Categories
Alex de Visscher Coronavirus Covid-19 Superspreader Worldometers

SARS-Cov-2 modelling situation report

Introduction

As we start September, the UK situation regarding Covid-19 cases and deaths has changed somewhat.

Since the UK Government re-assessed the way deaths data is collected and reported, the reported daily deaths resulting from Covid-19 infections have (thankfully) reduced to a very low level, as we see from the UK Government Covid-19 reporting website.

Cases, however, as we see from the Government chart on the right, have started to rise again, although for a number of reasons the impact on deaths has been less than before. Note that this chart plots people testing Covid-19 positive (daily and total to date) against time.

I have integrated this real-world UK reported data with my model data to assess what is happening.

Reporting changes for UK deaths

As I reported in my August 17th post, reported daily deaths in England had previously set no time limit between an individual’s positive test for Covid-19, and when that person died.

The three other home countries in the UK had already been applying a 28-day limit for this interval. It was felt that, for England, this lack of a limit on the time interval resulted in over-reporting of deaths from Covid-19. Even someone who had died in a road accident, say, would have been reported as a Covid-19 death if they had ever tested positive, and had then recovered from Covid-19, no matter how long before their death the positive test had occurred.

This adjustment to the reporting was applied retroactively in England for all reported daily deaths, which resulted in a cumulative reduction of c. 5,000 in the UK reported deaths to up to August 12th.

Case numbers and antibody testing

You can see from the following Chart 10 that the plateau for modelled cases is of the order of 3 million. This startling view is supported by a recent Imperial College antibody study reported by U.K. Government here.

I have applied a factor of 8.3 to the reported cases in Chart 10 to bring them into line with the modelled cases, owing to significant under-reporting of the number of UK cases (based on positive Covid-19 tests).

Modelled Cases & Deaths development since Feb 1st - Uninfected, Cumulative Deaths, Uninfected & Seriously Sick
Modelled Cases & Deaths development since Feb 1st – Uninfected, Cumulative Deaths, Uninfected & Seriously Sick

The reported cases (defined, as above, by UK Government as people who have had a positive Covid-19 test) are just 337,168 as at September 1st, as we see from the following chart 9.

Modelled vs Reported Compartment development - Uninfected, Cumulative Cases & Deaths. Modelled Uninfected, All Infected & Seriously Sick
Modelled vs Reported Compartment development – Uninfected, Cumulative Cases & Deaths. Modelled Uninfected, All Infected & Seriously Sick

Testing, antibodies and case counting

The four pillars of Covid-19 testing include a single pillar of antibody testing, although it isn’t clear exactly which class of antibody is being tested. Not all antibody tests are the same.

It is also the case that despite more than 16 million Covid-19 tests having been processed in the UK to date (September 1st), the great majority of people have never been tested.

The under-reporting of cases (defining cases as those who have ever had Covid-19) was, in effect, confirmed by the major antibody testing programme, led by Imperial College London, involving over 100,000 people, finding that just under 6% of England’s population – an estimated 3.4 million people – had antibodies to Covid-19, and were therefore likely previously to have had the virus, prior to the end of June.

Even my modelled cases are likely to be a little under-estimated, and some update to my model’s calculation of cases will be made shortly.

Quite apart from the definition and counting of cases, according to a recent report by The Times, referencing this article from the BMJ, results obtained from some antibody testing might well be under-estimated too.

Stephen Burgess, from the Medical Research Council Biostatistics Unit at Cambridge University, and one of the authors, said. “It’s possible that somebody could have antibodies present in their saliva but not in their blood and it’s possible that somebody could have one class of antibody but not another class of antibodies.”

In particular, most antibody tests do not look for a type of response called IgA antibodies, which are made in mucus — in the mouth, eyes and nose. “In certain respiratory diseases, it’s well-documented that it’s possible to beat the infection with an IgA response,” he said.

When scientists have tested for IgA as well as the standard IgG antibodies, they have on occasions found hugely different results. In Luxembourg, IgA were found in 11 per cent of people compared with 2 per cent who tested positive using more conventional tests.

Dr Burgess said that calibrating tests using people who had been more severely ill may mean that a lot of asymptomatic infections are being missed.

The full report is here.

The Times concludes that it’s possible that herd immunity is closer than we think, with regional variations.

Reported Cases and Deaths

The following slide presentation shows only reported data for the UK. With Tom Sutton’s help, I have managed to link his previously developed Worldometers scraping code, which interrogates the daily updated Worldometers site for the UK, to retrieve reported cases and deaths data, to populate my MatLab/Octave model for Coronavirus (originally developed by Prof Alex de Visscher at Concordia University, Montreal).

This allows me to plot both modelled forecast data and reported data on the same charts, plotted from from the Octave forecasting model.

  • Reported UK Deaths vs.Cases since Feb 15th 2020, log chart
  • Reported UK Deaths vs.Cases since Feb 15th 2020, linear chart
  • Reported UK Deaths since Feb 15th 2020, linear chart
  • Reported UK Cases and Deaths since Feb 15th 2020, dual axis, log deaths, linear cases
  • Reported UK Cases and Deaths since Feb 15th 2020, linear dual-axis chart
  • Reported UK Cases and Deaths since Feb 15th 2020, log chart

Chart 3 shows reported deaths plotted against cases, on a log chart, and shows the log curve for deaths flattening as cumulative cases (on the linear x-axis) increase over time, indicating that the ratio of deaths/cases is reducing. This can also be seen very clearly on the linear scaled Chart 4.

Chart 5 shows cumulative deaths over time on linear axes, exhibiting the typical S-curve for infectious diseases; as of September 1st, daily deaths in the UK are in single figures.

Chart 6 shows deaths on a log y-axis (left) and cases on a linear y-axis (right).

Chart 7 plots both deaths and cases on linear y-axes (left and right respectively) for more direct comparison, and again we see that recently, since about Day 110 ( June 1st), cases have increased proportionately much faster than deaths. This date is fairly close to the time that the UK started to ease its lockdown restrictions.

Finally, Chart 8, plotting both deaths and cases on the same log y-axis, shows the relative progression over nearly 200 days since the onset of the pandemic.

These different views clearly show the recent changes in the way the epidemic is playing out in the UK population. Bear in mind that reported cases need something like a factor of 10 applied to bring them to a realistic figure.

Evidence for the under-estimation of Cases

The Imperial College antibody study referenced above is also in line with the estimate made by Prof. Alex de Visscher, author of my original model code, that the number of cases is typically under-reported by a factor of 12.5 – i.e. that only c. 8% of cases are detected and reported, an estimate assessed in the early days for the Italian outbreak, at a time when “test and trace” wasn’t in place anywhere.

A further sanity check on my forecasted case numbers, relative to the forecasted number of deaths, would be the observed mortality from Covid-19, where this can be assessed.

A study by a London School of Hygiene & Tropical Medicine team carried out an analysis of the Covid-19 outbreak in the closed community of the Diamond Princess cruise ship in March 2020.

Adjusting for delay from confirmation-to-death, this paper estimated case and infection fatality ratios (CFR, IFR) for COVID-19 on the Diamond Princess ship as 2.3% (0.75%-5.3%) and 1.2% (0.38-2.7%) respectively. See the World Health Organisation (WHO) description of CFR & IFR here.

In broad terms, my model forecast of c. 42,000 deaths and up to 3 million cases would be a ratio of about 1.4%, and so the IFR relationship between the deaths and cases numbers in my charts seems reasonable.

(NB since we know that the risk of death from Covid-19 is higher in older people, and the age profile of cruise ship passengers is probably higher than average, the Diamond Princess percentages are at the high end of the spectrum.)

Reasons for the reducing deaths/cases ratio

Reported deaths per case are reducing significantly, because:

a) we are more aware of taking care of older people in Care Homes (and certainly not knowingly sending Covid-19 positive old folks to them), sadly lacking in the early days of the pandemic in many countries;

b) relatively more young people are being infected as compared with older people because they are the ones working and going out more, and they have lower mortality than older people;

c) we have some better experience and palliative treatments to help some people recover (eg Dexamethasone as described at https://www.sps.nhs.uk/articles/summary-of-covid-19-medicines-guidance-critical-care/); and

d) daily cases are increasing, rather than reducing, as deaths are.

This is covered in a very good article by Rowland Manthorpe, technology correspondent, and Isla Glaister, data editor of Sky News, whose reports I have read before. The article makes very clear the changes in the age-profile of cases from early March to the end of July.

UK weekly confirmed cases by age, published by Sky September 2nd 2020
UK weekly confirmed cases by age, published by Sky September 2nd 2020

Another view of this is from The Times on September 5th, data sourced from Public Health England (PHE);

UK weekly confirmed cases by age, published by The Times September 5th 2020
UK weekly confirmed cases by age, published by The Times September 5th 2020

and, more specifically, here is how the proportion of cases has shifted between under 40s and over 50s from March until September.

Changing age profile of Covid-19 cases, published by The Times September 6th
Changing age profile of Covid-19 cases, published by The Times September 6th

Issues for modelling presented by local spikes

Modelling the epidemic for the UK is now really difficult, as most cases having an impact on the UK national statistics are nearly all caused by local outbreaks, or spikes – what I call multiple super-spreader events. Although that isn’t quite the right description, these are being caused by behaviour such as lack of social distancing, and maybe erratic mask-wearing on flights returning to the UK with pre- and even post-diagnostic cases on board.

The super-spreader events in the early days in Italy (and in the UK) were caused by people, unknowingly and asymptomatically infected, returning to their home countries from overseas and infecting others.

The increasingly frequent recent events we are seeing are caused, it seems to me, by people who ought, nowadays, to have more awareness of the risks, and know better, compared to those in the early days.

What would be needed to model such events is good local data for each one, and some kind of model for how, when and how often, statistically, these events might occur (aircraft, pubs, clubs, demonstrations, illegal raves and all the rest). Possibly even religious gatherings and other such cultural (including sporting) gatherings have a role.

So modelling this bottom-up is difficult – but feasible, hopefully. In any case, what is needed at the moment is a time-dependent way of handling the infection risks, in the context of these events, the way that lockdown easing points have been introduced to the model.

Worldometers/IHME forecasts and charts

I might say that modelling only by curve fitting, top-down, is pretty incomplete in my view. Phenomenological methods forecast the future based on the past with no ability to model or reflect changes in intervention methods, public behaviour and responses; and I see no capability in the methodology to take super-spreader events into account.

This might be difficult for bottom-up mechanistic modelling, but it’s impossible for broad, country-based curve-fitting, as no link can be made from input changes in government measures, population responses and individual behaviour, to their influence on outcomes.

I covered the comparative phenomenological and mechanistic methods in my previous posts on July 14th and July 18th.

In the charts that follow, we see that forecasts are made for three scenarios: current projections; mandates easing; and universal masks.

To do this, as IHME (Institute for Health Metrics and Evaluation at the University of Washington, USA) say at the IHME FAQ (Frequently Asked Questions) page, Worldometers/IHME forecasts rely on both statistical and disease transmission models: “Our current model is not a disease transmission model. It is a hybrid model that combines both a statistical modeling approach and a disease transmission approach, leveraging the strengths of both types of models, and scaling the results of the disease transmission model to the results of the statistical model.

This enables them to calibrate outcomes based on three outbreak management scenarios.

Illustrating the point, I show the IHME forecast for the UK, followed by that for the USA . First the UK:

Worldometers forecast for the UK, with three scenarios and error bounds

It seems that IHME forecasts for the UK, linked to the Worldometers UK site, are based on a broader view of UK deaths, relating to those where Covid-19 is mentioned on the death certificate, as defined by the UK Office for National Statistics (ONS), but not necessarily cited as the cause of death.

This is even though the Worldometers current reporting charts themselves are consistent with UK Government reported data, which presents deaths in all settings (including hospitals, care homes and the community) but only when Covid-19 is cited as the cause of death.

These ONS and IHME numbers are higher than the UK Government (and Worldometers) statistic. The daily numbers I have been using, presented by the UK Government, continue to be based on the narrower definition – Covid-19 as the cause of death on the death certificate.

Nevertheless, my main point here isn’t about the absolute numbers, but about the forecasting scenarios. We can see that the IHME methodology allows for several forecasting scenarios – current projections based on the interventions currently in place; mandates easing; and universal mask-wearing.

The US IHME forecast is presented similarly:

Worldometers forecast for the USA, with three scenarios and error bounds

In the case of the USA, the numbers are far larger for a much bigger population, and at worst the numbers are staggering. The Covid-19 deaths, currently 187,770 on this chart, had already exceeded Michael Levitt’s well-publicised curve-fitting Twitter forecast made in mid-July, indicating that by August 25th the USA excess deaths will have reduced to a very low level, and that the USA experience of the pandemic would essentially be over, with 170,000 deaths. It seems he agrees that forecast, or at least the way he expressed it, was a mistake.

In the USA case, the numbers are far larger for a much bigger population, and at worst the numbers are staggering. The current 187,770 already exceeds Michael Levitt's well-known curve-fitting forecast made a month ago, indicating that by August 25th the USA excess deaths will have reduced to a very low level, and that the USA experience of the pandemic would essentially be over, with 170,000 deaths. It seems he agrees that forecast, or at least the way he expressed it, was a mistake.
Michael Levitt’s well-publicised curve-fitting Twitter forecast made in mid-July, indicating that by August 25th the USA excess deaths will have reduced to a very low level, and that the USA experience of the pandemic would essentially be over, with 170,000 deaths
Michael Levitt's statement that his estimate of 170,000 reported deaths made 11 July was 7K too low.
Michael Levitt’s statement that his estimate of 170,000 reported deaths made on 11th July was 7K too low.

See Michael’s new UnHerd interview with Freddie Sayers.

As for excess deaths, no measure is without its issues, and the problem there is that Covid-19 deaths will probably have replaced deaths from some other causes (people go out less, so there will be less road accident deaths, for example).

This means that excess deaths reducing to zero isn’t by any means a sufficient test that the SARS-Cov-2 pandemic is all over bar the shouting.

IHME can predict several scenarios, as for the UK, and at best they are predicting 288,381 deaths by the end of the year for the USA. At worst their number is over 600,000. I’m sure things wouldn’t be allowed to get to that.

But these kinds of scenarios for different potential interventions, in combinations, or when eased, just aren’t going to work with curve-fitting alone, where, given just 3 (or at best, 4) parameters to do a least-squares fit of a Cycloid, Gompertz or more general Richards / General Logistics curve to the reported data, any changes to Government interventions and/or public response (even nationally, let alone for local spikes) can’t be reflected. It’s a top-down view of reported data (however well-cleansed) not a bottom-up causation model with the ability to make variations to strategies for intervention.

Mechanistic modelling is hard to do, takes longer and is more expensive in computer time (especially when trying to cover many countries individually); that is where a broader helicopter top-down view from curve-fitting can help to get started. But curve-fitting is not an actionable model for deciding between intervention methods.

I covered these methods in my blog posts on July 14th and July 18th as I was sanity checking my own outlook on modelling methods as between mechanistic modelling (the broad type of the model I use) and phenomenological / statistical methods.

The Imperial College resources

As I have already reported in my blog post on July 18th, Imperial College (and others such as The London School of Hygiene and Tropical Medicine) use a variety of model types and data sources (as do IHME) spanning both mechanistic and statistical methods (which include phenomenological techniques) for forecasts at different levels of detail and over different periods. These are described at the Imperial College’s Medical Research Council MRC Global Infectious Disease Analysis website, where this chart is presented, describing their different methods:

COVID-19 planning tools
Epidemiological models use a combination of mechanistic and statistical approaches.

and they go on to describe the key characteristics of the approaches:

Mechanistic model: Explicitly accounts for the underlying mechanisms of diseases transmission and attempt to identify the drivers of transmissibility. Rely on more assumptions about the disease dynamics.

Statistical model: Do not explicitly model the mechanism of transmission. Infer trends in either transmissibility or deaths from patterns in the data. Rely on fewer assumptions about the disease dynamics.

Mechanistic models can provide nuanced insights into severity and transmission but require specification of parameters – all of which have underlying uncertainty. Statistical models typically have fewer parameters. Uncertainty is therefore easier to propagate in these models. However, they cannot then inform questions about underlying mechanisms of spread and severity.

The forecasts they have made, as you can see, just as the IHME forecasts do, rely on several methodologies.

The table I have shown before from the pivotal Imperial College modelling team March 16th paper:

PC=school and university closure, CI=home isolation of cases, HQ=household quarantine, SD=large-scale general population social distancing, SDOL70=social distancing of those over 70 years for 4 months (a month more than other interventions)
PC=school and university closure, CI=home isolation of cases, HQ=household quarantine, SD=large-scale general population social distancing, SDOL70=social distancing of those over 70 years for 4 months (a month more than other interventions)

shows the capability to model a range of Non Pharmaceutical Interventions (NPIs) alone or in different combinations to arrive at forecasts based on such strategies. I covered the NPI variations in some detail in my August 14th blog post, and the mechanistic, statistical and phenomenological approaches in my July 14th blog post and July 18th post.

Discussion

My UK model is tracking quite well after a small change in intervention effectiveness since March 23rd to reflect the retroactive August 12th Government changes in counting deaths, and a slight easing of lockdown on day 105 (May 17th). We see a lot happening here and in other countries, with travel restrictions and quarantining measures changing all the time. It is unlikely that countries will revert to large scale lockdowns.

This is partly because lockdown is seen by many to have done its job; partly because of its negative economic and social impacts; and partly because we know more about the effects of the individual interventions available. Mechanistic modelling methods help discriminate between the effects of the different interventions.

One of the key factors in the choice of interventions is on the basis of longer-term outcomes – the effect of actions taken today on future “herd” immunity of the population, which I covered in my July 31st blog post.

I mention again the influential March 16th Imperial College paper in this respect which, while published nearly 6 months ago, does give an insight into the complexity and capability of modelling methods and data sources and intervention discrimination available to Government advisers.

Modelling on an overall national basis will need some enhancement to cope with the large number of local “spikes” and other events that we have been seeing recently.

Concluding comments

There are reasons for concern – the possibility that current spikes in cases might lead to a major “wave” in the epidemic; that autumn isn’t too far away; and that influenza and other related diseases such as SARS-Cov-2 are more prevalent in the autumn/winter months.

The BBC have reported that the return of students to Universities in the UK is expected to lead to a high risk of increasing the rate of Covid-19 cases. We will see.

I leave it to the Sky News summary to express closing thoughts, and some optimism.

The fear among government scientists is that if the outbreak gets out of control among young people, it will eventually leak into the more vulnerable parts of the population. What might look like a divergence between cases and deaths is actually just a larger lag. To find the answer to that, the best places to look are France and Spain, where cases are rising fast, but deaths and hospitalisations are still low. But whatever happens, we should remember: this isn’t March all over again. We test so much more. We know so much more about treatment. And we all understand how to change our behaviour. That is cause for optimism as we face the next six months.

Categories
Coronavirus Covid-19 Imperial College Michael Levitt Reproductive Number

Phenomenology & Coronavirus – modelling and curve-fitting

Introduction

I have been wondering for a while how to characterise the difference in approaches to Coronavirus modelling of cases and deaths, between “curve-fitting” equations and the SIR differential equations approach I have been using (originally developed in Alex de Visscher’s paper this year, which included code and data for other countries such as Italy and Iran) which I have adapted for the UK.

Part of my uncertainty has its roots in being a very much lapsed mathematician, and part is because although I have used modelling tools before, and worked in some difficult area of mathematical physics, such as General Relativity and Cosmology, epidemiology is a new application area for me, with a wealth of practitioners and research history behind it.

Curve-fitting charts such as the Sigmoid and Gompertz curves, all members of a family of curves known as logistics or Richards functions, to the Coronavirus cases or deaths numbers as practised, notably, by Prof. Michael Levitt and his Stanford University team has had success in predicting the situation in China, and is being applied in other localities too.

Michael’s team have now worked out an efficient way of reducing the predictive aspect of the Gompertz function and its curves to a straight line predictor of reported data based on a version of the Gompertz function, a much more efficient use of computer time than some other approaches.

The SIR model approach, setting up an series of related differential equations (something I am more used to in other settings) that describe postulated mechanisms and rates of virus transmission in the human population (hence called “mechanistic” modelling), looks beneath the surface presentation of the epidemic cases and deaths numbers and time series charts, to model the growth (or otherwise) of the epidemic based on postulated characteristics of viral transmission and behaviour.

Research literature

In researching the literature, I have become familiar with some names that crop up or frequently in this area over the years.

Focusing on some familiar and frequently recurring names, rather than more recent practitioners, might lead me to fall into “The Trouble with Physics” trap (the tendency, highlighted by Lee Smolin in his book of that name, exhibited by some University professors to recruit research staff (“in their own image”) who are working in the mainstream, rather than outliers whose work might be seen as off-the-wall, and less worthy in some sense.)

In this regard, Michael Levitt‘s new work in the curve-fitting approach to the Coronavirus problem might be seen by others who have been working in the field for a long time as on the periphery (despite his Nobel Prize in Computational Biology and Stanford University position as Professor of Structural Biology).

His results (broadly forecasting, very early on, using his curve-fitting methods (he has used Sigmoid curves before, prior to the current Gompertz curves), a much lower incidence of the virus going forward, successfully so in the case of China) are in direct contrast to that of some some teams working as advisers to Governments, who have, in some cases, all around the world, applied fairly severe lockdowns for a period of several months in most cases.

In particular the work of the Imperial College Covid response team, and also the London School of Hygiene and Tropical Medicine have been at the forefront of advice to the UK Government.

Some Governments have taken a different approach (Sweden stands out in Europe in this regard, for several reasons).

I am keen to understand the differences, or otherwise, in such approaches.

Twitter and publishing

Michael chooses to publish his work on Twitter (owing to a glitch (at least for a time) with his Stanford University laboratory‘s own publishing process. There are many useful links there to his work.

My own succession of blog posts (all more narrowly focused on the UK) have been automatically published to Twitter (a setting I use in WordPress) and also, more actively, shared by me on my FaceBook page.

But I stopped using Twitter routinely a long while ago (after 8000+ posts) because, in my view, it is a limited communication medium (despite its reach), not allowing much room for nuanced posts. It attracts extremism at worst, conspiracy theorists to some extent, and, as with a lot of published media, many people who choose on a “confirmation bias” approach to read only what they think they might agree with.

One has only to look at the thread of responses to Michael’s Twitter links to his forecasting results and opinions to see examples of all kinds of Twitter users: some genuinely academic and/or thoughtful; some criticising the lack of published forecasting methods, despite frequent posts, although they have now appeared as a preprint here; many advising to watch out (often in extreme terms) for “big brother” government when governments ask or require their populations to take precautions of various kinds; and others simply handclapping, because they think that the message is that this all might go away without much action on their part, some of them actively calling for resistance even to some of the most trivial precautionary requests.

Preamble

One of the recent papers I have found useful in marshalling my thoughts on methodologies is this 2016 one by Gustavo Chowell, and it finally led me to calibrate the differences in principle between the SIR differential equation approach I have been using (but a 7-compartment model, not just three) and the curve-fitting approach.

I had been thinking of analogies to illustrate the differences (which I will come to later), but this 2016 Chowell paper, in particular, encapsulated the technical differences for me, and I summarise that below. The Sergio Alonso paper also covers this ground.

Categorization of modelling approaches

Gerard Chowell’s 2016 paper summarises modelling approaches as follows.

Phenomenological models

A dictionary definition – “Phenomenology is the philosophical study of observed unusual people or events as they appear without any further study or explanation.”

Chowell states that phenomenological approaches for modelling disease spread are particularly suitable when significant uncertainty clouds the epidemiology of an infectious disease, including the potential contribution of multiple transmission pathways.

In these situations, phenomenological models provide a starting point for generating early estimates of the transmission potential and generating short-term forecasts of epidemic trajectory and predictions of the final epidemic size.

Such methods include curve fitting, as used by Michael Levitt, where an equation (represented by a curve on a time-incidence graph (say) for the virus outbreak), with sufficient degrees of freedom, is used to replicate the shape of the observed data with the chosen equation and its parameters. Sigmoid and Gompertz functions (types of “logistics” or Richards functions) have been used for such fitting – they produce the familiar “S”-shaped curves we see for epidemics. The starting growth rate, the intermediate phase (with its inflection point) and the slowing down of the epidemic, all represented by that S-curve, can be fitted with the equation’s parametric choices (usually three or four).

This chart was put up by Michael Levitt on July 8th to illustrate curve fitting methodology using the Gompertz function. See https://twitter.com/MLevitt_NP2013/status/1280926862299082754
Chart by Michael Levitt illustrating his Gompertz function curve fitting methodology

A feature that some epidemic outbreaks share is that growth of the epidemic is not fully exponential, but is “sub-exponential” for a variety of reasons, and Chowell states that:

Previous work has shown that sub-exponential growth dynamics was a common phenomenon across a range of pathogens, as illustrated by empirical data on the first 3-5 generations of epidemics of influenza, Ebola, foot-and-mouth disease, HIV/AIDS, plague, measles and smallpox.”

Choices of appropriate parameters for the fitting function can allow such sub-exponential behaviour to be reflected in the chosen function’s fit to the reported data, and it turns out that the Gompertz function is more suitable for this than the Sigmoid function, as Michael Levitt states in his recent paper.

Once a curve-fit to reported data to date is achieved, the curve can be used to make forecasts about future case numbers.

Mechanistic and statistical models

Chowell states that “several mechanisms have been put forward to explain the sub-exponential epidemic growth patterns evidenced from infectious disease outbreak data. These include spatially constrained contact structures shaped by the epidemiological characteristics of the disease (i.e., airborne vs. close contact transmission model), the rapid onset of population behavior changes, and the potential role of individual heterogeneity in susceptibility and infectivity.

He goes on to say that “although attractive to provide a quantitative description of growth profiles, the generalized growth model (described earlier) is a phenomenological approach, and hence cannot be used to evaluate which of the proposed mechanisms might be responsible for the empirical patterns.

Explicit mechanisms can be incorporated into mathematical models for infectious disease transmission, however, and tested in a formal way. Identification and analysis of the impacts of these factors can lead ultimately to the development of more effective and targeted control strategies. Thus, although the phenomenological approaches above can tell us a lot about the nature of epidemic patterns early in an outbreak, when used in conjunction with well-posed mechanistic models, researchers can learn not only what the patterns are, but why they might be occurring.

On the Imperial College team’s planning website, they state that their forecasting models (they have several for different purposes, for just these reasons I guess) fall variously into the “Mechanistic” and “Statistical” categories, as follows.

COVID-19 planning tools
Imperial College models use a combination of mechanistic and statistical approaches.

Mechanistic model: Explicitly accounts for the underlying mechanisms of diseases transmission and attempt to identify the drivers of transmissibility. Rely on more assumptions about the disease dynamics.

Statistical model: Do not explicitly model the mechanism of transmission. Infer trends in either transmissibility or deaths from patterns in the data. Rely on fewer assumptions about the disease dynamics.

Mechanistic models can provide nuanced insights into severity and transmission but require specification of parameters – all of which have underlying uncertainty. Statistical models typically have fewer parameters. Uncertainty is therefore easier to propagate in these models. However, they cannot then inform questions about underlying mechanisms of spread and severity.

So Imperial College’s “statistical” description matches more to Chowell’s description of a phenomenological approach, although may not involve curve-fitting per se.

The SIR modelling framework, employing differential equations to represent postulated relationships and transitions between Susceptible, Infected and Recovered parts of the population (at its most simple) falls into this Mechanistic model category.

Chowell makes the following useful remarks about SIR style models.

The SIR model and derivatives is the framework of choice to capture population-level processes. The basic SIR model, like many other epidemiological models, begins with an assumption that individuals form a single large population and that they all mix randomly with one another. This assumption leads to early exponential growth dynamics in the absence of control interventions and susceptible depletion and greatly simplifies mathematical analysis (note, though, that other assumptions and models can also result in exponential growth).

The SIR model is often not a realistic representation of the human behavior driving an epidemic, however. Even in very large populations, individuals do not mix randomly with one another—they have more interactions with family members, friends, and coworkers than with people they do not know.

This issue becomes especially important when considering the spread of infectious diseases across a geographic space, because geographic separation inherently results in nonrandom interactions, with more frequent contact between individuals who are located near each other than between those who are further apart.

It is important to realize, however, that there are many other dimensions besides geographic space that lead to nonrandom interactions among individuals. For example, populations can be structured into age, ethnic, religious, kin, or risk groups. These dimensions are, however, aspects of some sort of space (e.g., behavioral, demographic, or social space), and they can almost always be modeled in similar fashion to geographic space“.

Here we begin to see the difference I was trying to identify between the curve-fitting approach and my forecasting method. At one level, one could argue that curve-fitting and SIR-type modelling amount to the same thing – choosing parameters that make the theorised data model fit the reported data.

But, whether it produces better or worse results, or with more work rather than less, SIR modelling seeks to understand and represent the underlying virus incubation period, infectivity, transmissibility, duration and related characteristics such as recovery and immunity (for how long, or not at all) – the why and how, not just the what.

The (nonlinear) differential equations are then solved numerically (rather than analytically with exact functions) and there does have to be some fitting to the initial known data for the outbreak (i.e. the history up to the point the forecast is being done) to calibrate the model with relevant infection rates, disease duration and recovery timescales (and death rates).

This makes it look similar in some ways to choosing appropriate parameters for any function (Sigmoid, Gompertz or General Logistics function (often three or four parameters)).

But the curve-fitting approach is reproducing an observed growth pattern (one might say top-down, or focused on outputs), whereas the SIR approach is setting virological and other behavioural parameters to seek to explain the way the epidemic behaves (bottom-up, or focused on inputs).

Metapopulation spatial models

Chowell makes reference to population-level models, formulations that are used for the vast majority of population based models that consider the spatial spread of human infectious diseases and that address important public health concerns rather than theoretical model behaviour. These are beyond my scope, but could potentially address concerns about indirect impacts of the Covid-19 pandemic.

a) Cross-coupled metapopulation models

These models, which have been used since the 1940s, do not model the process that brings individuals from different groups into contact with one another; rather, they incorporate a contact matrix that represents the strength or sum total of those contacts between groups only. This contact matrix is sometimes referred to as the WAIFW, or “who acquires infection from whom” matrix.

In the simplest cross-coupled models, the elements of this matrix represent both the influence of interactions between any two sub-populations and the risk of transmission as a consequence of those interactions; often, however, the transmission parameter is considered separately. An SIR style set of differential equations is used to model the nature, extent and rates of the interactions between sub-populations.

b) Mobility metapopulation models

These models incorporate into their structure a matrix to represent the interaction between different groups, but they are mechanistically oriented and do this by considering the actual process by which such interactions occur. Transmission of the pathogen occurs within sub-populations, but the composition of those sub-populations explicitly includes not only residents of the sub-population, but visitors from other groups.

One type of model uses a “gravity” approach for inter-population interactions, where contact rates are proportional to group size and inversely proportional to the distance between them.

Another type described by Chowell uses a “radiation” approach, which uses population data relating to home locations, and to job locations and characteristics, to theorise “travel to work” patterns, calculated using attractors that such job locations offer, influencing workers’ choices and resulting travel and contact patterns.

Transportation and mobile phone data can be used to populate such spatially oriented models. Again SIR-style differential equations are used to represent the assumptions in the model about between whom, and how the pandemic spreads.

Summary of model types

We see that there is a range of modelling methods, successively requiring more detailed data, but which seek increasingly to represent the mechanisms (hence “mechanistic” modelling) by which the virus might spread.

We can see the key difference between curve-fitting (what I called a surface level technique earlier) and the successively more complex models that seek to work from assumed underlying causations of infection spread.

An analogy (picking up on the word “surface” I have used here) might refer to explaining how waves in the sea behave. We are all aware that out at sea, wave behaviour is perceived more as a “swell”, somewhat long wavelength waves, sometimes of great height, compared with shorter, choppier wave behaviour closer to shore.

I’m not here talking about breaking waves – a whole separate theory is needed for those – René Thom‘s Catastrophe Theory – but continuous waves.

A curve fitting approach might well find a very good fit using trigonometric sine waves to represent the wavelength and height of the surface waves, even recognising that they can be encoded by depth of the ocean, but it would need an understanding of hydrodynamics, as described, for example, by Bernoulli’s Equation, to represent how and why the wavelength and wave height (and speed*) changes depending on the depth of the water (and some other characteristics).

(*PS remember that the water moves, pretty much, up and down, in an elliptical path for any fluid “particle”, not in the direction of travel of the observed (largely transverse) wave. The horizontal motion and speed of the wave is, in a sense, an illusion.)

Concluding comments

There is a range of modelling methods, successively requiring more detailed data, from phenomenological (statistical and curve-fitting) methods, to those which seek increasingly to represent the mechanisms (hence “mechanistic”) by which the virus might spread.

We see the difference between curve-fitting and the successively more complex models that build a model from assumed underlying interactions, and causations of infection spread between parts of the population.

I do intend to cover the mathematics of curve fitting, but wanted first to be sure that the context is clear, and how it relates to what I have done already.

Models requiring detailed data about travel patterns are beyond my scope, but it is as well to set into context what IS feasible.

Setting an understanding of curve-fitting into the context of my own modelling was a necessary first step. More will follow.

References

I have found several papers very helpful on comparing modelling methods, embracing the Gompertz (and other) curve-fitting approaches, including Michaels Levitt’s own recent June 30th one, which explains his methods quite clearly.

Gerard Chowell’s 2016 paper on Mathematical model types September 2016

The Coronavirus Chronologies – Michael Levitt, 13th March 2020

COVID-19 Virus Epidemiological Model Alex de Visscher, Concordia University, Quebec, 22nd March 2020

Empiric model for short-time prediction of Covid-19 spreading , Sergio Alonso et al, Spain, 19th May 2020

Universality in Covid-19 spread in view of the Gompertz function Akira Ohnishi et al, Kyoto University) 22nd June 2020

Predicting the trajectory of any Covid-19 epidemic from the best straight line – Michael Levitt et al 30th June 2020

Categories
Coronavirus Covid-19 Michael Levitt Reproductive Number Uncategorized

Current Coronavirus model forecast, and next steps

Introduction

This post covers the current status of my UK Coronavirus (SARS-CoV-2) model, stating the June 2nd position, and comparing with an update on June 3rd, reworking my UK SARS-CoV-2 model with 83.5% intervention effectiveness (down from 84%), which reduces the transmission rate to 16.5% of its pre-intervention value (instead of 16%), prior to the 23rd March lockdown.

This may not seem a big change, but as I have said before, small changes early on have quite large effects later. I did this because I see some signs of growth in the reported numbers, over the last few days, which, if it continues, would be a little concerning.

I sensed some urgency in the June 3rd Government update, on the part of the CMO, Chris Whitty (who spoke at much greater length than usual) and the CSA, Sir Patrick Vallance, to highlight the continuing risk, even though the UK Government is seeking to relax some parts of the lockdown.

They also mentioned more than once that the significant “R” reproductive number, although less than 1, was close to 1, and again I thought they were keen to emphasise this. The scientific and medical concern and emphasis was pretty clear.

These changes are in the context of quite a bit of debate around the science between key protagonists, and I begin with the background to the modelling and data analysis approaches.

Curve fitting and forecasting approaches

Curve-fitting approach

I have been doing more homework on Prof. Michael Levitt’s Twitter feed, where he publishes much of his latest work on Coronavirus. There’s a lot to digest (some of which I have already reported, such as his EuroMOMO work) and I see more methodology to explore, and also lots of third party input to the stream, including Twitter posts from Prof. Sir David Spiegelhalter, who also publishes on Medium.

I DO use Twitter, although a lot less nowadays than I used to (8.5k tweets over a few years, but not at such high rate lately); much less is social nowadays, and more is highlighting of my https://www.briansutton.uk/ blog entries.

Core to that work are Michael’s curve fitting methods, in particular regarding the Gompertz cumulative distribution function and the Change Ratio / Sigmoid curve references that Michael describes. Other functions are also available(!), such as The Richard’s function.

This curve-fitting work looks at an entity’s published data regarding cases and deaths (China, the Rest of the World and other individual countries were some important entities that Michael has analysed) and attempts to fit a postulated mathematical function to the data, first to enable a good fit, and then for projections into the future to be made.

This has worked well, most notably in Michael’s work in forecasting, in early February, the situation in China at the end of March. I reported this on March 24th when the remarkable accuracy of that forecast was reported in the press:

The Times coverage on March 24th of Michael Levitt's accurate forecast for China
The Times coverage on March 24th of Michael Levitt’s accurate forecast for China

Forecasting approach

Approaching the problem from a slightly different perspective, my model (based on a model developed by Prof. Alex de Visscher at Concordia University) is a forecasting model, with my own parameters and settings, and UK data, and is currently matching death rate data for the UK, on the basis of Government reported “all settings” deaths.

The model is calibrated to fit known data as closely as possible (using key parameters such as those describing virus transmission rate and incubation period, and then solves the Differential Equations, describing the behaviour of the virus, to arrive at a predictive model for the future. No mathematical equation is assumed for the charts and curve shapes; their behaviour is constructed bottom-up from the known data, postulated parameters, starting conditions and differential equations.

The model solves the differential equations that represent an assumed relationship between “compartments” of people, including, but not necessarily limited to Susceptible (so far unaffected), Infected and Recovered people in the overall population.

I had previously explored such a generic SIR model, (with just three such compartments) using a code based on the Galbraith solution to the relevant Differential Equations. My following post article on the Reproductive number R0 was set in the context of the SIR (Susceptible-Infected-Recovered) model, but my current model is based on Alex’s 7 Compartment model, allowing for graduations of sickness and multiple compartment transition routes (although NOT with reinfection).

SEIR models allow for an Exposed but not Infected phase, and SEIRS models add a loss of immunity to Recovered people, returning them eventually to the Susceptible compartment. There are many such options – I discussed some in one of my first articles on SIR modelling, and then later on in the derivation of the SIR model, mentioning a reference to learn more.

Although, as Michael has said, the slowing of growth of SARS-CoV-2 might be because it finds it hard to locate further victims, I should have thought that this was already described in the Differential Equations for SIR related models, and that the compartment links in the model (should) take into account the effect of, for example, social distancing (via the effectiveness % parameter in my model). I will look at this further.

The June 2nd UK reported and modelled data

Here are my model output charts exactly up to, June 2nd, as of the UK Government briefing that day, and they show (apart from the last few days over the weekend) a very close fit to reported death data**. The charts are presented as a sequence of slides:

These charts all represent the same UK deaths data, but presented in slightly different ways – linear and log y-axes; cumulative and daily numbers; and to date, as well as the long term outlook. The current long term outlook of 42,550 deaths in the UK is within error limits of the the Worldometers linked forecast of 44,389, presented at https://covid19.healthdata.org/united-kingdom, but is not modelled on it.

**I suspected that my 84% effectiveness of intervention would need to be reduced a few points (c. 83.5%) to reflect a little uptick in the UK reported numbers in these charts, but I waited until midweek, to let the weekend under-reporting work through. See the update below**.

I will also be interested to see if that slight uptick we are seeing on the death rate in the linear axis charts is a consequence of an earlier increase in cases. I don’t think it will be because of the very recent and partial lockdown relaxations, as the incubation period of the SARS-CoV-2 virus means that we would not see the effects in the deaths number for a couple of weeks at the earliest.

I suppose, anecdotally, we may feel that UK public response to lockdown might itself have relaxed a little over the last two or three weeks, and might well have had an effect.

The periodic scatter of the reported daily death numbers around the model numbers is because of the reguar weekend drop in numbers. Reporting is always delayed over weekends, with the ground caught up over the Monday and Tuesday, typically – just as for 1st and 2nd June here.

A few numbers are often reported for previous days at other times too, when the data wasn’t available at the time, and so the specific daily totals are typically not precisely and only deaths on that particular day.

The cumulative charts tend to mask these daily variations as the cumulative numbers dominate small daily differences. This applies to the following updated charts too.

**June 3rd update for 83.5% intervention effectiveness

I have reworked the model for 83.5% intervention effectiveness, which reduces the transmission rate to 16.5% of its starting value, prior to 23rd March lockdown. Here is the equivalent slide set, as of 3rd June, one day later, and included in this post to make comparisons easier:

These charts reflect the June 3rd reported deaths at 39,728 and daily deaths on 3rd June of 359. The model long-term prediction is 44,397 deaths in this scenario, almost exactly the Worldometer forecast illustrated above.

We also see the June 3rd reported and modelled cumulative numbers matching, but we will have to watch the growth rate.

Concluding remarks

I’m not as concerned to model cases data as accurately, because the reported numbers are somewhat uncertain, collected as they are in different ways by four Home Countries, and by many different regions and entities in the UK, with somewhat different definitions.

My next steps, as I said, are to look at the Sigmoid and data fitting charts Michael uses, and compare the same method to my model generated charts.

*NB The UK Office for National Statistics (ONS) has been working on the Excess Deaths measure, amongst other data, including deaths where Covid-19 is mentioned on the death certificate, not requiring a positive Covid-19 test as the Government numbers do.

As of 2nd June, the Government announced 39369 deaths in its standard “all settings” – Hospitals, Community AND Care homes (with a Covid-19 test diagnosis) but the ONS are mentioning 62,000 Excess Deaths today. A little while ago, on the 19th May, the ONS figure was 55,000 Excess Deaths, compared with 35,341 for the “all settings” UK Government number. I reported that in my blog post https://www.briansutton.uk/?p=2302 in my EuroMOMO data analysis post.

But none of the ways of counting deaths is without its issues. As the King’s Fund says on their website, “In addition to its direct impact on overall mortality, there are concerns that the Covid-19 pandemic may have had other adverse consequences, causing an increase in deaths from other serious conditions such as heart disease and cancer.

“This is because the number of excess deaths when compared with previous years is greater than the number of deaths attributed to Covid-19. The concerns stem, in part, from the fall in numbers of people seeking health care from GPs, accident and emergency and other health care services for other conditions.

“Some of the unexplained excess could also reflect under-recording of Covid-19 in official statistics, for example, if doctors record other causes of death such as major chronic diseases, and not Covid-19. The full impact on overall and excess mortality of Covid-19 deaths, and the wider impact of the pandemic on deaths from other conditions, will only become clearer when a longer time series of data is available.”

Categories
Coronavirus Covid-19 Michael Levitt

Michael Levitt’s analysis of European Covid-19 data

Introduction

I promised in an earlier blog post to present Prof. Michael Levitt’s analysis of Covid-19 data published on the EuroMOMO site for European health data over the last few years.

EuroMOMO

EuroMOMO is the European Mortality Monitoring Project. Based in Denmark, their website states that the overall objective of the original European Mortality Monitoring Project was to design a routine public health mortality monitoring system aimed at detecting and measuring, on a real-time basis, excess number of deaths related to influenza and other possible public health threats across participating European Countries. More is available here.

The Excess Deaths measure

We have heard a lot recently about using the measure of “excess deaths” (on an age related basis) as our own Office for National Statistics (ONS) work on establishing a more accurate measure of the impact of the Coronavirus (SARS-CoV-2) epidemic in the UK.

I think it is generally agreed that this is a better measure – a more complete one perhaps – than those currently used by the UK Government, and some others, because there is no argument about what and what isn’t a Covid-19 death. It’s just excess deaths over and above the seasonal, age related numbers for the geography, country or community concerned, attributing the excess to the novel Coronavirus SARS-CoV-2, the new kid on the block.

That attribution, though, might have its own different issues, such as the inclusion (or not) of deaths related to people’s reluctance to seek hospital help for other ailments, and other deaths arising from the indirect consequences of lockdown related interventions.

There is no disputing, however, that the UK Government figures for deaths have been incomplete from the beginning; they were updated a few weeks ago to include Care Homes on a retrospective and continuing basis (what they called “all settings”) but some reporting of the ONS figures has indicated that when the Government “all settings” figure was 35,341, as of 19th May, the overall “excess deaths” figure might have been as high as 55,000. Look here for more detail and updates direct from the ONS.

The UK background during March 2020

The four policy stages the UK Government initially announced in early March were: Containment, Delay, Research and Mitigate, as reported here. It fairly soon became clear (after the outbreak was declared a pandemic on March 11th by the WHO) that the novel Coronavirus SARS-CoV-2 could not be contained (seeing what was happening in Italy, and case numbers growing in the UK, with deaths starting to be recorded on 10th March (at that time only recorded as caused by Covid-19 with a positive test (in hospital)).

The UK Government have since denied that “herd immunity” had been a policy, but it was mentioned several times in early March, pre-lockdown (which was March 23rd) by Government advisers Sir Patrick Vallance (Chief Scientific Adviser, CSA) and Prof. Chris Whitty (Chief Medical Officer, CMO), in the UK Government daily briefings, with even a mention of 60% population infection proportion to achieve it (at the same time as saying that 80% might be loose talk (my paraphrase)).

If herd immunity wasn’t a policy, it’s hard to understand why it was proactively mentioned by the CSA and CMO, at the same time as the repeated slogan Stay Home, Protect the NHS, Save Lives. This latter advice was intended to keep the outbreak within bounds that the NHS could continue to handle.

The deliberations of the SAGE Committee (Scientific Advisory Group for Emergencies) are not published, but senior advisers (including the CSA and CMO) sit on it, amongst many others (50 or so, not all scientists or medics). Given the references to herd immunity in the daily Government updates at that time, it’s hard to believe that herd immunity wasn’t at least regarded as a beneficial(?!) by-product of not requiring full lockdown at that time.

Full UK lockdown was announced on March 23rd; according to reports this was 9 days after it being accepted by the UK Government as inevitable (as a result of the 16th March Imperial College paper).

The Sunday Times newspaper (ST) published on 24th May 2020 dealt with their story of how the forecasters took charge at that time in mid-March as the UK Government allegedly dithered. The ST’s Insight team editor’s Tweet (Jonathan Calvert) and those of his deputy editor George Arbuthnott refer, as does the related Apple podcast.

Prof. Michael Levitt

Michael (a Nobel Laureate in Computational Biology in 2013) correctly forecast in February the potential extent of the Chinese outbreak (Wuhan in the Hubei province) at the end of March. I first reported this at my blog post on 24th March, as his work on China, and his amazingly accurate forecast, were reported that day here in the UK, which I saw in The Times newspaper.

On May 18th I reported in my blog further aspects of Michael’s outlook on the modelling by Imperial College, the London School of Hygiene and Tropical Medicine (and others) which he says, and I paraphrase his words, caused western countries to trash their economies through the blanket measures they have taken, frightened into alternative action (away from what seems to have been, at least in part, a “herd-immunity” policy) by the forecasts from their advisers’ models, reported as between 200,000 and 500,000 deaths in some publications.

Michael and I have been in direct touch since early May, when a mutual friend, Andrew Ennis, mentioned my Coronavirus modelling to him in his birthday wishes! We were all contemporaries at King’s College, London in 1964-67; they in Physics, and I in Mathematics.

I mentioned Michael’s work in a further, recent blog post on May 20th, when I mentioned his findings on the data at EuroMOMO, contrasting it with the Cambridge Conversation of 14th May, and that is when I said that I would post a blog article purely on his EurtoMOMO work, and this post is the delivery of that promise.

I have Michael’s permission (as do others who have received his papers) to publicise his recent EuroMOMO findings (his earlier work having been focused on China, as I have said, and then on the rest of the world).

He is senior Professor in Structural Biology at Stanford University School of Medicine, CA.

I’m reporting, and explaining a little (where possible!) Michael’s findings just now, rather than deeply analysing – I’m aware that he is a Nobel prize-winning data scientist, and I’m not (yet!) 😀

This blog post is therefore pretty much a recapitulation of his work, with some occasional explanatory commentary.

Michael’s EuroMOMO analysis

What follows is the content of several tweets published by Michael, at his account @MLevitt_NP2013, showing that in Europe, COVID19 is somewhat similar to the 2017/18 European Influenza epidemics, both in total number of excess deaths, and age ranges of these deaths.

Several other academics have also presented data that, whatever the absolute numbers, indicate that there is a VERY marked (“startling” was Prof. Sir David Spiegelhalter’s word) age dependency in the risk factors of dying from Covid-19. I return to that theme at the end of the post.

The EuroMOMO charts and Michael’s analysis

In summary, COVID19 Excess Deaths plateau at 153,006, 15% more than the 2017/18 Flu with similar age range counts. The following charts indicate the support for his view, including the correction of a large error Michael has spotted in one of the supporting EuroMOMO charts.

Firstly, here are the summary Excess Death Charts for all ages in 2018-20.

FIGURE 1. EuroMOMO excess death counts for calendar years 2018, 2019 & 2020

The excess deaths number for COVID19 is easily read as the difference between Week 19 (12 May ’20) and Week 8 (27 Feb ’20). The same is true of the 2018 part of the 2017/18 Influenza season. Getting the 2017 part of that season is harder. These notes are added to aid those interested in following the calculation, and hopefully help them in pointing out any errors.

The following EuroMOMO chart defines how excess deaths are measured.

FIGURE 2. EuroMOMO’s total and other categories of deaths

This is EuroMOMO’s Total (the solid blue line), Baseline (dashed grey line) and ‘Substantial increase’ (dashed red line) for years 2016 to the present. Green circles mark 2017/18 Flu and 2020 COVID-19. The difference between Total Deaths and Baseline Deaths is Excess Deaths.

Next, then, we see Michael’s own summary of the figures found from these earlier charts:

Table 3. Summary for 2020 COVID19 Season and 2017/18 Influenza Season.

Owing to baseline issues, we cannot estimate Age Range Mortality for the 2017 part of the Influenza season, so we base our analysis on the 2018 part, where data is available from EuroMOMO.

We see also the steep age dependency in deaths from under 65s to over 85s. I’ll present at the end of this post some new data on that aspect (it’s of personal interest too!)

Below we see EuroMOMO Excess Deaths from 2020 Week 8, now (on the 14th May) matching reported COVID Deaths @JHUSystems (Johns Hopkins University) perfectly (better than 2%). In earlier weeks the reported deaths were lower, but Michael isn’t sure why. But it allows him to do this in-depth analysis & comparison with EuroMOMO influenza data.

FIGURE 4. The weekly EuroMOMO Excess Deaths are read off their graphs by mouse-over.

The weekly reported COVID19 deaths are taken from the Johns Hopkins University Github repository. The good agreement is an encouraging sign of reliable data but there is a unexplained delay in EuroMOMO numbers.

Analysis of Europe’s Excess Deaths is hard: EuroMOMO provides beautiful plots, but extracting data requires hand-recorded mouse-overs on-screen*. COVID19 2020 – weeks 8-19, & Influenza 2018 – weeks 01-16 are relatively easy for all age ranges (totals 153,006 & 111,226). Getting the Dec. 2017 Influenza peak is very tricky.

(*My son, Dr Tom Sutton, has been extracting UK data from the Worldometers site for me, using a small but effective Python “scraping” script he developed. It is feasible, but much more difficult, to do this on the EuroMOMO site, owing to the vector coordinate definitions of the graphics, and Document Object Model they use for their charts.)

Figure 5. Deaths graphs from EurMoMo allow the calculation of Excess deaths

FIGURE 5. The Excess deaths for COVID19 in 2020 and for Influenza in 2018 are easily read off the EuroMOMO graphs by hand recording four mouse-overs.

The same is done for all different age ranges allowing accurate determination of the age range mortalities. For COVID19, there are 174,801 minus 21,795 = 153,006 Excess Deaths. For 2018 Influenza, the difference is 111,226 minus zero = 111,226 Excess Deaths.

Michael exposes an error in the EuroMOMO charts

In the following chart, it should be easy to calculate again, as mouse-over of the charts on the live EuroMOMO site gives two values a week: Actual death count & Baseline value.

Tests on the COVID19 peak gave a total of 127,062 deaths & not 153,006. Plotting a table & superimposing the real plot showed why. Baseline values are actually ‘Substantial increase’ values!! Wrong labelling?

Figure 6. Actual death count & Baseline value

In Figure 6, Excess Deaths can also be determined from the plots of Total and Baseline Deaths with week number. Many more numbers need to be recorded but the result would be the same.

TABLE 7. The pairs of numbers recorded from EuroMOMO between weeks 08 and 19

TABLE 7. The pairs of numbers recorded from EuroMOMO between weeks 08 and 19 of 2020 allow the Excess Deaths to be determined in a different way than from FIG. 5. The total Excess Deaths (127,062) should be the same as before (153,006) but it is not. Why? (Mislabelling of the EuroMOMO graph? What is “Substantial increase” anyway and why is it there? – BRS).

FIGURE 8. Analysing what is wrong with the EuroMOMO Excess Deaths count

FIGURE 8. The lower number in TABLE 7 is in fact not the Baseline Death value (grey dashed line) but the ‘Substantial increase’ value (red dashed line). Thus the numbers in the table are not Excess Deaths (Total minus Baseline level) but Total minus ‘Substantial increase’ level. The difference is found by adding 12×1981** to 127,062 to get 153,006. This means that the baseline is about 2000 deaths a week below the red line. This cannot be intended and is a serious error in EuroMOMO. Michael has been looking for someone to help him contact them? (**(153,006 – 127062)/12 = 25944/12 = 2162. So shouldn’t we be adding 12×2162, Michael? – BRS)

Reconciling the numbers, and age range data

Requiring the two COVID19 death counts to match means reducing the Baseline value by 23,774/12 = 1,981**. Mouse-over 2017 weeks 46 to 52 gave the table below. Negative Excess Deaths meant 2017 Influenza began Week 49 not 46. Michael tried to get Age Range data for 2017 but the table just uses 2018 Influenza data. (**see above also – same issue. Should be 25944/12 = 2162? – BRS)

TABLE 9. Estimating the Excess Deaths for the 2017 part of the 2017/18 influenza season

In TABLE 9, Michael tries to estimate the Excess Deaths for the 2017 part of the 2017/18 Influenza season by recording pairs of mouse-overs for seven weeks (46 to 52) and four age ranges. Because the Total Deaths are not always higher than the ‘Substantial increase’ base level, he uses differences as a sanity check. The red numbers for weeks 46 to 48 show that the Excess Deaths are negative and that the Influenza season did not start until week 49 of 2017.

TABLE 10. We try to combine the two parts of the 2017/18 Influenza season

TABLE 10 commentary. We try to combine the two parts of the 2017/18 Influenza season. The values for 2018 are straightforward as they are determined as shown in Fig. 5. For 2017, we need to use the values in Table 9 and add the baseline correction because the EuroMOMO mouse-overs are wrong, giving as they do the ‘Substantial increase’ value instead of the ‘Baseline’ value. We can use the same correction of 1981**(see my prior comments on this number – BRS) deaths per week as determined for all COVID19 data but we do not know what the correction is for other age ranges. An attempt to assume that the correction is proportional to the 2017 number of deaths in each age range gives strange age range mortalities.
Thus, we choose to use the total for 2017 (21,972) but give the age range mortalities just from the deaths in 2018, as the 2017 data is arcane, unreliable or flawed.

Michael’s concluding statement

COVID19 is similar to Influenza only in total and in age range excess mortality. Flu is a different virus, has a safe vaccine & is much less a threat to heroic medical professionals.

Additional note on the age dependency of Covid-19 risk

In my earlier blog post, reporting the second Cambridge Conversation webinar I attended, the following slide from Prof. Sir David Spiegelhalter was one that drew the sharp distinction between the risk to people in different age ranges:

Age related increase in Covid-19 death rates

Prof. Spiegelhalter’s own Twitter account is also quite busy, and this particular chart was mentioned there, and also on his blog.

This week I was sent this NHS pre-print paper (pending peer review, as many Coronavirus research papers are) to look at the various Covid-19 risk factors and their dependencies, and to explain them. The focus of the 20-page paper is the potential for enhanced risk for people with Type-1 or Type-2 Diabetes, but the Figure 2 towards the end of that paper shows the relative risk ratios for a number of other parameters too, including age range, gender, deprivation and ethnic group.

Risk ratios for different population characteristics

This chart extract, from the paper by corresponding author Prof. Jonathan Valabhji (Imperial College, London & NHS) and his colleagues, indicates a very high age-related dependency for Covid-19 risk, based on the age of the individual. The risk ratio for a white woman under 40, with no deprivation factors, and no diabetes, compared with a control person (a 60-69 year old white woman, with no deprivation factors, and no diabetes) is 1% of the risk. A white male under 40 with otherwise similar characteristics would have a risk of 1.94% of the control person.

Other reduction factors apply in the two 10-year age bands between 40-49 and 50-59, for a white woman (no deprivations or diabetes) in those age ranges of 11% and 36% of the risk respectively.

At 70-79, and above 80, the risk enhancement factors owing to age are x 2.63 and x 9.14 respectively.

So there is some agreement (at least on the principle of age dependency of risk, as represented by the data, if not the quantum), between EuroMOMO, Prof. Michael Levitt, Prof. Sir David Spiegelhalter and the Prof. Jonathan Valabhji et al. paper; that increasing age beyond middle age is a significant indicator of enhanced risk to Covid-19.

In some other respects, Michael is at odds with forecasts made by Prof. Neil Ferguson’s Imperial College group (and, by inference, also with the London School of Hygiene and Tropical Medicine) and with the analysis of the Imperial College paper by Prof. Spiegelhalter.

I reported this in my recent blog post on May 18th concerning the Cambridge Conversation of 14th May, highlighting the contrast with Michael’s interview with Freddie Sayers of UnHerd, which is available directly on YouTube at https://youtu.be/bl-sZdfLcEk.

I recommend going to the primary evidence and watching the videos in those posts.