visualisation of coronavirus

Coronavirus: Modelling an outbreak

Image credit: Centers for Disease Control and Prevention (CDC)

The emergence of nCoV-2019 is highlighting novel AI and ML techniques that can track and predict the spread of disease.

Predicting and tracking viral outbreaks is a growing business. Even Google has tried its hand at it, though unsuccessfully. The search giant continues to work with governments and healthcare, but the most significant growth in the space is now taking place among privately-funded artificial intelligence (AI) start-ups.

Among these, the one that has attracted most attention during the novel Coronovirus (nCoV-2019) outbreak in Wuhan, China is BlueDot. That is because it sent out an alert to private and public sector customers on 31 December that they should avoid the region. The World Health Organisation (WHO) did not release a similar warning until 9 January.

BlueDot and its rivals – such as Flowminder of Sweden and Metabiota of the US – are interesting in that they supplement medical data with that available from what they see as other relevant sources to fuel machine learning (ML) and forecasts.

For example, in identifying (and now tracking) nCoV-2019, BlueDot’s methodology builds in information about local demographics and movement, climate and global flight ticket sales (significantly, not just airline schedules).

Its AI also continuously scans news reports from recognised sources in 65 different languages, subjecting them to natural language processing and analysis that further inform its ML algorithms.

This, the company says, not only allowed it to get ahead of the game in identifying the Wuhan outbreak but also meant that it was able to predict correctly that the first signs of the virus’s migration outside China would appear in Bangkok, Thailand, Seoul, South Korea, Taipei, Taiwan and Tokyo, Japan.

However, perhaps the most important thing to note is that the more widely recognised modelling players – be their focus healthcare, insurance technology or corporate risk management – emphasise the need to have clinical professionals on board in senior positions.

Kamran Khan, BlueDot’s founder and CEO, is an MD who was inspired to set up the company because of his experiences in Toronto during the 2003 SARS outbreak. At Flowminder, which primarily uses mobile data to monitor population movements during an outbreak, the executive director is Linus Bengtsson, another MD and former clinical epidemiologist.

Appointments like these highlight two important issues.

The first is the question of bias in AI and machine learning generally. Google’s earlier problems here illustrate how this can go beyond ‘human’ biases that may be written into the algorithms that drive AI and ML.

The search company closed its Google Flu Trends (GFT) programme after missing the peak of the 2013 flu season when its prediction of the extent of the outbreak missed by 140 per cent. Subsequent research suggested that one factor in GFT’s failure was that between its launch in 2008 and its closure, Google introduced predictive search terms and their incorporation within the forecasts threw off the numbers.

Today, therefore, companies such as BlueDot argue – almost certainly, rightly – that their kind of AI modelling still needs a human-in-the-loop, a trained epidemiologist to review the output before they raise any kind of flag.

The second issue applies particularly to medicine. It is inherently resistant to what it sees as ‘outside’ technologies (a resistance that, for example, can be seen as one of the reasons for the failure of the NHS National Programme for IT (NPfIT) in the UK about a decade ago). This does not come down to any kind of ‘God complex’ among doctors but risk, and as one clinician said of the NPfIT fiasco, “the fundamental difficulties when one of the most analogue disciplines imaginable meets the digital”. More recently, the Theranos scandal has also seeded a great deal of scepticism.

One lesson that appears to have been drawn from these and other similar instances is that the use of AI, ML and other processing strategies within healthcare is likely to fail unless there are medical professionals leading the work at the executive level, during data processing and as evangelists. And, these companies must be able to publish peer-reviewed results in recognised medical journals such as The Lancet.

Considering that last goal, BlueDot’s work on nCoV-2019 joins an increasing body of evidence that AI-led outbreak modelling and tracking has demonstrable value. That company also claims to have been able to predict a 2016 outbreak of the Zika virus in Florida six months ahead of time and has previously published results on its analysis of how the Ebola virus spread out of West Africa in 2014.  Flowminder has been building its reputation since using mobile data to analyse the consequences of the 2010 earthquake in Haiti and subsequent cholera outbreak.

There is quantifiable progress. But then two other concerns arise.

As noted at the beginning, BlueDot was able to notify its clients ahead of the public alert by the WHO. The issue of who should be told and, arguably more important, when is an extremely thorny one – should the better disease warnings only be available on apparently a pay-to-play basis?

Certainly, BlueDot’s client base includes public healthcare organisations and most people would prefer to hear warnings from them rather than private companies – it’s their job. At the same time, the company says it is expanding its product range to include public data and alerts. Its rival, Metabiota, has already placed a freely available Epidemic Tracker online. Flowminder is a not-for-profit and describes its data as “global public goods” primarily targeted at governments, inter-governmental organisations and NGOs.

But, as the nCoV-2019 outbreak has already illustrated, there are good grounds here for the companies to be wary of the so-called ‘Chicken Little’ effect. What if they raise a flag and the sky does not fall? Moreover, the public’s ability to misinterpret even this kind of enhanced data profiling is considerable (which is largely where this series came in).

Then, there is the question of the data itself. Flowminder uses mobile data to good purpose but how much of our personal information might we be willing to give up for healthcare purposes as opposed to selling us goodies online. Answering that is not as simple as it sounds, given that systems such as these tend to become increasingly granular and one of their main supporters is the insurance industry. Privacy, privacy, privacy.

While healthcare organisations are clients and partners in initiatives across medical modelling and they are also supported by altruistic trusts such as The Bill and Melinda Gates Foundation, much of the real investment is coming from venture capital. Where’s the return coming from? What’s the exit strategy?

Success in the early-stage modelling and ongoing tracking of nCoV-2019 is obviously welcome. But it also raises serious issues for a fast-growing and fast-evolving part of the AI space that range from defining the most appropriate business model to fundamental ethics of communication. Right now, the world will take what it can get. But in the hope that nCoV-2019 is managed, there will then still be plenty to resolve.

The next part of this series looking at how digital technologies are being used to address the nCoV-2019 will look at drug discovery as medicine races to find treatments and ultimately a vaccine for the virus.

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles