Mobile maps: mapping live data with the help of mobile networks
Image credit: Telenor Norway
Mobile phone records could give us a window into everything from drug use and the spread of epidemics, to the reliability of networks needed to support the connected car.
In 2009, the German newspaper Die Zeit published an animated map of six months in the life of Malte Spitz, a Green Party politician, using his mobile call details records (CDRs). These are the logs of how, when, where and with whom he communicated, collected by his phone supplier for billing purposes.
By matching these records to mentions of Spitz’s political life on websites and blogs, Die Zeit was able to pinpoint the places the politician visited, the routes he took, how long he’d stayed in each location, and the people he’d texted and spoken to on the phone at the time.
Die Zeit and Spitz wanted to reveal how potentially intrusive the retention of CDRs can be in the way that they can track individuals from one cellular antenna tower to the next. Now, nearly a decade later, CDRs are being used to map the movements of millions of people at once.
These new population-scale movement-mapping techniques are based around anonymised aggregations of CDRs. Once user IDs and location data are irreversibly scrambled, the message is that CDRs become statistical tools that can be used for the good of society.
CDRs en-masse can show patterns of human ebb and flow across cities, countries, and even continents. Linked to other datasets about climate or disease, they can reveal the influence of humans on epidemics, pollution and more.
The testing lab for ‘social good’ projects has largely been the developing world, which is where most of the world’s five billion mobile phone owners live.
Over the last few years, studies have used CDRs to model the spread of dengue fever virus in Pakistan, ebola in West Africa, and malaria in Kenya, with encouraging results.
The Kenya malaria project was one of the first. A team from Harvard School of Public Health (HSPH) with seven other institutions mapped every call or text over a year (from June 2008 to June 2009) made by each of 15 million Kenyan mobile phone subscribers to one of 11,920 cell towers.
Malaria is caused by parasites transmitted through mosquito bites. It infects over 200 million people a year, and kills around half a million, mostly children in sub-Saharan Africa
Researchers were curious about how humans ‘importing’ infections might contribute to the transmission of parasites at distances beyond where mosquitoes might go.
Every time an individual left home (home in these studies is assumed to be near the antenna where most calls at night connect to), they calculated where they went and the length of the trip. They estimated the disease’s prevalence in each location with a 2009 malaria prevalence map from the Kenya Medical Research Institute (KEMRI) and the Malaria Atlas Project.
From this, they inferred each resident’s risk of being infected and the daily risk that visitors to particular areas would become infected.
It turned out that most infections carried by people end up in the capital Nairobi, after they have returned from malaria hotspots like Lake Victoria.
The conclusion of the team’s 2012 paper in the journal Science was that preventative schemes could target these volumes of human traffic between regions.
One of the most ambitious disease-mapping initiatives is under way in India using CDR data from 280 million people to understand and control the spread of TB. Tuberculosis killed 423,000 Indians in 2016, which is a third of the world’s TB death toll.
A team from the GSMA (a trade body that represents the interests of mobile network operators worldwide), the mobile phone firm Bharti Airtel and Be He@lthy, Be Mobile (a collaboration between the WHO and the ITU) will be mining the CDR data to identify potential TB hotspots. The government will use the data to target vaccinations and campaigns as part of its plan to eradicate TB in India by 2025.
The United Nations Foundation is a supporting partner, and mobile firms across the world are backing the initiative.
These include Bharti Airtel, Deutsche Telekom, Hutchison, KDDI, KT Corporation, Megafon, Millicom, MTS, NTT Docomo, Orange, Safaricom, SK Telecom, Telefónica, Telenet Telenor Group, Telia, Turkcell, Vodafone and Zain.
In Europe, the mobile operator Telenor is developing a predictive model of how seasonal flu spreads in Norway using phone records and case data from the Norwegian Institute of Public Health (NIPH).
According to WHO statistics, seasonal flu epidemics around the world cause three to five million cases of severe illness and kill 290,000 to 650,000 people each year.
“We want to understand how people have travelled in the past, and match this to case-study data of how flu has spread throughout Norway. Based on that information, we would like to simulate intervention strategies to see if we can reduce the spread of flu,” explains Kenth Engø-Monsen, a senior research scientist at Telenor Research in Norway.
A team at the University of Oslo will be building the computer model, which they will tune to predict the spread of flu during the actual flu season.
“If we can come up with reliable interventions that demonstrate in simulation that we can slow down the spread of flu, then the NIPH would be ready to implement these,” says Engø-Monsen.
Potentially, flu vaccinations could be targeted geographically, based on travel patterns, he says.
Environmental monitoring is another ‘social good’ application that is capturing the imagination of research teams and mobile companies.
A couple of years ago an MIT study, led by Marguerite Nyhan, changed our view of air pollution in cities.
Using CDR data from 8.5 million people in New York City, Nyhan and colleagues showed that the daily movements of people around 71 districts had a major influence on air quality. They were interested in small particles less than 2.5µm in diameter (PM2.5) associated with the worst health effects.
Comparing active population exposure with home exposure (which assumed a static population), they found that districts that contributed most to overall PM2.5 exposure were those with most residents. Districts with higher relative influence on exposure tended to be clustered in the areas where New Yorkers work and socialise (lower regions of Manhattan and centralised areas of Brooklyn and Queens).
Mobile operator Telefonica has taken this idea a step further in Sao Paulo, Brazil’s largest city, as part of the GSMA’s Big Data for Social Good Initiative.
Combining mobile phone data, machine learning and data from weather, air quality and traffic sensors, Telefonica is predicting air quality across the city up to 48 hours in advance.
It’s a cheaper alternative to air-quality monitoring that allows local government to take further preventive actions, says Jeanine Vos, head of the GSMA’s Big Data for Social Good Initiative. “Using this predictive approach, you can see areas of high air pollution and take preventative steps like rerouting traffic, and giving warnings to people with health conditions such as asthma.”
In a more unusual vein, scientists at the Norwegian Institute for Water Research (NIVA) have been using CDRs and sewage analysis to understand the extent of drug use in Oslo.
Everything humans eat and drink, including drugs (legal and illegal), leaves a chemical signature in sewage that can be measured at sewage treatment facilities.
‘The dynamic population shifts we can measure in this way are excellent.’
Population is the biggest uncertainty for these measurements because people take drugs on different days and the numbers of people in a sewage catchment area will vary.
“At NIVA, we had been studying Oslo quite intensively in terms of its drug use and I noticed as I sat on the Metro every day that far more people go in and out of the city on weekdays than weekends,” explains Kevin Thomas, director of the Queensland Alliance for Environmental Health Sciences at the University of Queensland and a research scientist at NIVA.
Thomas coordinates a network of laboratories across the world measuring drug use by taking sewage samples, as part of the international SCORE network.
In NIVA’s research they had seen big changes in drug and alcohol consumption at weekends, and Thomas was concerned that they were missing part of the picture.
It took Thomas three years to persuade a mobile phone company to agree to work on this. The project went ahead in 2016, using anonymous phone data from Telenor customers collected across a holiday period in June to July 2016.
They found that numbers of people in the sewage catchment area ebbed and flowed dramatically. 469,000 people, for example, were counted at 9am on 5 June and this had increased to 670,000 by 2pm the next day. Anyone tracking drug use at the time would have assumed there was a massive spike, when there were just more people.
Over the testing period, illicit drug use rose, with Ecstasy spiking at weekends. These results suggest that mobile data could help public health agencies, law enforcement and epidemiologists refine their understanding of drug use trends.
“The dynamic population shifts we can measure in this way are excellent,” says Thomas, who is now repeating the study over 12 months in Oslo as part of a European project looking at drugs markets.
In the future, says Thomas, these methods could be used to monitor the general health of a city, something he is researching in Australia.
“Sewage monitoring can tell us about people’s vitamin intake, alcohol intake, and you could look at very specific markers such as grain intake, how much fruit and vegetable [based on beta carotene levels], and exposure to pesticides, fire retardants and plastisisers. In fact all the chemicals you are exposed to in the home,” he elaborates.
The base stations connect your phone to other phone users or to the internet. They also generate the call detail records (CDRs) that log all communication activity for billing purposes.
CDRs detail the unique IDs of the caller and ‘callee’, and also those of their phones (international mobile equipment identity, or IMEI). Calls or SMS are time-stamped with a call-start and call-end, and the cell IDs and locations of the base stations involved are also logged. An extended version of these records, called xDR, also includes records of mobile data.
In urban environments, these details can locate individuals to an accuracy of around 200m, making it easy to track their approximate trajectory from tower to tower across a network.
By cross-referencing to road and rail maps, it is possible to infer more precisely users’ modes of transport and routes.
CDR data sets used in many of these social projects are based purely on SMS and call logs generated only when people use these services. For applications that need more frequent location updates (every 15 to 30 minutes), there are xDRs (extended Detail Records), which include all the signals generated by the transmission of data packages by smartphones.
“When you turn on your phone, you send a signal, when you turn it off, you send a signal, when you change your location, it sends a signal, and so on. Everything you do with your phone generates a signal, which is read by the cellular network and collected as the xDR,” explains Arturo Amador, a senior consultant at the ICT consultancy Acando, who has been involved in developing Telenor’s Mobility Analytics platform since 2015.
However, these larger data sets are ‘noisier’, making it harder to find signals of interest, points out Telenor’s Engø-Monsen.
“I’d prefer 10Mbytes or 100Mbytes of nicely curated mobility data on a daily resolution with a fairly good spatial resolution, than 20 terabytes of detailed browsing history,” he comments.
Commercial applications of movement monitoring in tourism, retail and transport are likely to benefit most from these larger (if noisier) sets of phone-use data.
A recent trial on roads around Dublin, carried out by Vodafone Ireland and the mobile analytics firm Cell Mining, is a good example of a transport application that uses frequent signal updates.
They were able to distinguish the fast-moving phone subscribers travelling in trains and cars using these records. By measuring the number of phone calls that were cut short or data sessions that dropped out along particular road or rail routes, they were able to create ‘mobile experience’ maps marking each cell site along a route in ‘traffic light’ colours, from red to green.
This trial has one eye to a future of autonomous cars when we will want to plan our journeys around roads with the best mobile network quality, instead of those that get us to our destinations the fastest.
Worries about privacy (bearing in mind many of these applications involve phone companies giving customer data to third parties) remains the single strongest limitation of this technology, which brings us back to Malte Spitz and his map.
The solution may be (as in Norway) to mandate that all processing is carried out on a secure platform on the mobile operator’s premises. “We cannot upload anything to the cloud and we cannot ship anything outside Norway,” explains Acando’s Arturo Amador.
Likewise, mobile companies should use state-of-the-art anonymisation techniques including hashing and encrypting all user identifiers and sensitive fields such as the location of antenna towers.
Path obfuscation is an additional privacy enhancer that adds random ‘noise’ to the location fields, according to Amador. “Instead of a person starting their journey in location A, it becomes A+delta where delta is a small random number. And instead of ending up in location B, it becomes B+delta. Across a population, the results contain enough ramdomness to protect privacy,” he explains.
If mobile operators get the privacy issues right, mapping population movements in this way could change the way governments and international agencies create and implement health and social policy.
Unlike surveys or censuses, which take years and cost millions, mobile movement maps can be generated quickly and cost-efficiently, and updated frequently, sometimes even in real time. Moreover, their diagnostic nature means that our mobile-tracked movements could become part of a series of huge policy feedback loops.
Exposure to air pollution in New York
Comparing active population exposure to air pollution with home exposure in New York. Districts with higher relative influence on exposure tended to be clustered in the areas where New Yorkers work and socialise (lower regions of Manhattan and centralised areas of Brooklyn and Queens).
Credit: ‘Exposure Track – The Impact of Mobile-Device-Based Mobility Patterns on Quantifying Population Exposure to Air Pollution’ by Marguerite Nyhan et al; Environ. Sci. Technol., 2016, 50 (17).
Spread of malaria in Kenya
Density maps showing sources (red) and sinks (blue) of human travel and total parasite movement in Kenya, where each settlement was designated as a relative source or sink based on yearly estimates.
(A): Travel sources and sinks. (B): Parasite sources and sinks.
Credit: Quantifying the Impact of Human Mobility on Malaria; Amy Wesolowski et al. Science 338, 267 (2012)
Mapping poverty with call data records
Neeti Pokhriyal at the University of New York at Buffalo has created ‘poverty maps’ in Senegal using CDRs from 9 million Sonatel customers. The thickness of the link indicates the volume of calls and texts exchanged between regions. Size of the circles indicates total incoming and outgoing calls and texts. The level of connectivity turns out to be a good proxy for the Multidimensional Poverty Index (MPI) for a region, which is a composite of 10 indicators.