‘Sitting on data is like sitting on an oilfield’: Mark Girolami, the Alan Turing Institute
Image credit: Nick Smith
Newly appointed chief scientist at the Alan Turing Institute in central London, Mark Girolami discusses why artificial intelligence has become ‘big news’ and how data-centric engineering is at its core.
“Artificial intelligence is very much an umbrella term,” says Mark Girolami. “When we say AI, what we’re really describing is a whole load of technologies that are characterised by three components at their core: data, computing and algorithms.” Girolami, who has taken up the post of the Alan Turing Institute’s first chief scientist, says that while there are plenty of people out there crossing over into philosophy and neural sciences “solving intelligence”, his approach to AI is based on these three inter-related parameters. A University of Cambridge academic, he also holds the Royal Academy of Engineering research chair in data-centric engineering. The two positions “feed off each other”, he says.
Girolami’s appointment at the Alan Turing Institute comes hot on the heels of the UK government publishing its National AI Strategy, a ten-year plan “to make the UK a global AI superpower”. What this means is that the 58-year-old British scientist has stepped into the role of the Turing’s chief scientist at a time described by the UK’s chief scientific adviser Sir Patrick Vallance as “a critical moment for UK science and technology, in particular AI and data science”. Girolami says the reason AI is “big news” now is that “there are a number of factors that have coalesced at one time”.
It’s important to place this in the context of “AI being discussed for many years”. Looking around the institute’s headquarters in London’s British Library, Girolami suggests that you can trace the term back to the mid-20th century mathematician Alan Turing, who famously raised questions such as whether machines could think. But now, there is a confluence of technologies that have taken the notion of AI out of the realms of abstract philosophy and into an area that is “actually useful”. Girolami isn’t having a swipe at philosophers: he’s drawing attention to the fact that “the lives of the average person in the street are now being affected by our ability to gather data on just about everything we do”.
He picks up his smartphone and explains that the amount of data it gathers “about me, where I go, what I do, what my interests are” is colossal. There was a time in a pre-digital landscape, he reflects, when data was in the hands of a few boffins doing weird and wonderful research “that didn’t have much to do with the real world. Now we can look at traffic flow in our cities, urban air quality. It goes on and on. We’re generating more data than ever: more than could have been imagined half a century ago. Also, our computing capabilities have shot up since Turing’s time.”
Given that AI algorithms rely on two everyday conditions of data and computing power, “AI is no longer restricted to laboratories of national importance. These two things mean that the AI algorithms of 30 years ago that plodded along without doing anything spectacular can now recognise patterns and make decisions at almost super-human performance levels. That’s why AI is big news today.” Girolami says that because AI is an umbrella term, you can include other emergent data-led phenomena under it, such as the Internet of Things (IoT), digital twins and Industry 4.0.
“They’re in the same family. They can all be traced back to the start of the internet. What did the internet do? It was the first wave in the way we changed our data capability from being able to view it on a local level to a global reach. Now with IoT, whole buildings are producing data: we can monitor energy efficiency, temperature distribution, occupancy levels and so on.” Yet the reason everyone knows about AI now, says Girolami, is that if you’ve got a smartphone, you’ve got AI in your pocket. From fitness monitoring to consumer behaviour patterns, “there’s pretty much nothing that hasn’t been impacted by the three fields of data, computing and algorithms. That all gets wrapped up in this term AI. The same goes for terms such as deep and machine learning.”
Girolami is keen to point out that today’s AI technologies transforming our lives and businesses have little to do with the horizon-scanning futurism enjoyed by middle-brow newspapers that continually inform us that robots will be taking our jobs. You can trace this back to the 1920s, says Girolami, who describes the scenario of Fritz Lang’s sci-fi movie ‘Metropolis’, in which the cybernetics take over the menial jobs. Yet that’s not what we mean by AI today (and it’s worth remembering that H G Wells thought that ‘Metropolis’ was “silly”). What we’re talking about, says Girolami, is “new technologies based on data-driven algorithms that exploit ubiquitous computing to solve real problems and to open up new markets and business opportunities”.
For all the popularity of the view that data is the bedrock of every technological megatrend in the 21st century, there is the counter argument that the amount of data that gets used productively is just the tip of the iceberg, while there’s not much evidence to suggest that organisations are using their data stock to drive their businesses forward. To judge from Girolami’s wry laugh, this is an argument he’s had to deal with on more than one occasion.
“Sitting on data is like sitting on an oilfield,” he explains. “The latent wealth is incredible. But getting to that wealth is another story entirely. And it’s exactly the same with data. There are lots of sectors that now realise they are sitting on great potential in terms of the data they’re producing. But how you get it out of the ground, how you refine it and apply it to the products that are really going to make you money... that’s the big question. Then there are all the legal, moral and ethical issues surrounding data which are very thorny.”
The big difference between fossil fuels and data though is that while reserves of oil and gas are finite and become progressively harder and more expensive to extract as resources dry up, “data is infinite. That’s one of the big challenges the Alan Turing Institute will be hoping to address: the amount of data we are producing is increasing.”
‘There’s pretty much nothing that hasn’t been impacted by data, computing and algorithms.’
In describing the Alan Turing Institute, its first chief scientist Girolami uses the term start-up. It was originally conceived as recently as 2013, when the then government chief scientific adviser Sir Mark Walport “was talking about the transformation effect that Big Data was going to have. He spoke about ‘the age of the algorithm’ and encouraged the government to establish a national resource for data science, which is how the institute came about.”
By 2015, the Turing was established, named after the mathematician in recognition of his achievements, and by 2016 it had taken on the role of national body for artificial intelligence. “We’re only five years old. The thing about start-ups is that they’ve got to get going and start doing useful stuff. You’ve got to establish governance and get the finances right. This takes time. We’re now at the stage of the evolution of this national institute when it is time to say that we’ve got to be a bit more formal in terms of the science we are setting out to achieve. That’s why the role of chief scientist has been defined now and my commitment is 100 per cent.”
Not that Girolami is new to the institute. Prior to taking up the chief scientist role, he was director of the data-centric engineering (DCE) programme at Turing. Simultaneously the Sir Kirby Laing professor of civil engineering at the University of Cambridge, as well as the Royal Academy of Engineering research chair in DCE, “you can see how the jobs leverage together. The sum of their parts is far greater, while the work of the Turing, which is a big national resource and a big national investment, harnesses all the intellectual horsepower in data science and AI across the UK. The global reach of the Institute is strong, with organisations around the world – Australia, Canada and the US – wanting to work with us.”
When asked what he is specifically aiming to achieve as chief scientist, Girolami reframes the idea, expressing it as what he wants the Institute to achieve, which is to “focus on a number of global grand challenges. Part of the job is to define what these challenges will be.” At this point, Girolami reflects on progress so far in delivering what he describes as several ‘world’s firsts’ in engineering and engineering-related applications. There is the world’s first self-sensing 3D-printed stainless-steel pedestrian bridge in Amsterdam (pictured below). The Institute has worked on projects to deliver sustainable and efficient underground agriculture, advanced AI-enabled city-scale air-quality monitoring systems, the development of an AI-enabled UK air traffic control service, digital twin technologies in rail transportation and aerospace design. Turing has even been involved in city-level monitoring of social distancing and activity monitoring in London during the Covid-19 public health crisis.
At the heart of everything Girolami does is data-centric engineering. “It’s what I’ve led at the Turing for the past five years.” The first thing you need to understand about DCE, he says, is that it’s not new. “Engineering, engineering science and engineering practice has always been centred around data.” To press his point, Girolami goes back to the 19th century and reminds me of the quotation from Lord Kelvin, who famously said: “If you cannot measure it, you cannot improve it.” In other words, you need data, says Girolami: “One of the underlying principles of engineering is experimentation and what does experimentation produce? Data. So, you could ask what is the big deal about data-centric engineering, if engineering has always been centred around data?”
It’s a good question because DCE is rapidly becoming an industry buzzword. Girolami is the editor of the journal Data-Centric Engineering, published by the Cambridge University Press to disseminate “research at the interface of data science and engineering”. He was also the driving force behind the first DCEng Summit in September 2021, an event that was advertised as bringing together, “world-class thinking on how data science tools and methods can improve the reliability, resilience, safety, efficiency and usability of engineered systems”. One of the aims of the DCEng summit was to assist delegates, “devise your data-centric engineering roadmap and solve your business challenges by getting a better understanding of how data-centric engineering is transforming industry”.
Explaining the current surge of interest in an aspect of engineering that was being practised by Rolls-Royce as far back as the 1960s (when, rather than selling engines, they were marketing ‘power by the hour’), Girolami says there is a shift in the way in which we can now approach DCE, which has been ushered in by the age of AI.
“If you look at civil engineering, structural engineering, geo-technical engineering, agricultural engineering and so on, what is new is that these disciplines are all coming to a stage in their development where sensors, networks of sensors and systems of sensors are producing data related to the processes they control and manage at rates that they never have before.” For Girolami, it follows that this aspect of the data-centric view of engineering science across all disciplines is new. “What this means is that you now have, say, agricultural engineers saying: ‘wow, I could potentially improve the crop yields because I’m producing all these measurements and data.’ It’s at such a stage now that when the president of the Royal Academy of Engineering spoke at our summit, he recognised that markets are going to change, new businesses are going to emerge and new sciences will come about, all because data is taking on a greater level of importance.”
For Girolami and his work at the Alan Turing Institute, what’s critical about this mass of data is that it provides the raw material for artificial intelligence, sustaining his mantra of the three words that define it – data, computing, algorithms. The supply of data is “infinite” says Girolami: it’s just a case of what we do with it from here on in.
As our conversation comes to a close, I ask the man who’s done more than any other to put data-centric engineering on the map what he thinks Alan Turing would have thought of recent developments in AI. Without hesitation, Girolami informs me that he’d be “over the moon. If you look at the pioneering things he did in areas such as pattern formation, he would be delighted to see the cool things that we’re doing with AI technology today. He’d just be relishing the whole thing.”
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.