
‘We have to boost the pace of discovery’: Ed Pyzer-Knapp, AI lead at IBM
Image credit: Nick Smith
Worldwide research lead for AI Enriched modelling and simulation at IBM, Ed Pyzer-Knapp says that the ‘need for science has never been more urgent’, while the key to solving future global challenges lies in ‘accelerated discovery’.
“To deal with the challenges our world is facing, we have to boost the pace of discovery,” says Dr Edward O Pyzer-Knapp, worldwide research lead for AI enriched modelling and simulation at multinational technology corporation IBM. A man on a mission, if you take only one thing away from meeting the 33-year-old British scientist, it will be that there is an imperative firmly placed on the shoulders of the STEM community to do things better and quicker.
“The need for science has never been more urgent,” says Pyzer-Knapp, and the way to get that message across is to talk about it in simple terms. A passionate advocate for the public understanding of science, he talks continually about “real-world applications”. You get the feeling that he’d be happier with a slightly less intimidating job title, which he accepts is “slightly convoluted”, while his enthusiasm and energy in the AI space appears to be limitless. Apart from his post at IBM, Pyzer-Knapp is also the editor-in-chief on the journal Applied AI Letters, as well as Visiting Professor in Industrially Applied AI at the University of Liverpool. “I work better when there’s a lot to do,” he says.
Pyzer-Knapp’s work focuses on artificial intelligence, which he described as “one of a suite of tools that helps me to deliver what I am really passionate about, which is what’s now called accelerated discovery”. This term is of IBM coinage and, while it brings with it more than a hint of 21st-century corporate policy newspeak, Pyzer-Knapp says that it can be translated into meaning simply “investigating how we use current technology and methods to make us better at science”. What this further means for the piano-playing cricket enthusiast is “the blending of AI with HPC [high-performance computing] – including dabbling in quantum – and Cloud, to help us do discovery faster”.
Pyzer-Knapp is also passionate about communicating the value of science to the non-scientist and is eager to describe what he does in straightforward terms while referencing real-world applications. This is the founding philosophy of Applied AI Letters, which he explains is “as the title implies, all about applications”. He says it’s all right to have abstract knowledge and research, but you’ve got to do something with it. As an analogue, he refers to Damien Hirst’s art installation The Physical Impossibility of Death in the Mind of Someone Living, explaining that “you’d get people saying ‘anyone could have put a shark in a glass tank full of formaldehyde’. But Hirst did it.” Pyzer-Knapp was “enraged enough” by the lack of understanding of what AI could do in the real world that, as with Hirst, he went ahead and made something happen. And while Hirst pickled a tiger shark, Pyzer-Knapp set up a journal with the editorial principle that it had to be about something: “So I assembled a cast of editors and it’s going really well.”
“I like to think of the world splitting up into three parts,” he says. “There is the real world: you build a bridge. Then there is the data world, which is an historical agglomeration of what’s gone on in that real world. Then there is the digital world, which involves the simulation of the real world by maybe taking the laws of physics and saying: ‘I’m building a bridge. Before I do it, can I produce a grounded idea based on science of what’s going to happen?” In this way, AI has revolutionised bridge-building, says Pyzer-Knapp, “because nobody builds one these days without simulating wind conditions”. He illustrates what happens when you don’t by referring to the collapse of the pre-AI Tacoma Narrows Bridge due to self-exciting aeroelastic flutter created by seemingly innocuous 40mph (64km/h) wind conditions. “The whole thing fell down. These days, that couldn’t happen because you’d have simulated everything in advance.”
‘Accelerated discovery means using current technology and methods to make us better at science.’
The problem with simulation, says Pyzer-Knapp, is that it is expensive. It’s also difficult because, “you need to ask the right questions. When I talk about AI-enriched modelling and simulation, I’m asking: ‘how do we get AI to help us answer those questions about the digital world so that we can do the best job that we can in the real world?’ It might be by suggesting the right protocols to use or replacing some of that expensive physics with neural networks.” When asked what his team does to make that happen, Pyzer-Knapp says that they “exist to work across the spectrum. We have done fundamental mathematical development, computer science algorithm deployment optimisation, as well as stringing this all together into complex heterogeneous workflows for applications.”
While this might sound abstract, it means that if a designer on, say, a Formula 1 car needs to optimise a spoiler profile, Pyzer-Knapp can help, specifically at IBM’s new Hartree National Centre for Digital Innovation. Announced in June this year, the centre is a £210m collaboration with the UK government’s Science and Technology Facilities Council (STFC) that, according to IBM’s blog ‘The Discovery Accelerator comes to Europe’, will “become a hub of engagement with collaborators across the UK’s industrial and research ecosystem seeking to drive innovations in life sciences, new materials development, environmental sustainability, and advanced manufacturing”. More than 100 scientists from IBM Research and STFC will work together over the next five years applying AI and quantum computing, “to produce innovations in materials, life sciences, climate, agriculture and manufacturing”.
Pyzer-Knapp has mixed feelings over his family teasing him “mercilessly” about the fact that his first interaction with a computer came at the tender age of five. “It was a BBC Micro, and I was the only person that could wire it up properly.” His first venture into programming was to write a search protocol for the school computer’s encyclopaedia: “If I’d had any kind of prescience, I’d have remembered that later in life and developed search algorithms when they were much more monetisable.” He describes himself as “very logical”, and it was the corresponding logic at the heart of computing that was its early attraction. While reading chemistry at Durham University, a chance sequence of events that included a software program licence expiring on him led Pyzer-Knapp to write his own software. It was the moment when “I started to realise that I enjoyed working with the data more than getting the data. My wife is a proper chemist, but for me there was only a certain amount of time I could spend scrubbing down a lab because something’s exploded before I was dissuaded from chemistry.”
His hands still fluorescing from the mishaps that led him to suspect that he was “terrible” at lab-based chemistry, he retreated to the University of Cambridge to do a purely theoretical PhD about the way molecules pack into solids. The title of his thesis was ‘Exploring the crystal energy landscapes of porous molecular crystals’, which was to bring out the latent periodical editor in Pyzer-Knapp, who is still annoyed at the bumpy scansion and repetition of ‘crystal’. “At least it didn’t have the word ‘towards’ in it,” he reflects. “That always sounds like you didn’t quite get there.”
Impatient to get onto real-world applications, he wrote his PhD in two-and-a-half years (it normally takes as much as three times longer to gain a doctorate), got on a plane, crossed the Atlantic and headed for Harvard, where he was in charge of the day-to-day running of the university’s Clean Energy Project: a White House project under the Obama administration in collaboration with IBM which combined massive distributed computing, quantum-mechanical simulations and machine learning to “accelerate discovery of the next generation of organic solar cells. The idea was that we understood that these are important, but hard to find good ones. So could we, by understanding the molecule, predict how good the cell it generated would be: and if so, could we screen through a large number of molecules? Where IBM came in was with technology called the World Community Grid, which basically said that when you weren’t using your computer you could either get it to show you pictures of your cat, or you could get it to do something useful. A quarter of a million people signed up for this and we had the equivalent of a 7,000-node cluster running just for me, every day. And that’s one hell of a beast to keep fed.
“The reason I got into AI was that I couldn’t do that manually. I needed some intelligence to drive this thing: something that would construct these molecules and prioritise them. This is what took me into the area called Bayesian optimisation, which is what I do most of my work in now.”
‘Bayesopt’, as the AI community calls it, is not only named after the 18th-century Presbyterian minister and statistician Thomas Bayes, but is also according to Pyzer-Knapp, “a really underappreciated family of algorithms”, that he explains for the layperson in terms of looking for a lost set of keys in the real world. “How do you look for keys?” he asks rhetorically, before explaining that the one thing you don’t do is look in a focused way everywhere. “What you do is think about where you last saw them and you retrace your route. And you have some sort of idea about probabilities.” What bayesopt does is look for ‘good solutions’ in the way that you search for your keys. “It’s checking and updating what it thinks about the world now and it’s learning all the time.”
I put it to Pyzer-Knapp, the logical loser of keys would hardly assign much value in looking for lost items in the ‘most likely’ places on the basis that if they were present in any of those, you’d be hard-pushed to describe them as lost. This is “precisely what makes bayesopt better than anything else because the optimisation part of it understands what it doesn’t know. It can make the options available for the next move in finding your keys sit on a scale between purely exploitative (‘this is where I think it is’) and purely exploratory (‘I’m going there to rule this out’). Then there’s a sliding scale in between where you are trading off these two things. What I’ve been doing for the past five years is inventing more strategies for how you do this in various situations.” Such occurrences might include a scenario where there are five clones looking at the same time or constraints that you can observe but not understand (‘you can’t look for your keys in there because the door’s locked from the other side...’).
Pyzer-Knapp is keen to point out how this could be used in the real world: “I’ve got 20,000 molecules and I want to know with a limited budget how to find the best one. Or I’m designing a front wing for an F1 car: I can deform it in a number of ways, but I’ve got a budget. How can I find the best way? We’ve done work with experimental design where we’re helping people making incredibly cool technology which stores information on DNA. But they need to tune their machines so that the process is more efficient. We use the technology everywhere.”
To make this work, “there are two things you need to think about: quasi-philosophical and quasi-practical. First, there’s how you interact with the technology from the user perspective, and second, how the system is there to be interacted with. I could build the best GUI in the world, but if the system behind it isn’t built to be interacted with in any kind of sensible way, I’m limited.” Pyzer-Knapp says these considerations were understood early on, “and so we spent some time developing ‘explainable AI’ models. Explainability and transparency is key for the long-term success of these models.”
As our time comes to a close, Pyzer-Knapp introduces the idea that the unifying thread that is woven through everything he does is “the diversity of thought and experience. If you only talk to one set of people, you’ll only ever get one set of ideas. It’s easy to operate in that echo chamber, but not have an impact on the real world. And the way to do that, the most important thing at this point in time, is accelerated discovery.”
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.