vol 4 issue 8

The body in question

5 May 2009
By Christine Evans-Pughe
Share |

The UK's universities are leading the way in many aspects of ICT R&D. In the third of a three-part series, E&T surveys innovative projects at University of Oxford's Computing Laboratory.

Established in 1957 from the University's mathematics faculty, the Oxford Computing Laboratory's ethos is in many ways still characterised by these roots: a strong mathematical and theoretical informs its research, which covers everything from Web science to quantum computation.

The Computing Laboratory has expanded since 2005 to 55 research staff, 40 postdoctoral researchers, 110 doctoral students, and £27m of research funding. Its highest profile activity - and what has contributed a great deal to this growth - is its work on modelling and simulating the human heart, which is making significant inroads toward a more quantitative approach to medicine.

"Today's view of biology is in an exciting state of transition, moving from a mainly qualitative science to one, like physics and chemistry, where maths can be used to describe functions and processes," explains Nic Smith, Professor of Computational Physiology at Oxford Computing Laboratory, and a member of the it's Computational Biology Group. "In much the same way as we can simulate concrete in a bridge, we can now simulate the mechanical contraction of the heart, and it's the same mathematics - finite element analysis - developed by Boris Galerkin in the late 19th century."

Driving this change to what is often called 'systems biology' is the availability of high-performance computers, such as HECToR (High End Computing Terascale Resource), coupled with the rapid development of measurement technologies such as high-resolution clinical imaging. Despite the increasing detail of data, difficulty remains in understanding how specific measurements are indicative of the whole.

Systems biology

Smith's expertise and interest is in solving the problem of linking experimental data to fully integrated function.

His focus is on modelling the heart's electrophysiology and contraction at the cellular level, and the multi-scale translation of these models to enable blood flow and cardiac electro-mechanics simulations. Smith is customising electro-mechanical and fluid-mechanical computer models of heart function to particular individuals for the €14m EU project euHEART, of which he is scientific director.

Part of the project involves working with heart surgeons and their patients at London's St Thomas's Hospital to study specific treatments, such as Cardiac Resynchronisation Therapy (CRT), and understand which patients can benefit.

"In heart failure, one side of the heart might contract and relax before the other - so there is a sort of hiccup in the electrical wave that causes the contraction," Smith explains. "If you put a pacemaker on each side of the heart it's possible to re-synchronise the contraction, but only about 60 per cent of patients regain proper function and it's hard to know why. By fitting our computer models to hearts and doing electromechanical simulation, we hope to work out who would respond well to CRT and where exactly each pacemaker should be placed."

In animated electromechanical computer models developed at Oxford of a healthy heart pumping, the principle stresses in the heart are indicated with colour changes (see image on p58). One of Smith's doctoral students is adapting the model to a specific patient's heart by using a combination of measurements from various MRI scans that give the shape of the patient's heart, the muscle fibre orientation, and how the cells are contracting over a full heartbeat. Smith's team has also developed a fluid mechanical model to analyse how well the heart pumps by showing the fluid flow and the mechanical deformation of the muscle.

"One of the euHEART partners is a company called Berlin Heart that makes temporary pumps - Left Ventricular Assist Devices - that give the heart muscle a chance to recover," Smith continues. "We're using our fluid mechanical models to look at who recovers so we can get a better metric on which patients are a good bet for surgery and how to tune the pumps to ensure the best chance of recovery."

At this stage, fitting the patient-specific data to heart models is done manually. In the long-term, though, the idea is that the models could be generated automatically from heart scans and made available as standard. This work requires a great deal of computing power and the team is using HECToR.

While Smith's research is geared towards clinical applications, David Gavaghan, Professor of Computational Biology, is working with colleagues in the Computational Biology group on models of the heart and tumour growth that have potential applications in both basic science research and medicine. For example, in the €4.2M EU project preDICT, Gavaghan is looking at testing the cardio-toxicity of particular drugs using detailed models of the heart's electrophysiology with Professor Denis Noble, co-director of computational physiology at Oxford - who has spent 50 years developing models of working heart cells - and pharmaceutical companies GSK, Novartis, and Roche.

"A large proportion of drugs fail at clinical trials due to cardiac side-effects, in particular arrhythmia - the famous case being the painkiller Vioxx," Gavaghan explains. "Our aim is to build heart models that are comprehensive enough to firstly understand why existing drugs may have caused arrhythmia and ultimately to predict how new compounds will behave and whether certain people have a genetic susceptibility to an adverse reaction." It is still early days, Gavaghan says: "Our proposal is that we're setting up a framework in which we can look at this problem and try some case studies to see if it's possible. Once we've done this, we'll be able to go on to the more interesting stage of looking at particular compounds."

As part of preDICT, the Computational Biology Group is using an in-house developed general purpose simulation package called CHASTE (Cancer Heart And Soft Tissue Environment) aimed specifically at multi-scale, computationally-demanding problems arising in biology and physiology. Unusually for software written in academia, CHASTE has been built using advanced test-driven software engineering techniques such as Agile programming, pair programming, and code refactoring.

The idea is that it can be a highly efficient, standard software platform for these kinds of simulations designed to run on large-scale, high-performance computer systems.

The level of physiological detail of these models means that even on state-of-the-art high performance computing facilities it can take several hours or days - depending on the application - to simulate a second or two of a complete beating heart.

The Computational Biology group is collaborating with vendor Fujitsu to see if the company's next generation 10-petaflop supercomputer can run such models in near real-time. One element of this work is to explore whether it is feasible to use adaptive numerical algorithms in heart models so that in areas of the model where the solutions to the underlying equations are changing slowly, fewer mesh points can be used. These ideas have been used in oceanic modelling and in the automotive industry, but not applied to heart models before.

The next stage - and perhaps the hardest - is to look at cancer. "Cancer modelling has typically focused on one particular aspect, such as how does a tumour invade, whereas you really want a multi-scale model just as we have with the heart, that can include everything starting from genetic mutations through to when a tumour starts to vascularise and metastasise," explains Gavaghan. "It's a huge problem, but if we can develop a framework first for the heart, and then transfer to cancer, we can make a good start."

Tools for quantifying uncertainty

In another part of the Computing Laboratory, Marta Kwiatkowska, Professor of Computing Systems, is - as it were - quantifying uncertainty. Her work has culminated in the development of a versatile model-checking tool called PRISM, which can be used for studying the behaviour of almost any system that behaves in a random or probabilistic way.

PRISM has been applied to a wide range of real problems, including quantifying the best way to avoid denial-of-service security threats, studying the behaviour of communications protocols, measuring the reliability of nanotechnology designs, and even examining biological cell behaviours.

"Anyone who wants to understand how systems built of components [such as sensors, processing elements, molecules] in parallel interact with each other or a user, or react to changes in the environment, can use PRISM," says Kwiatkowska. 

Kwiatkowska's group has been analysing wireless networking protocols: "We have unearthed some very strange behaviours," she reports. "For Bluetooth, we found that that the worst-case expected time to send a message is over two seconds. You'd have to be lucky to find that by testing or simulation."

Formal verification allows unpredictable system behaviour to be checked exhaustively and then quantified using mathematical techniques. As computers become ubiquitous, says Kwiatkowska, applying these techniques will become more and more important - safety being the pressing reason.

A recent example of the inadequacy of standard testing and simulation regimes was in 2005 when Toyota had to recall 75,000 Prius cars after drivers reported sudden stalling or stopping at high speed due to a software glitch.

For proposed health applications such as using body sensor networks to monitor people after serious operations, formally checking for strange glitches will be crucial. For such a system, PRISM can answer questions like: 'what is the probability that a body sensory fails within ten minutes?' or 'what is the worst-case probability that more than 12 sensor readings have been lost by the first day?', and so on.

The ongoing research challenge is to find more compact data-structures for handling this type of analysis, says Kwiatkowska: "A probabilistic system is one that doesn't behave in an entirely predictable way, so transitions from one state to another have to be annotated with probabilities or rates. And there may be millions of such probable states. So representing such systems efficiently in computer memory is a very difficult task."

PRISM can't yet be applied to the scale of ubiquitous computing because it's too complicated. "The Bluetooth analysis, for example, involved two devices - any more wouldn't have fitted into the computer memory," she adds, "but there is scope for clever model reductions and approximation methods to make this feasible."

Talking to computers

How to get computers to process language as if they really understood it remains one of the knottiest questions in computing. Take, for example, the sentence 'The porters refused admission to the students because they feared violence'. Most of us ascribe 'they' as referring to the porters. Change the verb 'feared' to 'advocated', and we assume this time that 'they' relates to the students. Humans make such decisions by inference from our knowledge of the world - they are not thoughts about students and porters that we have consciously entertained before.

"Computer scientists are not much closer to figuring out how to make such fine distinctions than we were 30 years ago," says Stephen Pulman, Professor of Computational Linguistics and Director of the Computing Laboratory. "The best we can do is an approximation. If you have enough data to train a statistical model of likely interpretations, hopefully it will get it right more often than wrong."

The lab's Computational Linguistics Group's answer is to mix statistical methods with automated grammatical and semantic analysis. "Statistical methods have the virtue of behaving robustly on new data, but it is not easy to uncover deep aspects of the meanings of sentences this way. For that you need traditional syntactic and semantic analysis," avers Pulman. "We hope to combine depth of analysis with breadth of coverage using this combination of methods."

An interesting application is 'sentiment analysis' - automatically detecting attitudes towards products, people or policies from news reports and other online text. The standard technique is to score text against lists of words tagged as having positive, negative or neutral polarity.

A doctoral student in Pulman's group, Karo Moilanen, has extended this approach by basing the sentiment profile on a full syntactic analysis using compositional rules that recognise, for example, that while 'killing' is bad and 'bacteria' are bad, 'killing bacteria' is perceived as good. The result is a much more nuanced sentiment classification system, which is attracting commercial interest.

"If you have a sentence like 'the opposition criticised the government for cashing-in on the success of our Olympic team', then there are three entities we can analyse: the opposition, the government and the Olympic team. The attitude towards the team is positive, but the attitude to the government is negative. So you get a much more fine-grained profile of what people's opinions and motivations are towards particular entities," says Pulman.

"The system processes an average length news report in one second on a laptop, and can collect from many different sources and compare and present it in a digestible way. We plan also to try this system in the world of stocks and shares, and see whether we can predict market movements on the basis of textual data," he adds.

The lab is also applying a combination of techniques in the €13m EU-funded COMPANIONS project, which has the aim of developing 'embodied conversational agents', i.e., computer-based companions that can stay with someone for a long period, developing a relationship with them.

Pulman's group has the intriguing task of measuring what would be classed as a 'socially successful dialogue': "In commercial spoken language dialogue systems, such as those used by banks, automated dialogue is driven by the need to carry out a database transaction," he explains.

"You know the dialogue is successful when you get the money out or pay a bill. The type of conversation you might have with a companion isn't going to be like that. We aim to build a system that is sensitive to the reactions of the user, using sentiment analysis and also detecting linguistic cues associated with emotions, like surprise or joy and sadness.

"We intend the ECA dialogue system to use machine learning techniques to change its behaviour to maximise positive sentiment in the user."

Share |

The semantic way

Data cleaning

A reliable-looking website states that Anton Chekhov wrote 'One Day in Life of Ivan Denisovitch'. How do you know it's wrong? It's a trivial example but as we come to depend more on information technology, finding ways to evaluate the quality of data sources is increasingly important.

Michael Benedikt, Professor of Computing Science at Oxford, is working on this kind of problem in two projects with Bell Labs, the research arm of Alcatel-Lucent. The first, which is EPSRC-funded and led by Professor Wenfei Fan of Edinburgh University, is to develop tools that can clean data from multiple sources by capturing the quality of data, integrating the data, and detecting inconsistencies and repairing them.

The second project is gathering data from the Web to answer queries in a 'quality-aware' fashion rather than pre-processing data sources to improve quality in advance of queries. 

Benedikt is interested in evaluating data reliability with automated reasoning and using the results to produce an error profile. "A simple idea is to use semantic information about the data, for example we know people can only have one father, so two fathers must be wrong.

This is the constraint-based approach to data cleaning. A second approach is by cross-referencing over several data sources; this can be combined with bootstrapping from one particular data source that is reliable," he explains. To get an error profile, he says, involves combining all the indicators of reliability together, namely trusted data such as an online encyclopaedia, manual verification by users, and constraint-based techniques.

The tools generated in the EPSRC project are intended for use in scientific data management in the Generation Scotland project; a partnership between academics and the Scottish NHS to compare health and illness factors passed on in families.

Related forum discussions
forum comment To start a discussion topic about this article, please log in or register.    

Latest Issue

E&T cover image 1404

"Power cuts might seem like a 1970s fad, but they could be on the way back. How can we prevent them happening again?"

E&T videos

The Bugzi Wheelchair for toddlers

E&T jobs

E&T Marketplace

The essential source of engineering products and suppliers.

E&T podcast

Tune into our latest podcast

iTunes logo

Subscribe

Choose the way you would like to access the latest news and developments in your field.

Subscribe to E&T