Virtual talking head can express human emotions

19 March 2013
By Edd Gent
Mobile version
Share |
Research Engineer Dr Vincent Wan, from Toshiba Research in Cambridge, alongside the digital talking head named Zoe by researchers

Research Engineer Dr Vincent Wan, from Toshiba Research in Cambridge, alongside the digital talking head named Zoe by researchers

Image showing the digital architecture behind Zoe's face (CREDIT: University of Cambridge / Toshiba Cambridge Research Lab)

Image showing the digital architecture behind Zoe's face (CREDIT: University of Cambridge / Toshiba Cambridge Research Lab)

A virtual talking head created by scientists in Cambridge can express human emotion.

The new system, called Zoe, can recite text while realistically recreating emotions like happiness, anger and fear and could see the creation of life-like computerised personal assistants, similar to those seen in sci-fi films.

According to its designers, it is the most expressive controllable avatar ever and students have already remarked upon the similarity with the disembodied head and Holly, the ship's computer, in cult comedy Red Dwarf.

Developed by Toshiba's Cambridge Research Lab and the University of Cambridge's Department of Engineering, the face is actually that of Zoe Lister, an actress best known as Zoe Carpenter in the Channel 4 series Hollyoaks.

Professor Roberto Cipolla said: "This technology could be the start of a whole new generation of interfaces which make interacting with a computer much more like talking to another human being.

"It took us days to create Zoe, because we had to start from scratch and teach the system to understand language and expression. Now that it already understands those things, it shouldn't be too hard to transfer the same blueprint to a different voice and face."

The system is light enough to work in mobile phones and uses could include smartphone personal assistants or face messages to replace texts.

It works by using a set of fundamental emotions. Zoe's voice, for example, has six basic settings: happy, sad, tender, angry, afraid and neutral and the user can adjust these settings to different levels, as well as altering the pitch, speed and depth of the voice itself.

By combining these levels, it is possible to create almost infinite emotional combinations – for example, a combination of speed, anger and fear makes Zoe sound as if she is panicking.

Scientists hope the software could soon be adapted to allow people to upload their own faces and voices in a matter of seconds.

If this can be developed, a user could, for example, text the message "I'm going to be late" and ask it to set the emotion to "frustrated", so their friend would then receive a face message that looked like the sender repeating the message in a frustrated way.

The team is also working with a school for autistic and deaf children to see if Zoe could be used to help pupils read emotions or learn to lip-read.

"Present-day human-computer interaction still revolves around typing at a keyboard or moving and pointing with a mouse," Prof Cipolla said. "For a lot of people, that makes computers difficult and frustrating to use.”

"In the future, we will be able to open up computing to far more people if they can speak and gesture to machines in a more natural way. That is why we created Zoe; a more expressive, emotionally responsive face that human beings can actually have a conversation with."

Share |

Latest Issue

E&T cover image 1410

"Climate change in Antarctica is leading to interest in extracting the region's natural resources, but there's the small matter of a treaty."

E&T jobs

E&T Marketplace

The essential source of engineering products and suppliers.

E&T podcast

Tune into our latest podcast

iTunes logo

Subscribe

Choose the way you would like to access the latest news and developments in your field.

Subscribe to E&T