- Blackburn, Lancashire
- £23,560 to £35,583 pro-rata, per annum plus Market Premium of £3K per annum
We are looking for a Lecturer in Electronic / Electrical Engineering to join our busy Higher Education team at a time when the STEM agenda....
- Recruiter: Blackburn College
- Blackburn, Lancashire
- £23,560 to £35,583 pro-rata, per annum; plus Market Premium of £3,000 per annum
We are looking for a Lecturer in Mechanical Engineering to join our busy Higher Education team at a time when the STEM agenda...
- Recruiter: Blackburn College
- Birmingham, West Midlands
- c £55,000.00
This key role will provide inspirational leadership to drive success and outstanding performance across the department
- Recruiter: Birmingham Metropolitan College
- Zurich, Canton of Zürich (CH)
The successful candidate is expected to develop a strong and visible research programme in the area of control and diagnostics of building systems
- Recruiter: ETH Zurich
- Birmingham, West Midlands or Pershore (Worcestershire)
- £30,000 - £35,000 (depending on experience) + benefits
Network Innovation Engineer / Analyst to join a team of talented technology enthusiasts who design and support the low carbon networks of the future.
- Recruiter: Nortech Management Ltd
- Thirsk / Leeds / Banbury / Colchester / Cambridge
- Salary will be competitive and commensurate with experience, knowledge, aptitude and capability
A Production Engineer with some knowledge and understanding of radiant energy transfer.
- Recruiter: Compact Engineering
- Falkland Islands
Sure South Atlantic Ltd currently has a unique engineering opportunity in their Falkland Islands office. Surrounded by the Atlantic Ocean, teeming ...
- Recruiter: Sure South Atlantic Ltd
- Tring, Hertfordshire
Nikon Metrology is looking for an Electronics Engineer to join our Electronics Team based in Tring (UK).
- Recruiter: Nikon Metrology Europe
- Porton Down, Salisbury
- Competitive salaries
Information is everything. Use it to serve your country and help keep us safe.
- Recruiter: Dstl
- Hinckley, England, Leicestershire
MI Senior Developer Hinckley National Grid's energy network transports gas and electricity to homes and businesses all over the UK. It's an essential part of all our lives. And it needs to be continuously advanced and enhanced to meet increasing demand wh
- Recruiter: National Grid
Advances in eye tracking and speech synthesis
E&T looks at developments in eye tracking and speech synthesis
Eyetracking technology like this Tobii IS-2 has come down in price
For those whose eye function cannot interact through gaze, Quovadis is developing a wearable brain interface
Children as young as 13 months have used Eyegaze Edge with some success
Eye tracking and speech synthesis are now able to give voice to those with even severely limited movement.
In 'Diamonds are Forever' (1971) James Bond's nemesis Blofeld uses an electronic gadget to synthesise casino owner Willard Whyte's voice to fool our favourite spy. Fast-forward to today and we have technologies that can not only synthesise any voice or accent but that can generate speech by someone glancing at text on a screen. For Tony Nicklinson, suffering from locked-in syndrome after a brain stem stroke in 2005, communication software The Grid'2 linked to an eye-tracker enabled him to argue eloquently for doctors to end his life, until his natural death in August 2012 shortly after losing his High Court appeal.
Nicklinson used a southern English voice by Acapela called 'Graham' that comes with The Grid 2. 'Graham' is a modern text-to-speech engine built using strung-together snippets of real recorded speech that capture changes in intonation and frequency spectrum that make each human voice unique and expressive.
Edinburgh-based CereProc is another company whose high quality British regional voices (including Scottish, Irish, and Black Country) show how far speech synthesis has progressed since we first heard the flat robotic tones of physicist Professor Stephen Hawking, whose voice harks back to an earlier technology based on mathematical models of the human vocal tract.
Eye'm in control
However, authentic synthetic speech is merely the headline technology in breaking down the human isolation of severe disability. The decreasing cost of eye-tracking interfaces over recent years has arguably been more important. Stephen Murray, a professional BMX rider before he was paralysed in 2007, still has his voice but describes his eye-control system from Swedish firm Tobii Technology as like an antidepressant that has put him back in control of his life.
Eye-tracking systems use tiny infrared-sensitive video cameras positioned below a screen to enable users of assistive communication software like to generate speech, control their environment (lights, call bells, television etc), use email, the Web, and social media and even work full-time using only eye movements.
A technique called Pupil Centre Corneal Reflection (PCCR) captures the instances when the eye pauses on a specific area of the screen, and tracks the rapid movements of the eyes between pauses. PCCR works by filming (at 30 to 60 frames per second) the reflections on the cornea (the transparent front of the eye) and in the pupil from an infrared LED light source. Image processing algorithms estimate the position of the eye and the point of gaze by analysing the vectors between the pupil centre and corneal reflections. Bright pupil eye-tracking, where the infrared LED is placed close to the optical axis of the camera (causing the pupil to appear lit up) is the most widely used form of lighting.
Some systems are so accurate that babies can use them. Among the youngest users of the Eyegaze Edge made by LC Technologies (an American company that built its first systems in 1986) is a 13-month-old baby girl with spinal muscular atrophy. "She is smart, understands cause and effect, and is able to run picture-based programs in The Grid 2," says Nancy Cleveland, the company's medical director.
A tiny red dot serves as a screen cursor (in effect the x-y coordinate of the gaze-point) to show where the eye is pointing to on the onscreen pictures of keys and symbols. These graphics make an audible click and flash when they activate, based on a gaze-time that can be set for each specific user. The shortest activation gaze-time is around one-fifth of a second.
Eyegaze Edge systems can work with only one eye, giving freedom of head position for the user, which means babies and adults who have to lie on one side with their head at an angle can use them. LC Technologies' youngest user before was an 18 month-old who couldn't move or speak and was on a ventilator. "He figured out the system in no time at all and is now three years old and uses the system every day," says Cleveland. "A lot of what he understood at first was cause and effect. For example, looking at a cell with a picture of a lion would play a video of a lion. Now there is serious effort being made to teach him to use symbols to communicate."
Algorithms that interpret eye movements are the main patented IP behind these systems, most of which run on PC hardware with relatively low-cost software programs like The Grid 2. But eye-tracking systems still cost several thousand pounds because of the high cost of the camera hardware.
Mass market eye-gaze interaction
Tobii Technology started life in 2001 with eye-tracking systems for studying human behaviour and human-computer interaction. Tobii is now a leading seller of eye-controlled all-in-one computers for people with disabilities. Its recent projects includea concept eye-controlled laptop built by Lenovo; field-tests of an eye-tracking system for driver drowsiness-detection in cars; an Asteroids arcade game that works with both eye and head movements; a prototype eye-controlled television made by Hai; and most recently a concept tablet with embedded eye tracking by NTT Docomo points the way forward.
In March 2012 the company was the recipient of $21m from Intel Capital towards bringing its technology to the mass market. "Computer peripheral eye-trackers used in assistive communication cost around '4,000. To bring that price further down you need consumer volumes," says Sara Hyl'en Tobii's marketing director.
As part of its strategy to bring gaze interaction into the mainstream Tobii now has a 3W single-board eye-tracking camera component that can be integrated into any product. It includes system-independent processing and measures 200 x 25 x 15mm.
Back to the voice of the future
Off-the-peg text-to-speech engines are generally bundled into the communications software and so are not costly. But the future takes us back to Willard Whyte. Today's version of 'voice transformation' means capturing a small speech sample to quickly produce a custom voice. "The goal over the next three years is to be able to produce any voice in this way," says Acapela's chief technology office Fabrice Malfrére.
Voice transformation uses Hidden Markov Models that 'learn' from a small database of information relating to linguistics and prosody (the music of speech) rather like the databases in today's unit selection-based voices. From this material it generates parameters to create speech from a mathematical speech model (vocoder).
Eventually anyone may be able to connect to a website, record 100 sentences or so and automatically get a synthetic version of their voice. R&D systems already exist but for the moment they require more recordings to be able to produce commercially usable speech synthesis. Malfrére sees this technique as being a way to quickly and cheaply add unique voices to all kinds of products as part of the brand-identity whether it is car GPS systems or voices that read the newspaper on your smartphone.
"Improving the quality of long pieces of text is the next challenge," says Malfrére. Building a text-to-speech synthesiser that could read a book (or this article) in a natural way is a task related to the computer understanding of meaning which means using elements of language-context analysis, text pattern recognition, sentiment- and humour-analysis.
Cloud computing would be one way to handle the complex processing, says Malfrére, allowing owners of smartphones, tablets and e-books to access reading services on-demand from a smart server.
Perhaps the population of ageing and increasingly infirm baby boomers who enjoyed James Bond gadgetry first time round will be equally appreciative of the modern successors.
Speech synthesis: Cut and paste
The smallest components of computer-synthesised speech are phonemes, which are in effect vowels and consonants. So 'Hello' can be split into four phonemes/h/ /eh/ /l/ and /ow/. More of a voice's character can be captured using transitions between phonemes, known as diphones (there are over 1,400 diphones for English). 'Hello' is made up of five diphones/silence:H/ /H:EH/ /EH:L/ /L:OW/ and /OW:silence/.
Most modern text-to-speech synthesis programs use 'unit-selection'a program called a linguistic module first converts text into phoneme sequences (making use of lexicon or letter-to-sound rules). It then selects and strings together appropriate diphones that have previously been snipped out of real sentences generated in a recording studio by a specific speaker.
These multiple examples of possible diphones have linguistic context tags that allow the unit-selection algorithm to choose those that best match the context of the words in the text. For instance tags will indicate the part of speech (noun, verb, etc) and also mark the diphone's original position (beginning or end) in both the syllable and in the sentence it was snipped from. "If you have the same context but one diphone originates from a noun and the other from an verb, the algorithm will prefer the diphone that comes from a noun," says Fabrice Malfrére, CTO of Acapela, a text-to-speech company formed as a spin-off from Mons Polytechnic in Belgium.
A good unit-selection algorithm also takes account of 'acoustic cost'it tries to match adjacent diphones by length, pitch, and frequency spectrum. "For example, it will avoid putting a diphone with a rising pitch next to one with a falling pitch because that produces an impossible pitch pattern with no continuity," says Malfrére.
Prosody - the music of speech - is also important. Humans would say 'Charlie went to the shop to buy some coffee' with a slight break after 'shop' because the sentence consists of two smaller units. And we put pitch accents on words that carry the most information such as 'Charlie', 'shop', and 'coffee'. Unit-selection systems use techniques such as grammar tags and statistical rules to label each syllable with high or low pitch or a transition between high and low to achieve a similar effect.
Software: Grid 2 & JayBee
Sensory Software's The Grid 2 program and Time Is Ltd's JayBee software were both developed in the UK by British engineers. Sensory Software's The Grid 2 provides text or speech output from libraries of symbols, pictures and words. "A child born with cerebral palsy, for instance, could start to talk and use the eye-gaze system to make requests and build sentences from early on using symbol libraries," explains Dougal Hawes, business development manager for sensory software and Smartbox AT.
For literate users, there are grids that include a full keyboard with word prediction, phrase prediction, instant message cells and also ready-made grids (some of which can be downloaded as apps) for common computer tasks including Internet browsing, Facebook, SMS text messaging, Twitter and so on.
Ian Schofield developed JayBee with the text-to-speech company CereProc after two friends succumbed to motor neurone disease (MND). JayBee uses a pattern-matching algorithm developed for the satellite industry to learn about the words and sentences a user commonly employs so it can flag them up as the user starts typing so they can speak in almost real-time. "We call our approach Predetermined Text as it adapts to the user's patterns of communication as it goes along," explains Schofield. "It means that the user almost never has to finish typing a word."
JayBee has been successfully used with Alea Technologies' IntelliGaze eye-tracking system and Schofleld is working with an American company, Grinbath LLC, which has developed a low cost eye-tracking system for around $500.
|To start a discussion topic about this article, please log in or register.|
"Where would Frankenstein and his creative mind fit into today's workplace? Should we fear technological developments or embrace them?"
- Will Brexit lead to 'Techxit'? What does the vote mean for UK engineering?
- Driverless cars should kill their passengers if necessary poll finds
- Humans will not land on Mars for at least 15 years, says ESA head
- Sweden’s e-Highway frees trucks from fossil fuels
- IET appeals for increase in published works by female engineers
- Student-built electric car breaks acceleration record