E&T looks how software can help patients with speech difficulties.
Patricia Ruckoff was diagnosed with motor neurone disease (MND) on Christmas Eve, 2009. Her daughter Melissa found JayBee on the Internet and contacted Time Is Ltd. (TIL) to try to secure the speech-generation software it had developed. The local MND Association had a system purchased for a previous patient who had recently died of MND.
Patricia took delivery of the system in January 2010 and immediately tried it out, allowing JayBee to learn her communication traits. At that time, she was fully mobile, walking and climbing stairs, and her voice, although slightly slurred, was completely understandable.
Her home was soon broken into, and the system was stolen. After an insurance claim was filed, she received a replacement.
Staying in charge
Patricia has always been an active person. Her home is the centre of the family, always full of children and grandchildren. Since she was diagnosed with MND, her mobility and speech have deteriorated. She now relies on a motorised wheelchair and has completely lost her voice. However, with the chair and the software, she can still speak and interact with her family. She is, in short, still in charge.
I am a satellite control systems consultant and director of TIL, and my motivation to have the company write a computer system for MND sufferers was purely personal.
Jon and Bridget - the 'Jay' and 'Bee' of the product's title - were my good friend and my mother-in-law, the former succumbing to the condition in 2007, and the latter - in 1985.
Bridget had nothing to communicate with, except for paper and pencil. Jon had a machine, which was supposed to be state-of-the-art, but could only come up with a single word, question or response after lengthy and laborious typing. 'food' or perhaps 'pillow' would be articulated in that well-known robotic voice, which has become part of the persona of Professor Steven Hawking - the best known current sufferer of MND. The cost of the system was also prohibitive.
What is MND?
Motor neurone disease, also known as amyotrophic lateral sclerosis (ALS), or Lou Gehrig's disease, is a cruel and progressive neurodegenerative disease that attacks the upper and lower motor neurones. Degeneration of the motor neurones leads to weakness and wasting of muscles, causing loss of mobility in the limbs and difficulties with speech, swallowing and breathing.
Bridget's predicament back in the 1980s prompted me to begin creating a programme based on an artificial intelligence system I had developed at the European Space Agency back in 1976. I was doing it on an Amiga computer.
Sadly, Bridget died before it was ready. The average life expectancy of an MND sufferer, once diagnosed, is not long. Finding Jon in the same situation, I was determined to develop something better, and at a fraction of the price of the leading suppliers.
Computers are faster now, but how could something be produced soon enough for a quickly deteriorating patient to be able to use it? The hunt was on for suitable existing technologies. The first, Microsoft Access, might be deemed somewhat curious, but has grown to provide an excellent basis for JayBee, being exceedingly fast both for development and in operation.
With the grunt-work of the processing taken care of by Access, the functionality was provided by an extensive Visual Basic (VBA) front end. TIL has invested over 2,500 hours of development into the current version.
The speech technology is by a company called CereProc, based in Edinburgh. CereProc specialises in producing natural-sounding voices, with real personality and emotion, and in creating voices with regional accents.
CereProc was recently in the news for its project to create a custom voice for Roger Ebert, the well-known US film critic. The company constructed his voice from archive DVD commentaries, recorded before he lost the power of speech.
CereProc can create voices that sound like the original speaker, because the basis for all is a real human voice. To produce a voice, they record a speaker reading a script, or, as in Roger Ebert's case, take archive recordings and have them transcribed. The speech is then broken down into its constituent sounds, or phonemes, and weightings applied to each phoneme. Once the voice has been built, the user can type in any text, and the best-matching phonemes will be selected and arranged to form the speech output. This process is also known as voice banking.
Tongues and emotions
With CereProc voices it is possible to produce extremely expressive speech. In addition to recording the speaker's normal voice, the company also records its emotional pitches. During synthesis, the CereProc voice is able to simulate happiness, calm, anger and sadness, in the speech output. It is also possible to apply specific emphasis, speed or pitch by placing tags in the text around particular words or phrases.
CereProc provides its own API for software developers, or else the voices can be accessed through the standard Microsoft SAPI5 interface, as is the case with JayBee.
The quality of the voice is of paramount importance to patients, who no longer have the power of speech. It is important that what is produced is lifelike. After all, the synthesised robotic voice sounds unnatural and can be off-putting.
Although JayBee comes with two CereProc voices, 'William' and 'Sarah', any SAPI5 compliant voice can be used instantly. TIL has already provided Turkish and German E F versions. French is currently in production and Spanish is planned to follow.
JayBee is built to allow those who have lost the power of speech and have severe mobility difficulties to communicate fully and very quickly. MND is degenerative, with mobility and dexterity deteriorating rapidly. To counter this, JayBee has a variety of interface methods which allow it to be used despite advanced stages of deterioration as long as a single movement of any part of the body is still possible.
How JayBee works
The innovation that distinguishes JayBee is the prediction engine, which adapts constantly to the user's current usage of language. It is this adaptive prediction that forms the JayBee artificial intelligence. Text prediction is well known in mobile phones, but JayBee takes it much further. It predicts shortcuts, alphabetically sorted words, words sorted by likelihood of use at the moment, complete phrases and even complete essays. On pressing the letter 'H', the screen would be populated.
With just a few (typically three or four) keystrokes, mouse clicks, finger touches, switch clicks or even head movements, the user can access and speak out complete sentences, or even lengthy statements, pre- prepared and stored in JayBee's comprehensive library.
JayBee's current release has an initial vocabulary of 17,000 words, 1,500 phrases and 500 shortcuts. The user can add to any of these simply by using JayBee, and the number is limited only by the disk size of the computer.
One crucial factor in usability is the speed with which JayBee carries out its prediction processes, and that is one of the system's strengths. The rapidity of response is all the more difficult to achieve, given the adaptive learning that the system is performing. It quickly determines what the most likely outcome of a user's input is to be, and the more an individual uses the system, the better the adaptation is.
JayBee has numerous interface methods for the prediction engine. The simplest input method is a standard QWERTY keyboard, optimised for limitations familiar to MND/ALS sufferers. To counteract the inability to stretch the fingers out, JayBee employs forward slash (/) and back slash (\) keys at the bottom of the keyboard to perform functions needed by the user. The top row of function keys, often used by other suppliers, is virtually useless to MND/ALS patients.
The next input method available to JayBee users is a virtual on-screen keyboard. Any pointing device, such as mouse, rollerball or touch screen, can be used. One advanced way is Camera Mouse developed at Boston University. It uses the webcam to track any feature of the user's face and allows anyone with head movement to use any of JayBee's virtual keyboards.
There are four full on-screen keyboards built into JayBee. The first is the normal QWERTY set up, with only the keys needed by JayBee, making for a reduced key set. The second is alphabetic, the third - Dvorak, and the fourth - custom.
A fifth virtual keyboard is the predictive one, and its keys mimic a mobile phone layout. The difference here, however, is in the effect of using it. Because of the prediction engine, even after a single keystroke, JayBee will populate the four panels and locate suitable entries in the library.
Because of previous usage by a patient, JayBee has predicted the most likely words which may be used, namely 'I', 'hospital' and 'go'. These can be found in the 'likely' panel above. Notice, however, that JayBee has predicted a phrase in the 'phrases' panel, 'I must go to the hospital tomorrow'. So, with just two clicks, the 'ghi' button and the phrase in the 'phrases' panel, the user can communicate a complete sentence.
This virtual keyboard is particularly handy with Camera Mouse, where the user needs only move his or her head to control it. It can also be accessed by using a standard numeric keyboard attached to the computer.
One of JayBee's most advanced features is switch operation. This is designed for use by people with severe disabilities who can move one or two parts of their bodies. There are a multitude of switches which cater for almost any single movement, including eye blinks.
JayBee allows for single-switch (in which an automatic scan of the necessary keys is performed), or two-switch (where one switch moves the focus from key to key and the other performs the selection).
The user waits while Jaybee scans across the top 10 buttons, and when the 'ghi' button is in focus, clicks. JayBee then generates a new vertical menu and begins to scan down. The phrase needed is shown in the 'phrase' panel, so the user waits until the 'phrases' option is selected in the vertical scan, and clicks again. One more click when the needed phrase is highlighted and the phrase is spoken. This can be performed in two to three seconds depending on the scanning speed the user has selected.
Single phrases aside, the system is able to cope with text of any length, prepared in advance of, say, seeing a doctor. Such statements can be prepared from within JayBee or cut and pasted from any text source, for example a Word document or the Internet. This text is then stored as an 'essay' in JayBee's library. Every entry in the library is subject to the prediction engine. JayBee even allows library entries to be emailed using the email client on the computer.
Via its many options, JayBee allows customisation. Fonts, colours, keyboard layouts and active features can all be adjusted to suit the user. There is also a complete administration capability for all data stored in the system.
JayBee has active on-screen help, context-sensitive help, a Flash demonstration, an interactive Tutorial and a complete handbook.
Because of the tools used to create JayBee, tailored solutions can currently be prepared in around two weeks - the Turkish modification, for example, was produced in one week.
JayBee brings full, expressive and audible communication to many who would otherwise face very limited possibilities. It is not restricted to MND sufferers - for example, a version is currently in preparation for a wheelchair-mounted system for a patient with Cerebral Palsy who only has movement of the head. Its aim is to provide autonomy to those who use it thus helping them communicate and assisting in independent decision-making.
If you are interested in helping or would like to try JayBee , visit the website at www.jaybee.org.uk where you can contact Ian Schofield, try the demonstration or download a free 30 day trial. To contact CereProc, go to www.cereproc.com.