top of page
  • Writer's pictureWorld Half Full

AI gives a paralysed woman her voice back

SCIENCE



Researchers at two Californian universities have developed a brain-computer interface (BCI) that has enabled a woman with severe paralysis from a brainstem stroke to speak through a digital avatar.


It’s the first time either speech or facial expressions have been synthesised from brain signals. The BCI can also decode these signals into text at nearly 80 words a minute, a vast improvement over commercially available technology.


Edward Chang, MD, chair of neurological surgery at the University of California San Francisco (UCSF), who has worked on BCI for more than a decade, hopes this latest breakthrough, published in Nature, will enable speech from brain signals in the near future.


“Our goal is to restore a full, embodied way of communicating, which is really the most natural way for us to talk with others,” says Chang, a member of the UCSF Weill Institute for Neuroscience and the Jeanne Robertson Distinguished Professor in Psychiatry. “These advancements bring us much closer to making this a real solution for patients.”


Chang’s team previously demonstrated it was possible to decode brain signals into text in a man who’d also experienced a brainstem stroke many years earlier. The current technology demonstrates something more ambitious: decoding brain signals into the richness of speech, along with movements that animate a person’s face during conversation.


Chang implanted a paper-thin rectangle of electrodes onto the surface of the woman’s brain over areas his team had discovered are critical for speech. The electrodes intercepted the brain signals that — if not for the stroke — would have gone to muscles in her tongue, jaw and larynx, as well as her face. A cable plugged into a port fixed to her head connected the electrodes to a bank of computers.


For weeks, the participant worked with the team to train the system’s artificial intelligence (AI) algorithms to recognise her unique brain signals for speech. This involved repeating different phrases from a 1,024-word conversational vocabulary over and over again, until the computer recognised the brain activity patterns associated with the sounds.


Rather than train the AI to recognise whole words, the researchers created a system that decodes words from phonemes. These are the sub-units of speech that form spoken words in the same way letters form written words. “Hello”, for example, contains four phonemes: HH, AH, L and OW.


Using this approach, the computer only needed to learn 39 phonemes to decipher any word in English. This both improved the system’s accuracy and made it three times faster.


“The accuracy, speed and vocabulary are crucial,” says Sean Metzger, who developed the text decoder with Alex Silva, both graduate students in the joint Bioengineering Program at University of California Berkeley (UCB) and UCSF. “It’s what gives a user the potential, in time, to communicate almost as fast as we do, and to have much more naturalistic and normal conversations.”


To create the voice, the team devised an algorithm for synthesising speech, which they personalised to sound like her voice before the injury, using a recording of her speaking at her wedding. They then animated the avatar with the help of software that simulates and animates the face’s muscle movements, developed by Speech Graphics, a company that makes AI-driven facial animation. The researchers created customised machine-learning processes that allowed the company’s software to mesh with signals being sent from the woman’s brain as she was trying to speak and convert them into the movements on the avatar’s face, making the jaw open and close, the lips protrude and purse and the tongue go up and down, as well as the facial movements for happiness, sadness and surprise.


“We’re making up for the connections between the brain and vocal tract that have been severed by the stroke,” says Kaylo Littlejohn, a graduate student working with Chang, who is also working with Gopala Anumanchipalli, PhD, a professor of electrical engineering and computer sciences at UCB. “When the subject first used this system to speak and move the avatar’s face in tandem, I knew this was going to be something that would have a real impact.”


An important next step for the team is to create a wireless version that won’t require the user to be physically connected to the BCI.


“Giving people the ability to freely control their own computers and phones with this technology would have profound effects on their independence and social interactions,” says co-first author David Moses, PhD, an adjunct professor in neurological surgery.


We’re making up for the connections between the brain and vocal tract that have been severed by the stroke. When the subject first used this system to speak and move the avatar’s face in tandem, I knew this was going to be something that would have a real impact.

Kaylo Littlejohn


8 views0 comments
bottom of page