New visually-guided hearing aid uses eye gaze to focus amplification of sound signals from the direction of the user’s attention.
Reviewed by Dr Gerald Kidd.
The concept of a visually-guided hearing aid (VGHA) is intriguing and encompasses a number of different disciplines that are not for the faint of heart, considering the complexity of the technologies being brought together.
Although not a new idea, the VGHA, which evolved out of work on spatial hearing in the Psychoacoustics Laboratory at Boston University, Massachusetts, United States, is a prototype that is currently used for research that incorporates acoustic beamforming.
Beamforming is a way of harnessing audio-frequency signals to focus amplification of a signal from a specific direction. In the VGHA, the formed beam is manipulated by eye gaze to improve the ability of users to better focus on one sound despite the presence of nearby competing sounds.
Dr Gerald Kidd Jr, a professor in the department of Speech, Language and Hearing Sciences and director of the Psychoacoustics Laboratory at Boston University, explained that one of the early concepts motivating this work was the finding that individuals can listen very selectively in space. “Along the dimension left to right—i.e., azimuth—they can focus their attention on a particular point of interest and attenuate sources of sound that are off the axis of the focus of attention,” he said.
The evidence for this spatial tuning effect with natural hearing was first described in 2000.1 Investigators then incorporated the effect into studies on hearing loss and improving hearing aids.
“In considering how we could improve individuals’ hearing in situations in which there are multiple spatially distributed sources of sound, which has been a perennial problem for people who wear hearing aids, we worked on devising an algorithm to replicate normal spatial hearing for those who cannot hear well; that is, by separating the sources in azimuth,” Dr Kidd said.
Dr Kidd and his colleagues worked in collaboration with investigators from Sensimetrics of Malden, Massachusetts, US. They designed a way to work with beamforming as a method to improve hearing aids by enhancing the signal-to-noise ratio for sounds that are immediately in front of the beamformer. This research extended earlier work at the Research Laboratory of Electronics at the Massachusetts Institute of Technology (MIT) in Cambridge, Massachusetts, US.2
One option was to mount microphones on a spectacle frame. Working with Dr Joseph Desloge, professor, Department of speech, Language and Hearing Sciences and director of the Psychocaoustics Laboratory, Boston University, the two groups identified a problem with a beamformer mounted on a spectacle frame—namely, the inability to move it from one place to another.
For example, it could not easily be used to follow conversations in a group. The listener’s head would have to move each time a different individual spoke, to point the beamformer in the direction of the speaker.
Another method of steering the beamformer was also considered; this involved using a hand dial or phone controls to move the beam into a desired direction. However, the most natural way of steering the beam was in plain sight—moving the beam with eye gaze. The linking of the auditory and visual attention was born.
The eyes and the beam move in tandem from left to right with the change in the source of the sound. This was considered to be a powerful way to improve amplification for individuals with hearing loss, Dr Kidd explained.
Dr Kidd and Dr Desloge settled on a system that used the signals from a commercially available eye tracker combined with a custom-made microphone array comprised of four rows of four microphones, flush-mounted on a flexible band that can be positioned across the top of the user’s head.3
Dr Kidd explained that the outputs from the microphones (the acoustic component) are combined using an algorithm that was applied to audio beamforming devised by the MIT group, among others. “This configuration optimises the response of the microphone array to the direction chosen by the user,” he said.
As Dr Kidd went on to explain, the signals from the microphone array are processed in such a way as to be maximally responsive to an azimuth that is determined by gaze as detected by the eye tracker. A requirement of the eye tracker is that it should have both a world-view camera that points outwards and a camera that points inwards to track pupillary location. The two used together calibrate where the eyes are positioned relative to where the external camera thinks the eyes are positioned.
Associated software also contains a previously measured set of head-related impulse responses. “These responses provide the values across frequency for the algorithm that determines the optimal phase response, or time delay, and amplitude used to weight the response of each microphone to optimise the responsiveness to a particular azimuth,” Dr Kidd stated.
The system was fitted on a mannequin and a recorded set of impulse responses obtained from a number of different locations in the front hemifield. The resolution obtained was good.
When placed on a user, the eye tracker senses the angle at which the eyes are positioned, selects the head-related transfer function that corresponds to that azimuth and convolves that with the approaching stimulus. “This provides a very highly directional response … as if the sound were coming from that location,” Dr Kidd explained.
The filtering function of the VGHA is sharpest at the higher frequencies (short wavelengths) and broadest at the lower frequencies (long wavelengths), resulting in the beamformer being most sharply tuned at higher frequencies.
Dr Kidd reported that sounds falling outside the focus of the beam are sharply attenuated at high frequencies and less attenuated at low frequencies, depending on the distance from the focus. “This typically alters the quality of the sounds that are off axis a bit,” he said. “However, the beamformer can provide a great improvement in the signal-to-noise ratio for nearby sound sources and in rooms with good acoustics.”
This technology seems to be most effective when used in a small conference room with individuals sitting around a table, or in a “cocktail party” scenario in which selective hearing can be impaired when multiple individuals are speaking simultaneously. In larger settings, such as at a concert or a play, the technology would not be beneficial for selecting one speaker from among others at a greater distance. “The basic physics of this technology is that it will be more beneficial for nearby sound sources,” Dr Kidd stated.
A recent beneficial advancement in Dr Kidd’s research has been the development of a triple beamformer that should provide enhanced hearing for individuals with cochlear implants.4 By way of comparison, Dr Kidd noted that the original beamformer had a single-channel output that comprised one spatial filter used to provide sound to one or both ears, but without any difference in the sound that went to the two ears.
“With single channel output, [although] hearing is enhanced, our natural binaural hearing that provides a big advantage in locating sounds is lost,” he said. The investigators are working on multiple beams to focus on the primary sound source of interest. A second beam that is pointed to the right funnels sound only to the right ear, and a third beam pointed to the left funnels sound only to the left ear.
“This approach restores some of the normal binaural hearing in addition to the benefit of the original beam former,” Dr Kidd explained. This restoration, the major feature of the triple beamformer, works by improving the signal-to-noise ratio of the single-channel beam and improves the spatial hearing, by improving the ability to locate sound sources outside the beam.5
In patients with bilateral cochlear implants, the devices work mostly independently of each other, which results in loss of a great deal of the normal binaural spatial hearing. “The triple beamformer enhances the intra-aural differences that lead to spatial location and source segregation,” Dr Kidd stated. He believes that this research approach is promising for this patient population.
The VGHA research is currently confined to the laboratory, and computers perform the signal processing. In addition, for the technology to become commercially available, the portable system would have to be miniaturised and cosmetically acceptable in order to be worn on the head.
Dr Kidd said he anticipates that the technology may be first used in the cochlear implant community. “I hope that this research will lead to building a better hearing aid for solving situations such as the problem at the classical cocktail party,” he concluded. “In addition, the technology is not intended only for people with hearing loss. We believe that even people with normal hearing might benefit from this technology and it could have wide application.”