Abstract
The processing required for the global maximization of the intelligibility of speech acquired by multiple microphones and rendered by a single loudspeaker, is considered in this paper. The intelligibility is quantized, based on the mutual information rate between the message spoken by the talker and the message as interpreted by the listener. We prove that then, in each of a set of narrow-band channels, the processing can be decomposed into a minimum variance distortionless response (MVDR) beamforming operation that reduces the noise in the talker environment, followed by a gain operation that, given the far-end noise and beamforming operation, accounts for the noise at the listener end. Our experiments confirm that both processing steps are necessary for the effective conveyance ofa message and, importantly, that the second step must be aware of the first step.
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Subtitle of host publication | Proceedings |
Editors | Min Dong, Thomas Fang Zheng |
Place of Publication | Danvers, MA |
Publisher | IEEE |
Pages | 654-658 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-4799-9988-0 |
DOIs | |
Publication status | Published - 19 May 2016 |
Event | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai International Convention Center, Shanghai, China Duration: 20 Mar 2016 → 25 Mar 2016 |
Conference
Conference | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 |
---|---|
Abbreviated title | ICASSP |
Country/Territory | China |
City | Shanghai |
Period | 20/03/16 → 25/03/16 |
Bibliographical note
Accepted Author ManuscriptKeywords
- speech intelligibility enhancement
- mutal information
- signal to noise ratio
- roduction
- multi-microphone
- speech enhancement
- minimum variance distortionless response (MVDR)
- beamformer