Adaptive time segmentation for improved speech enhancement

RC Hendriks; R Heusdens; J Jensen

Adaptive time segmentation for improved speech enhancement

RC Hendriks, R Heusdens, J Jensen

Multimedia Computing

Research output: Contribution to journal › Article › Scientific › peer-review

20 Citations (Scopus)

Abstract

Single-channel enhancement algorithms are widely used to overcome the degradation of noisy speech signals. Speech enhancement gain functions are typically computed from two quantities, namely, an estimate of the noise power spectrum and of the noisy speech power spectrum. The variance of these power spectral estimates degrades the quality of the enhanced signal and smoothing techniques are, therefore, often used to decrease the variance. In this paper, we present a method to determine the noisy speech power spectrum based on an adaptive time segmentation. More specifically, the proposed algorithm determines for each noisy frame which of the surrounding frames should contribute to the corresponding noisy power spectral estimate. Further, we demonstrate the potential of our adaptive segmentation in both maximum likelihood and decision direction-based speech enhancement methods by making a better estimate of the a priori signal-to-noise ratio (SNR)$xi$. Objective and subjective experiments show that an adaptive time segmentation leads to significant performance improvements in comparison to the conventionally used fixed segmentations, particularly in transitional regions, where we observe local SNR improvements in the order of 5 dB.

Original language	Undefined/Unknown
Pages (from-to)	2064-2074
Number of pages	11
Journal	IEEE Transactions on Audio, Speech and Language Processing
Volume	14
Issue number	6
Publication status	Published - 2006

Bibliographical note

volgens mij heette dit journal eerst IEEE.... on speech and audio proc..... en daarvan was de weegfactor 0,96, die heb ik dus maar aangehouden want voor dit journal staat "onbekend"

Keywords

academic journal papers
CWTS JFIS < 0.75

Cite this

@article{724108a829f347ab983b15faeac3785b,

title = "Adaptive time segmentation for improved speech enhancement",

abstract = "Single-channel enhancement algorithms are widely used to overcome the degradation of noisy speech signals. Speech enhancement gain functions are typically computed from two quantities, namely, an estimate of the noise power spectrum and of the noisy speech power spectrum. The variance of these power spectral estimates degrades the quality of the enhanced signal and smoothing techniques are, therefore, often used to decrease the variance. In this paper, we present a method to determine the noisy speech power spectrum based on an adaptive time segmentation. More specifically, the proposed algorithm determines for each noisy frame which of the surrounding frames should contribute to the corresponding noisy power spectral estimate. Further, we demonstrate the potential of our adaptive segmentation in both maximum likelihood and decision direction-based speech enhancement methods by making a better estimate of the a priori signal-to-noise ratio (SNR)$xi$. Objective and subjective experiments show that an adaptive time segmentation leads to significant performance improvements in comparison to the conventionally used fixed segmentations, particularly in transitional regions, where we observe local SNR improvements in the order of 5 dB.",

keywords = "academic journal papers, CWTS JFIS < 0.75",

author = "RC Hendriks and R Heusdens and J Jensen",

note = "volgens mij heette dit journal eerst IEEE.... on speech and audio proc..... en daarvan was de weegfactor 0,96, die heb ik dus maar aangehouden want voor dit journal staat {"}onbekend{"}",

year = "2006",

language = "Undefined/Unknown",

volume = "14",

pages = "2064--2074",

journal = "IEEE Transactions on Audio, Speech and Language Processing",

issn = "1558-7916",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "6",

}

TY - JOUR

T1 - Adaptive time segmentation for improved speech enhancement

AU - Hendriks, RC

AU - Heusdens, R

AU - Jensen, J

N1 - volgens mij heette dit journal eerst IEEE.... on speech and audio proc..... en daarvan was de weegfactor 0,96, die heb ik dus maar aangehouden want voor dit journal staat "onbekend"

PY - 2006

Y1 - 2006

N2 - Single-channel enhancement algorithms are widely used to overcome the degradation of noisy speech signals. Speech enhancement gain functions are typically computed from two quantities, namely, an estimate of the noise power spectrum and of the noisy speech power spectrum. The variance of these power spectral estimates degrades the quality of the enhanced signal and smoothing techniques are, therefore, often used to decrease the variance. In this paper, we present a method to determine the noisy speech power spectrum based on an adaptive time segmentation. More specifically, the proposed algorithm determines for each noisy frame which of the surrounding frames should contribute to the corresponding noisy power spectral estimate. Further, we demonstrate the potential of our adaptive segmentation in both maximum likelihood and decision direction-based speech enhancement methods by making a better estimate of the a priori signal-to-noise ratio (SNR)$xi$. Objective and subjective experiments show that an adaptive time segmentation leads to significant performance improvements in comparison to the conventionally used fixed segmentations, particularly in transitional regions, where we observe local SNR improvements in the order of 5 dB.

AB - Single-channel enhancement algorithms are widely used to overcome the degradation of noisy speech signals. Speech enhancement gain functions are typically computed from two quantities, namely, an estimate of the noise power spectrum and of the noisy speech power spectrum. The variance of these power spectral estimates degrades the quality of the enhanced signal and smoothing techniques are, therefore, often used to decrease the variance. In this paper, we present a method to determine the noisy speech power spectrum based on an adaptive time segmentation. More specifically, the proposed algorithm determines for each noisy frame which of the surrounding frames should contribute to the corresponding noisy power spectral estimate. Further, we demonstrate the potential of our adaptive segmentation in both maximum likelihood and decision direction-based speech enhancement methods by making a better estimate of the a priori signal-to-noise ratio (SNR)$xi$. Objective and subjective experiments show that an adaptive time segmentation leads to significant performance improvements in comparison to the conventionally used fixed segmentations, particularly in transitional regions, where we observe local SNR improvements in the order of 5 dB.

KW - academic journal papers

KW - CWTS JFIS < 0.75

UR - http://ieeexplore.ieee.org/iel5/10376/36074/01709895.pdf?isnumber=36074&prod=JNL&arnumber=1709895&arSt=+2064&ared=+2074&arAuthor=+Hendriks%2C+R.C.%3B++Heusdens%2C+R.%3B++Jensen%2C+J.

M3 - Article

SN - 1558-7916

VL - 14

SP - 2064

EP - 2074

JO - IEEE Transactions on Audio, Speech and Language Processing

JF - IEEE Transactions on Audio, Speech and Language Processing

IS - 6

ER -

Adaptive time segmentation for improved speech enhancement

Abstract

Bibliographical note

Keywords

Other files and links

Cite this