Datenbestand vom 12. August 2022
Tel: 0175 / 9263392
Mo - Fr, 9 - 12 Uhr
Fax: 089 / 66060799
aktualisiert am 12. August 2022
978-3-8439-0363-9, Reihe Informationstechnik
Codebook-Based Speech Enhancement – Robust and Efficient Approaches
181 Seiten, Dissertation Carl von Ossietzky Universität Oldenburg (2012), Softcover, B5
In this thesis, a codebook-based speech enhancement approach is presented. The approach is based on previous work on this topic and has a special focus on robustness and efficiency.
Codebook-based speech enhancement algorithms have the potential to accurately estimate and therefore effectively reduce not only stationary but also (highly) non-stationary noise. This can be achieved by incorporating a priori knowledge on speech and different noise classes in form of trained codebooks, which contain typical spectral envelopes of the respective signals. However, these methods fail to estimate the noise if the noise type is unknown or if the model is inaccurate. Moreover, they suffer from extreme computational complexity.
This thesis explicitly addresses these drawbacks. Different methods are proposed which significantly increase both robustness and efficiency of known codebook-based methods.
Robustness is increased by integrating the codebook-based approach into a recursive minimum tracking noise estimation algorithm, which is a state-of-the-art noise estimation method. Consequently, the integrated approach inherits the robustness of the recursive minimum tracking algorithm. As a further step, the noise codebooks can either be rendered adaptively or a novel codebook training method can be used. The latter method uses the cepstral difference between the actual noise spectrum and a robust estimate—obtained with the recursive minimum tracking approach—as input for the codebook training algorithm. The so obtained delta codebook is then centered around the robust noise estimate during operation. That way, the robust estimate is always also a valid estimate of the codebook-based algorithm and therefore robustness is increased while the ability to track non-stationary noise is not affected.
Further improvement over prior art includes sophisticated memory models which model the temporal evolution of the speech and noise parameters, i. e., spectral shapes and broadband gain factors.
Efficiency is considerably improved by two means: First, a cepstral envelope model is used instead of an autoregressive model. The cepstral model strictly separates pitch components and spectral envelope in contrast to the autoregressive model. Thus, an additional degree of freedom in the codebook training can be avoided and the speech codebook size can be reduced. A further reduction of the speech codebook size ...