Datenbestand vom 30. Januar 2026

Impressum Warenkorb Datenschutzhinweis Dissertationsdruck Dissertationsverlag Institutsreihen     Preisrechner

aktualisiert am 30. Januar 2026

ISBN 978-3-8439-5689-5

84,00 € inkl. MwSt, zzgl. Versand


978-3-8439-5689-5, Reihe Elektrotechnik

Ragini Sinha
Deep Neural Network-based Approaches for Single-channel Speaker-conditioned Target Speaker Extraction

164 Seiten, Dissertation Carl von Ossietzky Universität Oldenburg (2025), Hardcover, B5

Zusammenfassung / Abstract

In everyday communication scenarios, such as meetings and social gatherings, undesired interfering speakers and background noise often degrade the quality and intelligibility of the desired target speaker. Various approaches have been developed to address this issue, such as blind source separation and speaker-conditioned target speaker extraction (SC-TSE). SC-TSE algorithms aim at extracting the desired speaker from the mixture by utilizing auxiliary information about the target speaker, such as reference speech, visual information, directional information, or speaker activity. A typical SC-TSE system consists of a speaker embedder network and a speaker separator network. The speaker embedder network generates target speaker-specific discriminative features from the auxiliary information, which guides the speaker separator network to extract the target speaker from the mixture.

The aim of this thesis is to develop and evaluate novel DNN-based architectures, both objectively and subjectively to enhance the reliability, efficiency and robustness of single-channel SC-TSE algorithms utilizing reference speech as auxiliary information.