Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS

Liu, Qingju; Wang, Wenwu; Jackson, Philip

doi:10.1007/978-3-642-15995-4_17

Qingju Liu²¹,
Wenwu Wang²¹ &
Philip Jackson²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

3121 Accesses
1 Citations

Abstract

Recent studies show that visual information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterisation of the coherence between the audio and visual speech using, e.g. a Gaussian mixture model (GMM). In this paper, we present two new contributions. An adapted expectation maximization (AEM) algorithm is proposed in the training process to model the audio-visual coherence upon the extracted features. The coherence is exploited to solve the permutation problem in the frequency domain using a new sorting scheme. We test our algorithm on the XM2VTS multimodal database. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS.

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) (Grant number EP/H012842/1) and the MOD University Defence Research Centre on Signal Processing (UDRC).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jutten, C., Herault, J.: Blind Separation of Sources, Part I: An Adaptive Algorithm Based on Neuromimetic Architecture. Signal Process. 24(1), 1–10 (1991)
Article MATH Google Scholar
Comon, P.: Independent Component Analysis, a New Concept? Signal Process. 36(3), 287–314 (1994)
Article MATH Google Scholar
Cardoso, J.F., Souloumiac, A.: Blind Beamforming for Non-Gaussian Signals. IEEE Proc.-F 140(6), 362–370 (1993)
Google Scholar
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, New York (2001)
Book Google Scholar
Sodoyer, D., Schwartz, J.L., Girin, L., Klinkisch, J., Jutten, C.: Separation of Audio-Visual Speech Sources: a New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli. EURASIP J. Appl. Signal Process. 11, 1165–1173 (2002)
Google Scholar
Wang, W., Cosker, D., Hicks, Y., Sanei, S., Chambers, J.: Video Assisted Speech Source Separation. In: Proc. IEEE ICASSP, pp. 425–428 (2005)
Google Scholar
Rivet, B., Girin, L., Jutten, C.: Mixing Audiovisual Apeech Processing and Blind Source Separation for the Extraction of Speech Signals from Convolutive Mixtures. IEEE Trans. Audio Speech Lang. Process. 15(1), 96–108 (2009)
Article Google Scholar
Anemüller, J., Kollmeier, B.: Amplitude Modulation Decorrelation for Convolutive Blind Source Separation. In: Proc. ICA, pp. 215–220 (2000)
Google Scholar
Ikram, M.Z., Morgan, D.R.: A Beamforming Approach to Permutation Alignment for Multichannel Frequency-Domain Blind Speech Separation. In: Proc. IEEE ICASSP, pp. 881–884 (2002)
Google Scholar
Matsuoka, K., Nakashima, S.: Minimal Distortion Principle for Blind Source Separation. In: Proc. ICA, pp. 722–727 (2001)
Google Scholar
Thomas, J., Deville, Y., Hosseini, S.: Time-Domain Fast Fixed-Point Algorithms for Convolutive ICA. IEEE Signal Process. Lett. 13(4), 228–231 (2006)
Article Google Scholar
Messer, K., Matas, J., Kittler, J., Luettin, J., Maitre, G.: XM2VTSDB: The Extended M2VTS Database. In: AVBPA (1999), http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/
Westner, A.: Room Impulse Responses (1998), http://alumni.media.mit.edu/~westner/papers/ica99/node2.html

Download references

Author information

Authors and Affiliations

Centre for Vision, Speech and Signal Processing, Faculty of Engineering and Physical Sciences, University of Surrey, Guildford, GU2 7XH, United Kingdom
Qingju Liu, Wenwu Wang & Philip Jackson

Authors

Qingju Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenwu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Philip Jackson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Universitè d’Evry Val d’Essone, 40 rue du Pelvoux, 91020, Courcouronnes, France
Vincent Vigneron
Laboratoire I3S, Les Algorithmes - Euclide-B, BP 121, Université de Nice-Sophia Antipolis, 2000 Route des Lucioles, 06903, Sophia Antipolis Cedex, France
Vicente Zarzoso
School of Engineering, Dept. of Telecommunications, ISITSchool of Engineering, Dept. of Telecommunications, ISITV, Université de Toulon, Avenue George Pompidou, BP 56, La Valette du Var, Cedex, 83162, France
Eric Moreau
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes cedex, France
Rémi Gribonval
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes Cedex, France
Emmanuel Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Q., Wang, W., Jackson, P. (2010). Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-15995-4_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics