WO2006078003A2 - Procede et systeme de separation de signaux acoustiques - Google Patents

Procede et systeme de separation de signaux acoustiques Download PDF

Info

Publication number
WO2006078003A2
WO2006078003A2 PCT/JP2006/300918 JP2006300918W WO2006078003A2 WO 2006078003 A2 WO2006078003 A2 WO 2006078003A2 JP 2006300918 W JP2006300918 W JP 2006300918W WO 2006078003 A2 WO2006078003 A2 WO 2006078003A2
Authority
WO
WIPO (PCT)
Prior art keywords
acoustic
signal
mixtures
separating
separated
Prior art date
Application number
PCT/JP2006/300918
Other languages
English (en)
Other versions
WO2006078003A3 (fr
Inventor
Che-Ming Lin
Chien-Ming Wu
Original Assignee
Matsushita Electric Industrial Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co., Ltd. filed Critical Matsushita Electric Industrial Co., Ltd.
Publication of WO2006078003A2 publication Critical patent/WO2006078003A2/fr
Publication of WO2006078003A3 publication Critical patent/WO2006078003A3/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source

Definitions

  • the invention relates to a method and system for separating signals, more particularly to a method and system for separating acoustic signals.
  • BSS Blind source separation
  • x(n) [X 1 ( «)... X jx (n)] T in discrete form
  • A[Jc) represents the impulse response of a transmission environment from each independent sound source to a respective microphone.
  • the dimension of A ⁇ ) is dsx dx , whereas r represents
  • each acoustic mixture is a result of computation of convolution of the original sound source with the impulse response of the transmission environment.
  • the conventional BSS technique primarily utilizes the acoustic mixtures to find a good separated matrix W(k) , and to perform
  • the conventional BSS technique assumes the signal points in each of the original sound sources S 1 (n)... s ds (n) to be statistically independent of each other so that there is no spatial correlation, and the calculated separated matrix W ⁇ ) will result in absence of spatial correlation
  • the BSS technique as disclosed includes the following steps:
  • Step (Tl) is a linear prediction processing step, in which acoustic mixtures X 1 ( «)... x dx (n) undergo linear prediction processing to result in
  • the linear prediction processing is to try to remove the temporal correlation in each of the acoustic mixtures X 1 i.e., using the
  • step (Tl) still cannot completely or substantially remove the temporal correlation in each acoustic mixture. Therefore, although the temporal correlation in each of the residual signals r x ( «)... r ⁇ ) thus calculated is lower than that in the acoustic
  • step (T2) the residual signals are subjected to independent component analysis (hereinafter referred to as ICA).
  • ICA independent component analysis
  • ICA processing is a known technique, for which reference can be made to the article "Independent Component Analysis, a new concept?" proposed in Signal Processing by P. Common in 1994.
  • the conventional ICA processing scheme primarily involves calculating the separated matrix W(k) from the residual
  • the ICA processing scheme can effectively remove the spatial correlation among the signal points in each of the residual signals ⁇ 1 ( «)... r ⁇ (n).
  • the method for calculating the separated matrix W(k) is to generate
  • off _diag ⁇ is a value of a non-diagonal from the matrix
  • ⁇ and r are time indexes
  • L is a positive integer and represents the number of signal points in the acoustic mixture.
  • the W mw (k) calculated from equation (6) is used as a new W r ⁇ k) for substitution into equation (4) to update the value of the signal m, and the new W r (k) and the updated m are substituted into equation (5) to
  • Equations (4), (5) and (6) are calculated respectively in this recursive manner until the AW ⁇ (k) calculated from equation (5) approximates 0.
  • the W new (k) thus calculated from equation (6) at this time is the separated matrix W (Jc) .
  • the acoustic mixtures X 1 («)... * «& ( «) and the separated matrix W(k) are convoluted according to equation (2) in step (T2) to obtain the separated signals Z 1 (4 ⁇ 2 ( «),•• •**(») .
  • step (T2) the temporal correlation in each residual signal cannot be further reduced to 0. Therefore, the separated matrix calculated in step (T2) still cannot achieve optimization, and the separated signals Z 1 ( «)... z ds (n) still cannot be completely identical to the
  • step (Tl) the preprocessing of linear prediction not only cannot remove the temporal correlation at pitch positions, but also there is another drawback that the order q must exceed 50, thereby rendering the calculation of equation (3) rather complicated and time-consuming.
  • the primary object of the present invention is to provide a system for separating acoustic signals which can enhance the sound separation effect.
  • Another object of the present invention is to provide a method for separating acoustic signals.
  • the method can be employed to better separate into original sound sources from acoustic mixtures.
  • the system for separating acoustic signals of the present invention is adapted for separating a plurality of acoustic mixtures into at least one single sound source.
  • the system for separating acoustic signals includes: a pitch prediction module capable of removing temporal correlation in each of the acoustic mixtures according to the following equation:
  • p is the order
  • P 1 Kk is a pitch prediction coefficient
  • D k is a pitch position, ⁇ , ⁇ k) and D k being calculated as follows:
  • L is the number of signal points contained in each of the acoustic mixtures
  • D is a positive integer ranging from 1 to L
  • different values of D being substituted into the above equation to obtain different values of ⁇ ,(D)
  • P 1 (JC) being the kth largest value of ⁇ , (D)
  • D k being the D which makes ⁇ ,(D) have the &th largest value
  • a linear prediction module connected electrically to the pitch prediction module for further removing the temporal correlation in each meta-signal y ⁇ (n)... y ⁇ n) according to the following equation so as to
  • the method for separating acoustic signals of the present invention is adapted for separating a plurality of acoustic mixtures into at least one single sound source.
  • the method for separating the acoustic signals includes the following steps:
  • yXn xXn)- ⁇ 1 (k)xXn-D k )
  • Ic I where x, ⁇ ) is the zth acoustic mixture, y, ⁇ ) is the zth processed
  • p is the order, ⁇ , ⁇ k) is a pitch prediction coefficient, and
  • D k is a pitch position, P 1 Kk) and D k being calculated as follows:
  • L is the number of signal points contained in each of the acoustic mixtures
  • D is a positive integer ranging from 1 to L
  • different values of D being substituted into the above equation to obtain different values of ⁇ D), /?, (&) being the Arth largest value of ⁇ , (D)
  • D k being the D that makes /?, (D) have the k th largest value; (B) further removing the temporal correlation in each meta-signal
  • the effect of the present invention resides in that the pitch prediction module can substantially remove the temporal correlation of the acoustic mixtures so that the separated matrix can be optimized, thereby enhancing the effect of acoustic signal separation.
  • Figure 1 is a flowchart of a conventional BSS technique, which includes a linear prediction processing step
  • Figure 2 is a system block diagram of the preferred embodiment of a system for separating acoustic signals according to the present invention
  • Figure 3 is a flowchart of the preferred embodiment of the present invention.
  • the preferred embodiment of a system for separating acoustic signals is shown to include a sound receiving module 1, a pitch prediction module 2, a linear prediction module 3, an independent component analysis processing module 4 (hereinafter referred to as ICA processing module), and a sound playback unit 5.
  • the sound receiving module 1 includes dx microphones 11 and a sampling unit 12.
  • the microphones 11 receive acoustic signals, respectively.
  • the acoustic signal received by an /th microphone 11 is represented by x, (t), and x, (t) represents a continuous acoustic signal. It is noted that, in this embodiment, the number of the microphones 11 must be at least two.
  • the sampling unit 12 is connected electrically to the pitch prediction module 2.
  • the sampling unit 12 samples the acoustic signals
  • the sampling unit 12 samples the continuous acoustic signals X 1 (t)... x dx (t) respectively at a sampling rate of 8000 samples per second to result in the acoustic mixtures X 1 ( «)... Xj x (n) in discrete form. Therefore, each acoustic mixture
  • x i( n )- -- x dx ( n ) has 8000 samples per second.
  • the sampling rate according to the present invention should not be limited to 8000 samples per second.
  • the samples obtained from the acoustic mixture X 1 ( «)... x,(n) are taken as one frame.
  • each frame includes 240
  • a frame may also be composed of samples taken from the acoustic mixture X 1 ( «)... x, («) at other durations of time, not limited to 30 ms.
  • a frame may also include all the samples in the acoustic mixture x, ⁇ ).
  • the pitch prediction module 2 is connected electrically to the sampling unit 12 and the linear prediction module 3.
  • the pitch prediction module 2 reads the acoustic mixture X 1 ( «)... x,(n) of the frame outputted by the sampling unit 12, and then removes the temporal correlation points in each acoustic mixture according to the following equation (7) to result in a meta-signal y x ( «)... y,(n), and outputs the
  • the Mi largest value i.e., ⁇ t ⁇
  • ⁇ , ⁇ D the largest value of ⁇ , ⁇ D
  • the linear prediction module 3 is connected electrically to the ICA processing module 4, and reads the meta-signals y ⁇ n).. -y dx ⁇ n) of the frame outputted by the pitch prediction module 2.
  • the linear prediction module 3 removes the temporal correlation in the meta-signals y x ( «)... y dx ⁇ n) according to the linear prediction scheme expressed in the
  • the ICA processing module 4 furthermore, it is possible to obtain a separated matrix W ⁇ k).
  • the ICA processing module 4 furthermore, it is possible to obtain a separated matrix W ⁇ k).
  • the sound playback unit 5 receives the separated signal Z 1 (n) in the frame outputted by the ICA processing module 4, and plays back the separated signal z, ( «).
  • the preferred embodiment of the system for separating acoustic signals according to the present invention can be employed to separate ds separated signals from dx acoustic mixtures
  • a method employed by the system for separating acoustic signals according to this invention includes the following steps: (Sl) receiving dx continuous acoustic signals X 1 ⁇ )...X 1 (t)
  • step (57) playing back the separated signal z,.( «) obtained in step (S6) using the sound playback unit 5, and thereafter skipping back to step (S3) to continue with the execution of steps (S3) to (S7) successively for a next frame, wherein steps (S3) to (S7) are repeated until all the frames have been processed.
  • this invention may include only the pitch prediction module 2, the linear prediction module 3, and the ICA processing module 4, and dispense with the sound receiving module 1 and the sound playback unit 5. That is, the acoustic mixtures to be analyzed are not necessarily received through the microphones 11, and can be directly inputted into the pitch prediction module 2 through network downloading, electrical interfaces, or a storage medium. For instance, the acoustic mixtures to be processed x x ( «)... x dx (n) can be inputted into the pitch prediction module 2 by
  • acoustic mixtures X 1 («)... x dx ⁇ n) in an external database through a universal serial bus (USB), or by using an optical disk drive to read the acoustic mixtures X 1 ( «)... x dx (n) stored on an optical disk.
  • USB universal serial bus
  • the data that have been processed by the ICA processing module 4 can also be sent to other systems for subsequent applications, and do not necessarily have to be played back.
  • the present invention may include only the pitch prediction module 2 and the ICA processing module 4, and dispense with the linear prediction module 3. Besides, even without the linear prediction module 3, the present invention can still effectively alleviate the drawbacks associated with the prior art. Furthermore, since the computational scheme adopted by the linear prediction module 3 is relatively complicated and takes a relatively large amount of computational time, dispensing with the linear prediction module 3 can render the present invention more time-saving in terms of computation, as compared with the prior art. In this case, the meta-signals
  • step (S4) the ICA processing module 4 calculates the separated matrix W(k)
  • the pitch prediction module 2 of this invention can considerably remove the temporal correlation in each acoustic mixture at the pitch position, optimization of the separated matrix can be achieved.
  • the separated signals are less susceptible to distortion and can be identical to the original sound sources.
  • the order p it will be sufficient for the order p to be 1 or 2. Therefore, computational complexity can be simplified, and computational time can be saved.
  • the present invention can be applied to a method or a system for separating acoustic signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Système de séparation de signaux acoustiques conçu pour séparer une pluralité de mélanges acoustiques en au moins une seule source sonore. Le système de séparation de signaux acoustique comprend: un module de prévision de pas permettant de supprimer la corrélation temporelle de chaque mélange acoustique pour obtenir un métasignal correspondant; un module de prévision linéaire connecté électriquement au module de prédiction de pas en vue de la suppression ultérieure de la corrélation temporelle dans chaque métasignal afin d'obtenir un signal résiduel correspondant; et un module de traitement d'analyse d'un composant indépendant connecté électriquement au module de prévision linéaire permettant de recevoir les signaux résiduels. Le module de traitement d'analyse de composant indépendant calcule une matrice séparée à partir des signaux résiduels et réalise la convolution de la matrice séparée avec les mélanges acoustiques en vue d'une séparation en au moins une seule source sonore.
PCT/JP2006/300918 2005-01-19 2006-01-17 Procede et systeme de separation de signaux acoustiques WO2006078003A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 200510005544 CN1808571A (zh) 2005-01-19 2005-01-19 声音信号分离系统及方法
CN200510005544.8 2005-01-19

Publications (2)

Publication Number Publication Date
WO2006078003A2 true WO2006078003A2 (fr) 2006-07-27
WO2006078003A3 WO2006078003A3 (fr) 2007-02-08

Family

ID=36660000

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2006/300918 WO2006078003A2 (fr) 2005-01-19 2006-01-17 Procede et systeme de separation de signaux acoustiques

Country Status (2)

Country Link
CN (1) CN1808571A (fr)
WO (1) WO2006078003A2 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2123116A1 (fr) * 2007-01-26 2009-11-25 Microsoft Corporation Localisation de source sonore à capteur multiple
US8126829B2 (en) 2007-06-28 2012-02-28 Microsoft Corporation Source segmentation using Q-clustering
WO2012099518A1 (fr) * 2011-01-19 2012-07-26 Limes Audio Ab Procédé et dispositif de sélection de microphone
US10032461B2 (en) 2013-02-26 2018-07-24 Koninklijke Philips N.V. Method and apparatus for generating a speech signal
CN113574597A (zh) * 2018-12-21 2021-10-29 弗劳恩霍夫应用研究促进协会 用于使用声音质量的估计和控制的源分离的装置和方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1909064B (zh) * 2006-08-22 2011-05-18 复旦大学 一种在线自然语音卷积混合信号的时域盲分离方法
CN104078051B (zh) * 2013-03-29 2018-09-25 南京中兴软件有限责任公司 一种人声提取方法、系统以及人声音频播放方法及装置
CN104269174B (zh) * 2014-10-24 2018-02-09 北京音之邦文化科技有限公司 一种音频信号的处理方法及装置
US20220139368A1 (en) * 2019-02-28 2022-05-05 Beijing Didi Infinity Technology And Development Co., Ltd. Concurrent multi-path processing of audio signals for automatic speech recognition systems

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KOKKINAKIS K ET AL.: "BLIND SEPARATION OF ACOUSTIC MIXTURES BASED ON LINEAR PREDICTION ANALYSIS" 4TH INTERNATIONA SYMPOSIUM ON INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION (ICA 2003), [Online] April 2003 (2003-04), pages 343-348, XP002391413 NARA (JP) Retrieved from the Internet: URL:http://www.kecl.ntt.co.jp/icl/signal/i ca2003/cdrom/data/0128.pdf> [retrieved on 2006-07-20] *
NISHIKAWA T ET AL: "STABLE LEARNING ALGORITHM FOR BLIND SEPARATION OF TEMPORALLY CORRELATED ACOUSTIC SIGNALS COMBINING MULTISTAGE ICA AND LINEAR PREDICTION" IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS, COMMUNICATIONS AND COMPUTER SCIENCES, ENGINEERING SCIENCES SOCIETY, TOKYO, JP, vol. E86-A, no. 8, August 2003 (2003-08), pages 2028-2036, XP001177859 ISSN: 0916-8508 *
ZHAO ZHIJIN ET AL: "The study on application of linear prediction filter in blind source separation" SIGNAL PROCESSING, 2004. PROCEEDINGS. ICSP '04. 2004 7TH INTERNATIONAL CONFERENCE ON BEIJING, CHINA AUG. 31 - SEPT 4, 2004, PISCATAWAY, NJ, USA,IEEE, 31 August 2004 (2004-08-31), pages 331-334, XP010809628 ISBN: 0-7803-8406-7 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2123116A1 (fr) * 2007-01-26 2009-11-25 Microsoft Corporation Localisation de source sonore à capteur multiple
EP2123116A4 (fr) * 2007-01-26 2012-09-19 Microsoft Corp Localisation de source sonore à capteur multiple
US8126829B2 (en) 2007-06-28 2012-02-28 Microsoft Corporation Source segmentation using Q-clustering
WO2012099518A1 (fr) * 2011-01-19 2012-07-26 Limes Audio Ab Procédé et dispositif de sélection de microphone
US9313573B2 (en) 2011-01-19 2016-04-12 Limes Audio Ab Method and device for microphone selection
US10032461B2 (en) 2013-02-26 2018-07-24 Koninklijke Philips N.V. Method and apparatus for generating a speech signal
CN113574597A (zh) * 2018-12-21 2021-10-29 弗劳恩霍夫应用研究促进协会 用于使用声音质量的估计和控制的源分离的装置和方法
CN113574597B (zh) * 2018-12-21 2024-04-12 弗劳恩霍夫应用研究促进协会 用于使用声音质量的估计和控制的源分离的装置和方法

Also Published As

Publication number Publication date
WO2006078003A3 (fr) 2007-02-08
CN1808571A (zh) 2006-07-26

Similar Documents

Publication Publication Date Title
WO2006078003A2 (fr) Procede et systeme de separation de signaux acoustiques
CN108735227B (zh) 对麦克风阵列拾取的语音信号进行声源分离的方法及系统
Tan et al. Audio-visual speech separation and dereverberation with a two-stage multimodal network
EP3649642A1 (fr) Procédé et système pour améliorer un signal vocal d'un locuteur humain dans une vidéo à l'aide d'informations visuelles
WO2019246220A1 (fr) Amélioration audio guidée par des données
JP5231139B2 (ja) 音源抽出装置
CN108429995B (zh) 音响处理装置、音响处理方法以及存储介质
WO2002065782A1 (fr) Contenu multi-media : creation et mise en correspondance de hachages
JP2007526511A (ja) 周波数領域で多重経路多チャネル混合信号のブラインド分離のための方法及びその装置
US5999567A (en) Method for recovering a source signal from a composite signal and apparatus therefor
CN112567459A (zh) 声音分离装置、声音分离方法、声音分离程序以及声音分离系统
EP1589783A2 (fr) Dispositif et méthode pour séparer des sources multiples utilisant du filtrage directionnel
EP3392882A1 (fr) Procédé de traitement d'un signal audio et dispositif électronique correspondant, produit-programme lisible par ordinateur non transitoire et support d'informations lisible par ordinateur
EP4211686A1 (fr) Apprentissage machine de transfert de style de microphone
US20040054528A1 (en) Noise removing system and noise removing method
CN111128222A (zh) 语音分离方法、语音分离模型训练方法和计算机可读介质
WO2003083858A1 (fr) Application du filigrane dans le domaine temps des signaux multimedia
Furnon et al. Distributed speech separation in spatially unconstrained microphone arrays
GB2510650A (en) Sound source separation based on a Binary Activation model
JP3486975B2 (ja) ノイズ低減装置及び方法
WO2003083860A1 (fr) Fonctions de façonnage de fenetre pour le filigrane numerique de signaux multimedias
JP2020012980A (ja) 信号処理装置、信号処理プログラム、信号処理方法、及び収音装置
EP3185242A1 (fr) Procédé et appareil de traitement de contenu audio
JP4652116B2 (ja) エコー消去装置
EP4214707A1 (fr) Procédé et dispositif pour le traitement d'un enregistrement binaural

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06712134

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 06712134

Country of ref document: EP

Kind code of ref document: A2

WWW Wipo information: withdrawn in national office

Ref document number: 6712134

Country of ref document: EP