WO2014133338A1 - Blind signal extraction method using direction of arrival information and de-mixing system therefor - Google Patents

Blind signal extraction method using direction of arrival information and de-mixing system therefor Download PDF

Info

Publication number
WO2014133338A1
WO2014133338A1 PCT/KR2014/001630 KR2014001630W WO2014133338A1 WO 2014133338 A1 WO2014133338 A1 WO 2014133338A1 KR 2014001630 W KR2014001630 W KR 2014001630W WO 2014133338 A1 WO2014133338 A1 WO 2014133338A1
Authority
WO
WIPO (PCT)
Prior art keywords
mixing
signal
transfer function
filter
sound source
Prior art date
Application number
PCT/KR2014/001630
Other languages
French (fr)
Inventor
Soo Young Lee
Choong Hwan Choi
Ruxin Chen
Jae Kwon Yoo
Original Assignee
Korea Advanced Institute Of Science And Technology
Sony Computer Entertainment America Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Korea Advanced Institute Of Science And Technology, Sony Computer Entertainment America Llc filed Critical Korea Advanced Institute Of Science And Technology
Publication of WO2014133338A1 publication Critical patent/WO2014133338A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/02Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using radio waves
    • G01S3/74Multi-channel systems specially adapted for direction-finding, i.e. having a single antenna system capable of giving simultaneous indications of the directions of different signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing

Definitions

  • the present invention relates to a blind signal extraction method using a direction of arrival information and a de-mixing system therefor, and more particularly to a blind signal extraction algorithm for extracting a signal of a sound source which is located in a specific direction, from mixed signals received in a signal reception unit in a frequency domain.
  • the signal may be a signal in which signals from two different sources are mixed. Accordingly, it is required that only a signal from a desired source is separated or extracted from the signal in which the signals from the two different sources are mixed.
  • a blind signal separation (BSS) algorithm and a blind signal extraction (BSE) algorithm are known as methods of separating or extracting the signal of the desired source.
  • the BSS separates signals from at least two sources and separately acquires the signal from each source.
  • the BSS may result in the separation of the signal, e.g., noise, from an undesired source.
  • the blind signal extraction is a method of extracting only a signal of a desired source from the mixed signals.
  • An algorithm for blind signal extraction in a time domain has been already proposed, but has a disadvantage in that an amount of calculation significantly increases, resulting in prolongation of a calculating time.
  • the algorithm for the blind signal extraction in the frequency domain cannot be activated.
  • a permutation phenomenon occurs in which signals separated from different frequency domains should be ordered in sequence.
  • pairing the signals separated from the different frequency domains is performed.
  • pairing the signals for removal of the permutation phenomenon cannot be performed because only one signal is extracted.
  • the present invention has been made to solve the above-mentioned problem in the conventional art, and an aspect of the present invention is to provide a method of quickly extracting a signal of a sound source of a specific direction from a mixed signal through blind signal extraction in a frequency domain, while preventing a permutation phenomenon.
  • a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain includes: receiving mixed signals through a signal reception unit from at least two sound sources including the sound source placed in the specific direction; and de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
  • a transfer function W of a de-mixing filter is initialized and calculated by using the direction information.
  • a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain includes: receiving mixed signals through at least two signal reception units from two or more sound sources; and de-mixing the received signals by using non-Gaussianity of the received signals, wherein in the de-mixing of the received signals, a learning is repeatedly performed in order to calculate a transfer function W of a de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, and a value of a time delay when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
  • a de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain.
  • the de-mixing system includes: a signal reception unit for receiving mixed signals from at least two sound sources including the sound source placed in the specific direction; and a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
  • the de-mixing system for blind signal extraction by using direction information may further include a filter parameter calculating unit for initializing and calculating a transfer function W of the de-mixing filter by using the direction information.
  • de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain.
  • the de-mixing system includes: two or more signal reception units for receiving mixed signals from two or more sound sources; a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals; and a filter parameter calculating unit for repeatedly performing a learning in order to calculate a transfer function W of the de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, wherein a value of a time delay ⁇ when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
  • the method of extracting the signal of the sound source from the specific direction in the frequency domain, in which the blind signal extraction is performed in the frequency domain while preventing the permutation phenomenon can be provided.
  • a convergence rate of the blind signal extraction algorithm can be improved, resulting in the reduction of the amount of the calculation.
  • FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm is performed according to the embodiment of the present invention
  • FIG. 2 is a view illustrating a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention
  • FIG. 3 is a graph illustrating flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit.
  • FIG. 4 is a view illustrating a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention.
  • FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm can be performed according to the embodiment of the present invention.
  • signals from at least two sound sources 10 and 12 are mixed and received in at least one signal reception unit 20 or 22.
  • a room environment is illustrated as an example. Accordingly, the signals from the sound sources 10 and 12 arrive at the signal reception unit 20 or 22 not only through direct paths D11, D12, D21 and D22, but also through7 reflection paths R11, R12, R21 and R22 after being reflected in the room.
  • the received signals of the sound sources may be input in a de-mixing system 30.
  • the mixed and received signals of the sound sources can be separated through a de-mixing performed by the de-mixing system 30.
  • the de-mixing system 30 may be referred to include the signal reception unit 20 or 22.
  • a state in which there is no information on a signal from a sound source or a mixing environment is referred to as a blind state. That is, the embodiment of the present invention provides an algorithm for extracting a signal received in the blind state.
  • the blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 uses the fact that mixed signals are statistically independent, while different frequency bins in one signal are statistically dependent. In the case of using the blind signal extraction algorithm, it is possible to solve the permutation phenomenon during the separation of the mixed signals.
  • ICA Independent Component Analysis
  • BSS Blind Signal Separation
  • FDICA Frequency Domain ICA
  • the blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 corresponds to multivariable expansion of the ICA, and solves the permutation uncertainty by using the dependency of the frequency components.
  • the detailed description may be referred to U.S. Patent No. 7,797,153 B2.
  • the mixed signals received in the de-mixing system 30 may be expressed in the frequency domain through Short-Time Fourier Transform (STFT).
  • STFT Short-Time Fourier Transform
  • a signal y extracted by a de-mixing filter (not shown) in the de-mixing system 30 should be identical to a signal from a sound source 10 or 12. Accordingly, when the signal from the initial sound source 11 or 12 is multiplied by a transfer function A (a transfer function for the mixing filter) for a path of the signal from the sound source, and additionally multiplied by a transfer function W of the de-mixing filter, the signal from the initial sound source 11 or 12 should be restored.
  • A a transfer function for the mixing filter
  • W ij (Z) indicates a transfer function for an input j and an output i of the de-mixing filter (not shown) involved in the de-mixing system 30 in a z-domain
  • a ij (Z) indicates a transfer function for a path from a source j to a signal reception unit i in the z-domain.
  • the extracted signal y may be expressed by Equation (2) using a multivariable probability density function.
  • the de-mixing filter calculates the transfer function W for the de-mixing.
  • a parameter of the transfer function W may be obtained in a filter parameter calculation unit (not shown) in a manner described below.
  • a negentropy function may be used as a cost function in order to maximize non-Gaussianity of the signal in the blind signal extraction algorithm disclosed in U.S. Patent 7,797,153 B2, as indicated below.
  • Equation (3) y G is a signal of a Gaussian function having means and dispersion identical to those of y.
  • a following learning rule may be obtained in order to acquire an optimal extraction signal using the cost function expressed by Equation (3).
  • Equation (4) ⁇ indicates a learning rate.
  • Equation (5) may be acquired below.
  • the learning rule expressed by Equation (4) may be determined by using Equation (5).
  • x f indicates a signal of an f th frequency which is received in the signal reception unit 20 or 22 and input into the de-mixing system 30.
  • a filter parameter calculation unit (not shown) involved in the de-mixing system 30 may obtain a transfer function W for de-mixing a signal by using a cost function expressed by Equation (3). Then, the de-mixing filter performs a de-mixing for the signal received in the de-mixing system 30 by using the obtained transfer function W.
  • the filter parameter calculation unit may receive an output from the de-mixing filter, and repeatedly obtain a filter parameter according to the learning rule expressed by Equation (4) based on the output, so as to provide the obtained filter parameter to the de-mixing filter.
  • the de-mixing filter may adaptively operate. That is, the calculation is repeatedly performed according to the learning rule of Equation (4), thereby adaptively obtaining the transfer function W. Then, it is determined whether the transfer function W is converged, and if not converged, a previous step is performed again to calculate the transfer function W, so that the de-mixing is performed again.
  • the signal y can be extracted by using the transfer function W obtained in an adaptive manner as described above. Then, this may be converted and expressed in the time domain if necessary.
  • the signal y extracted by the above-mentioned algorithm may be a specific signal among the signals received through the signal reception unit 20 or 22. Generally, the extracted signal y is a signal with the strongest intensity among the signals received through the signal reception unit 20 or 22.
  • a signal of a sound source nearest to the signal reception unit 20 or 22 may be extracted.
  • a blind signal extraction algorithm capable of extracting a near field signal will be described.
  • FIG. 2 illustrates a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention.
  • a sound source is indicated by source
  • the signal reception unit is depicted by "mic”.
  • a transvers axis is a time axis indicating a time delay
  • a longitudinal axis indicates a magnitude of a response.
  • the room impulse response has a shape similar to that of a delta function.
  • the signal has a predetermined constant value in all frequency domains. Accordingly, flatness which is set to increase the value so that the room impulse response has the constant value is expressed by Equation (6).
  • a ij f means a transfer function of an f th frequency domain on a path from a source j to a signal reception unit i.
  • FIG. 3 illustrates flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit.
  • the flatness of each signal is indicated by a different color according to a reflection extent of each signal in the room.
  • a transverse axis is a distance axis indicating a distance between a signal source and a microphone
  • a longitudinal axis is a magnitude axis indicating a magnitude of the flatness.
  • Equation (6) a value of the flatness defined by Equation (6) increases.
  • the flatness is calculated by using the transfer function a ij f of the path from the source to the signal reception unit.
  • the blind signal extraction algorithm can calculate the transfer function W of the de-mixing filter by adding the closeness constraints to the learning rule expressed by Equation (4).
  • Equation (1) described in this specification expresses blind signal extraction of 2x2 (signals from two sound sources, and two microphones).
  • Blind signal extraction for N signals and N microphones may be expressed by Equation (7) below.
  • Equation (8) a relation formula of the de-mixing filter and a mixing filter may be obtained as Equation (8)
  • Equation (8) may be modified in a more simple form by applying two following assumptions for obtaining the closeness constraints. Two following assumptions are used without departing from a purpose for extraction of the closest signal to the signal reception unit 20 or 22, thereby simply modeling the transfer function a ij f corresponding to the mixing filter.
  • the room impulse responses a i1 f and a k1 f from the two signal reception units may be expressed as Equation (9) in which they have a difference corresponding to a time delay.
  • Equation (9) l means a time which is taken to transfer a sound from the closest sound source to the signal reception unit 1
  • L 1 is a distance from the closest sound source to the signal reception unit 1
  • v means a velocity of the sound. Accordingly, if this relation is substituted for Equation (8) with respect to a transfer function a 11 f , Equation (10) may be obtained below.
  • Equation (10) may be more simply expressed. That is, the transfer function a 11 f of the mixing filter may be treated as the constant because of a short distance to the source. Therefore, Equation (10) may be expressed as Equation (11) below.
  • Equation (11) a property of the de-mixing filter which the signal close to the signal reception unit has may be modeled by Equation (11) through the two assumptions using the property of the near field signal.
  • Equation (12) the flatness expressed by Equation (6) is introduced, resulting in an obtainment of the closeness constraints expressed by Equation (12) below.
  • Equation (12) becomes the closeness constraints for the extraction of the near field signal in the frequency domain according to the embodiment of the present invention.
  • Equation (4) it is necessary to calculate a differential value of the J c .
  • the differential value of the J c may be expressed by Equation (13) below.
  • Equation (13) closeness constraints are added to the algorithm extracting a specific signal and expressed by Equation (4), thereby obtaining a learning rule for extraction of the near field signal, expressed by Equation (14).
  • Equation (14), , , and .
  • is a weight of the learning rate and ⁇ is a weight of the closeness constraints, which are defined by a specific value. If values of ⁇ and ⁇ are too large, the learning rule expressed by Equation (14) is diverged so that it may fail to learn. However, if these values are too small, it takes a long time to learn. Generally, the values of ⁇ and ⁇ may be set to the largest value when the learning rule is converged through several trials and errors. The values of ⁇ and ⁇ may be set to a value corresponding to 70 ⁇ 80% of the largest value for stability.
  • J Neg is a part extracting one specific signal by using a negentropy function, and J c indicates closeness constraints.
  • the blind signal extraction algorithm may extract the closest signal to the signal reception unit 20 or 22.
  • the function J c for the closeness constraints includes a time delay ⁇ as a variable.
  • the blind signal extraction algorithm is executed in the state that a value of the time delay ⁇ is fixed to zero (0) or a specific value.
  • the time delay ⁇ which is a difference of times when the signals arrive at the plurality of the signal reception units may be not used as the fixed value but adaptively calculated and updated. That is, the value of the time delay ⁇ can be adaptively updated at each learning time of the blind signal extraction algorithm.
  • the blind signal extraction algorithm according to the embodiment of the present invention may be more rapidly converged by adaptively updating the time delay ⁇ at each learning time of the blind signal extraction algorithm. That is, although the value of the time delay ⁇ is not input from outside, it is possible to predict a direction of the sound source close to the signal reception unit 20 or 22, and to efficiently learn. Thereby, the amount of the calculation can be reduced.
  • FIG. 4 illustrates a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention.
  • a first microphone (microphone 1) and a second microphone (microphone 2) are depicted as the signal reception units, and a first sound source (sound source 1) is shown as a sound source.
  • the value of the time delay ⁇ wich is a difference of times when the signals arrive at the plurality of the signal reception units 20 and 22 is closely related to a direction in which the sound source is placed.
  • a maximum value of the time delay ⁇ which is the difference of the times when the signals arrive at the first microphone and the second microphone from the first sound source is determined by a distance between the two microphones.
  • a transverse axis is defined as an x axis
  • a longitudinal axis is defined as a y axis.
  • the value of J c which is the closeness constraint can be obtained depending on the value of the time delay ⁇ .
  • the value of the function J c can be obtained by suitable resolution of the value of the time delay ⁇ e.g., at a time interval of 0.1 second.
  • These values of the time delay ⁇ and the function J c may be depicted in a graph.
  • a transverse axis indicates the value of the time delay ⁇ and a longitudinal axis shows the value of the function J c which is the closeness constraint, so that a relation of two values may be indicated in the graph.
  • a learning aim of the blind signal extraction algorithm using the closeness constraints is to seek a transfer function W of the de-mixing filter which enables the value of the function J c to be a maximum value.
  • the blind signal extraction algorithm may extract a specific signal, e.g., a signal having the largest intensity, from the signal reception unit 20 or 22.
  • the blind signal extraction algorithm may extract a signal of the sound source placed in the specific direction from the mixed signals.
  • the signal reception units 20 and 22 receive the mixed signals from the plurality of the sound sources 10 and 12. At this time, it is possible to extract the signal of the sound source placed in the specific direction from the signals received in the signal reception units 20 and 22.
  • Information on a desired direction may be used in order to extract the signal of the sound source 10 or 12 placed in the specific direction with relation to the arrangement of the signal reception units 20 and 22.
  • the direction information refers to a relative direction of the sound source to the signal reception unit 20 or 22.
  • the sound source may be regarded as located in a direction of 0 degrees.
  • the sound source may be regarded as located in a direction of +90 degrees or -90 degrees.
  • a transfer function from the sound source 10 or 12 to the signal reception unit 20 or 22, i.e., a transfer function of the mixing filter, may be modeled as Equation (15), depending on a position of the sound source 10 or 12 to the signal reception unit 20 or 22.
  • Equation (15) ⁇ k1 is a time spent to transfer a voice from a sound source 1 to a signal reception unit k, l k1 is a distance between the sound source 1 and the signal reception unit k, and v is a velocity of a sound.
  • Information on the direction of the sound source for example, when two microphones are used, is indicated as a phase difference in a transfer function A from one sound source to each of the microphones. That is, a relation formula of two transfer functions through the transfer function A of the mixing filter for an m th microphone and an n th microphone is expressed as Equation (16).
  • Equation (17) If is calculated by using Equation (15), following Equation (17) can be obtained from Equation (8) showing a relation of the de-mixing filter and the mixing filter.
  • a coefficient of the transfer function W of the de-mixing filter and a coefficient of the transfer function A of the mixing filter are bundled respectively.
  • the coefficient of the mixing filter may be expressed as the phase difference.
  • Equation (18) The initialization of the transfer function W of the de-mixing filter can be expressed by Equation (18).
  • ⁇ m1 is a time spent to transfer a signal from the sound source 1 to the signal reception unit m.
  • This value ⁇ m1 is a value input according to the direction information of the sound source 1.
  • ⁇ m1 is a value of a time delay caused because distances over which signals to be extracted are transferred from the sound source 1 to the signal reception units m and n are different.
  • the blind signal extraction algorithm using the direction information may be used along with a technique of adding the closeness constraints and/or a technique of adaptively updating the time delay ⁇ . For example, when an error is present in the direction information of the sound source, the blind signal extraction algorithm can be applied while the time delay ⁇ is updated so as to compensate for wrong information on the direction of the sound source.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

Discloses is a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through a signal reception unit from at least two sound sources including the sound source placed in the specific direction; and de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.

Description

BLIND SIGNAL EXTRACTION METHOD USING DIRECTION OF ARRIVAL INFORMATION AND DE-MIXING SYSTEM THEREFOR
This application claims the priority under 35 U.S.C. §119(a) to Korean Patent Application Serial No. 10-2013-0020917, which was filed in the Korean Intellectual Property Office on February 27, 2013, the entire content of which is hereby incorporated by reference.
The present invention relates to a blind signal extraction method using a direction of arrival information and a de-mixing system therefor, and more particularly to a blind signal extraction algorithm for extracting a signal of a sound source which is located in a specific direction, from mixed signals received in a signal reception unit in a frequency domain.
In the case of receiving a signal such as a voice, the signal may be a signal in which signals from two different sources are mixed. Accordingly, it is required that only a signal from a desired source is separated or extracted from the signal in which the signals from the two different sources are mixed. A blind signal separation (BSS) algorithm and a blind signal extraction (BSE) algorithm are known as methods of separating or extracting the signal of the desired source.
The BSS separates signals from at least two sources and separately acquires the signal from each source. However, the BSS may result in the separation of the signal, e.g., noise, from an undesired source. Thus, there are problems in that an amount of calculation unnecessarily increases, and a structure of a circuit is complex. On the other hand, the blind signal extraction is a method of extracting only a signal of a desired source from the mixed signals. An algorithm for blind signal extraction in a time domain has been already proposed, but has a disadvantage in that an amount of calculation significantly increases, resulting in prolongation of a calculating time.
For the reason, the algorithm for the blind signal extraction in the frequency domain cannot be activated. In the case that the blind signal separation of the frequency domain is performed, a permutation phenomenon occurs in which signals separated from different frequency domains should be ordered in sequence. In order to remove the permutation phenomenon, pairing the signals separated from the different frequency domains is performed. However, in the case of the blind signal extraction, pairing the signals for removal of the permutation phenomenon cannot be performed because only one signal is extracted.
Therefore, a method of effectively extracting a desired signal in the frequency domain, in which a blind signal extraction algorithm is performed while not causing the permutation phenomenon, has been required.
Citation Literature
Patent Literature
Korean Patent Laid-Open Publication No. 10-2008-0019879, published on March 5, 2008.
U.S. Patent No. 7,797,153 B2, published on September 14, 2010.
The present invention has been made to solve the above-mentioned problem in the conventional art, and an aspect of the present invention is to provide a method of quickly extracting a signal of a sound source of a specific direction from a mixed signal through blind signal extraction in a frequency domain, while preventing a permutation phenomenon.
Technical problems which the present invention solves are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned above may be understood by those skilled in the art through the description of the present invention.
In accordance with an aspect of the present invention, there is provided a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through a signal reception unit from at least two sound sources including the sound source placed in the specific direction; and de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
According to the embodiment of the present invention, in the de-mixing of the received signals, a transfer function W of a de-mixing filter is initialized and calculated by using the direction information.
In accordance with another embodiment of the present invention, there is provided a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through at least two signal reception units from two or more sound sources; and de-mixing the received signals by using non-Gaussianity of the received signals, wherein in the de-mixing of the received signals, a learning is repeatedly performed in order to calculate a transfer function W of a de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, and a value of a time delay when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
In accordance with another aspect of the present invention, there is provided a de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain. The de-mixing system includes: a signal reception unit for receiving mixed signals from at least two sound sources including the sound source placed in the specific direction; and a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
According to the embodiment of the present invention, the de-mixing system for blind signal extraction by using direction information may further include a filter parameter calculating unit for initializing and calculating a transfer function W of the de-mixing filter by using the direction information.
In accordance with another embodiment of the present invention, there is provided de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain. The de-mixing system includes: two or more signal reception units for receiving mixed signals from two or more sound sources; a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals; and a filter parameter calculating unit for repeatedly performing a learning in order to calculate a transfer function W of the de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, wherein a value of a time delay τ when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
According to the present invention, the method of extracting the signal of the sound source from the specific direction in the frequency domain, in which the blind signal extraction is performed in the frequency domain while preventing the permutation phenomenon, can be provided. According to the present invention, further, a convergence rate of the blind signal extraction algorithm can be improved, resulting in the reduction of the amount of the calculation.
The above and other aspects, features, and advantages of the present invention will be more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which:
FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm is performed according to the embodiment of the present invention;
FIG. 2 is a view illustrating a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention;
FIG. 3 is a graph illustrating flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit; and
FIG. 4 is a view illustrating a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention.
Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, shapes and sizes of structural elements may be excessively depicted in order to clearly describe the structural elements. Also, it is noted that identical reference numerals and symbols denote the same structural elements throughout the drawings. In the following description of the present invention, a detailed description of known functions or configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.
FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm can be performed according to the embodiment of the present invention. As shown in FIG. 1, it may be considered that signals from at least two sound sources 10 and 12 are mixed and received in at least one signal reception unit 20 or 22. In FIG. 1, a room environment is illustrated as an example. Accordingly, the signals from the sound sources 10 and 12 arrive at the signal reception unit 20 or 22 not only through direct paths D11, D12, D21 and D22, but also through7 reflection paths R11, R12, R21 and R22 after being reflected in the room. The received signals of the sound sources may be input in a de-mixing system 30. The mixed and received signals of the sound sources can be separated through a de-mixing performed by the de-mixing system 30. Hereinafter, the de-mixing system 30 may be referred to include the signal reception unit 20 or 22.
At this time, a state in which there is no information on a signal from a sound source or a mixing environment is referred to as a blind state. That is, the embodiment of the present invention provides an algorithm for extracting a signal received in the blind state.
A specific signal extraction from a mixed signal
Hereinafter, a blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 and available in the embodiment of the present invention will be described. The blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 uses the fact that mixed signals are statistically independent, while different frequency bins in one signal are statistically dependent. In the case of using the blind signal extraction algorithm, it is possible to solve the permutation phenomenon during the separation of the mixed signals.
More particularly, Independent Component Analysis (ICA) is a Blind Signal Separation (BSS) algorithm using a statistic independent between output signals. Frequency Domain ICA (FDICA) is used for a convolution BSS algorithm because mixed convolution signals in a time domain may be modeled on mixed instantaneous signals in a frequency domain in the algorithm. This modeling makes the separation problem simple. The FDICA successfully separates a signal component of each frequency channel. However, a random permutation phenomenon of the separated frequency components occurs between the frequency bins.
The blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 corresponds to multivariable expansion of the ICA, and solves the permutation uncertainty by using the dependency of the frequency components. The detailed description may be referred to U.S. Patent No. 7,797,153 B2.
Firstly, the mixed signals received in the de-mixing system 30 may be expressed in the frequency domain through Short-Time Fourier Transform (STFT). A signal y extracted by a de-mixing filter (not shown) in the de-mixing system 30 should be identical to a signal from a sound source 10 or 12. Accordingly, when the signal from the initial sound source 11 or 12 is multiplied by a transfer function A (a transfer function for the mixing filter) for a path of the signal from the sound source, and additionally multiplied by a transfer function W of the de-mixing filter, the signal from the initial sound source 11 or 12 should be restored. This may be expressed by a matrix as follows.
Equation (1)
Figure PCTKR2014001630-appb-I000001
In Equation (1), Wij(Z) indicates a transfer function for an input j and an output i of the de-mixing filter (not shown) involved in the de-mixing system 30 in a z-domain, and Aij(Z) indicates a transfer function for a path from a source j to a signal reception unit i in the z-domain.
In the embodiment of the present invention, the extracted signal y may be expressed by Equation (2) using a multivariable probability density function.
Equation (2)
Figure PCTKR2014001630-appb-I000002
In Equation (2), yf is an output signal of an fth frequency, and y=[y1, yf, ..., yK]. Further, K indicates the number of entire frequency bins included in the signal y, and f is a standard deviation of an absolute value of an fth frequency signal, for example, may be set to 1.
In order to de-mix the signal received in the de-mixing system 30, the de-mixing filter calculates the transfer function W for the de-mixing. For example, a parameter of the transfer function W may be obtained in a filter parameter calculation unit (not shown) in a manner described below.
According to the embodiment of the present invention, a negentropy function may be used as a cost function in order to maximize non-Gaussianity of the signal in the blind signal extraction algorithm disclosed in U.S. Patent 7,797,153 B2, as indicated below.
Equation (3)
Figure PCTKR2014001630-appb-I000003
In Equation (3), yG is a signal of a Gaussian function having means and dispersion identical to those of y. A following learning rule may be obtained in order to acquire an optimal extraction signal using the cost function expressed by Equation (3).
Equation (4)
Figure PCTKR2014001630-appb-I000004
In Equation (4), ηindicates a learning rate. At this time, if Equation (3) is differentiated, Equation (5) may be acquired below.
Equation (5)
Figure PCTKR2014001630-appb-I000005
In the embodiment of the present invention, the learning rule expressed by Equation (4) may be determined by using Equation (5). In Equation (5), xf indicates a signal of an fth frequency which is received in the signal reception unit 20 or 22 and input into the de-mixing system 30.
In the embodiment of the present invention, a filter parameter calculation unit (not shown) involved in the de-mixing system 30 may obtain a transfer function W for de-mixing a signal by using a cost function expressed by Equation (3). Then, the de-mixing filter performs a de-mixing for the signal received in the de-mixing system 30 by using the obtained transfer function W.
At this time, the filter parameter calculation unit may receive an output from the de-mixing filter, and repeatedly obtain a filter parameter according to the learning rule expressed by Equation (4) based on the output, so as to provide the obtained filter parameter to the de-mixing filter. Thereby, the de-mixing filter may adaptively operate. That is, the calculation is repeatedly performed according to the learning rule of Equation (4), thereby adaptively obtaining the transfer function W. Then, it is determined whether the transfer function W is converged, and if not converged, a previous step is performed again to calculate the transfer function W, so that the de-mixing is performed again.
The signal y can be extracted by using the transfer function W obtained in an adaptive manner as described above. Then, this may be converted and expressed in the time domain if necessary. The signal y extracted by the above-mentioned algorithm may be a specific signal among the signals received through the signal reception unit 20 or 22. Generally, the extracted signal y is a signal with the strongest intensity among the signals received through the signal reception unit 20 or 22.
Signal extraction from a sound source nearest to a signal reception unit
As described above, as a specific condition is added to the blind signal extraction algorithm and then the signal y is extracted, a signal of a sound source nearest to the signal reception unit 20 or 22 may be extracted. Hereinafter, a blind signal extraction algorithm capable of extracting a near field signal according to the embodiment of the present invention will be described.
Firstly, a property of a signal from the sound source 10 or 12 placed near the signal reception unit 20 or 22 will be described.
FIG. 2 illustrates a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention. In FIG. 2, a sound source is indicated by source , and the signal reception unit is depicted by "mic". In a graph illustrating the sources 1, 2, 3, 4 and 5 of FIG. 2, a transvers axis is a time axis indicating a time delay, and a longitudinal axis indicates a magnitude of a response.
As known from the room impulse response for each source 1, 2, 3, 4, or 5 in FIG. 2, it is understood that as a distance from the sound source to the signal reception unit (microphone) becomes shorter, a magnitude of a first component of the room impulse response rapidly increases in comparison with magnitudes of the other components. Further, as the distance from the sound source to the signal reception unit (microphone) becomes longer, differences in the magnitudes of the first component and the other components of the room impulse response decrease.
As shown in FIG. 2, in the case that the distance between the sound source and the signal reception unit becomes extremely short, it is understood that the room impulse response has a shape similar to that of a delta function. When the room impulse response similar to the delta function is expressed in the frequency domain, the signal has a predetermined constant value in all frequency domains. Accordingly, flatness which is set to increase the value so that the room impulse response has the constant value is expressed by Equation (6).
Equation (6)
Figure PCTKR2014001630-appb-I000006
In Equation (6), aij f means a transfer function of an fth frequency domain on a path from a source j to a signal reception unit i.
FIG. 3 illustrates flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit. In FIG. 3, the flatness of each signal is indicated by a different color according to a reflection extent of each signal in the room. In FIG. 3, a transverse axis is a distance axis indicating a distance between a signal source and a microphone, and a longitudinal axis is a magnitude axis indicating a magnitude of the flatness.
As shown in FIG. 3, as the source j becomes further closer to the signal reception unit i, the magnitude of the transfer function in the frequency domain becomes constant. As the result, a value of the flatness defined by Equation (6) increases. The flatness is calculated by using the transfer function aij f of the path from the source to the signal reception unit.
In order to extract the near field signal with the blind signal extraction algorithm according to the embodiment of the present invention, the property described above should be applied when the transfer function of the de-mixing filter is calculated. In the embodiment of the present invention, the blind signal extraction algorithm can calculate the transfer function W of the de-mixing filter by adding the closeness constraints to the learning rule expressed by Equation (4).
Hereinafter, the closeness constraints will be described.
Equation (1) described in this specification expresses blind signal extraction of 2x2 (signals from two sound sources, and two microphones). Blind signal extraction for N signals and N microphones may be expressed by Equation (7) below.
Equation (7)
Figure PCTKR2014001630-appb-I000007
In Equation (7), const. indicates a constant value. Accordingly, through Equation (7), a relation formula of the de-mixing filter and a mixing filter may be obtained as Equation (8)
Equation (8)
Figure PCTKR2014001630-appb-I000008
Equation (8) may be modified in a more simple form by applying two following assumptions for obtaining the closeness constraints. Two following assumptions are used without departing from a purpose for extraction of the closest signal to the signal reception unit 20 or 22, thereby simply modeling the transfer function aij f corresponding to the mixing filter.
1. A distance between two neighboring microphones becomes very short.
2. As a distance from the closest signal to two signal reception units is very short, all the room impulse responses which are transfer functions from a close signal to each of the signal reception units have a similar shape to the delta function, i.e., the first component is very large and the other components are very small. If the other components except for the first component are ignored, the room impulse responses ai1 f and ak1 f from the two signal reception units may be expressed as Equation (9) in which they have a difference corresponding to a time delay.
Equation (9)
Figure PCTKR2014001630-appb-I000009
In Equation (9), l means a time which is taken to transfer a sound from the closest sound source to the signal reception unit 1, L1 is a distance from the closest sound source to the signal reception unit 1, and v means a velocity of the sound. Accordingly, if this relation is substituted for Equation (8) with respect to a transfer function a11 f, Equation (10) may be obtained below.
Equation (10)
Figure PCTKR2014001630-appb-I000010
By using the above-mentioned assumption 2, Equation (10) may be more simply expressed. That is, the transfer function a11 f of the mixing filter may be treated as the constant because of a short distance to the source. Therefore, Equation (10) may be expressed as Equation (11) below.
Equation (11)
Figure PCTKR2014001630-appb-I000011
That is, a property of the de-mixing filter which the signal close to the signal reception unit has may be modeled by Equation (11) through the two assumptions using the property of the near field signal. In Equation (11), the flatness expressed by Equation (6) is introduced, resulting in an obtainment of the closeness constraints expressed by Equation (12) below.
Equation (12)
Figure PCTKR2014001630-appb-I000012
In Equation (12),
Figure PCTKR2014001630-appb-I000013
Jc obtained in Equation (12) becomes the closeness constraints for the extraction of the near field signal in the frequency domain according to the embodiment of the present invention. In order to apply the closeness constraints to the learning rule expressed by Equation (4), it is necessary to calculate a differential value of the Jc. The differential value of the Jc may be expressed by Equation (13) below.
Equation (13)
Figure PCTKR2014001630-appb-I000014
By using Equation (13), closeness constraints are added to the algorithm extracting a specific signal and expressed by Equation (4), thereby obtaining a learning rule for extraction of the near field signal, expressed by Equation (14).
Equation (14)
Figure PCTKR2014001630-appb-I000015
In Equation, (14),
Figure PCTKR2014001630-appb-I000016
,
Figure PCTKR2014001630-appb-I000017
, and
Figure PCTKR2014001630-appb-I000018
.
Further,
Figure PCTKR2014001630-appb-I000019
, ηis a weight of the learning rate and λis a weight of the closeness constraints, which are defined by a specific value. If values of ηand λare too large, the learning rule expressed by Equation (14) is diverged so that it may fail to learn. However, if these values are too small, it takes a long time to learn. Generally, the values of ηand ηmay be set to the largest value when the learning rule is converged through several trials and errors. The values of ηand λmay be set to a value corresponding to 70~80% of the largest value for stability. JNeg is a part extracting one specific signal by using a negentropy function, and Jc indicates closeness constraints.
As described above, by using the transfer function W calculated by applying the learning rule including the closeness constraints, e.g., the learning rule expressed by Equation (14), the blind signal extraction algorithm according to the embodiment of the present invention may extract the closest signal to the signal reception unit 20 or 22.
The function Jc for the closeness constraints includes a time delay τas a variable. The blind signal extraction algorithm is executed in the state that a value of the time delay τis fixed to zero (0) or a specific value. When the closest signal to the signal reception unit 20 or 22 is extracted through the blind signal extraction algorithm according to the embodiment of the present invention to which the closeness constraints are added, the time delay τwhich is a difference of times when the signals arrive at the plurality of the signal reception units may be not used as the fixed value but adaptively calculated and updated. That is, the value of the time delay τcan be adaptively updated at each learning time of the blind signal extraction algorithm. As described above, the blind signal extraction algorithm according to the embodiment of the present invention may be more rapidly converged by adaptively updating the time delay τat each learning time of the blind signal extraction algorithm. That is, although the value of the time delay τis not input from outside, it is possible to predict a direction of the sound source close to the signal reception unit 20 or 22, and to efficiently learn. Thereby, the amount of the calculation can be reduced.
FIG. 4 illustrates a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention. In FIG. 4, a first microphone (microphone 1) and a second microphone (microphone 2) are depicted as the signal reception units, and a first sound source (sound source 1) is shown as a sound source.
The value of the time delay τwich is a difference of times when the signals arrive at the plurality of the signal reception units 20 and 22 is closely related to a direction in which the sound source is placed. In other words, in FIG. 4, a maximum value of the time delay τwhich is the difference of the times when the signals arrive at the first microphone and the second microphone from the first sound source is determined by a distance between the two microphones.
In FIG. 4, a transverse axis is defined as an x axis, and a longitudinal axis is defined as a y axis. If the first sound source is placed on the x axis, the value of the time delay τwhich is the difference of the times when the signal from the first sound source arrives at the first microphone and the second microphone becomes a maximum value. The value of the time delay τat that time is referred to as T, which indicates a maximum time delay. If the first sound source is placed on the y axis, the value of the time delay τwhich is the difference of the times when the signal from the first sound source arrives at the first microphone and the second microphone becomes zero (0). Accordingly, it can be known that the value of the time delay τis present in a range of T~T.
At this time, the value of Jc which is the closeness constraint can be obtained depending on the value of the time delay τ. For example, the value of the function Jc can be obtained by suitable resolution of the value of the time delay τ e.g., at a time interval of 0.1 second. These values of the time delay τand the function Jc may be depicted in a graph. For example, a transverse axis indicates the value of the time delay τand a longitudinal axis shows the value of the function Jc which is the closeness constraint, so that a relation of two values may be indicated in the graph.
A learning aim of the blind signal extraction algorithm using the closeness constraints is to seek a transfer function W of the de-mixing filter which enables the value of the function Jc to be a maximum value. Similarly, it is possible to seek the value of the time delay τwhich enables the value of the function Jc to be a maximum value. As described above, by adaptively updating the time delay τat each learning time of the blind signal extraction algorithm according to the embodiment of the present invention, to which the closeness constraints are added, the blind signal extraction algorithm according to the embodiment of the present invention can be more rapidly converged. Thereby, the amount of the calculation can be reduced.
Signal extraction from a sound source placed in a specific direction
In the case that the closeness constraints are not added, the blind signal extraction algorithm according to the embodiment of the present invention may extract a specific signal, e.g., a signal having the largest intensity, from the signal reception unit 20 or 22.
The blind signal extraction algorithm according to the embodiment of the present invention may extract a signal of the sound source placed in the specific direction from the mixed signals.
That is, the signal reception units 20 and 22 receive the mixed signals from the plurality of the sound sources 10 and 12. At this time, it is possible to extract the signal of the sound source placed in the specific direction from the signals received in the signal reception units 20 and 22.
Information on a desired direction may be used in order to extract the signal of the sound source 10 or 12 placed in the specific direction with relation to the arrangement of the signal reception units 20 and 22. Here, the direction information refers to a relative direction of the sound source to the signal reception unit 20 or 22. For example, if the sound source is placed on the y axis of FIG. 4, the sound source may be regarded as located in a direction of 0 degrees. Further, if the sound source is placed to the right or left of the first and second microphones on the x axis of FIG. 4, the sound source may be regarded as located in a direction of +90 degrees or -90 degrees.
Hereinafter, a blind signal extraction algorithm capable of extracting a signal from a sound source placed in a specific direction according to the embodiment of the present invention will be described.
A transfer function from the sound source 10 or 12 to the signal reception unit 20 or 22, i.e., a transfer function of the mixing filter, may be modeled as Equation (15), depending on a position of the sound source 10 or 12 to the signal reception unit 20 or 22.
Equation (15)
Figure PCTKR2014001630-appb-I000020
In Equation (15), τk1 is a time spent to transfer a voice from a sound source 1 to a signal reception unit k, lk1 is a distance between the sound source 1 and the signal reception unit k, and v is a velocity of a sound. Information on the direction of the sound source, for example, when two microphones are used, is indicated as a phase difference in a transfer function A from one sound source to each of the microphones. That is, a relation formula of two transfer functions through the transfer function A of the mixing filter for an mth microphone and an nth microphone is expressed as Equation (16).
Equation (16)
Figure PCTKR2014001630-appb-I000021
If is calculated by using Equation (15), following Equation (17) can be obtained from Equation (8) showing a relation of the de-mixing filter and the mixing filter.
Equation (17)
Figure PCTKR2014001630-appb-I000022
As known from Equation (17), a coefficient of the transfer function W of the de-mixing filter and a coefficient of the transfer function A of the mixing filter are bundled respectively. Here, the coefficient of the mixing filter may be expressed as the phase difference. When the coefficient of the de-mixing filter is initialized, it is possible to compensate for the phase difference occurring in the mixing filter, based on the direction information. As described above, by initializing the transfer function W of the de-mixing filter, the signal from the sound source placed in the specific direction can be extracted according to the direction information.
The initialization of the transfer function W of the de-mixing filter can be expressed by Equation (18).
Equation (18)
Figure PCTKR2014001630-appb-I000023
Here, τm1 is a time spent to transfer a signal from the sound source 1 to the signal reception unit m. This value τm1 is a value input according to the direction information of the sound source 1. τm1 is a value of a time delay caused because distances over which signals to be extracted are transferred from the sound source 1 to the signal reception units m and n are different. When the blind signal extraction algorithm according to the embodiment of the present invention is applied to the transfer function W after the transfer function W of the de-mixing filter is initialized as indicated by Equation (17), the signal may be extracted from the sound source placed in a desired direction. For reference, in the case that the direction information is not provided, the transfer function W may be initialized into
Figure PCTKR2014001630-appb-I000024
in the blind signal extraction algorithm.
The blind signal extraction algorithm using the direction information may be used along with a technique of adding the closeness constraints and/or a technique of adaptively updating the time delay τ. For example, when an error is present in the direction information of the sound source, the blind signal extraction algorithm can be applied while the time delay τis updated so as to compensate for wrong information on the direction of the sound source.
Although the embodiments of the present invention have been described with the accompanying drawings up to now, it will be understood that the present invention may be implemented in various embodiments without departing from the technical spirit and the scope of the present invention. Accordingly, it should be understood that the above-described embodiments are merely exemplary and is not limited, and it should be interpreted that the scope of the present invention is represented by the claims rather than the description, and the changes or modifications derived from the claims and the equivalents thereof pertain to the scope of the present invention.

Claims (12)

  1. A method of extracting a signal y from a sound source placed in a specific direction in a frequency domain, the method comprising:
    receiving mixed signals through a signal reception unit from two or more sound sources including the sound source placed in the specific direction; and
    de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
  2. The method as claimed in claim 1, wherein in the de-mixing of the received signals, a transfer function W of a de-mixing filter is initialized and calculated by using the direction information.
  3. The method as claimed in claim 2, wherein the de-mixing of the received signals comprises:
    calculating a transfer function W of the de-mixing filter by using a cost function
    Figure PCTKR2014001630-appb-I000025
    , the transfer function being repeatedly calculated according to a learning rule
    Figure PCTKR2014001630-appb-I000026
    ; and
    de-mixing the received signals by using the transfer function,
    in which
    Figure PCTKR2014001630-appb-I000027
    , and ηis a learning rate.
  4. A method of extracting a signal y from a sound source placed in a specific direction in a frequency domain, the method comprising:
    receiving mixed signals through two or more signal reception units from two or more sound sources; and
    de-mixing the received signals by using non-Gaussianity of the received signals,
    wherein in the de-mixing of the received signals, a learning is repeatedly performed in order to calculate a transfer function W of a de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, and a value of a time delay when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
  5. The method as claimed in claim 4, wherein the de-mixing of the received signals comprises:
    calculating a transfer function W of the de-mixing filter by using a cost function
    Figure PCTKR2014001630-appb-I000028
    , the transfer function of the de-mixing filter being repeatedly calculated according to a learning rule
    Figure PCTKR2014001630-appb-I000029
    ; and
    de-mixing the received signals by using the transfer function of the de-mixing filter,
    in which
    Figure PCTKR2014001630-appb-I000030
    , Jc is closeness constraint, λis a learning rate, and is a weight of the closeness constraint.
  6. The method as claimed in claim 5, wherein the closeness constraint is expressed by
    Figure PCTKR2014001630-appb-I000031
    .
  7. A de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain, the de-mixing system comprising:
    a signal reception unit for receiving mixed signals from at least two sound sources including the sound source placed in the specific direction; and
    a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
  8. The de-mixing system as claimed in claim 7, further comprising: a filter parameter calculating unit for initializing and calculating a transfer function W of the de-mixing filter by using the direction information.
  9. The de-mixing system as claimed in claim 8, wherein the filter parameter calculating unit calculates a transfer function W of the de-mixing filter by using a cost function
    Figure PCTKR2014001630-appb-I000032
    , the transfer function being repeatedly calculated according to a learning rule
    Figure PCTKR2014001630-appb-I000033
    , in which
    Figure PCTKR2014001630-appb-I000034
    , and ηis a learning rate.
  10. A de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain, the de-mixing system comprising:
    two or more signal reception units for receiving mixed signals from two or more sound sources;
    a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals; and
    a filter parameter calculating unit for repeatedly performing a learning in order to calculate a transfer function W of the de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain,
    wherein a value of a time delay τwhen signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
  11. he de-mixing system as claimed in claim 10, wherein the filter parameter calculating unit calculates a transfer function W of the de-mixing filter by using a cost function
    Figure PCTKR2014001630-appb-I000035
    , the transfer function being repeatedly calculated according to a learning rule
    Figure PCTKR2014001630-appb-I000036
    ,
    in which
    Figure PCTKR2014001630-appb-I000037
    , Jc is closeness constraint, λis a learning rate, and is a weight of the closeness constraint.
  12. he de-mixing system as claimed in claim 11, wherein the closeness constraint is expressed by .
PCT/KR2014/001630 2013-02-27 2014-02-27 Blind signal extraction method using direction of arrival information and de-mixing system therefor WO2014133338A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130020917A KR101463955B1 (en) 2013-02-27 2013-02-27 Blind source extraction method using direction of arrival information and de-mixing system therefor
KR10-2013-0020917 2013-02-27

Publications (1)

Publication Number Publication Date
WO2014133338A1 true WO2014133338A1 (en) 2014-09-04

Family

ID=51428532

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2014/001630 WO2014133338A1 (en) 2013-02-27 2014-02-27 Blind signal extraction method using direction of arrival information and de-mixing system therefor

Country Status (2)

Country Link
KR (1) KR101463955B1 (en)
WO (1) WO2014133338A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101721424B1 (en) * 2015-12-31 2017-03-31 서강대학교산학협력단 Reverberation_robust multiple sound source localization based independent component analysis
CN106483502B (en) * 2016-09-23 2019-10-18 科大讯飞股份有限公司 A kind of sound localization method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100066916A (en) * 2008-12-10 2010-06-18 한국전자통신연구원 Method for separating noise from audio signal
KR20110121955A (en) * 2010-05-03 2011-11-09 한국과학기술원 Method and apparatus for blind source extraction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100066916A (en) * 2008-12-10 2010-06-18 한국전자통신연구원 Method for separating noise from audio signal
KR20110121955A (en) * 2010-05-03 2011-11-09 한국과학기술원 Method and apparatus for blind source extraction

Also Published As

Publication number Publication date
KR101463955B1 (en) 2014-11-21
KR20140106823A (en) 2014-09-04

Similar Documents

Publication Publication Date Title
WO2018190547A1 (en) Deep neural network-based method and apparatus for combined noise and echo removal
WO2020204525A1 (en) Combined learning method and device using transformed loss function and feature enhancement based on deep neural network for speaker recognition that is robust in noisy environment
WO2016117793A1 (en) Speech enhancement method and system
WO2014133338A1 (en) Blind signal extraction method using direction of arrival information and de-mixing system therefor
WO2017144007A1 (en) Method and system for audio recognition based on empirical mode decomposition
WO2015108362A1 (en) Method for controlling timing of terminal in wireless communication system, and electronic device therefor
KR20060024414A (en) Updating adaptive equalizer coefficients using known or predictable bit patterns distributed among unknown data
WO2013022166A1 (en) Uplink signal processing method, downlink signal processing method, and wireless unit for executing the methods
WO2010143910A2 (en) Method and apparatus for selecting optimum transfer protocol
EP2599219A2 (en) Variable resister having resistance varying geometrically ratio and control method thereof
JP6763721B2 (en) Sound source separator
WO2015012572A1 (en) Apparatus and method for receiving signal in communication system supporting low density parity check code
WO2014003254A1 (en) Apparatus and method for setting search region for predicting motion vector
WO2020242260A1 (en) Method and device for machine learning-based image compression using global context
WO2015026130A1 (en) Apparatus and method for removing interference by ics repeater using standardizer
WO2021071215A1 (en) Method for transmitting reference signal and apparatus using the same
WO2019027099A1 (en) Synchronization error compensation system for synchronization error compensation between master and slaves of ethercat network and synchronization error compensation method thereof
WO2016108467A1 (en) Method for high speed equalizing of packet data received from bus topology network, method for transmitting and receiving packet data in bus topology network, and receiver of bus topology network
WO2020027365A1 (en) Method for processing elastic wave data, elastic wave processing device using same, and program therefor
WO2017069544A1 (en) Method and apparatus for channel estimation in wireless communication system
WO2023177108A1 (en) Method and system for learning to share weights across transformer backbones in vision and language tasks
WO2023136500A1 (en) Device and method for channel estimation using multicyclic shift separation
WO2018159892A1 (en) Method for processing s-parameter for transient analysis
WO2010044615A2 (en) Method and apparatus for setting bandwidth using moving average of data transfer rate
JP2002152065A (en) Sneak cancellor and transmission path characteristic measuring device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14757597

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14757597

Country of ref document: EP

Kind code of ref document: A1