WO2014133338A1

WO2014133338A1 - Blind signal extraction method using direction of arrival information and de-mixing system therefor

Info

Publication number: WO2014133338A1
Application number: PCT/KR2014/001630
Authority: WO
Inventors: Soo Young Lee; Choong Hwan Choi; Ruxin Chen; Jae Kwon Yoo
Original assignee: Korea Advanced Institute Of Science And Technology; Sony Computer Entertainment America Llc
Priority date: 2013-02-27
Filing date: 2014-02-27
Publication date: 2014-09-04
Also published as: KR101463955B1; KR20140106823A

Abstract

Discloses is a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through a signal reception unit from at least two sound sources including the sound source placed in the specific direction; and de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.

Description

BLIND SIGNAL EXTRACTION METHOD USING DIRECTION OF ARRIVAL INFORMATION AND DE-MIXING SYSTEM THEREFOR

This application claims the priority under 35 U.S.C. §119(a) to Korean Patent Application Serial No. 10-2013-0020917, which was filed in the Korean Intellectual Property Office on February 27, 2013, the entire content of which is hereby incorporated by reference.

The present invention relates to a blind signal extraction method using a direction of arrival information and a de-mixing system therefor, and more particularly to a blind signal extraction algorithm for extracting a signal of a sound source which is located in a specific direction, from mixed signals received in a signal reception unit in a frequency domain.

In the case of receiving a signal such as a voice, the signal may be a signal in which signals from two different sources are mixed. Accordingly, it is required that only a signal from a desired source is separated or extracted from the signal in which the signals from the two different sources are mixed. A blind signal separation (BSS) algorithm and a blind signal extraction (BSE) algorithm are known as methods of separating or extracting the signal of the desired source.

The BSS separates signals from at least two sources and separately acquires the signal from each source. However, the BSS may result in the separation of the signal, e.g., noise, from an undesired source. Thus, there are problems in that an amount of calculation unnecessarily increases, and a structure of a circuit is complex. On the other hand, the blind signal extraction is a method of extracting only a signal of a desired source from the mixed signals. An algorithm for blind signal extraction in a time domain has been already proposed, but has a disadvantage in that an amount of calculation significantly increases, resulting in prolongation of a calculating time.

For the reason, the algorithm for the blind signal extraction in the frequency domain cannot be activated. In the case that the blind signal separation of the frequency domain is performed, a permutation phenomenon occurs in which signals separated from different frequency domains should be ordered in sequence. In order to remove the permutation phenomenon, pairing the signals separated from the different frequency domains is performed. However, in the case of the blind signal extraction, pairing the signals for removal of the permutation phenomenon cannot be performed because only one signal is extracted.

Therefore, a method of effectively extracting a desired signal in the frequency domain, in which a blind signal extraction algorithm is performed while not causing the permutation phenomenon, has been required.

Citation Literature

Patent Literature

Korean Patent Laid-Open Publication No. 10-2008-0019879, published on March 5, 2008.

U.S. Patent No. 7,797,153 B2, published on September 14, 2010.

The present invention has been made to solve the above-mentioned problem in the conventional art, and an aspect of the present invention is to provide a method of quickly extracting a signal of a sound source of a specific direction from a mixed signal through blind signal extraction in a frequency domain, while preventing a permutation phenomenon.

Technical problems which the present invention solves are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned above may be understood by those skilled in the art through the description of the present invention.

In accordance with an aspect of the present invention, there is provided a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through a signal reception unit from at least two sound sources including the sound source placed in the specific direction; and de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.

According to the embodiment of the present invention, in the de-mixing of the received signals, a transfer function W of a de-mixing filter is initialized and calculated by using the direction information.

In accordance with another embodiment of the present invention, there is provided a method of extracting a signal y from a sound source placed in a specific direction in a frequency domain. The method includes: receiving mixed signals through at least two signal reception units from two or more sound sources; and de-mixing the received signals by using non-Gaussianity of the received signals, wherein in the de-mixing of the received signals, a learning is repeatedly performed in order to calculate a transfer function W of a de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, and a value of a time delay when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.

In accordance with another aspect of the present invention, there is provided a de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain. The de-mixing system includes: a signal reception unit for receiving mixed signals from at least two sound sources including the sound source placed in the specific direction; and a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.

According to the embodiment of the present invention, the de-mixing system for blind signal extraction by using direction information may further include a filter parameter calculating unit for initializing and calculating a transfer function W of the de-mixing filter by using the direction information.

In accordance with another embodiment of the present invention, there is provided de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain. The de-mixing system includes: two or more signal reception units for receiving mixed signals from two or more sound sources; a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals; and a filter parameter calculating unit for repeatedly performing a learning in order to calculate a transfer function W of the de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, wherein a value of a time delay τ when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.

According to the present invention, the method of extracting the signal of the sound source from the specific direction in the frequency domain, in which the blind signal extraction is performed in the frequency domain while preventing the permutation phenomenon, can be provided. According to the present invention, further, a convergence rate of the blind signal extraction algorithm can be improved, resulting in the reduction of the amount of the calculation.

The above and other aspects, features, and advantages of the present invention will be more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which:

FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm is performed according to the embodiment of the present invention;

FIG. 2 is a view illustrating a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention;

FIG. 3 is a graph illustrating flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit; and

FIG. 4 is a view illustrating a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention.

Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, shapes and sizes of structural elements may be excessively depicted in order to clearly describe the structural elements. Also, it is noted that identical reference numerals and symbols denote the same structural elements throughout the drawings. In the following description of the present invention, a detailed description of known functions or configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

FIG. 1 is an exemplary view illustrating an environment of a de-mixing system in which a blind signal extraction algorithm can be performed according to the embodiment of the present invention. As shown in FIG. 1, it may be considered that signals from at least two

sound sources

10 and 12 are mixed and received in at least one

signal reception unit

20 or 22. In FIG. 1, a room environment is illustrated as an example. Accordingly, the signals from the

sound sources

10 and 12 arrive at the

signal reception unit

20 or 22 not only through direct paths D11, D12, D21 and D22, but also through7 reflection paths R11, R12, R21 and R22 after being reflected in the room. The received signals of the sound sources may be input in a de-mixing system 30. The mixed and received signals of the sound sources can be separated through a de-mixing performed by the de-mixing system 30. Hereinafter, the de-mixing system 30 may be referred to include the

signal reception unit

20 or 22.

At this time, a state in which there is no information on a signal from a sound source or a mixing environment is referred to as a blind state. That is, the embodiment of the present invention provides an algorithm for extracting a signal received in the blind state.

A specific signal extraction from a mixed signal

Hereinafter, a blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 and available in the embodiment of the present invention will be described. The blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 uses the fact that mixed signals are statistically independent, while different frequency bins in one signal are statistically dependent. In the case of using the blind signal extraction algorithm, it is possible to solve the permutation phenomenon during the separation of the mixed signals.

More particularly, Independent Component Analysis (ICA) is a Blind Signal Separation (BSS) algorithm using a statistic independent between output signals. Frequency Domain ICA (FDICA) is used for a convolution BSS algorithm because mixed convolution signals in a time domain may be modeled on mixed instantaneous signals in a frequency domain in the algorithm. This modeling makes the separation problem simple. The FDICA successfully separates a signal component of each frequency channel. However, a random permutation phenomenon of the separated frequency components occurs between the frequency bins.

The blind signal extraction algorithm disclosed in U.S. Patent No. 7,797,153 B2 corresponds to multivariable expansion of the ICA, and solves the permutation uncertainty by using the dependency of the frequency components. The detailed description may be referred to U.S. Patent No. 7,797,153 B2.

Firstly, the mixed signals received in the de-mixing system 30 may be expressed in the frequency domain through Short-Time Fourier Transform (STFT). A signal y extracted by a de-mixing filter (not shown) in the de-mixing system 30 should be identical to a signal from a

sound source

10 or 12. Accordingly, when the signal from the initial sound source 11 or 12 is multiplied by a transfer function A (a transfer function for the mixing filter) for a path of the signal from the sound source, and additionally multiplied by a transfer function W of the de-mixing filter, the signal from the initial sound source 11 or 12 should be restored. This may be expressed by a matrix as follows.

Equation (1)

In Equation (1), W_ij(Z) indicates a transfer function for an input j and an output i of the de-mixing filter (not shown) involved in the de-mixing system 30 in a z-domain, and A_ij(Z) indicates a transfer function for a path from a source j to a signal reception unit i in the z-domain.

In the embodiment of the present invention, the extracted signal y may be expressed by Equation (2) using a multivariable probability density function.

Equation (2)

In Equation (2), y^f is an output signal of an f^th frequency, and y=[y¹, y^f, ..., y^K]. Further, K indicates the number of entire frequency bins included in the signal y, and f is a standard deviation of an absolute value of an f^th frequency signal, for example, may be set to 1.

In order to de-mix the signal received in the de-mixing system 30, the de-mixing filter calculates the transfer function W for the de-mixing. For example, a parameter of the transfer function W may be obtained in a filter parameter calculation unit (not shown) in a manner described below.

According to the embodiment of the present invention, a negentropy function may be used as a cost function in order to maximize non-Gaussianity of the signal in the blind signal extraction algorithm disclosed in U.S. Patent 7,797,153 B2, as indicated below.

Equation (3)

In Equation (3), y^G is a signal of a Gaussian function having means and dispersion identical to those of y. A following learning rule may be obtained in order to acquire an optimal extraction signal using the cost function expressed by Equation (3).

Equation (4)

In Equation (4), ηindicates a learning rate. At this time, if Equation (3) is differentiated, Equation (5) may be acquired below.

Equation (5)

In the embodiment of the present invention, the learning rule expressed by Equation (4) may be determined by using Equation (5). In Equation (5), x^f indicates a signal of an f^th frequency which is received in the

signal reception unit

20 or 22 and input into the de-mixing system 30.

In the embodiment of the present invention, a filter parameter calculation unit (not shown) involved in the de-mixing system 30 may obtain a transfer function W for de-mixing a signal by using a cost function expressed by Equation (3). Then, the de-mixing filter performs a de-mixing for the signal received in the de-mixing system 30 by using the obtained transfer function W.

At this time, the filter parameter calculation unit may receive an output from the de-mixing filter, and repeatedly obtain a filter parameter according to the learning rule expressed by Equation (4) based on the output, so as to provide the obtained filter parameter to the de-mixing filter. Thereby, the de-mixing filter may adaptively operate. That is, the calculation is repeatedly performed according to the learning rule of Equation (4), thereby adaptively obtaining the transfer function W. Then, it is determined whether the transfer function W is converged, and if not converged, a previous step is performed again to calculate the transfer function W, so that the de-mixing is performed again.

The signal y can be extracted by using the transfer function W obtained in an adaptive manner as described above. Then, this may be converted and expressed in the time domain if necessary. The signal y extracted by the above-mentioned algorithm may be a specific signal among the signals received through the

signal reception unit

20 or 22. Generally, the extracted signal y is a signal with the strongest intensity among the signals received through the

signal reception unit

20 or 22.

Signal extraction from a sound source nearest to a signal reception unit

As described above, as a specific condition is added to the blind signal extraction algorithm and then the signal y is extracted, a signal of a sound source nearest to the

signal reception unit

20 or 22 may be extracted. Hereinafter, a blind signal extraction algorithm capable of extracting a near field signal according to the embodiment of the present invention will be described.

Firstly, a property of a signal from the

sound source

10 or 12 placed near the

signal reception unit

20 or 22 will be described.

FIG. 2 illustrates a room impulse response corresponding to a position of a sound source from a signal reception unit to seek closeness constraints according to the embodiment of the present invention. In FIG. 2, a sound source is indicated by source , and the signal reception unit is depicted by "mic". In a graph illustrating the

sources

1, 2, 3, 4 and 5 of FIG. 2, a transvers axis is a time axis indicating a time delay, and a longitudinal axis indicates a magnitude of a response.

As known from the room impulse response for each

source

1, 2, 3, 4, or 5 in FIG. 2, it is understood that as a distance from the sound source to the signal reception unit (microphone) becomes shorter, a magnitude of a first component of the room impulse response rapidly increases in comparison with magnitudes of the other components. Further, as the distance from the sound source to the signal reception unit (microphone) becomes longer, differences in the magnitudes of the first component and the other components of the room impulse response decrease.

As shown in FIG. 2, in the case that the distance between the sound source and the signal reception unit becomes extremely short, it is understood that the room impulse response has a shape similar to that of a delta function. When the room impulse response similar to the delta function is expressed in the frequency domain, the signal has a predetermined constant value in all frequency domains. Accordingly, flatness which is set to increase the value so that the room impulse response has the constant value is expressed by Equation (6).

Equation (6)

In Equation (6), a_ij ^f means a transfer function of an f^th frequency domain on a path from a source j to a signal reception unit i.

FIG. 3 illustrates flatness of a signal in the frequency domain according to a distance between the sound source and the signal reception unit. In FIG. 3, the flatness of each signal is indicated by a different color according to a reflection extent of each signal in the room. In FIG. 3, a transverse axis is a distance axis indicating a distance between a signal source and a microphone, and a longitudinal axis is a magnitude axis indicating a magnitude of the flatness.

As shown in FIG. 3, as the source j becomes further closer to the signal reception unit i, the magnitude of the transfer function in the frequency domain becomes constant. As the result, a value of the flatness defined by Equation (6) increases. The flatness is calculated by using the transfer function a_ij ^f of the path from the source to the signal reception unit.

In order to extract the near field signal with the blind signal extraction algorithm according to the embodiment of the present invention, the property described above should be applied when the transfer function of the de-mixing filter is calculated. In the embodiment of the present invention, the blind signal extraction algorithm can calculate the transfer function W of the de-mixing filter by adding the closeness constraints to the learning rule expressed by Equation (4).

Hereinafter, the closeness constraints will be described.

Equation (1) described in this specification expresses blind signal extraction of 2x2 (signals from two sound sources, and two microphones). Blind signal extraction for N signals and N microphones may be expressed by Equation (7) below.

Equation (7)

In Equation (7), const. indicates a constant value. Accordingly, through Equation (7), a relation formula of the de-mixing filter and a mixing filter may be obtained as Equation (8)

Equation (8)

Equation (8) may be modified in a more simple form by applying two following assumptions for obtaining the closeness constraints. Two following assumptions are used without departing from a purpose for extraction of the closest signal to the

signal reception unit

20 or 22, thereby simply modeling the transfer function a_ij ^f corresponding to the mixing filter.

1. A distance between two neighboring microphones becomes very short.

2. As a distance from the closest signal to two signal reception units is very short, all the room impulse responses which are transfer functions from a close signal to each of the signal reception units have a similar shape to the delta function, i.e., the first component is very large and the other components are very small. If the other components except for the first component are ignored, the room impulse responses a_i1 ^f and a_k1 ^f from the two signal reception units may be expressed as Equation (9) in which they have a difference corresponding to a time delay.

Equation (9)

In Equation (9), _l means a time which is taken to transfer a sound from the closest sound source to the signal reception unit 1, L₁ is a distance from the closest sound source to the signal reception unit 1, and v means a velocity of the sound. Accordingly, if this relation is substituted for Equation (8) with respect to a transfer function a₁₁ ^f, Equation (10) may be obtained below.

Equation (10)

By using the above-mentioned assumption 2, Equation (10) may be more simply expressed. That is, the transfer function a₁₁ ^f of the mixing filter may be treated as the constant because of a short distance to the source. Therefore, Equation (10) may be expressed as Equation (11) below.

Equation (11)

That is, a property of the de-mixing filter which the signal close to the signal reception unit has may be modeled by Equation (11) through the two assumptions using the property of the near field signal. In Equation (11), the flatness expressed by Equation (6) is introduced, resulting in an obtainment of the closeness constraints expressed by Equation (12) below.

Equation (12)

In Equation (12),

Jc obtained in Equation (12) becomes the closeness constraints for the extraction of the near field signal in the frequency domain according to the embodiment of the present invention. In order to apply the closeness constraints to the learning rule expressed by Equation (4), it is necessary to calculate a differential value of the J_c. The differential value of the J_c may be expressed by Equation (13) below.

Equation (13)

By using Equation (13), closeness constraints are added to the algorithm extracting a specific signal and expressed by Equation (4), thereby obtaining a learning rule for extraction of the near field signal, expressed by Equation (14).

Equation (14)

In Equation, (14),

,

, and

.

Further,

, ηis a weight of the learning rate and λis a weight of the closeness constraints, which are defined by a specific value. If values of ηand λare too large, the learning rule expressed by Equation (14) is diverged so that it may fail to learn. However, if these values are too small, it takes a long time to learn. Generally, the values of ηand ηmay be set to the largest value when the learning rule is converged through several trials and errors. The values of ηand λmay be set to a value corresponding to 70~80% of the largest value for stability. J_Neg is a part extracting one specific signal by using a negentropy function, and J_c indicates closeness constraints.

As described above, by using the transfer function W calculated by applying the learning rule including the closeness constraints, e.g., the learning rule expressed by Equation (14), the blind signal extraction algorithm according to the embodiment of the present invention may extract the closest signal to the

signal reception unit

20 or 22.

The function J_c for the closeness constraints includes a time delay τas a variable. The blind signal extraction algorithm is executed in the state that a value of the time delay τis fixed to zero (0) or a specific value. When the closest signal to the

signal reception unit

20 or 22 is extracted through the blind signal extraction algorithm according to the embodiment of the present invention to which the closeness constraints are added, the time delay τwhich is a difference of times when the signals arrive at the plurality of the signal reception units may be not used as the fixed value but adaptively calculated and updated. That is, the value of the time delay τcan be adaptively updated at each learning time of the blind signal extraction algorithm. As described above, the blind signal extraction algorithm according to the embodiment of the present invention may be more rapidly converged by adaptively updating the time delay τat each learning time of the blind signal extraction algorithm. That is, although the value of the time delay τis not input from outside, it is possible to predict a direction of the sound source close to the

signal reception unit

20 or 22, and to efficiently learn. Thereby, the amount of the calculation can be reduced.

FIG. 4 illustrates a geographical position relation between the sound source to which the blind signal extraction algorithm is applicable, and the signal reception unit, according to the embodiment of the present invention. In FIG. 4, a first microphone (microphone 1) and a second microphone (microphone 2) are depicted as the signal reception units, and a first sound source (sound source 1) is shown as a sound source.

The value of the time delay τwich is a difference of times when the signals arrive at the plurality of the

signal reception units

20 and 22 is closely related to a direction in which the sound source is placed. In other words, in FIG. 4, a maximum value of the time delay τwhich is the difference of the times when the signals arrive at the first microphone and the second microphone from the first sound source is determined by a distance between the two microphones.

In FIG. 4, a transverse axis is defined as an x axis, and a longitudinal axis is defined as a y axis. If the first sound source is placed on the x axis, the value of the time delay τwhich is the difference of the times when the signal from the first sound source arrives at the first microphone and the second microphone becomes a maximum value. The value of the time delay τat that time is referred to as T, which indicates a maximum time delay. If the first sound source is placed on the y axis, the value of the time delay τwhich is the difference of the times when the signal from the first sound source arrives at the first microphone and the second microphone becomes zero (0). Accordingly, it can be known that the value of the time delay τis present in a range of T~T.

At this time, the value of J_c which is the closeness constraint can be obtained depending on the value of the time delay τ. For example, the value of the function J_c can be obtained by suitable resolution of the value of the time delay τ e.g., at a time interval of 0.1 second. These values of the time delay τand the function J_c may be depicted in a graph. For example, a transverse axis indicates the value of the time delay τand a longitudinal axis shows the value of the function J_c which is the closeness constraint, so that a relation of two values may be indicated in the graph.

A learning aim of the blind signal extraction algorithm using the closeness constraints is to seek a transfer function W of the de-mixing filter which enables the value of the function J_c to be a maximum value. Similarly, it is possible to seek the value of the time delay τwhich enables the value of the function J_c to be a maximum value. As described above, by adaptively updating the time delay τat each learning time of the blind signal extraction algorithm according to the embodiment of the present invention, to which the closeness constraints are added, the blind signal extraction algorithm according to the embodiment of the present invention can be more rapidly converged. Thereby, the amount of the calculation can be reduced.

Signal extraction from a sound source placed in a specific direction

In the case that the closeness constraints are not added, the blind signal extraction algorithm according to the embodiment of the present invention may extract a specific signal, e.g., a signal having the largest intensity, from the

signal reception unit

20 or 22.

The blind signal extraction algorithm according to the embodiment of the present invention may extract a signal of the sound source placed in the specific direction from the mixed signals.

That is, the

signal reception units

20 and 22 receive the mixed signals from the plurality of the

sound sources

10 and 12. At this time, it is possible to extract the signal of the sound source placed in the specific direction from the signals received in the

signal reception units

20 and 22.

Information on a desired direction may be used in order to extract the signal of the

sound source

10 or 12 placed in the specific direction with relation to the arrangement of the

signal reception units

20 and 22. Here, the direction information refers to a relative direction of the sound source to the

signal reception unit

20 or 22. For example, if the sound source is placed on the y axis of FIG. 4, the sound source may be regarded as located in a direction of 0 degrees. Further, if the sound source is placed to the right or left of the first and second microphones on the x axis of FIG. 4, the sound source may be regarded as located in a direction of +90 degrees or -90 degrees.

Hereinafter, a blind signal extraction algorithm capable of extracting a signal from a sound source placed in a specific direction according to the embodiment of the present invention will be described.

A transfer function from the

sound source

10 or 12 to the

signal reception unit

20 or 22, i.e., a transfer function of the mixing filter, may be modeled as Equation (15), depending on a position of the

sound source

10 or 12 to the

signal reception unit

20 or 22.

Equation (15)

In Equation (15), τ_k1 is a time spent to transfer a voice from a sound source 1 to a signal reception unit k, l_k1 is a distance between the sound source 1 and the signal reception unit k, and v is a velocity of a sound. Information on the direction of the sound source, for example, when two microphones are used, is indicated as a phase difference in a transfer function A from one sound source to each of the microphones. That is, a relation formula of two transfer functions through the transfer function A of the mixing filter for an m^th microphone and an n^th microphone is expressed as Equation (16).

Equation (16)

If is calculated by using Equation (15), following Equation (17) can be obtained from Equation (8) showing a relation of the de-mixing filter and the mixing filter.

Equation (17)

As known from Equation (17), a coefficient of the transfer function W of the de-mixing filter and a coefficient of the transfer function A of the mixing filter are bundled respectively. Here, the coefficient of the mixing filter may be expressed as the phase difference. When the coefficient of the de-mixing filter is initialized, it is possible to compensate for the phase difference occurring in the mixing filter, based on the direction information. As described above, by initializing the transfer function W of the de-mixing filter, the signal from the sound source placed in the specific direction can be extracted according to the direction information.

The initialization of the transfer function W of the de-mixing filter can be expressed by Equation (18).

Equation (18)

Here, τ_m1 is a time spent to transfer a signal from the sound source 1 to the signal reception unit m. This value τ_m1 is a value input according to the direction information of the sound source 1. τ_m1 is a value of a time delay caused because distances over which signals to be extracted are transferred from the sound source 1 to the signal reception units m and n are different. When the blind signal extraction algorithm according to the embodiment of the present invention is applied to the transfer function W after the transfer function W of the de-mixing filter is initialized as indicated by Equation (17), the signal may be extracted from the sound source placed in a desired direction. For reference, in the case that the direction information is not provided, the transfer function W may be initialized into

in the blind signal extraction algorithm.

The blind signal extraction algorithm using the direction information may be used along with a technique of adding the closeness constraints and/or a technique of adaptively updating the time delay τ. For example, when an error is present in the direction information of the sound source, the blind signal extraction algorithm can be applied while the time delay τis updated so as to compensate for wrong information on the direction of the sound source.

Although the embodiments of the present invention have been described with the accompanying drawings up to now, it will be understood that the present invention may be implemented in various embodiments without departing from the technical spirit and the scope of the present invention. Accordingly, it should be understood that the above-described embodiments are merely exemplary and is not limited, and it should be interpreted that the scope of the present invention is represented by the claims rather than the description, and the changes or modifications derived from the claims and the equivalents thereof pertain to the scope of the present invention.

Claims

A method of extracting a signal y from a sound source placed in a specific direction in a frequency domain, the method comprising:

receiving mixed signals through a signal reception unit from two or more sound sources including the sound source placed in the specific direction; and

de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
The method as claimed in claim 1, wherein in the de-mixing of the received signals, a transfer function W of a de-mixing filter is initialized and calculated by using the direction information.
The method as claimed in claim 2, wherein the de-mixing of the received signals comprises:

calculating a transfer function W of the de-mixing filter by using a cost function
, the transfer function being repeatedly calculated according to a learning rule
; and

de-mixing the received signals by using the transfer function,

in which
, and ηis a learning rate.
A method of extracting a signal y from a sound source placed in a specific direction in a frequency domain, the method comprising:

receiving mixed signals through two or more signal reception units from two or more sound sources; and

de-mixing the received signals by using non-Gaussianity of the received signals,

wherein in the de-mixing of the received signals, a learning is repeatedly performed in order to calculate a transfer function W of a de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain, and a value of a time delay when signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
The method as claimed in claim 4, wherein the de-mixing of the received signals comprises:

calculating a transfer function W of the de-mixing filter by using a cost function
, the transfer function of the de-mixing filter being repeatedly calculated according to a learning rule
; and

de-mixing the received signals by using the transfer function of the de-mixing filter,

in which
, J_c is closeness constraint, λis a learning rate, and is a weight of the closeness constraint.
The method as claimed in claim 5, wherein the closeness constraint is expressed by
.
A de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain, the de-mixing system comprising:

a signal reception unit for receiving mixed signals from at least two sound sources including the sound source placed in the specific direction; and

a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals and information on the direction.
The de-mixing system as claimed in claim 7, further comprising: a filter parameter calculating unit for initializing and calculating a transfer function W of the de-mixing filter by using the direction information.
The de-mixing system as claimed in claim 8, wherein the filter parameter calculating unit calculates a transfer function W of the de-mixing filter by using a cost function
, the transfer function being repeatedly calculated according to a learning rule
, in which
, and ηis a learning rate.
A de-mixing system for extracting a signal y from a sound source placed in a specific direction in a frequency domain, the de-mixing system comprising:

two or more signal reception units for receiving mixed signals from two or more sound sources;

a de-mixing filter for de-mixing the received signals by using non-Gaussianity of the received signals; and

a filter parameter calculating unit for repeatedly performing a learning in order to calculate a transfer function W of the de-mixing filter by using closeness constraints indicating flatness of a transfer function of a mixing filter in a frequency domain,

wherein a value of a time delay τwhen signals arrive from the two or more signal reception units is adaptively updated as the learning is repeatedly performed.
he de-mixing system as claimed in claim 10, wherein the filter parameter calculating unit calculates a transfer function W of the de-mixing filter by using a cost function
, the transfer function being repeatedly calculated according to a learning rule
,

in which
, J_c is closeness constraint, λis a learning rate, and is a weight of the closeness constraint.
he de-mixing system as claimed in claim 11, wherein the closeness constraint is expressed by .