CN108957401A

CN108957401A - Sound source number estimation method and device based on rarefaction representation

Info

Publication number: CN108957401A
Application number: CN201811030884.XA
Authority: CN
Inventors: 蒋灏; 张小博; 刘文娟
Original assignee: CETC 3 Research Institute
Current assignee: CETC 3 Research Institute
Priority date: 2018-09-05
Filing date: 2018-09-05
Publication date: 2018-12-07

Abstract

The present invention relates to a kind of sound source number estimation method and device based on rarefaction representation, this method comprises: the direction based on base vector, determines overall situation potential function；To the value of all direction calculating potential functions, the number for searching for potential function peak value obtains the estimated value of sound source number.The device includes: potential function determining module, determines global potential function for the direction based on base vector；Sound source number estimation module, for the value to all direction calculating potential functions, the number for searching for potential function peak value obtains the estimated value of sound source number.Present invention estimated accuracy with higher, and it is also effective in the case where sound source number is greater than microphone number.

Description

Sound source number estimation method and device based on rarefaction representation

Technical field

The invention belongs to field of signal processing more particularly to a kind of sound source number estimation methods and dress based on rarefaction representation It sets.

Background technique

Microphone array signals processing has attracted the interest of Many researchers as a branch of array signal processing.Make For the combination of traditional voice signal processing and array signal processing, the research of microphone array both can be by conventional microphone The advantage of array, and have the distinctive difficult point of Speech processing.It is well known that in classical signal process field, with many classics Algorithm, such as DOA estimation, Blind Signal Separation all must know in advance that the number of signal.Equally, at microphone array signals In reason, enhancing is carried out to particular sound source signal or separates some interested sound source and interference sound source, it is also desirable to is right The accurate description of surrounding acoustic environment, the position of number, sound source or the estimation of DOA and tracking etc. including sound source, and sound source Several accurate estimations is DOA estimation and isolated premise, and reason is that most existing algorithm all assumes sound source number priori It is known that this be in actual environment it is very inappeasable, it is subsequent when sound sources number detection and practical sound source number are there are when error DOA estimation and separating property can be remarkably decreased.Therefore, the research of sound sources number detection be with a wide range of applications with it is important Researching value.

In narrow band signal processing, information theory criterion receives extensive pass because it is realized simply and estimation is functional Note has delivered a large amount of document, and the mainstream research direction that the method about the estimation of broadband signal number is then relatively fewer, current It is the method based on maximum likelihood and the method based on subspace.Although based on the method for maximum likelihood under statistical significance most It is excellent, but complexity is calculated, it is difficult to practice.Method based on subspace can be divided into incoherent subspace method and relevant son is empty Between method.Broadband signal is converted to the narrow band signal on multiple frequency bands generally by transformation by incoherent subspace method, so Estimated afterwards using narrowband source number estimate method, certain finally is carried out to each frequency band estimated result and is averaged, information source is obtained The final estimation of number.The problem of such methods is performance decline to be estimated when noise is relatively low, and cannot handle coherent signal.Phase More classical way is first to mix convolution to signal progress short time discrete Fourier transform is received in dry subspace method Signal is converted to frequency domain Instantaneous Mixtures, is then focused transformation to signal is received in each frequency band, finally utilizes information Estimate by number of the criterion to sound-source signal.The algorithm due to needing to be focused frequency-domain received signal transformation, Computation complexity is higher, meanwhile, in order to calculate focussing matrix, need to carry out pre-estimation to the direction of arrival of source signal, therefore should The performance of algorithm is easy to be influenced by focussing matrix.

Summary of the invention

In order to solve the above-mentioned technical problem, the object of the present invention is to provide a kind of sound sources number detection based on rarefaction representation Method and device, this method estimated accuracy with higher, and it is greater than the feelings of microphone number (it is fixed to owe) in sound source number Under condition also effectively.

The present invention provides a kind of sound source number estimation method based on rarefaction representation, comprising:

Direction based on base vector determines overall situation potential function；

To the value of all direction calculating potential functions, the number for searching for potential function peak value obtains the estimated value of sound source number.

Further, the global potential function are as follows:

In formula, θ indicates the direction of base vector in polar coordinate system；λ is resolution adjustment parameter；T is sampling number, t= 1,2,…,T；r_tIt is sampled point with a distance from origin, weight is used as in formula；F is local base function；θ_tFor the exhausted of each sampled point To angle, θ-θ_tFor local angle, i.e., the absolute angle of each sampled point and the difference in base vector direction.

The present invention also provides a kind of sound source number estimation device based on rarefaction representation, comprising:

Potential function determining module determines global potential function for the direction based on base vector；

Sound source number estimation module, for the value to all direction calculating potential functions, the number for searching for potential function peak value is obtained To the estimated value of sound source number.

Further, the global potential function are as follows:

Compared with prior art the beneficial effects of the present invention are: estimated accuracy with higher, and it is big in sound source number In the case where microphone number (it is fixed to owe) also effectively.

Detailed description of the invention

Fig. 1 is a kind of flow chart of the sound source number estimation method based on rarefaction representation of the present invention；

Fig. 2 is a kind of structural block diagram of the sound source number estimation device based on rarefaction representation of the present invention；

Fig. 3 is the estimation performance map of experiment 1 in emulation experiment；

Fig. 4 is the estimation performance map of experiment 2 in emulation experiment；

Fig. 5 is the estimation performance map of experiment 3 in emulation experiment；

Fig. 6 is the estimation performance map of experiment 4 in emulation experiment.

Specific embodiment

The present invention is described in detail for each embodiment shown in reference to the accompanying drawing, but it should be stated that, these Embodiment is not limitation of the present invention, those of ordinary skill in the art according to these embodiments made by function, method, Or equivalent transformation or substitution in structure, all belong to the scope of protection of the present invention within.

Join shown in Fig. 1, present embodiments provide a kind of sound source number estimation method based on rarefaction representation, comprising:

Step S1, the direction based on base vector determine overall situation potential function；

Step S2, to the value of all direction calculating potential functions, the number for searching for potential function peak value obtains estimating for sound source number Evaluation.

This method estimated accuracy with higher, and in the case where sound source number is greater than microphone number (it is fixed to owe) Also effectively.

Join shown in Fig. 2, present embodiments provide a kind of sound source number estimation device based on rarefaction representation, comprising:

Potential function determining module 10 determines global potential function for the direction based on base vector；

Sound source number estimation module 20 searches for the number of potential function peak value for the value to all direction calculating potential functions Obtain the estimated value of sound source number.

Device estimated accuracy with higher, and in the case where sound source number is greater than microphone number (it is fixed to owe) Also effectively.

Invention is further described in detail below.

Assuming that there is N number of voice signal to be incident on the microphone array being made of M microphone, then, moment t microphone array Reception signal on column are as follows:

Wherein, N is source signal number, and T is sampling number.It is write the data of T sampled point as matrix form, is had

X=AS (2)

Wherein, X=[x (1), x (2) ..., x (T)] indicates to receive data matrix, A=[a₁,a₂,…,a_N] indicate aliasing square Battle array,Indicate source signal data matrix.

The target of the present embodiment algorithm is exactly to estimate source signal in the case where Mixture matrix A and source signal S unknown Number.

The present embodiment algorithm is based primarily upon the sparsity of source signal it is assumed that i.e. source signal s_i(t), i=1,2 ..., N；T= In 1,2 ..., T, only a small amount of signal is not equal to 0.The case where for M=2, it is assumed that only i-th of source signal is non-zero signal, So, all x (t) will be with a_iIt is proportional, and all reception data points can all be concentrated to this direction.

Therefore, if signal be it is sparse, one of signal of some data point is very big, then other signals level off to Zero, and there are two features for the distribution density of data point, first is that the distribution of then data point more remote from origin is more sparse, second is that basal orientation Measure a_iDirection aggregation.Using this feature, we can estimate the number of source signal by the data-intensive direction of searching.

The present embodiment considers the array that M=2 microphone is constituted, and mixed signal space is plane, base vector a at this time_i's Direction can be indicated by the angle, θ in polar coordinate system.I.e.

θ_t=arctan (x₂(t)/x₁(t)) (4)

r_t、θ_tRespectively indicate the radius and angle of data point x (t), x₁(t)、x₂(t) be respectively first observation signal and Second observation signal.It is as follows to construct local base function:

Global potential function are as follows:

In formula, θ indicates the direction of base vector in polar coordinate system, and λ is resolution adjustment parameter, adjustable basic function Resolution ratio, value is bigger, then basic function is narrower, and the peak value of put potential function will be more sharp；θ_tFor the absolute angle of each sampled point Degree, θ-θ_tFor local angle, i.e., the absolute angle of each sampled point and the difference in base vector direction；F is local base function；T is sampling Points, t=1,2 ..., T；r_tIt is sampled point with a distance from origin, weight is used as in formula, wherein origin of adjusting the distance is farther away Data point assigns bigger weight, assigns lesser weight to from the closer data point of origin, intuitively says, the point far from origin Stronger cluster feature is shown than the point around origin.

Finally to the value of all possible direction calculating potential function, each local maximum corresponds to a sound-source signal, By searching for the number of potential function peak value, the estimation of information source number can be obtained.

The validity of the algorithm is verified below by emulation experiment.

This experiment compares mentioned algorithm and Conventional wide band source number estimate algorithm (with k mean value using several groups of truthful datas Cluster combine) estimated accuracy.Data used are all made of two microphones and receive, microphone spacing be respectively 0.07m and 0.12m.Sound source number is more than or equal to 2.

1: the first group of data is tested as anechoic room acquisition, uses distance to receive for 0.12 meter of two microphones, sound-source signal Respectively 2.3 meters of distance microphone array of male voice, the song away from 2.5 meters of microphone array and 4 meters of distance microphone array Schoolgirl.Fig. 3 (a) is the sound spectrograph for receiving signal, and Fig. 3 (b) is variation diagram of the cost function with θ.Table 1 is is mentioned algorithm, non-phase The estimation performance comparison of dry subspace method and relevant subspace method.

The comparison of 1 algorithm performance of table

	Actual signal number	Estimate signal number
			Mentioned algorithm	3	3
Incoherent subspace method	3	4
			Relevant subspace method	3	2

2: the second groups of data are tested as anechoic room acquisition, use distance to receive for 0.12 meter of two microphones, sound-source signal Respectively 1 meter of distance microphone array of male voice and 3 meters of distance microphone array of female voice.Fig. 4 (a) is to receive signal language spectrum Figure, Fig. 4 (b) are cost function with the variation diagram with θ.Table 2 is mentioned algorithm, incoherent subspace method and relevant subspace method Estimate performance comparison.

The comparison of 2 algorithm performance of table

	Actual signal number	Estimate signal number
			Mentioned algorithm	2	2
Incoherent subspace method	2	2
			Relevant subspace method	2	2

Experiment 3: third group data are anechoic room acquisition, use distance to receive for 0.12 meter of two microphones, sound-source signal For 2 meters of female voice of distance microphone array.Fig. 5 (a) is the sound spectrograph for receiving signal, and Fig. 5 (b) is variation of the cost function with θ Figure.Table 3 for mentioned algorithm, incoherent subspace method and relevant subspace method estimation performance comparison.

The comparison of 3 algorithm performance of table

	Actual signal number	Estimate signal number
			Mentioned algorithm	1	1
Incoherent subspace method	1	2
			Relevant subspace method	1	1

4: the four groups of data are tested as indoor environment acquisition, use distance to receive for 0.07 meter of two microphones, sound source letter Number be respectively 1 meter of distance microphone array of male voice and 3 meters of distance microphone array of male voice.Fig. 6 (a) is the language for receiving signal Spectrogram, Fig. 6 (b) are variation diagram of the cost function with θ.Table 3 is mentioned algorithm, incoherent subspace method and relevant subspace method Estimate performance comparison.

The comparison of 4 algorithm performance of table

Comparison discovery, mentioned algorithm either are still owed to determine to make signal source number in situation in overdetermination, positive definite It is effectively estimated, and relevant subspace method and incoherent subspace method can only work in positive definite, owe to determine in situation completely Failure.

The series of detailed descriptions listed above only for feasible embodiment of the invention specifically Protection scope bright, that they are not intended to limit the invention, it is all without departing from equivalent implementations made by technical spirit of the present invention Or change should all be included in the protection scope of the present invention.

It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.

Claims

1. a kind of sound source number estimation method based on rarefaction representation characterized by comprising

2. the sound source number estimation method according to claim 1 based on rarefaction representation, which is characterized in that the overall situation gesture Function are as follows:

In formula, θ indicates the direction of base vector in polar coordinate system；λ is resolution adjustment parameter；T is sampling number, t=1, 2,…,T；r_tIt is sampled point with a distance from origin, weight is used as in formula；F is local base function；θ_tFor the absolute of each sampled point Angle, θ-θ_tFor local angle, i.e., the absolute angle of each sampled point and the difference in base vector direction.

3. a kind of sound source number estimation device based on rarefaction representation characterized by comprising

Sound source number estimation module, for the value to all direction calculating potential functions, the number for searching for potential function peak value obtains sound The estimated value of source number.

4. the sound source number estimation device according to claim 3 based on rarefaction representation, which is characterized in that the overall situation gesture Function are as follows: