CN106526541A - Sound positioning method based on distribution matrix decision - Google Patents

Sound positioning method based on distribution matrix decision Download PDF

Info

Publication number
CN106526541A
CN106526541A CN201610893331.1A CN201610893331A CN106526541A CN 106526541 A CN106526541 A CN 106526541A CN 201610893331 A CN201610893331 A CN 201610893331A CN 106526541 A CN106526541 A CN 106526541A
Authority
CN
China
Prior art keywords
positioning
sound
signal
distribution matrix
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610893331.1A
Other languages
Chinese (zh)
Other versions
CN106526541B (en
Inventor
王建中
叶凯
曹九稳
薛安克
王天磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Hangzhou Electronic Science and Technology University
Original Assignee
Hangzhou Electronic Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Electronic Science and Technology University filed Critical Hangzhou Electronic Science and Technology University
Priority to CN201610893331.1A priority Critical patent/CN106526541B/en
Publication of CN106526541A publication Critical patent/CN106526541A/en
Application granted granted Critical
Publication of CN106526541B publication Critical patent/CN106526541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a sound positioning method based on a distribution matrix decision. The method comprises the following steps: 1) carrying out pretreatment, comprising framing, on multi-channel sound signals collected by a sound array; 2)carrying out sound identification algorithm on single-channel data, each frame obtaining one sound identification result; 3) carrying out broadband sound positioning on multi-channel data, each frame obtaining one sound positioning result; 4) through identification and positioning result sets, which represent lines and rows of a matrix, obtained through the steps above, constructing a distribution matrix; 5) after obtaining the distribution matrix, finding a positioning distribution peak value of a target sound source; and 6) selecting the peak value and adjacent two angle sections, and calculating a statistical mean value of the three sections. The method can improve sound positioning algorithm result accuracy, and such effect is especially obvious especially under the condition of obvious interference and complex environment background; dependent of the positioning algorithm on the identification results is low; and the method has wide applicability.

Description

Sound localization method based on distribution matrix decision-making
Technical field
The invention belongs to signal processing technology field, more particularly to the sound localization method based on distribution matrix decision-making.
Background technology
In traditional sound location algorithm, there are problems that following:
1. poor anti jamming capability.It is noiseless indoors, it is muting in the case of, location algorithm accuracy rate is high, but out of doors In the case of complex environment, once occur noise or very be interference, positioning result will be produced a very large impact.
2. sound signal processing field, recognizes and location algorithm contact is tight, and complement each other.Conventional location algorithm does not but have Having, and lacks the assurance to information fusion technology advantage.
The content of the invention
For problem above, the invention provides a kind of sound localization method based on distribution matrix decision-making.Now with cross It is illustrated as a example by ideophone array.
To achieve these goals, the technical solution used in the present invention comprises the steps:
Step 1, the four-way acoustical signal collected to acoustic array carry out pretreatment, and pretreatment includes framing;
Step 2, voice recognition is carried out to single-channel data;
Step 3, wideband voice positioning is carried out to multi-channel data;
Step 4, the identification obtained according to step 2,3 and positioning result set, build distribution matrix;
Step 5, obtain distribution matrix after, find the positioning distribution peaks of target sound source;
Step 6, selection peak value and its two neighboring angular interval, calculate these three interval average statisticals, as finally Positioning result.
Described step 1:Live sound signal is obtained using cross acoustic array, note sample frequency is fs.To four-way Acoustical signal carries out sub-frame processing, it is assumed that the frame number after framing is m.Next to framing after each frame signal process.
Described step 2:Take each frame single channel signal after framing to be identified.
The described algorithm being identified to single channel signal is LPCC+SVM algorithms.
Each frame obtains a recognition result, so as to constitute recognition result array C that length is m.
C=[c (1) c (2) c (m)];
Described step 3:Taking each frame four-way signal after framing carries out broadband location algorithm.
It is broadband MUSIC algorithms that described four-way signal carries out the algorithm of broadband positioning
3-1, as needed selection frequency band and mid frequency f0, described frequency band and mid frequency f0Need according to actual mesh The frequecy characteristic of mark signal is being selected.
3-2, FFT Fourier transformations are done to each frame four-way signal, the model X of each frame four-way signal after conversion (fj) be expressed as:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J formula 1
Aθ(fj) it is guiding vector, S (fj) and N (fj) it is sound-source signal and noise after FFT Fourier transformations respectively.
Selected frequency band is divided into into multiple frequencies for f after conversionjNarrow band signal combination.
3-3, using focussing matrix T, by each arrowband place frequency fjBy focus variations to mid frequency f0It is located narrow Band, change procedure are as follows:
T(fj)A(fj)S(fj)=A (f0)S(f0) formula 2
And mid frequency f is tried to achieve by formula 30The autocorrelation matrix at place, for positioning:
3-4, to mid frequency f0Place arrowband is positioned, and obtains the positioning result of this frame data.Each frame correspondence one Individual positioning result, so that constitute positioning result array A that length is m.
A=[a (1) a (2) a (m)]
Described step 4:Recognition result array C obtained according to step 2 and step 3 and positioning result array A, construction point Cloth matrix M.
Value with recognition result array C as abscissa, the angular configurations scope with positioning result array A as vertical coordinate, The result of each frame is traveled through, matrix M, wherein M (C is builti,Aj) represent is that recognition result is C in all framesiPositioning result is AjFrame number.
Described step 5:After obtaining distribution matrix, by recognition result CiFind the positioning distribution peaks of target sound source Atop
Described step 6:In recognition result CiPositioning distribution on, select peak AtopAnd its two neighboring value Atop-1And Atop+1, the average statistical of these three value place matrix units is calculated, formula can be expressed as:
The wherein resolution of P representing matrixs vertical coordinate angular interval.For example by circumference, 360 degree are divided into 36 angular areas Between, then resolution P=10.
The present invention has the beneficial effect that:
The acoustical signal for collecting simultaneously recognizes and location algorithm by the present invention, and according to result structure distribution matrix, End product is obtained by certain decision making algorithm.This invention can make full use of all identifications in sound clip and positioning letter Breath, on the premise of target sound is recognition result, is distributed according to the positioning result of all frames, obtains final positioning result. Advantage is can to maximize to reject interference and the impact that brings of noise in acoustical signal, and low to the dependency of recognizer, With broad applicability.
Description of the drawings
Fig. 1 is that the present invention proposes overall algorithm flow chart
Fig. 2 is position portion algorithm flow chart
Fig. 3 is the schematic diagram of distribution matrix
Fig. 4 is that 4 passage cross acoustic arrays set up the structure chart under rectangular coordinate system
Specific embodiment
With reference to the accompanying drawings and detailed description the present invention is elaborated, is below described and is only conciliate as demonstration Release, do not make any pro forma restriction to the present invention.
It is illustrated in figure 44 passage cross acoustic arrays and sets up the structure chart under rectangular coordinate system, wherein d is two phases The spacing of adjacent microphone;Radiuses of the r for cross array;S (t) is sound source, and its direction is θ;A, B, C, D in figure is right respectively Should be in passage 1, passage 2, passage 3, passage 4.Then signal is gathered, the signal for collecting 4 passages is always met together, x is designated as respectively1 (t), x2(t), x3(t), x4(t)。
Guiding vector based on signal collected by cross battle array can be expressed as:
Wherein, ω=2 π f, f are signal frequencies, τp(θ) (p=1,2,3,4) is the time delay between signal.Guiding vector exists Algorithm positioned below can be used.
Fig. 1 illustrates the algorithm overview flow chart of the present invention, according to the step in Fig. 1, is being connect by four-way acoustic array After having received four channel signals, pretreatment operation is carried out to which.Main pretreatment operation is framing.To four passages Signal does framing respectively, and framing length is 1024 sampled points, and step-length is 1/2nd of framing length.After assuming signal framing It is divided into the frame of m a length of 1024 sampled points, next our algorithm will be processed to this each frame.
First, algorithm is identified to each frame single channel signal.
Any speech recognition algorithm can be used, and we with LPCC feature extractions and svm classifier learning algorithm are here Example is illustrating.Wherein, we use 16 rank LPCC coefficients, the kernel function of SVM we choose RBF (Radial Basis Function, RBF), it is assumed that the sound type being identified has C1, C2, C3, C4, C5 three types.
12 rank linear predictor coefficients (Linear Prediction Coefficients, the LPC) value of every frame signal is tried to achieve, Wherein LPC values can be solved using Levinson-Durbin algorithms.Followed by LPCC values and the corresponding relation of LPC values Try to achieve the LPCC values of 16 ranks.
Described sound fingerprint base method for building up is as follows:
The 16 rank LPCC values extracted to every frame signal by rows, are then above adding string as category, mark Number ' 0 ' represents C1, and ' 1 ' represents C2, and ' 2 ' represent C3, and ' 3 ' represent C4, and ' 4 ' represent C5.So as to constitute the feature of 17 ranks to Amount.
SVM algorithm is realized with existing libsvm storehouses, chooses RBF as classifier functions;RBF has two parameters:Punish Penalty factor c and parameter gamma, can select optimum number by the grid search function opti_svm_coeff of libsvm Value.
Training process uses the svmtrain functions in libsvm storehouses, comprising four parameters:Characteristic vector, uses said extracted The labelled LPCC values for going out;Kernel function type, from RBF kernel functions;RBF kernel functional parameter c and gamma, are searched using grid Rope method determines;The variable of an entitled model can be obtained after calling svmtrain, this variable save training gained model letter This variable save is got off by breath, i.e., described sound fingerprint base.
And the svmtest being identified by libsvm storehouses of sound is come what is realized, the LPCC values that every frame signal is obtained Intelligent classification is carried out with the svmtest functions of libsvm, and svmtest there are three parameters:First is category, for testing identification Not (when the sound to UNKNOWN TYPE is identified, the parameter does not have practical significance) of rate;Second is characterized vector, i.e., The variable of storage LPCC values, the 3rd is Matching Model, is exactly the return value of above-mentioned steps training process svmtrain function.Adjust The return value obtained with svmtest is exactly acquired results of classifying, i.e. category, so as to can determine that the equipment class for producing this sound Type.
When in actual applications, feature extraction is carried out to signal, be then compared with the sound fingerprint base set up, do To identification.
Then after this stage, we can obtain m recognition result, constitute array C
C=[c (1) c (2) c (m)]
Next, the present invention carries out location algorithm to the four-way signal of each frame.
Fig. 2 illustrates the particular flow sheet of location algorithm part, carries out FFT including to subframe, to each arrowband Pre-estimation angle, and the location algorithm in broadband, here our explanations by taking MUSIC algorithms as an example.
For seeking the autocorrelation matrix of signal, this frame four-way signal is done into secondary framing, framing length is 256, and step-length is The half of frame length.FFT Fourier transformations are done after antithetical phrase framing.The formula of FFT is as follows:
L is that subframe is long, as 256.
After FFT, data can be expressed as:
N is the number of sub-frames after secondary framing.
The signal frequency domain model for then obtaining can be expressed as:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J
WhereinfsIt is the sample frequency of signal.As actual signal is mostly broadband signal, need to choose One suitable broadband frequency domain and center frequency points f0
Broadband signal can be regarded as multiple narrow band signals and constitute.By focussing matrix TjEach arrowband can be made by we Focusing transform is to mid frequency.
T(fj)A(fj)S(fj)=A (f0)S(f0)
A (f) is guiding vector to be used in location algorithm.
We first do the MUSIC location algorithms of an arrowband to each arrowband, used as pre-estimation when seeking focussing matrix As a result.Step is as follows:
The signal autocorrelation matrix R of each narrow band frequency is sought firstf, to autocorrelation matrix RfMake Eigenvalues Decomposition.
U in formulaSIt is the subspace namely signal subspace opened by the corresponding characteristic vector of big eigenvalue, and UNIt is by little The subspace of the corresponding characteristic vector of eigenvalue namely noise subspace.The Power estimation function of MUSIC algorithms is
In formula, Θ represents angle of visibility.
Allowing θ to scan in observation fan Θ faces, formula being calculated in the corresponding functional value of each scan position, peak value occurs in the function Orientation, be denoted as βj, as aspect.
β=[β can be obtained after MUSIC location algorithm pre-estimations are done to each arrowband1 β2 ··· βJ]。
And then, we will construct focussing matrix by pre-estimation result.
T(fj)=V (fj)U(fj)H
Wherein U (fj) and V (fj) it is respectively A (fj,β)AH(f0, β) left unusual and right singular vector.Using a series of poly- Burnt matrix T (fj) conversion is focused to array receiving data, obtain the data autocorrelation matrix of single-frequency point
Equally, after autocorrelation matrix has been obtained, we can try again to mid frequency arrowband MUSIC algorithms, just Last positioning result can be obtained.
After this stage, we can obtain m positioning result, constitute array A.
A=[a (1) a (2) a (3) a (4) a (m)]
As shown in Figure 1, after positioning and recognition result is obtained, therefore we can build distribution matrix M.Fig. 3 is illustrated The schematic diagram of distribution matrix.Abscissa is that the possible spans of positioning result A are interval.That vertical coordinate is represented is recognition result C Possible span.M(Ci,Aj) represent that recognition result is C in all frames of this segment dataiPositioning result is AjFrame it is total Number.
After distribution matrix statistics is obtained, just by the positioning distribution of the recognition result of target sound source, determining for target is tried to achieve Position result.
The present invention selects that a line that recognition result is target sound source, and the positioning result distribution of target sound source is obtained.Look for To peak Atop, determine peak value and its two neighboring value Atop-1And Atop+1, the statistics calculated in this 3 value place matrix units is equal Value, as final positioning result.
Formula can be expressed as:

Claims (7)

1. the sound localization method based on distribution matrix decision-making, it is characterised in that comprise the steps:
Step 1, the four-way acoustical signal collected to acoustic array carry out pretreatment;
Step 2, voice recognition is carried out to single-channel data;
Step 3, wideband voice positioning is carried out to multi-channel data;
Step 4, the identification obtained according to step 2,3 and positioning result set, build distribution matrix;
Step 5, obtain distribution matrix after, find the positioning distribution peaks of target sound source;
Step 6, selection peak value and its two neighboring angular interval, calculate these three interval average statisticals, and as last determines Position result.
2. the sound localization method based on distribution matrix decision-making according to claim 1, it is characterised in that described step 1:Live sound signal is obtained using cross acoustic array, note sample frequency is fs;Four-way acoustical signal is carried out at framing Reason, it is assumed that the frame number after framing is m;Next to framing after each frame signal process.
3. the sound localization method based on distribution matrix decision-making according to claim 2, it is characterised in that described step 2 The algorithm being identified to single channel signal is LPCC+SVM algorithms;
Each frame obtains a recognition result, so as to constitute recognition result array C that length is m;
C=[c (1) c (2) ... c (m)].
4. the sound localization method based on distribution matrix decision-making according to claim 3, it is characterised in that described four-way It is broadband MUSIC algorithms that road signal carries out the algorithm of broadband positioning, specific as follows:
3-1, as needed selection frequency band and mid frequency f0, described frequency band and mid frequency f0Need to be believed according to realistic objective Number frequecy characteristic being selected;
3-2, FFT Fourier transformations are done to each frame four-way signal, the model X (f of each frame four-way signal after conversionj) table It is shown as:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J formula 1
Aθ(fj) it is guiding vector, S (fj) and N (fj) it is sound-source signal and noise after FFT Fourier transformations respectively;
Selected frequency band is divided into into multiple frequencies for f after conversionjNarrow band signal combination;
3-3, using focussing matrix T, by each arrowband place frequency fjBy focus variations to mid frequency f0Place arrowband, becomes Change process is as follows:
T(fj)A(fj)S(fj)=A (f0)S(f0) formula 2
And mid frequency f is tried to achieve by formula 30The autocorrelation matrix at place, for positioning:
3-4, to mid frequency f0Place arrowband is positioned, and obtains the positioning result of this frame data;Each one positioning of frame correspondence As a result, so as to constitute length be m positioning result array A;
A=[a (1) a (2) ... a (m)].
5. the sound localization method based on distribution matrix decision-making according to claim 4, it is characterised in that described step 4:Recognition result array C obtained according to step 2 and step 3 and positioning result array A, construct distribution matrix M;
As abscissa, the angular configurations scope with positioning result array A is traveled through value with recognition result array C as vertical coordinate The result of each frame, builds matrix M, wherein M (Ci,Aj) represent is that recognition result is C in all framesiPositioning result is Aj's The number of frame.
6. the sound localization method based on distribution matrix decision-making according to claim 5, it is characterised in that described step 5:After obtaining distribution matrix, by recognition result CiFind the positioning distribution peaks A of target sound sourcetop
7. the sound localization method based on distribution matrix decision-making according to claim 6, it is characterised in that described step 6:In recognition result CiPositioning distribution on, select peak AtopAnd its two neighboring value Atop-1And Atop+1, calculate these three value institutes In the average statistical of matrix unit, formula can be expressed as:
FDOA C i = P * Σ l = t o p - 1 t o p + 1 A l * M ( C i , A l ) Σ l = t o p - 1 t o p + 1 M ( C i , A l ) .
CN201610893331.1A 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision Active CN106526541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610893331.1A CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610893331.1A CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Publications (2)

Publication Number Publication Date
CN106526541A true CN106526541A (en) 2017-03-22
CN106526541B CN106526541B (en) 2019-01-18

Family

ID=58332047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610893331.1A Active CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Country Status (1)

Country Link
CN (1) CN106526541B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493106A (en) * 2017-08-09 2017-12-19 河海大学 A kind of method of frequency and angle Combined estimator based on compressed sensing
CN112347984A (en) * 2020-11-27 2021-02-09 安徽大学 Olfactory stimulus-based EEG (electroencephalogram) acquisition and emotion recognition method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method
CN103439688A (en) * 2013-08-27 2013-12-11 大连理工大学 Sound source positioning system and method used for distributed microphone arrays
CN105609113A (en) * 2015-12-15 2016-05-25 中国科学院自动化研究所 Bispectrum weighted spatial correlation matrix-based speech sound source localization method
CN106023996A (en) * 2016-06-12 2016-10-12 杭州电子科技大学 Sound identification method based on cross acoustic array broadband wave beam formation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method
CN103439688A (en) * 2013-08-27 2013-12-11 大连理工大学 Sound source positioning system and method used for distributed microphone arrays
CN105609113A (en) * 2015-12-15 2016-05-25 中国科学院自动化研究所 Bispectrum weighted spatial correlation matrix-based speech sound source localization method
CN106023996A (en) * 2016-06-12 2016-10-12 杭州电子科技大学 Sound identification method based on cross acoustic array broadband wave beam formation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘春静 等: "基于数据矩阵聚焦的宽带DOA算法", 《弹箭与制导学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493106A (en) * 2017-08-09 2017-12-19 河海大学 A kind of method of frequency and angle Combined estimator based on compressed sensing
CN107493106B (en) * 2017-08-09 2021-02-12 河海大学 Frequency and angle joint estimation method based on compressed sensing
CN112347984A (en) * 2020-11-27 2021-02-09 安徽大学 Olfactory stimulus-based EEG (electroencephalogram) acquisition and emotion recognition method and system

Also Published As

Publication number Publication date
CN106526541B (en) 2019-01-18

Similar Documents

Publication Publication Date Title
Subramanian et al. Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition
CN1121681C (en) Speech processing
US9264806B2 (en) Apparatus and method for tracking locations of plurality of sound sources
CN111239680B (en) Direction-of-arrival estimation method based on differential array
CN106023996B (en) Sound recognition methods based on cross acoustic array broad-band EDFA
CN102760444B (en) Support vector machine based classification method of base-band time-domain voice-frequency signal
CN107393554A (en) In a kind of sound scene classification merge class between standard deviation feature extracting method
CN105976827B (en) A kind of indoor sound localization method based on integrated study
CN103854660B (en) A kind of four Mike's sound enhancement methods based on independent component analysis
JPH0272397A (en) Speech recognition device
CN103714806A (en) Chord recognition method combining SVM with enhanced PCP
Huang et al. Intelligent feature extraction and classification of anuran vocalizations
CN108091345B (en) Double-ear voice separation method based on support vector machine
Shimada et al. Ensemble of ACCDOA-and EINV2-based systems with D3Nets and impulse response simulation for sound event localization and detection
CN107424625A (en) A kind of multicenter voice activity detection approach based on vectorial machine frame
CN111653267A (en) Rapid language identification method based on time delay neural network
CN106526541B (en) Sound localization method based on distribution matrix decision
CN105931646A (en) Speaker identification method base on simple direct tolerance learning algorithm
Yamamoto et al. Deformable cnn and imbalance-aware feature learning for singing technique classification
CN116559778B (en) Vehicle whistle positioning method and system based on deep learning
CN113628640A (en) Cross-library speech emotion recognition method based on sample equalization and maximum mean difference
CN113111786A (en) Underwater target identification method based on small sample training image convolutional network
CN102419976A (en) Method for performing voice frequency indexing based on quantum learning optimization strategy
CN111179959A (en) Competitive speaker number estimation method and system based on speaker embedding space
Watcharasupat et al. Improving Polyphonic Sound Event Detection on Multichannel Recordings with the S {\o} rensen-Dice Coefficient Loss and Transfer Learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant