CN109313909A

CN109313909A - Assess method, unit and the system of microphone array consistency

Info

Publication number: CN109313909A
Application number: CN201880001199.6A
Authority: CN
Inventors: 李国梁; 罗朝洪; 程树青
Original assignee: Shenzhen Huiding Technology Co Ltd
Current assignee: Shenzhen Goodix Technology Co Ltd
Priority date: 2018-08-22
Filing date: 2018-08-22
Publication date: 2019-02-05
Anticipated expiration: 2038-08-22
Also published as: CN109313909B; CN116437280A; WO2020037555A1

Abstract

The embodiment of the present application provides a kind of method, unit and system for assessing microphone array consistency, the consistency in microphone array between different microphones can be assessed, to instruct the calibration of microphone array and the robustness of assessment multichannel enhancing algorithm according to compliance evaluation result, user experience is promoted.This method comprises: obtaining N number of audio signal that N number of microphone acquires respectively, which constitutes microphone array, N >=2；According to N number of audio signal, each microphone in N number of microphone in addition to reference microphone and phase spectrum difference and/or power spectrum difference between the reference microphone are determined, which is any one microphone in N number of microphone；According to each microphone in N number of microphone in addition to the reference microphone and phase spectrum difference and/or power spectrum difference between the reference microphone, compliance evaluation is carried out to N number of microphone.

Description

Assess method, unit and the system of microphone array consistency

Technical field

This application involves speech communications and speech-sound intelligent interaction field, and more particularly, to assessment microphone array Method, unit and the system of consistency.

Background technique

In speech communication application, speech enhancement technique can be improved the auditory perception of people, improve understanding for speech communication Degree, in speech-sound intelligent interactive application, speech enhancement technique can be improved the accuracy rate of speech recognition, promote user experience, because This speech enhancement technique is either crucial in traditional speech communication or interactive voice.Speech enhancement technique point Enhancing technology and multicenter voice for single-channel voice enhances technology, wherein single-channel voice enhancing technology can eliminate stable state Noise cannot eliminate nonstationary noise, and signal is using speech damage as cost than improving, and signal-to-noise ratio improves more, voice damage Hurt bigger；Multicenter voice enhances technology and acquires multiple signals using microphone array, utilizes the phase between multi-microphone signal Position information and coherence messages eliminate noise, can eliminate nonstationary noise, and smaller to speech damage.

In multicenter voice enhancing technology, the consistency in microphone array between different microphones directly affects algorithm Performance, existing scheme propose the innovatory algorithm of multichannel enhancing technology, increase the robustness of algorithm, while between microphone Coherence request reduce, however, the consistency between microphone still will affect algorithm performance when very low, to affect use Family experience.

Summary of the invention

The application provides a kind of method, unit and system for assessing microphone array consistency, can assess Mike Consistency in wind array between different microphones, to instruct the calibration of microphone array according to compliance evaluation result and comment The robustness for estimating multichannel enhancing algorithm, promotes user experience.

In a first aspect, providing a kind of method for assessing microphone array consistency, comprising:

N number of audio signal that N number of microphone acquires respectively is obtained, which constitutes microphone array, N >=2；

According to N number of audio signal, each microphone and the ginseng in N number of microphone in addition to reference microphone are determined The phase spectrum difference and/or power spectrum difference between microphone are examined, which is any one in N number of microphone Microphone；

According to each microphone in N number of microphone in addition to the reference microphone and the phase between the reference microphone Position spectral difference value and/or power spectrum difference carry out compliance evaluation to N number of microphone.

It should be noted that carrying out compliance evaluation to N number of microphone, it can be used for instructing the wheat in microphone array The distribution of gram wind, or guidance redesign the microphone distribution in microphone array, or guidance redesigns microphone array Column, or the robustness of assessment multichannel enhancing algorithm.

For example, when assessment result shows that the consistency of microphone 1 and microphone 2 is poor, can instruct adjustment microphone 1 or Distribution of the person's microphone 2 in microphone array can either be instructed to redesign microphone 1 or microphone 2.

In another example adjustment wheat can be instructed when assessment result shows that the consistency of microphone 1 and multiple microphones is all poor Gram distribution of the wind 1 in microphone array can perhaps be instructed to redesign microphone 1 or can instruct to redesign wheat Gram wind array.

In the embodiment of the present application, the N number of audio signal acquired respectively according to N number of microphone, determine each microphone with Phase spectrum difference and/or power spectrum difference between reference microphone are eliminated to carry out compliance evaluation to N number of microphone Influence of the consistency to multicenter voice enhancing algorithm between microphone, promotes user experience.

In some possible implementations, each wheat according in N number of microphone in addition to reference microphone Phase spectrum difference gram between wind and the reference microphone carries out compliance evaluation to N number of microphone, comprising:

According to each microphone in N number of microphone in addition to the reference microphone and the phase between the reference microphone Position spectral difference value, assesses the phase equalization between corresponding microphone and the reference microphone.

It should be noted that the phase spectrum difference between two microphones is smaller, the phase between the two microphones is indicated Bit integrity is better.

For example, the phase spectrum difference between microphone 1 and reference microphone is A, A is smaller, indicates microphone 1 and refers to wheat Phase equalization between gram wind is better.

It is alternatively possible to a threshold value be arranged, if the phase spectrum difference between two microphones is less than this threshold value, table Show that the phase equalization between the two microphones meets design requirement, the consistency between the two microphones is to multichannel language Sound enhancing algorithm influence can ignore or the two microphones between consistency to multicenter voice enhancing algorithm do not have It influences.

It should be noted that above-mentioned threshold value can enhance algorithm flexible configuration according to different multicenter voices.

In some possible implementations, this method further include:

The each microphone and the reference microphone to sound in N number of microphone in addition to the reference microphone are measured respectively The range difference in source；

According to measured range difference, each Mike in N number of microphone in addition to the reference microphone is calculated separately Fixed skew between wind and the reference microphone；

According to each microphone in N number of microphone in addition to the reference microphone and consolidating between the reference microphone Phase bit is poor, calibrates its corresponding phase spectrum difference respectively.

For example, the fixed skew between microphone 1 and reference microphone is A, between microphone 1 and reference microphone Phase spectrum difference is B, and after calibration, the phase spectrum difference between microphone 1 and reference microphone is C, at this point, C=B-A.

In some possible implementations, the distance according to measured by is calculated separately to remove in N number of microphone and is somebody's turn to do The fixed skew between each microphone and the reference microphone except reference microphone, comprising:

According to formulaIt calculates separately in N number of microphone in addition to the reference microphone Fixed skew between each microphone and the reference microphone,

Wherein, Y_i(ω) indicates the frequency spectrum of i-th of microphone, Y₁(ω) indicates that the frequency spectrum of reference microphone, ω indicate frequency Rate, d_iIndicate the range difference of i-th of microphone and reference microphone to sound source, the c expression velocity of sound, 2 π ω d_i/ c indicates i-th of Mike Fixed skew between wind and reference microphone.

According to each microphone in N number of microphone in addition to the reference microphone and the function between the reference microphone Rate spectral difference value assesses the amplitude coincidence between corresponding microphone and the reference microphone.

It should be noted that the power spectrum difference between two microphones is smaller, the width between the two microphones is indicated It is better to spend consistency.

For example, the power spectrum difference between microphone 1 and reference microphone is A, A is smaller, indicates microphone 1 and refers to wheat Amplitude coincidence between gram wind is better.

It is alternatively possible to a threshold value be arranged, if the power spectrum difference between two microphones is less than this threshold value, table Show that the amplitude coincidence between the two microphones meets design requirement, the consistency between the two microphones is to multichannel language Sound enhancing algorithm influence can ignore or the two microphones between consistency to multicenter voice enhancing algorithm do not have It influences.

In some possible implementations, when carrying out phase equalization assessment, which swept in broadcasting The signal acquired in the environment of frequency signal data.

In some possible implementations, when carrying out amplitude coincidence assessment, which is to play height The signal acquired in the environment of this white noise data or swept-frequency signal data.

In some possible implementations, which is Linear chirp, logarithm swept-frequency signal, linear stepping Any one in swept-frequency signal, logarithm stepping swept-frequency signal.

It is described according to N number of audio signal in some possible implementations, it determines in N number of microphone except reference The phase spectrum difference and/or power spectrum difference between each microphone and the reference microphone except microphone, comprising:

Each audio signal in N number of audio signal is subjected to framing, obtains K signal frame of equal length, K >=2；

Windowing process is done to each signal frame in the K signal frame, obtains K windowing signal frame；

Fast Fourier transform (Fast Fourier is done to each windowing signal frame in the K windowing signal frame Transformation, FFT) transformation, obtain K echo signal frame；

According to the corresponding K echo signal frame of each audio signal, determine in N number of microphone except this is with reference to Mike The phase spectrum difference and/or power spectrum difference between each microphone and the reference microphone except wind.

Optionally, K indicates that each microphone collects the totalframes of signal.

It should be noted that bring truncation effect when windowing process is used to eliminate framing.It is alternatively possible to be to the K Each signal frame in a signal frame is done plus Hamming window processing.

In some possible implementations, any two adjacent signals frame is overlapped R%, R > 0 in the K signal frame.Example Such as, which is 25 or 50.

Optionally, signal amplitude remains unchanged after being overlapped adding window.

It should be understood that each frame signal after overlapping has the ingredient of previous frame, prevent discontinuous between two frames.

In some possible implementations, i-th of audio signal is subjected to framing, obtains K signal of equal length Frame is write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) i-th of audio signal is indicated, K indicates that each microphone collects the totalframes of signal, []^TIt indicates The transposition of vector or matrix.

It is described according to the corresponding K echo signal frame of each audio signal in some possible implementations, really Each microphone in addition to the reference microphone and the phase spectrum difference between the reference microphone in fixed N number of microphone, Include:

According to formulaIt determines in N number of microphone in addition to the reference microphone Each microphone and the reference microphone between phase spectrum difference,

Wherein, imag () expression takes imaginary part, and ln () expression takes natural logrithm,Indicate i-th of microphone and ginseng The phase spectrum difference between microphone is examined,Indicate j-th of echo signal frame of reference microphone,Indicate i-th J-th of echo signal frame of a microphone,Indicate basic frequency.

It is described according to the corresponding K echo signal frame of each audio signal in some possible implementations, really Each microphone in addition to the reference microphone and the power spectrum difference between the reference microphone in fixed N number of microphone, Include:

According to the corresponding K echo signal frame of each audio signal, the power spectrum of each audio signal is determined；

According to the power spectrum of each audio signal, determine in N number of microphone each of in addition to the reference microphone Power spectrum difference between microphone and the reference microphone.

It is described according to the corresponding K echo signal frame of each audio signal in some possible implementations, really The power spectrum of fixed each audio signal, comprising:

According to formulaThe power spectrum of each audio signal is calculated,

Wherein, P_i(ω) indicates the power spectrum of i-th of audio signal, Y_i,j(ω) indicates the jth in i-th of audio signal A echo signal frame, K indicate that each microphone receives the totalframes of signal, and ω indicates frequency.

In some possible implementations, the power spectrum according to each audio signal determines N number of microphone In each microphone in addition to the reference microphone and the power spectrum difference between the reference microphone, comprising:

According to formula PD_i(ω)=P₁(ω)-P_i(ω) is calculated in N number of microphone each of in addition to reference microphone Power spectrum difference between microphone and the reference microphone,

Wherein, PD_i(ω) indicates the power spectrum difference between i-th of microphone and reference microphone, P₁(ω) indicates reference The power spectrum of microphone, P_i(ω) indicates the power spectrum of i-th of microphone.

In some possible implementations, the N number of audio signal for obtaining N number of microphone and acquiring respectively, comprising:

Determine sample frequency F of the N number of microphone when carrying out audio signal sample_sWith FFT points N_fft, use loudspeaking Device plays white Gaussian noise data or swept-frequency signal data, N number of microphone acquire N number of audio signal, wherein if this is raised The data that sound device is played are swept-frequency signal data, and the swept-frequency signal data are equal by M+1 segment length and the signal of frequency not etc. It constitutes,

It should be noted that FFT points N_fftFor even number, generally 32,64,128 ..., 1024 etc., it counts more, fortune The saving of calculation amount is bigger.

In some possible implementations, according to formulaCalculate the frequency of every segment signal in the M+1 segment signal Rate, and

According to formula S_i(t)=sin (2 π f_iT) every segment signal in the M+1 segment signal is calculated,

Wherein, f_iIndicate the frequency of the i-th segment signal, F_sIndicate sample frequency, N_fftIndicate FFT points, S_i(t) i-th is indicated Segment signal, and S₁(t) length is the integral multiple of cycle T, T=1/f₁。

In some possible implementations, the swept-frequency signal data that loudspeaker is played can be write as following vector shape Formula:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

Wherein, S (t) indicates the swept-frequency signal data that loudspeaker is played, S_i(t) the i-th segment signal is indicated, [ ]^TIndicate the transposition of vector or matrix.

In some possible implementations, which collects N number of audio signal respectively, wherein i-th of Mike The collected audio signal of wind is expressed as x_iAnd x (t),_i(t) it can be write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) the collected audio signal of i-th of microphone is indicated, K indicates that each microphone collects signal Totalframes, []^TIndicate the transposition of vector or matrix.

N number of microphone is placed in test room, loudspeaker, N number of microphone position are configured in the test room In the front of the loudspeaker；

It controls the loudspeaker and plays white Gaussian noise data or swept-frequency signal data, and control N number of microphone point N number of audio signal is not acquired.

In some possible implementations, there is noise reduction room environmental in the test room, which is audio-frequency test Special artificial mouth, and the artificial mouth is calibrated with standard microphone before the use.

In some possible implementations, white Gaussian noise data or swept-frequency signal number are played controlling the loudspeaker According to before, this method further include:

Under quiet environment, N number of microphone is obtained in the first duration T₁First audio data X of interior acquisition₁(n)；

In the environment of playing white Gaussian noise data or swept-frequency signal data, N number of microphone is obtained at second The second audio data X acquired in long T2₂(n)；

According to formulaSignal to Noise Ratio (SNR) is calculated, and ensures that the SNR is greater than first threshold.

Second aspect provides a kind of equipment for assessing microphone array consistency, comprising:

Acquiring unit, the N number of audio signal acquired respectively for obtaining N number of microphone, N number of microphone constitute Mike Wind array, N >=2；

Processing unit, for determining in N number of microphone in addition to reference microphone according to N number of audio signal Each microphone and the reference microphone between phase spectrum difference and/or power spectrum difference, the reference microphone is Any one microphone in N number of microphone；

The processing unit is also used to according to each Mike in N number of microphone in addition to the reference microphone Phase spectrum difference and/or power spectrum difference between wind and the reference microphone carry out consistency to N number of microphone and comment Estimate.

In some possible implementations, the processing unit is specifically used for:

According in N number of microphone in addition to the reference microphone each microphone and the reference microphone it Between phase spectrum difference, assess the phase equalization between corresponding microphone and the reference microphone.

In some possible implementations, the processing unit is also used to:

Each microphone in N number of microphone in addition to the reference microphone is measured respectively with described with reference to Mike Range difference of the wind to sound source；

According to measured range difference, calculate separately in N number of microphone each of in addition to the reference microphone Fixed skew between microphone and the reference microphone；

According in N number of microphone in addition to the reference microphone each microphone and the reference microphone it Between fixed skew, calibrate its corresponding phase spectrum difference respectively.

In some possible implementations, the processing unit is specifically used for:

According to formulaCalculate separately in N number of microphone except the reference microphone it Fixed skew between outer each microphone and the reference microphone,

In some possible implementations, the processing unit is specifically used for:

According in N number of microphone in addition to the reference microphone each microphone and the reference microphone it Between power spectrum difference, assess the amplitude coincidence between corresponding microphone and the reference microphone.

In some possible implementations, N number of audio signal is adopted in the environment of playing swept-frequency signal data The signal of collection.

In some possible implementations, N number of audio signal is to play white Gaussian noise data or frequency sweep The signal acquired in the environment of signal data.

In some possible implementations, the swept-frequency signal is Linear chirp, logarithm swept-frequency signal, linear step Into any one in swept-frequency signal, logarithm stepping swept-frequency signal.

In some possible implementations, the processing unit is specifically used for:

Each audio signal in N number of audio signal is subjected to framing, obtains K signal frame of equal length, K >= 2；

FFT transform is done to each windowing signal frame in the K windowing signal frame, obtains K echo signal frame；

According to the corresponding K echo signal frame of each audio signal, determine in N number of microphone except described The phase spectrum difference and/or power spectrum difference between each microphone and the reference microphone except reference microphone.

In some possible implementations, any two adjacent signals frame is overlapped R%, R > 0 in the K signal frame.

In some possible implementations, the R is 25 or 50.

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

In some possible implementations, the processing unit is specifically used for:

According to formulaIt determines in N number of microphone except the reference microphone Except each microphone and the reference microphone between phase spectrum difference,

In some possible implementations, the processing unit is specifically used for:

According to the corresponding K echo signal frame of each audio signal, the function of each audio signal is determined Rate spectrum；

According to the power spectrum of each audio signal, determine in N number of microphone in addition to the reference microphone Each microphone and the reference microphone between power spectrum difference.

In some possible implementations, the processing unit is specifically used for:

According to formulaThe power spectrum of each audio signal is calculated,

Wherein, P_i(ω) indicates the power spectrum of i-th of audio signal, Y_i,j(ω) indicates the jth in i-th of audio signal A echo signal frame, K indicate that each microphone collects the totalframes of signal, and ω indicates frequency.

In some possible implementations, the processing unit is specifically used for:

According to formula PD_i(ω)=P₁(ω)-P_i(ω) calculates every in addition to reference microphone in N number of microphone Power spectrum difference between a microphone and the reference microphone,

In some possible implementations, the processing unit is specifically used for:

Determine sample frequency F of the N number of microphone when carrying out audio signal sample_sWith FFT points N_fft, using raising Sound device plays white Gaussian noise data or swept-frequency signal data, controls N number of microphone and acquires N number of audio signal, Wherein, if the data that are played of the loudspeaker are swept-frequency signal data, the swept-frequency signal data it is equal by M+1 segment length and Equal signal is not constituted frequency,

In some possible implementations, the processing unit is also used to:

According to formulaThe frequency of every segment signal in the M+1 segment signal is calculated, and

In some possible implementations, the swept-frequency signal data that the loudspeaker is played are write as following vector shape Formula:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

In some possible implementations, N number of microphone collects N number of audio signal respectively, wherein i-th of wheat Gram collected audio signal of wind is expressed as x_iAnd x (t),_i(t) it can be write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

In some possible implementations, the acquiring unit is specifically used for:

N number of microphone is placed in test room, loudspeaker, N number of wheat are configured in the test room Gram wind is located at the front of the loudspeaker；

It controls the loudspeaker and plays white Gaussian noise data or swept-frequency signal data, and control N number of Mike Wind acquires N number of audio signal respectively.

In some possible implementations, there is noise reduction room environmental in the test room, the loudspeaker is audio Special artificial mouth is tested, and the artificial mouth is calibrated with standard microphone before the use.

In some possible implementations, the loudspeaker is controlled in the processing unit and plays white Gaussian noise data Or before swept-frequency signal data, the acquiring unit is also used to:

In the environment of playing white Gaussian noise data or swept-frequency signal data, N number of microphone is obtained second Duration T₂The second audio data X of interior acquisition₂(n)；

The processing unit is triggered according to formulaSignal to Noise Ratio (SNR) is calculated, and is ensured described SNR is greater than first threshold.

The third aspect provides a kind of device for assessing microphone array consistency, comprising:

Memory, for storing program and data；And

Processor, for calling and running the program and data that store in the memory；

The device is configured as executing the method in above-mentioned first aspect or its any possible implementation.

Fourth aspect provides the system of assessment microphone array consistency, comprising:

Constitute N number of microphone of microphone array, N >=2；

At least one audio-source；

Device stores in the memory including the memory for storing program and data and for calling and running The processor of program and data, the device are configured as the method in above-mentioned first aspect or its any possible implementation.

5th aspect, provides a kind of computer storage medium, is stored with program code in the computer storage medium, should Program code can serve to indicate that the method executed in above-mentioned first aspect or its any possible implementation.

6th aspect, provides a kind of computer program product comprising instruction, when running on computers, makes to succeed in one's scheme Calculation machine executes the method in above-mentioned first aspect or its any possible implementation.

Detailed description of the invention

Fig. 1 is the schematic flow chart of the method for the assessment microphone array consistency of the embodiment of the present application.

Fig. 2 is the test environment schematic according to the embodiment of the present application.

Fig. 3 is the schematic diagram according to the calculating phase spectrum difference of the embodiment of the present application.

Fig. 4 is the schematic diagram according to the calculating power spectrum difference of the embodiment of the present application.

Fig. 5 is the schematic diagram of the phase spectrum difference between two microphones according to the embodiment of the present application.

Fig. 6 is the schematic diagram of the phase spectrum difference after calibrating between two microphones according to the embodiment of the present application.

Fig. 7 a is the schematic diagram according to the power spectrum of two microphones of the embodiment of the present application.

Fig. 7 b is the schematic diagram of the power spectrum difference between two microphones according to the embodiment of the present application.

Fig. 8 is the schematic diagram according to a kind of equipment of assessment microphone array consistency of the embodiment of the present application.

Fig. 9 is the schematic diagram according to a kind of device of assessment microphone array consistency of the embodiment of the present application.

Figure 10 is the schematic diagram according to a kind of system of assessment microphone array consistency of the embodiment of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application is clearly retouched It states.

Microphone array (Microphone Array) refers to be made of the microphone (acoustic sensor) of certain amount, is used Come the system that the spatial character of sound field is sampled and handled.The difference between the phase of sound wave is received using two microphones It is different that sound wave is filtered, environmental background sound can be disposed to greatest extent, only the remaining sound wave needed.

Multicenter voice enhancing technique algorithm assumed condition be multiple microphones in microphone array target voice at Divide high correlation, target voice is uncorrelated to non-targeted interference, therefore the consistency in microphone array between different microphones Directly affect algorithm performance.

The qualitative assessment of microphone consistency can be used for instructing the design of microphone and the design of microphone array, Mike Circuit, electronic component, the acoustic construction of wind array can all influence the consistency of microphone, can be by when designing microphone array Influence of the item test various factors to consistency, so that the design of microphone consistency be made to reach system requirements.

The qualitative assessment of microphone consistency can be used for comparing the robustness of algorithms of different, reach same voice enhancing The premise of performance, lower to coincident indicator requirement, algorithm robustness is better.

In the embodiment of the present application, consistency is measured in terms of amplitude spectrum difference and phase spectrum difference two, it is objective to have Property and accuracy, and quantitative method for assessing consistency can objectively instruct the design of microphone array, it also can be objective Comparison multicenter voice enhancing algorithm robustness.

Hereinafter, the method for the assessment microphone array consistency of the embodiment of the present application is discussed in detail in conjunction with Fig. 1 to Fig. 7.

Fig. 1 is the schematic flow chart of the method for the assessment microphone array consistency of the application one embodiment.Ying Li The step of solution, Fig. 1 shows this method or operation, but these steps or operation are only examples, and the embodiment of the present application can also be held The deformation of other operations of row or each operation in Fig. 1.This method can be held by the device of assessment microphone array consistency Row, wherein the device of the assessment microphone array consistency can be mobile phone, tablet computer, portable computer, individual digital and help Manage (Personal Digital Assistant, PDA) etc..

S110 obtains N number of audio signal that N number of microphone acquires respectively, N number of microphone composition microphone array, and N >= 2。

When carrying out compliance evaluation to N number of microphone, need to limit environment locating for N number of microphone, i.e. N number of audio Signal is acquired under special test environment.

Specifically, as shown in Fig. 2, the microphone array 201 being made of N number of microphone is placed in test room 202 It is interior, and loudspeaker 203 is configured in the test room 202, which is being particularly located at the loudspeaker 203 just Front, the microphone array 201 connect the control equipment 204 of such as computer with the loudspeaker 203.The control equipment 204 can Specific audio data is played to control the loudspeaker 203, for example, white Gaussian noise data or swept-frequency signal data are played, Meanwhile the control equipment 204 can obtain N number of audio letter of N number of microphone distribution collection from the microphone array 201 Number.

It should be noted that microphone compliance evaluation requires the signal-to-noise ratio of the audio signal of acquisition sufficiently high, background is made an uproar Sound is weak enough, therefore tests environmental requirement under quiet environment.Particularly, noise reduction room environmental is required in test room 202. Loudspeaker 203 requires noise relatively high, and frequency response curve is flat, and particularly, loudspeaker uses audio-frequency test Special artificial Mouth, and calibrated before use with standard microphone.Microphone array 201 is placed on the front of loudspeaker 203, particularly, Ask the position for being placed on standard microphone calibration.

Optionally, before carrying out formal audio signal sample, it is also necessary to carry out signal-to-noise ratio to above-mentioned test environment (signal-to-noise ratio, SNR) detection.

Specifically, under test environment as shown in Figure 2, firstly, (i.e. loudspeaker 203, which is in, closes under quiet environment Closed state), N number of microphone is obtained in the first duration T₁First audio data X of interior acquisition₁(n)；Then, white Gaussian is being played (i.e. the control equipment 204 controls the loudspeaker 203 and plays white Gaussian noise in the environment of noise data or swept-frequency signal data Data or swept-frequency signal data), N number of microphone is obtained in the second duration T₂The second audio data X of interior acquisition₂(n)；It connects , 1 calculate SNR according to the following formula；Finally, then detection passes through when SNR is greater than given threshold, otherwise detects and do not pass through.

Wherein, T₁Indicate the first duration, T₂Indicate the second duration, X₁(n) the first audio data, X are indicated₂(n) second is indicated Audio data.

It should be noted that if detection does not pass through, need that above-mentioned test environment is adjusted or is calibrated, eliminate It may be to the factor that signal-to-noise ratio impacts, until being greater than given threshold according to the SNR calculated of above-mentioned formula 1.

It optionally, in the embodiment of the present application, specifically can be with using above-mentioned test environment shown in Fig. 2 acquisition audio signal Include:

Determine sample frequency F of the N number of microphone when carrying out audio signal sample_sWith FFT points N_fft, use loudspeaking Device plays white Gaussian noise data or swept-frequency signal data, N number of microphone acquire N number of audio signal.

Optionally, FFT points N_fftFor even number, generally 32,64,128 ..., 1024 etc., it counts more, operand It saves bigger.

It should be noted that if the data that the loudspeaker is played are swept-frequency signal data, the swept-frequency signal data are by M+1 Segment length is equal and the signal of frequency not etc. is constituted,

It is alternatively possible to 2 frequency for calculating every segment signal in the M+1 segment signal according to the following formula, and according to as follows Formula 3 calculates every segment signal in the M+1 segment signal.

Wherein, f_iIt is the frequency of the i-th segment signal, F_sIt is sample frequency, N_fftIndicate FFT points.

S_i(t)=sin (2 π f_iT) formula 3

Wherein, S_i(t) the i-th segment signal, f are indicated_iIt is the frequency of the i-th segment signal.

It should be noted that the first segment signal S₁(t) length is the integral multiple of cycle T, T=1/f₁。

Optionally, the swept-frequency signal data that loudspeaker is played can be write as following vector form:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

Optionally, N number of microphone collects N number of audio signal respectively, wherein the collected audio signal of i-th of microphone It is expressed as x_iAnd x (t),_i(t) it can be write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

S120 determines each microphone in N number of microphone in addition to reference microphone according to N number of audio signal Phase spectrum difference and/or power spectrum difference between the reference microphone, the reference microphone are appointing in N number of microphone It anticipates a microphone.

Optionally, in the embodiment of the present application, after N number of audio signal sample arrives, audio signal point can be passed through Frame does FFT transform to every frame windowing signal, seeks the phase spectrum difference between different microphones to every frame audio signal adding window.

Specifically, as shown in Figure 3, it is assumed that N number of audio signal is x₁(t),x₂(t),…,x_N(t), which is believed Each audio signal in number carries out framing, obtains K signal frame of equal length, K >=2, for example, by i-th of audio signal Framing is carried out, K signal frame for obtaining equal length is write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) i-th of audio signal is indicated, K indicates that each microphone collects the totalframes of signal, []^TIt indicates The transposition of vector or matrix；

Windowing process is done to each signal frame in the K signal frame, K windowing signal frame is obtained, for example, to i-th J-th of frame x of audio signal_i,jAdding window obtains j-th of windowing signal frame y of i-th of audio signal_i,j=x_i,j×Win；

FFT transform is done to each windowing signal frame in the K windowing signal frame, obtains K echo signal frame, for example, To j-th of windowing signal frame y of i-th of audio signal_i,j(t) FFT transform is done, j-th of target of i-th of audio signal is obtained Signal frame Y_i,j(ω)；

According to the corresponding K echo signal frame of each audio signal, determine in N number of microphone except this is with reference to Mike The phase spectrum difference between each microphone and the reference microphone except wind, for example, it is assumed that the master of j-th of echo signal frame Frequency isThen can be in basic frequency with according to the following formula 4 i-th of microphone of calculating and reference microphoneThe phase spectrum at place Difference.

Wherein, imag () expression takes imaginary part, and ln () expression takes natural logrithm,Indicate i-th of microphone with Phase spectrum difference between reference microphone,Indicate j-th of echo signal frame of reference microphone,Indicate the J-th of echo signal frame of i microphone,Indicate basic frequency.

It should be noted that in above-mentioned Fig. 3 being calculated separately and removing using first microphone as reference microphone The phase spectrum difference between each microphone and first microphone except first microphone, and the first microphone diaphone Frequency signal x₁(t), second microphone corresponds to audio signal x₂(t) ..., N microphone corresponds to audio signal x_N(t)。

Optionally, K indicates that each microphone receives the totalframes of signal.

In some possible implementations, any two adjacent signals frame is overlapped R%, R > 0 in the K signal frame.Example Such as, which is 25 or 50.In other words, any two adjacent signals frame overlapping 25% or 50% in the K signal frame.

Optionally, in the embodiment of the present application, when carrying out phase equalization assessment, which is to play The signal acquired in the environment of swept-frequency signal data.In other words, when computationally stating phase spectrum difference, N number of audio signal It is the signal acquired in the environment of playing swept-frequency signal data.

It is consequently possible to calculate the phase difference of any frequencies omega is out to get between i-th of microphone and reference microphone Phase spectrum difference PDiff_i(ω), i.e., it is above-mentioned

Optionally, in the embodiment of the present application, after N number of audio signal sample arrives, audio signal point can be passed through Frame does FFT transform to every frame windowing signal, seeks the power of every frame signal after FFT transform to every frame audio signal adding window Spectrum, seeks the power spectrum difference between different microphones.

Specifically, it is assumed that N number of audio signal is x₁(t),x₂(t),…,x_N(t), which is believed Each audio signal in number carries out framing, obtains K signal frame of equal length, K >=2, for example, by i-th of audio signal Framing is carried out, K signal frame for obtaining equal length is write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) i-th of audio signal is indicated, K indicates that each microphone receives the totalframes of signal, []^TIt indicates The transposition of vector or matrix；

According to the corresponding K echo signal frame of each audio signal, the power spectrum of each audio signal, example are determined Such as, according to the following formula 5 calculate i-th of audio signal power spectrum；

According to the power spectrum of each audio signal, determine in N number of microphone each of in addition to the reference microphone Power spectrum difference between microphone and the reference microphone, for example, 6 calculating i-th of microphone and the ginseng according to the following formula Examine the power spectrum difference between microphone.

Wherein, P_i(ω) indicates the power spectrum of i-th of audio signal, Y_i,j(ω) indicates the jth in i-th of audio signal A echo signal frame, ω indicate frequency, and K indicates that each microphone collects the totalframes of signal.

PD_i(ω)=P₁(ω)-P_i(ω) formula 6

It should be noted that in above-mentioned Fig. 4 being calculated separately and removing using first microphone as reference microphone The power spectrum difference between each microphone and first microphone except first microphone, and the first microphone diaphone Frequency signal x₁(t), second microphone corresponds to audio signal x₂(t) ..., N microphone corresponds to audio signal x_N(t)。

Optionally, in the embodiment of the present application, when carrying out amplitude coincidence assessment, which is to play The signal acquired in the environment of white Gaussian noise data or swept-frequency signal data.In other words, power spectral difference is computationally stated When value, which is the signal acquired in the environment of playing white Gaussian noise data or swept-frequency signal data.

S130, according in N number of microphone in addition to the reference microphone each microphone and the reference microphone it Between phase spectrum difference and/or power spectrum difference, compliance evaluation is carried out to the N number of microphone.

Specifically, phase spectrum difference is for carrying out phase equalization assessment and power spectrum difference for carrying out amplitude one The assessment of cause property.

Optionally, in the embodiment of the present application, according in N number of microphone each of in addition to the reference microphone Phase spectrum difference between microphone and the reference microphone assesses the phase between corresponding microphone and the reference microphone Bit integrity.

It should be noted that because when acquiring data, the distance of different microphones to sound source be difficult to it is completely the same, so not With between microphone, there are a fixed skews.

Optionally, in the embodiment of the present application, above-mentioned phase spectrum difference can be calibrated by fixed skew.

Specifically, each microphone in N number of microphone in addition to the reference microphone is measured respectively and this refers to wheat Gram wind to sound source range difference, for example, d_iIndicate the range difference of i-th of microphone with reference microphone to sound source；

According to measured range difference, each Mike in N number of microphone in addition to the reference microphone is calculated separately Fixed skew between wind and the reference microphone, for example, i-th of microphone and reference can be calculated with according to the following formula 7 Fixed skew between microphone；

It should be noted that fixed skew and signal frequency meet linear relationship, therefore, it is possible to use linear fit Mode determines fixed skew.

For example, the fixed skew between microphone 1 and reference microphone is A, between microphone 1 and reference microphone Phase spectrum difference is B, as shown in figure 5, straight line portion indicates the stationary phase between the obtained microphone 1 of fitting and reference microphone Potential difference, curved portion indicate the phase spectrum difference between microphone 1 and reference microphone, and overall performance goes out, with frequency from 0Hz increases to 8000Hz, and the phase spectrum difference between microphone 1 and reference microphone is decreased to -2 radians from 0 radian.Calibrate it Afterwards, the phase spectrum difference between microphone 1 and reference microphone is C, as shown in curve in Fig. 6, at this point, C=B-A, entirety Show, as frequency increases to 8000Hz from 0Hz, the phase spectrum difference between microphone 1 and reference microphone 0 radian with It is fluctuated between ± 0.5 radian.

By Fig. 5 and Fig. 6 comparison it is found that fixed skew the phase spectrum difference between two microphones can be caused it is biggish It influences, therefore, when carrying out amplitude coincidence assessment to two microphones, needs to eliminate the fixed skew institute between two microphones Caused by influence.

Optionally, in the embodiment of the present application, according to each Mike in N number of microphone in addition to the reference microphone It is consistent with the amplitude between the reference microphone to assess corresponding microphone for power spectrum difference between wind and the reference microphone Property.

For example, as shown in fig. 7, specifically, Fig. 7 a shows the power spectrum of microphone 1 and the power spectrum of reference microphone, Fig. 7 b shows the power spectrum difference between microphone 1 and reference microphone, the power spectrum between microphone 1 and reference microphone It is not much different, and maximum value < ± 1 decibel (dB) of its power spectrum difference.

Optionally, in the embodiment of the present application, can test item by item the circuit of such as microphone array, electronic component, Influence of the factors such as acoustic construction to microphone consistency specifically, can be guidance to instruct the calibration of microphone array The design of microphone and the design of microphone array, the robustness of assessment multichannel enhancing algorithm.

Therefore, in the embodiment of the present application, the N number of audio signal that can be acquired respectively according to N number of microphone, determines each Phase spectrum difference and/or power spectrum difference between microphone and reference microphone are commented to carry out consistency to N number of microphone Estimate, eliminates influence of the consistency to multicenter voice enhancing algorithm between microphone, promote user experience.

Optionally, as shown in figure 8, the embodiment of the present application provides a kind of equipment 800 for assessing microphone array consistency, Include:

Acquiring unit 810, the N number of audio signal acquired respectively for obtaining N number of microphone, N number of microphone are constituted Microphone array, N >=2；

Processing unit 820, for according to N number of audio signal, determine in N number of microphone except reference microphone it Phase spectrum difference and/or power spectrum difference between outer each microphone and the reference microphone, the reference microphone For any one microphone in N number of microphone；

The processing unit 820 is also used to according to each wheat in N number of microphone in addition to the reference microphone Phase spectrum difference and/or power spectrum difference gram between wind and the reference microphone carry out consistency to N number of microphone Assessment.

Optionally, the processing unit 820 is specifically used for:

Optionally, the processing unit 820 is also used to:

Optionally, the processing unit 820 is specifically used for:

Optionally, N number of audio signal is the signal acquired in the environment of playing swept-frequency signal data.

Optionally, N number of audio signal is in the environment of playing white Gaussian noise data or swept-frequency signal data The signal of acquisition.

Optionally, the swept-frequency signal is Linear chirp, logarithm swept-frequency signal, linear stepping swept-frequency signal, logarithm Any one in stepping swept-frequency signal.

Optionally, the processing unit 820 is specifically used for:

Optionally, any two adjacent signals frame is overlapped R%, R > 0 in the K signal frame.

Optionally, the R is 25 or 50.

Optionally, i-th of audio signal is subjected to framing, K signal frame for obtaining equal length is write as following vector shape Formula:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Optionally, the processing unit 820 is specifically used for:

According to formulaThe power spectrum of each audio signal is calculated,

Optionally, the processing unit 820 is specifically used for:

Optionally, the processing unit 820 is also used to:

Optionally, the swept-frequency signal data that the loudspeaker is played are write as following vector form:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

Optionally, N number of microphone collects N number of audio signal respectively, wherein the collected audio of i-th of microphone Signal is expressed as x_iAnd x (t),_i(t) it can be write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Optionally, the acquiring unit 810 is specifically used for:

Optionally, there is noise reduction room environmental in the test room, the loudspeaker is audio-frequency test Special artificial mouth, and The artificial mouth is calibrated with standard microphone before the use.

Optionally, the loudspeaker is controlled in the processing unit 820 play white Gaussian noise data or swept-frequency signal Before data, the acquiring unit 810 is also used to:

The processing unit 820 is triggered according to formulaSignal to Noise Ratio (SNR) is calculated, and ensures institute SNR is stated greater than first threshold.

Optionally, as shown in figure 9, the embodiment of the present application provides a kind of device 900 for assessing microphone array consistency, Include:

Memory 910, for storing program and data；And

Processor 920, for calling and running the program and data that store in the memory；

The device 900 is configured as executing method shown in above-mentioned Fig. 1 to 7.

Optionally, as shown in Figure 10, the embodiment of the present application provides a kind of system for assessing microphone array consistency 1000, comprising:

Constitute N number of microphone of microphone array 1010, N >=2；

At least one audio-source 1020；

Device 1030, including the memory 1031 for storing program and data and for calling and running the memory The program of middle storage and the processor 1032 of data, the device 1030 are configured as method shown in above-mentioned Fig. 1 to 7.

It should be understood that magnitude of the sequence numbers of the above procedures are not meant to execute suitable in the various embodiments of the application Sequence it is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present application Process constitutes any restriction.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed Scope of the present application.

It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.

In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit is drawn Point, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some interfaces, the INDIRECT COUPLING of device or unit Or communication connection, it can be electrical property, mechanical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.

It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially The part of the part that contributes to existing technology or the technical solution can embody in the form of software products in other words Come, which is stored in a storage medium, including some instructions are used so that a computer equipment (can To be personal computer, server or the network equipment etc.) execute each embodiment the method for the application all or part Step.And storage medium above-mentioned include: USB flash disk, it is mobile hard disk, read-only memory (ROM, Read-Only Memory), random Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic or disk Matter.

The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.Therefore, the protection scope of the application should be subject to the scope of protection of the claims.

Claims

1. a kind of method for assessing microphone array consistency characterized by comprising

N number of audio signal that N number of microphone acquires respectively is obtained, N number of microphone constitutes microphone array, N >=2；

According to N number of audio signal, determine each microphone in N number of microphone in addition to reference microphone with it is described Phase spectrum difference and/or power spectrum difference between reference microphone, the reference microphone are appointing in N number of microphone It anticipates a microphone；

According to each microphone in N number of microphone in addition to the reference microphone and between the reference microphone Phase spectrum difference and/or power spectrum difference carry out compliance evaluation to N number of microphone.

2. the method according to claim 1, wherein it is described according in N number of microphone remove reference microphone Except each microphone and the reference microphone between phase spectrum difference, consistency is carried out to the N number of microphone and is commented Estimate, comprising:

According to each microphone in N number of microphone in addition to the reference microphone and between the reference microphone Phase spectrum difference assesses the phase equalization between corresponding microphone and the reference microphone.

3. according to the method described in claim 2, it is characterized in that, the method also includes:

Each microphone in N number of microphone in addition to the reference microphone is measured respectively to arrive with the reference microphone The range difference of sound source；

According to each microphone in N number of microphone in addition to the reference microphone and between the reference microphone Fixed skew calibrates its corresponding phase spectrum difference respectively.

4. according to the method described in claim 3, it is characterized in that, the distance according to measured by, calculates separately described N number of The fixed skew between each microphone and the reference microphone in microphone in addition to the reference microphone, packet It includes:

Wherein, Y_i(ω) indicates the frequency spectrum of i-th of microphone, Y₁(ω) indicates that the frequency spectrum of reference microphone, ω indicate frequency, d_i Indicate the range difference of i-th of microphone and reference microphone to sound source, the c expression velocity of sound, 2 π ω d_i/ c indicate i-th microphone with Fixed skew between reference microphone.

5. method according to claim 1 to 4, which is characterized in that described to be removed according in N number of microphone The phase spectrum difference between each microphone and the reference microphone except reference microphone, to N number of microphone into Row compliance evaluation, comprising:

According to each microphone in N number of microphone in addition to the reference microphone and between the reference microphone Power spectrum difference assesses the amplitude coincidence between corresponding microphone and the reference microphone.

6. method according to any one of claim 2 to 4, which is characterized in that N number of audio signal is swept in broadcasting The signal acquired in the environment of frequency signal data.

7. according to the method described in claim 5, it is characterized in that, N number of audio signal is to play white Gaussian noise number According to or swept-frequency signal data in the environment of the signal that acquires.

8. method according to claim 6 or 7, which is characterized in that the swept-frequency signal is Linear chirp, logarithm is swept Frequency signal, linear stepping swept-frequency signal, any one in logarithm stepping swept-frequency signal.

9. method according to any one of claim 1 to 8, which is characterized in that it is described according to N number of audio signal, Determine each microphone in N number of microphone in addition to reference microphone and the Phase spectrum difference between the reference microphone Value and/or power spectrum difference, comprising:

According to the corresponding K echo signal frame of each audio signal, determine in N number of microphone except the reference The phase spectrum difference and/or power spectrum difference between each microphone and the reference microphone except microphone.

10. according to the method described in claim 9, it is characterized in that, any two adjacent signals frame weight in the K signal frame Folded R%, R > 0.

11. according to the method described in claim 10, it is characterized in that, the R is 25 or 50.

12. the method according to any one of claim 9 to 11, which is characterized in that divided i-th of audio signal Frame, K signal frame for obtaining equal length are write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) i-th of audio signal is indicated, K indicates that each microphone collects the totalframes of signal, []^TIndicate vector Or the transposition of matrix.

13. the method according to any one of claim 9 to 12, which is characterized in that described to be believed according to each audio Number corresponding K echo signal frame, determines each microphone in N number of microphone in addition to the reference microphone Phase spectrum difference between the reference microphone, comprising:

Wherein, imag () expression takes imaginary part, and ln () expression takes natural logrithm,It indicates i-th of microphone and refers to wheat Phase spectrum difference between gram wind,Indicate j-th of echo signal frame of reference microphone,Indicate i-th of wheat J-th of echo signal frame of gram wind,Indicate basic frequency.

14. the method according to any one of claim 9 to 13, which is characterized in that described to be believed according to each audio Number corresponding K echo signal frame, determines each microphone in N number of microphone in addition to the reference microphone Power spectrum difference between the reference microphone, comprising:

According to the power spectrum of each audio signal, determine every in addition to the reference microphone in N number of microphone Power spectrum difference between a microphone and the reference microphone.

15. according to the method for claim 14, which is characterized in that described corresponding described according to each audio signal K echo signal frame determines the power spectrum of each audio signal, comprising:

According to formulaThe power spectrum of each audio signal is calculated,

Wherein, P_i(ω) indicates the power spectrum of i-th of audio signal, Y_i,j(ω) indicates j-th of target in i-th of audio signal Signal frame, K indicate that each microphone collects the totalframes of signal, and ω indicates frequency.

16. method according to claim 14 or 15, which is characterized in that the power according to each audio signal Spectrum, determines each microphone in N number of microphone in addition to the reference microphone and between the reference microphone Power spectrum difference, comprising:

According to formula PD_i(ω)=P₁(ω)-P_i(ω) calculates each wheat in N number of microphone in addition to reference microphone Power spectrum difference gram between wind and the reference microphone,

Wherein, PD_i(ω) indicates the power spectrum difference between i-th of microphone and reference microphone, P₁(ω) indicates to refer to Mike The power spectrum of wind, P_i(ω) indicates the power spectrum of i-th of microphone.

17. according to claim 1 to method described in any one of 16, which is characterized in that the N number of microphone of acquisition is adopted respectively N number of audio signal of collection, comprising:

Determine sample frequency F of the N number of microphone when carrying out audio signal sample_sWith FFT points N_fft, broadcast using loudspeaker White Gaussian noise data or swept-frequency signal data are put, N number of microphone acquires N number of audio signal, wherein if described The data that loudspeaker is played are swept-frequency signal data, the swept-frequency signal data are equal by M+1 segment length and frequency not etc. Signal is constituted,

18. according to the method for claim 17, which is characterized in that

Wherein, f_iIndicate the frequency of the i-th segment signal, F_sIndicate sample frequency, N_fftIndicate FFT points, S_i(t) i-th section of letter is indicated Number, and S₁(t) length is the integral multiple of cycle T, T=1/f₁。

19. according to the method for claim 18, which is characterized in that the swept-frequency signal data that the loudspeaker is played are write as Following vector form:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

Wherein, S (t) indicates the swept-frequency signal data that loudspeaker is played, S_i(t) the i-th segment signal is indicated,[]^TTable Show the transposition of vector or matrix.

20. according to claim 1 to method described in any one of 19, which is characterized in that N number of microphone collects respectively N number of audio signal, wherein the collected audio signal of i-th of microphone is expressed as x_iAnd x (t),_i(t) it can be write as following vector Form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

Wherein, x_i(t) the collected audio signal of i-th of microphone is indicated, K indicates that each microphone collects total frame of signal Number, []^TIndicate the transposition of vector or matrix.

21. according to claim 1 to method described in any one of 20, which is characterized in that the N number of microphone of acquisition is adopted respectively N number of audio signal of collection, comprising:

N number of microphone is placed in test room, loudspeaker, N number of microphone are configured in the test room Positioned at the front of the loudspeaker；

22. according to the method for claim 21, which is characterized in that there is noise reduction room environmental in the test room, it is described Loudspeaker is audio-frequency test Special artificial mouth, and the artificial mouth is calibrated with standard microphone before the use.

23. the method according to claim 21 or 22, which is characterized in that play white Gaussian noise controlling the loudspeaker Before data or swept-frequency signal data, the method also includes:

In the environment of playing white Gaussian noise data or swept-frequency signal data, N number of microphone is obtained in the second duration T₂ The second audio data X of interior acquisition₂(n)；

24. a kind of equipment for assessing microphone array consistency characterized by comprising

Acquiring unit, the N number of audio signal acquired respectively for obtaining N number of microphone, N number of microphone constitute microphone array Column, N >=2；

Processing unit, for determining every in addition to reference microphone in N number of microphone according to N number of audio signal Phase spectrum difference and/or power spectrum difference between a microphone and the reference microphone, the reference microphone are the N Any one microphone in a microphone；

The processing unit, be also used to according in N number of microphone in addition to the reference microphone each microphone with Phase spectrum difference and/or power spectrum difference between the reference microphone carry out compliance evaluation to N number of microphone.

25. equipment according to claim 24, which is characterized in that the processing unit is specifically used for:

26. equipment according to claim 25, which is characterized in that the processing unit is also used to:

27. equipment according to claim 26, which is characterized in that the processing unit is specifically used for:

28. the equipment according to any one of claim 24 to 27, which is characterized in that the processing unit is specifically used for:

29. the equipment according to any one of claim 25 to 27, which is characterized in that N number of audio signal is to broadcast Put the signal acquired in the environment of swept-frequency signal data.

30. equipment according to claim 28, which is characterized in that N number of audio signal is to play white Gaussian noise The signal acquired in the environment of data or swept-frequency signal data.

31. the equipment according to claim 29 or 30, which is characterized in that the swept-frequency signal is Linear chirp, right Number swept-frequency signals, linear stepping swept-frequency signal, any one in logarithm stepping swept-frequency signal.

32. the equipment according to any one of claim 24 to 31, which is characterized in that the processing unit is specifically used for:

33. equipment according to claim 32, which is characterized in that any two adjacent signals frame in the K signal frame It is overlapped R%, R > 0.

34. equipment according to claim 33, which is characterized in that the R is 25 or 50.

35. the equipment according to any one of claim 32 to 34, which is characterized in that divided i-th of audio signal Frame, K signal frame for obtaining equal length are write as following vector form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

36. the equipment according to any one of claim 32 to 35, which is characterized in that the processing unit is specifically used for:

37. the equipment according to any one of claim 32 to 36, which is characterized in that the processing unit is specifically used for:

38. the equipment according to claim 37, which is characterized in that the processing unit is specifically used for:

According to formulaThe power spectrum of each audio signal is calculated,

39. the equipment according to claim 37 or 38, which is characterized in that the processing unit is specifically used for:

40. the equipment according to any one of claim 24 to 39, which is characterized in that the processing unit is specifically used for:

Determine sample frequency F of the N number of microphone when carrying out audio signal sample_sWith FFT points N_fft, broadcast using loudspeaker White Gaussian noise data or swept-frequency signal data are put, N number of microphone is controlled and acquires N number of audio signal, wherein if The data that the loudspeaker is played are swept-frequency signal data, the swept-frequency signal data are equal by M+1 segment length and frequency not Deng signal constitute,

41. equipment according to claim 40, which is characterized in that the processing unit is also used to:

42. equipment according to claim 41, which is characterized in that the swept-frequency signal data that the loudspeaker is played are write as Following vector form:

S (t)=[S₀(t),S₁(t),…,S_M(t)]^T

43. the equipment according to any one of claim 24 to 42, which is characterized in that N number of microphone acquires respectively To N number of audio signal, wherein the collected audio signal of i-th of microphone is expressed as x_iAnd x (t),_i(t) can be write as it is following to Amount form:

x_i(t)=[x_i,1(t),x_i,2(t),…,x_i,K(t)]^T

44. the equipment according to any one of claim 24 to 43, which is characterized in that the acquiring unit is specifically used for:

45. equipment according to claim 44, which is characterized in that there is noise reduction room environmental in the test room, it is described Loudspeaker is audio-frequency test Special artificial mouth, and the artificial mouth is calibrated with standard microphone before the use.

46. the equipment according to claim 44 or 45, which is characterized in that control the loudspeaker in the processing unit and broadcast Before putting white Gaussian noise data or swept-frequency signal data, the acquiring unit is also used to:

The processing unit is triggered according to formulaSignal to Noise Ratio (SNR) is calculated, and ensures that the SNR is big In first threshold.

47. a kind of device for assessing microphone array consistency characterized by comprising

Memory, for storing program and data；And

Described device is configured as: executing the method as described in any one of claim 1 to 23.

48. a kind of system for assessing microphone array consistency characterized by comprising

Constitute N number of microphone of microphone array, N >=2；

At least one audio-source；

Device, including the memory for storing program and data and for calling and running the program stored in the memory With the processor of data, described device is configured as:

Execute the method as described in any one of claim 1 to 23.