CN109309764A

CN109309764A - Audio data processing method, device, electronic equipment and storage medium

Info

Publication number: CN109309764A
Application number: CN201710632689.3A
Authority: CN
Inventors: 李洋; 纪璇; 陈伟
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2017-07-28
Filing date: 2017-07-28
Publication date: 2019-02-05
Anticipated expiration: 2037-07-28
Also published as: CN109309764B

Abstract

The embodiment of the invention provides audio data processing method, device, electronic equipment and storage mediums, to effectively eliminate the echo in recording audio.The method includes: acquisition voice signal, and moves according to frame and determine remote signaling, wherein frame moves unequal with block length；The target remote signaling of predetermined number is determined according to the remote signaling, wherein partial target remote signaling is identical as the setting partial target remote signaling of frame before, and the predetermined number is related to frame length, block length, and the setting frame is moved to frame, block length is related；Echo cancelltion processing is carried out according to the voice signal and target remote signaling, obtains the echo signal of echo cancellor.

Description

Audio data processing method, device, electronic equipment and storage medium

Technical field

This application involves technical fields, more particularly to a kind of audio data processing method, a kind of audio data processing dress It sets, a kind of electronic equipment and a kind of readable storage medium storing program for executing.

Background technique

With the fast development of the communication technology, the terminals such as mobile phone, tablet computer are more more and more universal, to the life of people Living, study, work bring great convenience.

User can be interacted by voice, video etc. with other users in using terminal, such as made a phone call, carried out video Communication etc..In these interactive processes, terminal would generally open microphone (Microphone, Mic) recorded speech be sent to it is logical Believe opposite end, the voice data of Correspondent Node can be also played by loudspeaker.Therefore, in actual treatment, pass through microphone records In audio data include the sound of this end subscriber, and the sound of the opposite end played by loudspeaker, it can be by recording by raising The sound for the opposite end that sound device plays is known as echo.In order to improve communication quality, prevent echo from causing to influence normal language in audio Sound content needs to eliminate the echo.

Summary of the invention

The embodiment of the present invention provides a kind of audio data processing method, to effectively eliminate the echo in recording audio.

Correspondingly, the embodiment of the invention also provides a kind of audio-frequency data processing device, a kind of electronic equipment and one kind to deposit Storage media, to guarantee the implementation and application of the above method.

To solve the above-mentioned problems, the embodiment of the invention discloses a kind of audio data processing methods, comprising: acquisition voice Signal, and moved according to frame and determine remote signaling, wherein frame moves unequal with block length；Default is determined according to the remote signaling Several target remote signaling, wherein partial target remote signaling is identical as the setting partial target remote signaling of frame before, described Predetermined number is related to frame length, block length, and the setting frame is moved to frame, block length is related；According to the voice signal and target distal end Signal carries out echo cancelltion processing, obtains the echo signal of echo cancellor.

Optionally, described move according to frame determines remote signaling, comprising: it is moved according to frame, determines the remote signaling of the first length, Wherein, first length and the frame phase shift are closed；Remote signaling is spliced according to frame length and frame shifting, obtains the second length Remote signaling, second length is related to the frame length.

Optionally, remote signaling is spliced according to frame length and frame shifting, obtains the remote signaling of the second length, comprising: According to frame length, the remote signaling of the third length before the remote signaling of first length is determined, the third length is The difference of first length and the second length；The remote signaling of the third length and the remote signaling of first length are carried out Splicing, obtains the remote signaling of the second length.

Optionally, the target remote signaling that predetermined number is determined according to the remote signaling, comprising: according to described The remote signaling of two length determines the target remote signaling of first the 4th several length；The of setting frame storage before obtaining The target remote signaling of two the 4th several length, wherein first number and second it is several and be predetermined number, it is described 4th length is long related to described piece.

Optionally, echo cancelltion processing is carried out according to the voice signal and target remote signaling, obtains echo cancellor Echo signal, comprising: the target remote signaling of the 4th length according to the predetermined number determines corresponding echo letter to be canceled Number；Subtracted each other using the voice signal and the echo signal to be canceled, obtains the echo signal of echo cancellor.

Optionally, the target remote signaling of the 4th length according to the predetermined number determines corresponding to be canceled return Acoustical signal, comprising: the target remote signaling of described first the 4th several length is handled, the first distal end of frequency domain is obtained Signal；Obtain the target remote signaling of described second the 4th several length, the second remote signaling of corresponding frequency domain；By the frequency First remote signaling and the second remote signaling in domain, are handled with space shock response, obtain echo signal to be canceled.

Optionally, further includes: update the corresponding frame number of target remote signaling of second the 4th several length.

The embodiment of the invention also discloses a kind of audio-frequency data processing devices, comprising: signal acquisition module, for acquiring language Sound signal, and moved according to frame and determine remote signaling, wherein frame moves unequal with block length；Signal processing module, for according to described in Remote signaling determines the target remote signaling of predetermined number, wherein partial target remote signaling and the part mesh for setting frame before It is identical to mark remote signaling, the predetermined number is related to frame length, block length, and the setting frame is moved to frame, block length is related；Echo cancellor Module obtains the target letter of echo cancellor for carrying out echo cancelltion processing according to the voice signal and target remote signaling Number.

Optionally, the signal acquisition module, comprising: acquisition submodule in distal end determines the first length for moving according to frame Remote signaling, wherein first length and the frame phase shift are closed；Splice submodule, for moving according to frame length and frame to remote End signal is spliced, and the remote signaling of the second length is obtained, and second length is related to the frame length.

Optionally, the splicing submodule, for determining before the remote signaling of first length according to frame length The remote signaling of third length, the third length are the difference of the first length and the second length；By the remote of the third length End signal and the remote signaling of first length are spliced, and the remote signaling of the second length is obtained.

Optionally, the signal processing module, comprising: target determines submodule, for according to the remote of second length End signal determines the target remote signaling of first the 4th several length；Acquisition submodule is cached, for setting frame before obtaining The target remote signaling of second the 4th several length of storage, wherein first number and second it is several and be default Number, the 4th length are long related to described piece.

Optionally, the echo cancellation module, comprising: echo determines submodule, for according to the predetermined number The target remote signaling of four length determines corresponding echo signal to be canceled；Submodule is eliminated, for using the voice signal Subtract each other with the echo signal to be canceled, obtains the echo signal of echo cancellor.

Optionally, the echo determines submodule, for the target remote signaling to described first the 4th several length It is handled, obtains the first remote signaling of frequency domain；The target remote signaling of described second the 4th several length is obtained, it is corresponding Second remote signaling of frequency domain；By the first remote signaling and the second remote signaling of the frequency domain, carried out with space shock response Processing, obtains echo signal to be canceled.

Optionally, the echo cancellation module, the target distal end for being also used to update described second the 4th several length are believed Number corresponding frame number.

The embodiment of the invention also discloses a kind of electronic equipment, which is characterized in that include memory and one or More than one program, perhaps more than one program is stored in memory and is configured to by one or one for one of them It includes the instruction for performing the following operation that a above processor, which executes the one or more programs:

Voice signal is acquired, and is moved according to frame and determines remote signaling, wherein frame moves unequal with block length；

The target remote signaling of predetermined number is determined according to the remote signaling, wherein partial target remote signaling is therewith The partial target remote signaling of preceding setting frame is identical, and the predetermined number and frame length, block length are related, the setting frame and frame shifting, Block is long related；

Echo cancelltion processing is carried out according to the voice signal and target remote signaling, obtains the target letter of echo cancellor Number.

Optionally, also comprising the instruction for performing the following operation: updating the target of second the 4th several length The corresponding frame number of remote signaling.

The embodiment of the invention also discloses a kind of readable storage medium storing program for executing, which is characterized in that the finger in the storage medium When enabling the processor execution by electronic equipment, so that electronic equipment is able to carry out as described in one or more of inventive embodiments Audio data processing method.

The embodiment of the present invention includes following advantages:

The embodiment of the present invention can acquire voice signal, and move according to frame and determine remote signaling, wherein frame moves long not with block Equal, to determine the target remote signaling of predetermined number according to remote signaling, partial target remote signaling sets frame before being Corresponding determination, wherein the predetermined number and frame length, block length are related, and the setting frame and frame shifting, block length are related, then foundation When the voice signal and target remote signaling carry out echo cancelltion processing, the target remote signaling of repeating part without counting again It calculates, can effectively reduce calculation amount, the echo signal of echo cancellor then can be obtained, to effectively eliminate in voice signal Echo, and voice delay can be shortened.

Detailed description of the invention

Fig. 1 is a kind of step flow chart of audio data processing method embodiment of the application；

Fig. 2 is the step flow chart of another audio data processing method embodiment of the application；

Fig. 3 is a kind of structural block diagram of audio-frequency data processing device embodiment of the application；

Fig. 4 is the structural block diagram of another audio-frequency data processing device embodiment of the application；

Fig. 5 is a kind of structural frames of electronic equipment for audio data processing shown according to an exemplary embodiment Figure；

Fig. 6 is a kind of electronic equipment for audio data processing that the application is shown according to another exemplary embodiment Structural schematic diagram.

Specific embodiment

In order to make the above objects, features, and advantages of the present application more apparent, with reference to the accompanying drawing and it is specific real Applying mode, the present application will be further described in detail.

In the communication for including voice, acoustic echo is difficult to avoid that.Wherein, remote signaling passes through phone or network It is transmitted to proximal end, is then played out by loudspeaker, after space propagation, the distal end passed back after proximal end microphone picks up Signal is acoustic echo.It can be indicated in the time domain by the mathematical model of the received voice signal of microphone are as follows:

Y (n)=h (n) * x (n)+d (n)

Wherein, y (n) is the voice signal of microphone acquisition；X (n) is remote signaling, and h (n) is the shock response in space, h (n) * x (n) is the convolution results of x (n) and h (n), is expressed as what remote signaling proximal end microphone after space propagation picked up Signal；D (n) is near end signal, that is, echo cancellor echo signal.

Acoustic Echo Cancellation (Acousitc Echo Cancellation, AEC) algorithm can be used for eliminating acoustic echo letter Number, which usually has two steps, and the first step is adaptive filter algorithm, and second step is residual echo post filtering algorithm, thus Obtain the echo signal of echo cancellor.The embodiment of the present invention is based on AEC algorithm and improves, with significantly more efficient elimination echo.

Referring to Fig.1, a kind of step flow chart of audio data processing method embodiment of the application is shown, it specifically can be with Include the following steps:

Step 102, voice signal is acquired, and is moved according to frame and determines remote signaling, wherein frame moves unequal with block length.

When the equipment such as using terminal, can carry out include voice communication, microphone records voice can be used in this process Data, i.e. acquisition voice signal, it is the energy converter that voice signal is converted to electric signal which, which is also referred to as microphone, The microphone of part, the present embodiment can also be able to be the external microphone wind connected in equipment for what is carried in equipment.Voice collected It include echo in signal, which is to receive signal by microphone after playing by loudspeaker, i.e., received remote by microphone End signal.The equipment such as using terminal can also acquire remote signaling, which can be to pass through phone or network in communication process The signal for being transmitted in equipment and being played by loudspeaker.Wherein, it when carrying out echo cancellor, can periodically be eliminated, I.e. acquisition signal is eliminated at regular intervals, can be moved according to frame length and frame in the embodiment of the present invention and be determined remote signaling, should Frame moves unequal with block length.

In a kind of example, piecemeal adaptive frequency domain filter (Partitioned can be used for adaptive filter algorithm Block Frequency Domain Adaptive Filter, PBFDAF), using in PBFDAF algorithm, the present invention is implemented The block length of piecemeal is consistent with the length of blocking filter in example setting filtering, and the shifting of the frame of speech processes and block length are unequal, it is assumed that The a length of N of block, then the length of blocking filter is also N, if it is M that frame, which moves, then N ≠ M, N, M are integer, such as can set 2 for N, M Power number.Block is long and frame shifting is one of PBFDAF algorithm parameter.

Step 104, the target remote signaling of predetermined number is determined according to the remote signaling, wherein partial target distal end Signal is identical as the setting partial target remote signaling of frame before, and the predetermined number grows related, the setting frame to frame length, block It is related to frame shifting, block length.

During PBFDAF algorithm after determining remote signaling, it can also determine that target distal end is believed based on remote signaling Number, which is the remote signaling for needing to calculate to carry out echo cancellor.In the embodiment of the present invention, target distal end is believed It number is made of the remote signaling of predetermined number, preset length, which can determine according to frame length, block are long etc., preset Length is related to block length.Since remote signaling moves acquisition according to frame, corresponding obtained target remote signaling also with frame length, block appearance Close, and set frame and frame move, block length it is related, it is thus determined that by the target remote signaling of the identified predetermined number of frame shifting, and The target remote signaling that previous frame corresponds to predetermined number therewith is compared, and finds there is part before present frame and before setting frame Several target remote signalings are duplicate.Assuming that present frame is the i-th frame, frame is set as i-b frame, then partial target in the i-th-b frame Part remote signaling is identical in remote signaling and the i-th frame, therefore the duplicate target distal end in this part is believed during processing Number without recalculating, as long as and calculating unduplicated target remote signaling.

Step 106, echo cancelltion processing is carried out according to the voice signal and target remote signaling, obtains echo cancellor Echo signal.

Then target remote signaling is handled, which is the processing that time domain is converted to frequency domain, such as based in Fu The various processing operations of leaf transformation, such as progress discrete Fourier transform (Discrete Fourier Transform, DFT), fastly Fast Fourier transformation (fast Fourier transform, FFT), so that the remote signaling of frequency domain is obtained, the distal end of the frequency domain Signal can determine jointly the received echo signal of noise, that is, microphone in frequency domain and space shock response h (n).Wherein, repeating portion The target remote signaling divided corresponds to the frequency-domain result of Fourier transformation without computing repeatedly, and setting frame is corresponding before need to only obtaining Calculated result.

Remote signaling according to the frequency domain can determine the received echo signal of noise i.e. microphone, can be in frequency-domain calculations h (n) product of the space shock response of * x (n) the i.e. remote signaling of frequency domain and frequency domain, passes to obtain remote signaling by space The signal that proximal end microphone picks up after defeated.Then echo cancelltion is being carried out with voice signal, is eliminating the echo in voice signal Equal echo signal obtains the echo signal of echo cancellor, to eliminate the echo in recording audio data.Such as voice, During video etc. passes through, after being transferred to opposite end by the data of microphone records, echo can be eliminated as far as possible, guarantee call matter Amount.If voice signal is y (n), the convolution results of remote signaling and space shock response are calculated in the time domain, accordingly on frequency domain The result of product for calculating remote signaling and space shock response, obtains echo signal i.e. h (n) * x (n), then the target of echo cancellor Signal d (n)=y (n)-h (n) * x (n).

The PBFDAF algorithm of and frame phase shift long for block etc., if the block length of piecemeal is excessive, although can be conducive to supporting for echo Disappear, but the delay of voice is just also larger, and if the block of piecemeal length is too small, the delay of voice can be solved preferably, but unfavorable In the performance of the counteracting of echo.In contrast, frame moves, settable frame shifting unequal with block length in the scheme of the embodiment of the present invention It is long less than block, in this way echo can be effectively offset on the basis of reducing voice delay.

To sum up, voice signal can be acquired, and is moved according to frame and determines remote signaling, wherein frame shifting and block length are unequal, from And the target remote signaling of predetermined number is determined according to remote signaling, setting frame corresponds to determination before partial target remote signaling is , wherein the predetermined number is related to frame length, block length, and the setting frame is moved to frame, block length is related, then according to the voice When signal and target remote signaling carry out echo cancelltion processing, the target remote signaling of repeating part, can without recalculating It is effective to reduce calculation amount, the echo signal of echo cancellor then can be obtained, so that the echo in voice signal is effectively eliminated, and And voice delay can be shortened.

Referring to Fig. 2, the step flow chart of another audio data processing method embodiment of the application is shown, specifically may be used To include the following steps:

Step 202, voice signal is acquired.

Step 204, it is moved according to frame, determines the remote signaling of the first length, wherein first length and frame phase shift are closed.

Step 206, remote signaling is spliced according to frame length and frame shifting, obtains the remote signaling of the second length, it is described Second length is related to frame length.

When the equipment such as using terminal, can carry out include voice communication, can be used in this process microphone acquisition voice Signal includes echo in voice signal collected, which is to receive signal by microphone after playing by loudspeaker, i.e., By the received remote signaling of microphone.Remote signaling can be also acquired, which can be to pass through phone or net in communication process The signal that network is transmitted in equipment and is played by loudspeaker.Wherein, when carrying out echo cancellor, can periodically be disappeared It removing, i.e., acquisition signal is eliminated at regular intervals, and it can be moved according to frame length and frame in the embodiment of the present invention and determine remote signaling, The frame moves unequal with block length.Assuming that a length of N of block, it is M that frame, which moves, then the length of blocking filter is also N, N ≠ M, N, and M is positive whole Number.

When carrying out Echo cancellation, the remote signaling of the first length can be first obtained, first length is related to frame shifting M. Wherein, can continue to receive remote signaling during being somebody's turn to do, the embodiment of the present invention carries out periodical Echo cancellation, therefore can not reach After first length, corresponding remote signaling is obtained, thus the processing in remote signaling one period of progress every the first length.Then Remote signaling to be spliced together can be determined based on frame length, by the remote signaling to be spliced together and the remote signaling of the first length into Row splicing, obtains the remote signaling of the second length, second length is related to frame length.

In one alternative embodiment, remote signaling is spliced according to frame length and frame shifting, obtains the distal end of the second length Signal, comprising: according to frame length, determine the remote signaling of the third length before the remote signaling of first length, it is described Third length is the difference of the first length and the second length；By the remote of the remote signaling of the third length and first length End signal is spliced, and the remote signaling of the second length is obtained.The second length can be determined according to frame length, then determine that second is long The difference of degree and the first length is third length, obtains the distal end letter of the third length before the remote signaling of the first length Number.Then the remote signaling of the remote signaling of third length and first length is spelled such as time sequencing in sequence It connects, obtains the remote signaling of the second length.

Step 208, the remote signaling according to second length determines the remote signaling of first the 4th several length.

Step 210, the remote signaling of second the 4th several length of setting frame storage before obtaining, wherein described the One number and second it is several and be predetermined number, the 4th length is grown related with block.

The remote signaling of the 4th length can be determined according to the remote signaling of the second length, wherein can determine predetermined number The 4th length remote signaling as target remote signaling, the 4th length and block long related, the predetermined number and frame Long, the long correlation of block.

In the embodiment of the present invention, duplicate number in the target remote signaling of predetermined number is set as second number, is not weighed Multiple number is first number, then+the second number=predetermined number of first number.Second number can be long true with frame shifting according to block It is fixed.The remote signaling that first the 4th several length can be then determined on the basis of the remote signaling of the second length, such as second The remote signaling of the 4th length is determined on the basis of the remote signaling of length.Can also be stored from acquisitions such as caching, memories the The remote signaling of two the 4th several length, then by the remote signaling of first the 4th several length and second the several the 4th The remote signaling of length constitutes the target remote signaling of the 4th length of predetermined number.

Step 212, Fourier transformation processing is carried out to the target remote signaling of described first the 4th several length, obtained First remote signaling of frequency domain.

Step 214, the target remote signaling for storing second the 4th several length is obtained, the second of corresponding frequency domain is remote End signal.

It step 216, will be at the first remote signaling of the frequency domain and the second remote signaling, with space shock response Reason, obtains echo signal to be canceled.

Then Fast Fourier Transform (FFT) can be carried out to the remote signaling of the 4th length of target remote signaling, that is, predetermined number FFT obtains the first remote signaling of frequency domain.The mesh of fourth length several for repeating part i.e. second of target remote signaling Remote signaling is marked, the second remote signaling of the frequency domain that Fourier transformation is handled has stored, therefore can obtain the frequency of storage Then second remote signaling in domain constitutes the mesh of frequency domain using the second remote signaling of the first remote signaling of the frequency domain and frequency domain Remote signaling is marked, wherein can synthesize according to timing information, then by the target remote signaling of the frequency domain and space shock response phase Multiply, obtains echo signal to be canceled.

Wherein, after present frame moves forward setting frame, two frames are corresponding to have duplicate target remote signaling, then can be current Frame moves forward setting frame, and the FFT of the corresponding target remote signaling for calculating duplicate second the 4th several length obtains frequency domain The second remote signaling, thus move backward setting frame arrive when the current frame, directly acquire before calculating the second remote signaling , without computing repeatedly FFT.

Step 218, echo cancelltion processing is carried out according to the voice signal and echo signal, obtains the target of echo cancellor Signal.

After obtaining echo signal, echo cancelltion processing can be carried out with voice signal, such as become echo signal by Fourier The inverse transformation changed obtains the echo signal of time domain, is subtracted each other in the time domain using voice signal with echo signal, obtained result is i.e. For the echo signal of echo cancellor.Certainly, echo signal may not completely eliminate during being somebody's turn to do, and can also carry out other processing behaviour Make, such as handled based on residual echo post filtering algorithm, a new step eliminates echo.

Wherein, the voice signal of the 4th length corresponding with the target remote signaling is obtained.According to the voice signal Echo cancelltion processing is carried out with echo signal, obtains the echo signal of echo cancellor, comprising: believes the voice of the 4th length Number subtract each other with the echo signal, obtains the echo signal of echo cancellor.Since the embodiment of the present invention is periodically to carry out echo It eliminates, therefore after collecting voice signal, voice signal can be intercepted according to some cycles moment, can such as be obtained according to temporal information To the voice signal of the 4th length corresponding with the target remote signaling, then by the voice signal of the 4th length and time domain Echo signal subtracts each other, and obtains the echo signal of echo cancellor.

It, can after duplicate target remote signaling and its second echo information are calculated for the first time in the embodiment of the present invention It carries out storing and configuring frame number, then after obtaining duplicate target remote signaling and its second echo information every time, may be updated Its corresponding frame number, convenient for obtaining next time.

In an example it is assumed that it is M, frame length L that a length of N of block, frame, which move, then the number k=L/N of a frame voice piecemeal, The power number that usual k is 2.During carrying out periodical echo cancellor:

In t moment, the M new remote signaling point moved up by frame obtains the remote signaling that the first length is M: x (0),x(1),……,x(M-1)

Assuming that the second length is identical as frame length L, then third length is (L-M), then carries out spelling frame with preceding (L-M) a point, then The remote signaling of present frame, i.e. length are the remote signaling of L are as follows:

x(M-L),x(M-L+1),......,x(M-1)

Predetermined number is k, i.e. the number of a frame voice piecemeal, and the 4th length is 2*N, that is, double length frame length, then k Length is the target remote signaling of 2*N:

x(M-2*N),x(M-2*N+1),......,x(M-1)

x(M-3*N),x(M-3*N+1),......,x(M-N-1)

……

x(M-(k+1)*N),x(M-(k+1)*N+1),......,x(M-(k-1)*N-1)

Wherein, k=L/N, then the last one length is that 2*N target remote signaling is also denoted as:

x(M-L-N),x(M-L-N+1),......,x(M-L+N-1)

Wherein, based on the above principles, the target remote signaling of the i-th-b frame are as follows:

x(-M-b*M-L),x(-M-b*M+1),……,x(M-b*M-1)

Wherein, b=N/M, then the target remote signaling of the i-th-b frame is also denoted as:

x(-M-N-L),x(-M-N+1),……,x(M-N-1)

For the i-th-b frame, calculating corresponding k length is 2*N target remote signaling are as follows:

x(M-3*N),x(M-3*N+1),......,x(M-N-1)

x(M-4*N),x(M-4*N+1),......,x(M-2*N-1)

……

x(M-(k+2)*N),x(M-(k+2)*N+1),......,x(M-k*N-1)

By it is above-mentioned target remote signaling that k length is 2*N corresponded to the i-th frame and the i-th-b frame to correspond to k length be 2*N Target remote signaling be compared, then the target remote signaling of k-1 length 2*N, corresponding with the i-th-b frame after the i-th frame is corresponding Preceding k-1 length is that 2*N target remote signaling is completely the same.

Therefore when carrying out the i-th frame Echo cancellation, the history of FFT can be corresponded to by caching the remote signaling of the i-th-b frame As a result, avoid the i-th frame wherein k-1 times FFT calculate, effectively reduce calculation amount.

Therefore, when carrying out each frame Echo Cancellation, need to use the remote signaling FFT result of several b frames forward, then The buffered results of b frame before real-time update, i.e. update frame number, for the Echo Cancellation of next frame, loop cycle, to reach reduction The purpose of calculation amount.

I.e. to above-mentioned i-th frame correspond to k length be 2*N target remote signaling, can to the i-th frame first target distal end believe Number carry out FFT obtain the first remote signaling of frequency domain, for the target remote signaling of rear k-1 length 2*N, storage can be obtained Second remote signaling of frequency domain, to obtain the target remote signaling of corresponding frequency domain, then by the target remote signaling of frequency domain and empty Between shock response be multiplied to obtain echo signal to be canceled.

In the embodiment of the present invention, also from the voice signal of acquisition, the length for obtaining the T moment is the voice signal of 2*N, That is:

y(M-2*N),y(M-2*N+1),......,y(M-1)

Then on this basis, frequency domain echo signal contravariant to be canceled is changed to the echo signal of time domain, is believed with voice Number it is the echo signal d (n) that adaptive cancellation is estimated.

Above-mentioned is a kind of example, in actual treatment, each length (including the first length, the second length, third length, the 4th Length) can also be unequal with each parameter, but be arranged as desired, the corresponding length of each parameter is such as set according to a certain percentage Deng.

Based on above-mentioned treatment process, it is assumed that b=N/M, if b is set greater than or is equal to 2, it is long that frame shifting is less than block , so that the delay that above-mentioned interchangeable frame moves the PBFDAF algorithm of length can be shortened, and the elimination of echo is not influenced.On also, The process of stating can guarantee the performance of Echo Cancellation in the case where the FFT length of signal remains unchanged.And pass through part weight Multiple target remote signaling can effectively reduce calculation amount, improve treatment effeciency.

And, it is assumed that every to pass through N number of point, the calculation amount of PBFDAF is C, due to N=b*M, then the meter of above-mentioned treatment process Calculation amount is b*C, if b is greater than or equal to 2, is shortened since frame moves, so as to by the convergence rate of adaptive filter algorithm Accelerate.

To which the above-mentioned piecemeal Adaptive Algorithm for being moved length based on interchangeable frame is reached according to reasonable setting parameter While not promoting excessive calculation amount, while both having can satisfy Echo Cancellation performance, the convergence time of algorithm can also be increased, And shorten the delay of AEC algorithm.

It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.

On the basis of the above embodiments, the embodiment of the invention also provides a kind of audio-frequency data processing devices.It can apply In on the terminal devices such as mobile phone, tablet computer.

Referring to Fig. 3, a kind of structural block diagram of audio-frequency data processing device embodiment of the application is shown, specifically can wrap Include following module:

Signal acquisition module 302 for acquiring voice signal, and moves according to frame and determines remote signaling, wherein frame moves and block Length is unequal.

Signal processing module 304, for determining the target remote signaling of predetermined number according to the remote signaling, wherein Partial target remote signaling is identical as the setting partial target remote signaling of frame before, wherein the predetermined number and frame length, block Long related, the setting frame is moved to frame, block length is related.

Echo cancellation module 306 is obtained for carrying out echo cancelltion processing according to the voice signal and target remote signaling To the echo signal of echo cancellor.

Referring to Fig. 4, the structural block diagram of another audio-frequency data processing device embodiment of the application is shown, it specifically can be with Including following module:

Wherein, the signal acquisition module 302, comprising:

Voice collecting submodule 3022, for acquiring voice signal.

Acquisition submodule 3024 in distal end determines the remote signaling of the first length, wherein described first for moving according to frame Length and the frame phase shift are closed.

Splice submodule 3026, for splicing according to frame length and frame shifting to remote signaling, obtains the remote of the second length End signal, second length are related to the frame length.

Wherein, the splicing submodule 3026, for determining before the remote signaling of first length according to frame length Third length remote signaling, the third length be the first length and the second length difference；By the third length Remote signaling and the remote signaling of first length are spliced, and the remote signaling of the second length is obtained.

The signal processing module 304, comprising:

Target determines submodule 3042, for the remote signaling according to second length, determines first the several the 4th The target remote signaling of length.

Acquisition submodule 3044 is cached, the target for second the 4th several length of setting frame storage before obtaining is remote End signal, wherein first number and second it is several and be predetermined number, the 4th length is grown related with block.

The echo cancellation module 306, comprising:

Echo determines submodule 3062, determining pair of the target remote signaling for the 4th length according to the predetermined number Answer echo signal to be canceled.

Submodule 3064 is eliminated, for being subtracted each other using the voice signal and the echo signal to be canceled, is returned The echo signal that sound is eliminated.

Wherein, the echo determines submodule 3062, believes for the target distal end to described first the 4th several length It number is handled, obtains the first remote signaling of frequency domain；The target remote signaling of described second the 4th several length is obtained, it is right Answer the second remote signaling of frequency domain；By the first remote signaling and the second remote signaling of the frequency domain, with space shock response into Row processing, obtains echo signal to be canceled.

The echo cancellation module 306 is also used to update the target remote signaling pair of described second the 4th several length The frame number answered.

Based on above-mentioned treatment process, it is assumed that b=N/M, if b is set greater than equal to 2, frame, which moves, is less than block length, To which the delay that above-mentioned interchangeable frame moves the PBFDAF algorithm of length can be shortened, and the elimination of echo is not influenced.Also, above-mentioned mistake Journey can guarantee the performance of Echo Cancellation in the case where the FFT length of signal remains unchanged.

And, it is assumed that every to pass through N number of point, the calculation amount of PBFDAF is C, due to N=b*M, then the meter of above-mentioned treatment process Calculation amount is b*C, if b is more than or equal to 2, is shortened since frame moves, so as to add the convergence rate of adaptive filter algorithm Fastly.

When carrying out the i-th frame Echo cancellation, the historical results of FFT can be corresponded to by caching the remote signaling of the i-th-b frame, Avoid the i-th frame wherein k-1 times FFT calculate, effectively reduce calculation amount.Therefore, when carrying out each frame Echo Cancellation, Need to use the remote signaling FFT result of several b frames forward, then before real-time update b frame buffered results, i.e., update frame number, For the Echo Cancellation of next frame, loop cycle, to achieve the purpose that reduce calculation amount

For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.

Fig. 5 is a kind of structure of electronic equipment 500 for audio data processing shown according to an exemplary embodiment Block diagram.For example, electronic equipment 500 can be mobile phone, computer, digital broadcasting terminal, messaging device, game control Platform, tablet device, Medical Devices, body-building equipment, personal digital assistant etc.；It is also possible to server device, such as server.

Referring to Fig. 5, electronic equipment 500 may include following one or more components: processing component 502, memory 504, Power supply module 506, multimedia component 508, audio component 510, the interface 512 of input/output (I/O), sensor module 514, And communication component 516.

The integrated operation of the usual controlling electronic devices 500 of processing component 502, such as with display, call, data are logical Letter, camera operation and record operate associated operation.Processing element 502 may include one or more processors 520 to hold Row instruction, to perform all or part of the steps of the methods described above.In addition, processing component 502 may include one or more moulds Block, convenient for the interaction between processing component 502 and other assemblies.For example, processing component 502 may include multi-media module, with Facilitate the interaction between multimedia component 508 and processing component 502.

Memory 504 is configured as storing various types of data to support the operation in equipment 500.These data are shown Example includes the instruction of any application or method for operating on electronic equipment 500, contact data, telephone directory number According to, message, picture, video etc..Memory 504 can by any kind of volatibility or non-volatile memory device or they Combination realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable Programmable read only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, quick flashing Memory, disk or CD.

Electric power assembly 504 provides electric power for the various assemblies of electronic equipment 500.Electric power assembly 504 may include power supply pipe Reason system, one or more power supplys and other with for electronic equipment 500 generate, manage, and distribute the associated component of electric power.

Multimedia component 508 includes the screen of one output interface of offer between the electronic equipment 500 and user. In some embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch surface Plate, screen may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touches Sensor is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding The boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, Multimedia component 508 includes a front camera and/or rear camera.When electronic equipment 500 is in operation mode, as clapped When taking the photograph mode or video mode, front camera and/or rear camera can receive external multi-medium data.It is each preposition Camera and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 510 is configured as output and/or input audio signal.For example, audio component 510 includes a Mike Wind (MIC), when electronic equipment 500 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone It is configured as receiving external audio signal.The received audio signal can be further stored in memory 504 or via logical Believe that component 516 is sent.In some embodiments, audio component 510 further includes a loudspeaker, is used for output audio signal.

I/O interface 512 provides interface between processing component 502 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.

Sensor module 514 includes one or more sensors, for providing the state of various aspects for electronic equipment 500 Assessment.For example, sensor module 514 can detecte the state that opens/closes of equipment 500, the relative positioning of component, such as institute The display and keypad that component is electronic equipment 500 are stated, sensor module 514 can also detect electronic equipment 500 or electronics The position change of 500 1 components of equipment, the existence or non-existence that user contacts with electronic equipment 500,500 orientation of electronic equipment Or the temperature change of acceleration/deceleration and electronic equipment 500.Sensor module 514 may include proximity sensor, be configured to It detects the presence of nearby objects without any physical contact.Sensor module 514 can also include optical sensor, such as CMOS or ccd image sensor, for being used in imaging applications.In some embodiments, which can be with Including acceleration transducer, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 516 is configured to facilitate the communication of wired or wireless way between electronic equipment 500 and other equipment. Electronic equipment 500 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.Show at one In example property embodiment, communication component 514 receives broadcast singal or broadcast from external broadcasting management system via broadcast channel Relevant information.In one exemplary embodiment, the communication component 514 further includes near-field communication (NFC) module, short to promote Cheng Tongxin.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module (UWB) technology, bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, electronic equipment 500 can be by one or more application specific integrated circuit (ASIC), number Word signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 504 of instruction, above-metioned instruction can be executed by the processor 520 of electronic equipment 500 to complete the above method.Example Such as, the non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, soft Disk and optical data storage devices etc..

A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of electronic equipment When device executes, so that electronic equipment is able to carry out a kind of audio data processing method, which comprises acquisition voice signal, And moved according to frame and determine remote signaling, wherein frame moves unequal with block length；The mesh of predetermined number is determined according to the remote signaling Mark remote signaling, wherein partial target remote signaling is identical as the setting partial target remote signaling of frame before, and described default Several related to frame length, block length, the setting frame is moved to frame, block length is related；According to the voice signal and target remote signaling into The processing of row echo cancelltion, obtains the echo signal of echo cancellor.

Fig. 6 is a kind of electronic equipment for audio data processing that the application is shown according to another exemplary embodiment 600 structural schematic diagram.The electronic equipment 600 can be server, which can generate ratio because configuration or performance are different Biggish difference may include one or more central processing units (central processing units, CPU) 622 (for example, one or more processors) and memory 632, one or more storage application programs 642 or data 644 Storage medium 630 (such as one or more mass memory units).Wherein, memory 632 and storage medium 630 can be with It is of short duration storage or persistent storage.The program for being stored in storage medium 630 may include that (diagram does not have one or more modules Mark), each module may include to the series of instructions operation in server.Further, central processing unit 622 can be with It is set as communicating with storage medium 630, executes the series of instructions operation in storage medium 630 on the server.

Server can also include one or more power supplys 626, one or more wired or wireless networks connect Mouthfuls 650, one or more input/output interfaces 658, one or more keyboards 656, and/or, one or one with Upper operating system 641, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..

In the exemplary embodiment, server is configured to by one or more than one central processing unit 622 executes one A or more than one program includes the instruction for performing the following operation: acquisition voice signal, and moves according to frame and determine distal end Signal, wherein frame moves unequal with block length；The target remote signaling of predetermined number is determined according to the remote signaling, wherein portion Partial objectives for remote signaling is identical as the setting partial target remote signaling of frame before, and the predetermined number is related to frame length, block length, The setting frame is moved to frame, block length is related；Echo cancelltion processing is carried out according to the voice signal and target remote signaling, is obtained The echo signal of echo cancellor.

All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.

The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.

These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.

Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.

Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.

Above to a kind of audio data processing method provided herein, a kind of audio-frequency data processing device, Yi Zhong electricity Sub- equipment and a kind of storage medium, are described in detail, used herein principle and implementation of the specific case to the application Mode is expounded, the description of the example is only used to help understand the method for the present application and its core ideas；Meanwhile For those of ordinary skill in the art, according to the thought of the application, has change in specific embodiments and applications Become place, in conclusion the contents of this specification should not be construed as limiting the present application.

Claims

1. a kind of audio data processing method characterized by comprising

The target remote signaling of predetermined number is determined according to the remote signaling, wherein partial target remote signaling with set before The partial target remote signaling of framing is identical, and the predetermined number is related to frame length, block length, and the setting frame is moved with frame, block is long It is related；

Echo cancelltion processing is carried out according to the voice signal and target remote signaling, obtains the echo signal of echo cancellor.

2. the method according to claim 1, wherein described move according to frame determines remote signaling, comprising:

It is moved according to frame, determines the remote signaling of the first length, wherein first length and the frame phase shift are closed；

Remote signaling is spliced according to frame length and frame shifting, obtains the remote signaling of the second length, second length and institute State frame length correlation.

3. being obtained according to the method described in claim 2, splicing it is characterized in that, being moved according to frame length and frame to remote signaling To the remote signaling of the second length, comprising:

According to frame length, the remote signaling of the third length before the remote signaling of first length is determined, the third is long Degree is the difference of the first length and the second length；

The remote signaling of the remote signaling of the third length and first length is spliced, the remote of the second length is obtained End signal.

4. according to the method described in claim 2, it is characterized in that, the mesh for determining predetermined number according to the remote signaling Mark remote signaling, comprising:

According to the remote signaling of second length, the target remote signaling of first the 4th several length is determined；

Obtain before setting frame storage second the 4th several length target remote signaling, wherein first number with Second several and be predetermined number, and the 4th length is long related to described piece.

5. according to the method described in claim 4, it is characterized in that, being returned according to the voice signal and target remote signaling Sound counteracting processing, obtains the echo signal of echo cancellor, comprising:

The target remote signaling of the 4th length according to the predetermined number determines corresponding echo signal to be canceled；

Subtracted each other using the voice signal and the echo signal to be canceled, obtains the echo signal of echo cancellor.

6. according to the method described in claim 5, it is characterized in that, the target of the 4th length according to the predetermined number Remote signaling determines corresponding echo signal to be canceled, comprising:

The target remote signaling of described first the 4th several length is handled, the first remote signaling of frequency domain is obtained；

Obtain the target remote signaling of described second the 4th several length, the second remote signaling of corresponding frequency domain；

By the first remote signaling and the second remote signaling of the frequency domain, is handled, obtained to be canceled with space shock response Echo signal.

7. according to the method described in claim 4, it is characterized by further comprising:

Update the corresponding frame number of target remote signaling of described second the 4th several length.

8. a kind of audio-frequency data processing device characterized by comprising

Signal acquisition module for acquiring voice signal, and moves according to frame and determines remote signaling, wherein frame moves and the long not phase of block Deng；

Signal processing module, for determining the target remote signaling of predetermined number according to the remote signaling, wherein partial target Remote signaling is identical as the setting partial target remote signaling of frame before, and the predetermined number is related to frame length, block length, described to set Framing is moved to frame, block length is related；

Echo cancellation module obtains echo for carrying out echo cancelltion processing according to the voice signal and target remote signaling The echo signal of elimination.

9. a kind of electronic equipment, which is characterized in that include memory and one or more than one program, wherein one A perhaps more than one program is stored in memory and is configured to execute described one by one or more than one processor A or more than one program includes the instruction for performing the following operation:

10. a kind of readable storage medium storing program for executing, which is characterized in that when the instruction in the storage medium is held by the processor of electronic equipment When row, so that electronic equipment is able to carry out the audio data processing method as described in one or more of claim 1-7.