CN111970610B

CN111970610B - Echo path detection method, audio signal processing method and system, storage medium, and terminal

Info

Publication number: CN111970610B
Application number: CN202010873815.6A
Authority: CN
Inventors: 雍雅琴; 潘思伟; 罗本彪; 董斐; 林福辉
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2022-05-20
Anticipated expiration: 2040-08-26
Also published as: CN111970610A

Abstract

An echo path detection method, an audio signal processing method and system, a storage medium and a terminal are provided, wherein the echo path detection method comprises the following steps: receiving an error signal, wherein the error signal is obtained by performing an adaptive echo cancellation operation on an audio signal, and the error signal comprises a residual echo; and detecting whether the echo path changes or not according to the residual quantity of the residual echo in the error signal. The scheme provided by the invention can detect the change condition of the echo path in real time, can accurately distinguish the double-talk and the echo path change, and has high detection accuracy, thereby balancing the echo suppression after the double-talk and the echo path change and ensuring the good conversation experience of a far-end listener.

Description

Echo path detection method, audio signal processing method and system, storage medium, and terminal

Technical Field

The present invention relates to the field of voice communication technologies, and in particular, to an echo path detection method, an audio signal processing method and system, a storage medium, and a terminal.

Background

In audio systems, acoustic echo is due to coupling between the loudspeaker and the microphone, resulting in the microphone signal containing not only the useful upstream speech signal, but also the echo. If the microphone signal is not processed, the echo signal is transmitted to the far-end loudspeaker for playing, and a far-end caller hears the delayed sound, which makes the caller uncomfortable and interferes with the uplink voice signal, thereby affecting the effect of the call.

With the rapid development of scientific technology, communication modes and application scenes are increasingly diversified, and communication terminals are increasingly miniaturized and portable, so that the coupling between a loudspeaker and a microphone is stronger and the echo channel is more and more complex and changeable, which brings great challenges to acoustic echo cancellation in voice communication.

To ensure good speech quality, it is common practice to use an Adaptive Echo Canceller (AEC) and an Echo Suppression (ES) to cancel the Echo. Specifically, the basic principle of AEC can be summarized as using a filter to adaptively estimate the echo propagation path, and thus the echo signal received by the microphone, and subtracting the estimated echo from the microphone pick-up signal, thereby removing the echo. In the hands-free state, the adaptive filter can eliminate linear echo and partial nonlinear echo about 20dB, residual echo needs echo suppression, and finally all echo signals are eliminated completely.

The sound emitted by the loudspeaker travels a path through air or other propagation medium, which is defined as the echo path, and finally reaches the microphone and is picked up. When there is an obstruction in the speaker, microphone or propagation path, such as a human hand obstructing the speaker, a human face approaching a telephone watch, etc., the propagation path of the acoustic echo changes. At this time, the shelter, the loudspeaker and the microphone form a new echo path, the echo received by the microphone changes, generally, nonlinear echoes in the echo signal increase, and the overall amplitude of the echo signal becomes larger. The AEC parameter and the ES parameter originally used in the voice call device cannot cancel the increased echo, resulting in a large residual echo.

In order to eliminate a large echo after the echo path is changed, the prior art generally sets a strong AEC parameter and an ES parameter in advance, so that the echo can still be completely eliminated after the echo path is changed, but when the echo path is not changed, because a strong echo suppression parameter is set, when an uplink voice and an echo exist at the same time, that is, when a Double-talk (DT) exists, the uplink voice and the echo are eliminated together, that is, the performance of the Double-talk is poor.

If one wants to use the normal AEC parameters and ES parameters to ensure continuous double talk when the anechoic path changes; when the Echo path is changed, the AEC parameter and the ES parameter are adjusted to cancel the Echo after the Echo path is changed, which requires Echo-path Change Detector (ECD).

However, the existing echo path detection mechanism has many defects, the detection accuracy is low, and it is difficult to balance the echo suppression after the double-talk and echo path change under the hands-free call.

Disclosure of Invention

The invention solves the technical problem of how to detect the change of an echo path in real time and accurately distinguish the change from double-talk.

To solve the foregoing technical problem, an embodiment of the present invention provides an echo path detection method, including: receiving an error signal, wherein the error signal is obtained by performing an adaptive echo cancellation operation on an audio signal, and the error signal comprises a residual echo; and detecting whether the echo path changes or not according to the residual quantity of the residual echo in the error signal.

Optionally, the residual amount is based on a mean square error of the error signal, or the residual amount is based on an energy or amplitude characterization of a residual echo in the error signal.

Optionally, the detecting whether the echo path changes according to the residual amount of the residual echo in the error signal includes: comparing the residual quantity with a preset threshold value; if the residual quantity is larger than the preset threshold value, detecting that the echo path changes; and if the residual quantity is smaller than the preset threshold value, determining that the echo path is not changed.

To solve the foregoing technical problem, an embodiment of the present invention further provides an audio signal processing method, including: receiving the audio signal, wherein the audio signal comprises an uplink voice signal and an echo signal; performing an adaptive echo cancellation operation on the audio signal to obtain an error signal, wherein the error signal comprises a residual echo; detecting whether the echo path changes according to the residual quantity of the residual echo in the error signal; generating a control signal when the echo path is detected to be changed, wherein the control signal is used for adjusting a first parameter used when the echo cancellation operation is executed.

Optionally, the control signal is further configured to adjust a second parameter used when performing an echo suppression operation, where the echo suppression operation is configured to eliminate a residual echo in the error signal.

Optionally, the error signal is a difference between the audio signal and an estimated echo signal, where the estimated echo signal is estimated according to the first parameter and the downlink speech signal.

Optionally, before detecting whether the echo path is changed according to a residual amount of residual echo in the error signal, the audio signal processing method further includes: judging whether a downlink voice signal is received or not; and when the judgment result shows that the downlink voice signal is received, detecting whether the echo path is changed or not according to the residual quantity of the residual echo in the error signal.

Optionally, the audio signal is acquired by a microphone.

To solve the above technical problem, an embodiment of the present invention further provides an audio signal processing system, including: the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring audio signals, and the audio signals comprise uplink voice signals and echo signals; the adaptive echo cancellation module is coupled with the acquisition module and performs adaptive echo cancellation operation on the audio signal according to a first parameter to obtain an error signal, wherein the error signal comprises residual echo; the echo path detection module is coupled with the self-adaptive echo cancellation module and detects whether the echo path changes according to the residual quantity of the residual echo in the error signal; when detecting that the echo path changes, the echo path detection module generates a control signal to adjust the first parameter.

Optionally, the audio signal processing system further includes: and the echo suppression module is coupled with the self-adaptive echo cancellation module and cancels the residual echo in the error signal according to a second parameter.

Optionally, the control signal is further configured to adjust the second parameter.

Optionally, the adaptive echo cancellation module includes: the adaptive filter receives a downlink voice signal and obtains an estimated echo signal according to the first parameter estimation and the downlink voice signal estimation; and the computing unit is coupled with the adaptive filter and the acquisition module and computes to obtain the error signal according to the audio signal and the estimated echo signal.

Optionally, the echo path detecting module includes: the comparison unit is used for comparing the residual quantity with a preset threshold value; the first detection unit detects that the echo path changes if the residual quantity is greater than the preset threshold value; and the second detection unit is used for determining that the echo path is not changed if the residual quantity is smaller than the preset threshold value.

Optionally, the audio signal processing system further includes: and the receiving module is used for receiving the downlink voice signal.

Optionally, the echo path detecting module is coupled to the receiving module, and when it is determined that the receiving module receives the downlink voice signal, the echo path detecting module detects whether the echo path changes.

To solve the above technical problem, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, and the computer program executes the steps of the above method when being executed by a processor.

In order to solve the above technical problem, an embodiment of the present invention further provides a terminal, including: the audio acquisition equipment is used for acquiring audio signals; the audio signal processing system is coupled with the audio acquisition equipment so as to acquire and process the audio signal through the audio acquisition equipment.

Optionally, the audio capture device comprises a microphone.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides an echo path detection method, which comprises the following steps: receiving an error signal, wherein the error signal is obtained by performing an adaptive echo cancellation operation on an audio signal, and the error signal comprises a residual echo; and detecting whether the echo path changes or not according to the residual quantity of the residual echo in the error signal.

Compared with the existing echo path detection method based on cross correlation, the scheme of the embodiment detects the change condition of the echo path by measuring the residual quantity of the residual echo in the error signal, and the response to the change of the echo path is more timely. Because the change of the residual quantity is detected, the residual quantity after the double-talk and the echo path are changed is obviously different, so that the double-talk cannot be mistaken for the echo, the accurate distinguishing of the double-talk and the echo path change becomes possible, and the detection accuracy rate is high. Particularly, the residual quantity can intuitively reflect the size of the echo in the audio signal, the embodiment can directly detect whether the echo path changes based on the error signal, the detection calculation amount is small, and the response to the echo path change is more timely.

Further, an embodiment of the present invention further provides an audio signal processing method, including: receiving the audio signal, wherein the audio signal comprises an uplink voice signal and an echo signal; performing an adaptive echo cancellation operation on the audio signal to obtain an error signal, wherein the error signal comprises a residual echo; detecting whether the echo path changes according to the residual quantity of the residual echo in the error signal; generating a control signal when the echo path is detected to be changed, wherein the control signal is used for adjusting a first parameter used when the echo cancellation operation is executed.

The embodiment can balance double-talk and echo suppression after the echo path is changed, and ensure good conversation experience of a far-end listener. Specifically, the variation of the echo path is detected by measuring the residual amount of the residual echo in the error signal, and the echo suppression parameters are adjusted according to the detection result. Therefore, by distinguishing the current echo path state in real time, on one hand, the echo cancellation under the normal echo path can be ensured to be clean and continuous in double talk, and on the other hand, the larger echo can be effectively cancelled after the echo path is changed.

Further, judging whether a downlink voice signal is received or not; and when the judgment result shows that the downlink voice signal is received, detecting whether the echo path is changed or not according to the residual quantity of the residual echo in the error signal.

Since the echo is generated only when the downlink voice signal exists, the present embodiment determines whether to apply the echo path detection algorithm according to whether the downlink voice signal is received.

Drawings

Fig. 1 is a flow chart of an echo path detection method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an audio signal processing method according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an audio signal processing system according to an embodiment of the invention;

FIG. 4 is a schematic illustration of an unprocessed audio signal;

FIG. 5 is a schematic diagram of a downstream speech signal as an echo suppression reference signal;

FIG. 6 is a schematic illustration of the processing results for the audio signal shown in FIG. 4 without echo path detection;

FIG. 7 is a schematic illustration of the result of a process for enhanced suppression of the audio signal of FIG. 4 without echo path detection;

FIG. 8 is a diagram illustrating the detection result of the audio signal of FIG. 4 by the echo path detection method of FIG. 1;

fig. 9 is a schematic diagram of a processing result of the audio signal shown in fig. 4 by using the audio signal processing method shown in fig. 2.

Detailed Description

As background, echo path detection is required to provide good speech quality. However, existing echo path detection algorithms mainly use the cross-correlation of the microphone signal and the AEC signal, and the cross-correlation of the downlink reference signal and the AEC signal. The method generally has the condition that the double-talk and echo path change is difficult to distinguish, so that the accuracy of echo path detection is influenced, and the balance of echo suppression of the double-talk and echo path change under the hands-free call is difficult to balance.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

The embodiment can be applied to the application scenes that the echo path changes under specific conditions, and the echo becomes large, such as mobile communication terminals, wearable equipment, intelligent sound boxes, intelligent homes and the like. For example, the specific situations may include a user holding a mobile phone in hands-free conversation, a speaker being hidden, a face or palm being close to the phone, a smart watch.

By adopting the embodiment, when the echo path is changed due to the fact that the loudspeaker is shielded in the conversation process of the voice communication terminal such as a hands-free mode, continuous double-talk and better echo cancellation performance can be ensured at the same time.

Taking a mobile terminal as an example, when the mobile terminal such as a mobile phone is in a hands-free mode, the sound of the mobile terminal is played by a speaker located at the lower left corner of the back surface of the mobile terminal, and a microphone is located at the right side of the bottom of the mobile terminal. When the user's hand is located at the middle upper portion of the mobile terminal, the echo path between the speaker and the microphone may be considered as a normal echo path. That is, the normal echo path refers to an echo path when there is no object occlusion between the microphone and the speaker. When the hand of the user is located at the lower part of the mobile terminal, the palm of the user is close to the speaker and the microphone, and at the moment, the palm, the speaker and the microphone form a new echo path, so that the original normal echo path is changed. At this time, the echo received by the microphone is large and is difficult to eliminate.

Taking wearable devices as an example, the speaker of a wearable device such as a smart watch is usually located below the device, and the microphone is located above and to the right of the device. When a human face or a palm approaches to the wearable device, the human face or the palm, the loudspeaker and the microphone form a new echo path together, and the original normal echo path is changed. At this time, the echo received by the microphone is large and is difficult to eliminate.

By adopting the embodiment, whether the echo path is changed can be accurately and timely detected. Further, if the echo path is changed, the echo suppression parameters are adjusted in time, so that the echo after the path change is eliminated completely, and good conversation experience of a far-end listener is ensured.

Fig. 1 is a flowchart of an echo path detection method according to an embodiment of the present invention.

Specifically, referring to fig. 1, the echo path detection method according to this embodiment may include the following steps:

step S101, receiving an error signal, wherein the error signal is obtained by performing adaptive echo cancellation operation on an audio signal, and the error signal comprises a residual echo;

and step S102, detecting whether the echo path changes according to the residual quantity of the residual echo in the error signal.

In one implementation, the residual amount may be based on a mean square error characterization of the error signal. Alternatively, the residual amount may also be characterized based on the energy or amplitude of the residual echo in the error signal.

For example, the error signal is the difference between the audio signal and the estimated echo signal, and the mean square error is obtained by taking the square of the error signal as an expectation. Wherein, the audio signal includes an uplink speech signal and an echo signal (which may be understood as a true echo signal), and the estimated echo signal is an echo signal estimated by the adaptive filter according to the downlink speech signal and the first parameter. Thus, based on the mean square error, the degree of difference between the estimator (estimated echo signal) and the estimated volume (true echo signal) can be measured, i.e. the residual amount of the residual echo is measured.

Also for example, the energy or amplitude of the residual echo in the error signal may also measure the residual amount.

In one implementation, an exponential recursive weighting algorithm may be employed to characterize the mean square error.

In one implementation, the step S102 may include the steps of: comparing the residual quantity with a preset threshold value; if the residual quantity is larger than the preset threshold value, detecting that the echo path changes; and if the residual quantity is smaller than the preset threshold value, determining that the echo path is not changed.

Specifically, the preset threshold may be a mean square error threshold of the mean square error. For example, the mean square error threshold may be an empirical value obtained from a previous test and stored in the mobile terminal.

The echo path detection method shown in fig. 1 can be applied to an audio signal processing scene, and the echo suppression parameters are adjusted in time by detecting the state of the echo path in real time during a voice call, so that on the basis of accurately distinguishing double-talk and echo path change, echoes can be completely eliminated before and after the echo path change, the double-talk effect cannot be influenced, and good call experience of a far-end listener is ensured.

Specifically, referring to fig. 2, the audio signal processing method according to this embodiment may include the following steps:

step S201, receiving the audio signal, wherein the audio signal comprises an uplink voice signal and an echo signal;

step S202, performing adaptive echo cancellation operation on the audio signal to obtain an error signal, wherein the error signal contains residual echo;

step S203, detecting whether an echo path changes according to the residual quantity of the residual echo in the error signal;

step S204, when detecting that the echo path changes, generating a control signal, where the control signal is used to adjust a first parameter used when performing the echo cancellation operation.

More specifically, the first parameter may comprise the echo suppression parameter, such as an AEC parameter.

The explanation of the terms in this embodiment may refer to the description related to the embodiment shown in fig. 1, and will not be repeated here.

The audio signal processing method according to this embodiment can be implemented based on the audio signal processing system 3 shown in fig. 3. The following describes the processing flow of the audio signal including echo path detection according to the present embodiment in detail with reference to fig. 2 and 3.

In one implementation, in conjunction with fig. 2 and 3, the audio signal processing system 3 may include a collecting module 31 for collecting the audio signal, wherein the audio signal includes an uplink voice signal and an echo signal.

In particular, the acquisition module 31 may be coupled to an audio acquisition device 4 to acquire the audio signal by the audio acquisition device 4.

For smart devices and wearable devices, due to the limitation of the size of the devices, and the close proximity between the audio capture device 4 and the audio output device 5 may be very close, the audio information captured by the audio capture device 4 may also include, in addition to the uplink speech signal, the downlink speech signal output by the audio output device 5, that is, the echo signal generated by the acoustic echo of the audio output device 5. This situation is particularly noticeable in the hands-free talk mode.

During the hands-free call, when an object (such as a face, a hand, etc.) approaches, the audio acquisition device 4, the approaching object, and the audio output device 5 together form a new echo path, resulting in a change in the original normal echo path.

In this embodiment, the local device is used as a reference, and a signal which is acquired by the audio acquisition device 4 and needs to be transmitted to the remote communication device is referred to as an uplink voice signal and is denoted by v (n); the signal transmitted from the remote communication device is referred to as a downstream voice signal, and is denoted by x (n).

The audio acquisition device 4 may comprise a microphone and the audio output device 5 may comprise a speaker. Correspondingly, the audio signal received by the acquisition module 31 may also be referred to as a microphone signal and is denoted as m (n), where n is a sampling point.

In one specific implementation, the audio signal processing system 3 may further include an adaptive echo cancellation module 32 coupled to the acquisition module 31, where the adaptive echo cancellation module 32 performs an adaptive echo cancellation operation on the audio signal m (n) according to a first parameter to obtain an error signal, denoted as e (n). Since the adaptive echo cancellation operation is based on an estimated echo signal obtained by estimation

Cancelling the true echo signal in the audio signal m (n), so that the obtained error signal e (n) contains residual echo.

In particular, the adaptive echo cancellation module 32 may perform the adaptive echo cancellation operation based on the AEC principle to cancel linear echoes and partially non-linear echoes in the microphone signals m (n).

For example, the adaptive echo cancellation module 32 may include: an adaptive filter 321, where the adaptive filter 321 receives a downlink voice signal x (n), and estimates an estimated echo signal according to the first parameter and the downlink voice signal x (n)

The downlink voice signal may be obtained from the receiving module 33, and the receiving module 33 is further coupled to the audio output device 5 to play the received downlink voice signal.

The downlink voice signal x (n) sent by the far-end talking device passes through the real echo path h, so as to obtain the real echo signal y (n) ═ xh^TWherein h ═ h₀,h₁,...,h_L-1]T is transposed, L is the order of the adaptive filter 321, x ═ x (n), x (n-1),.., x (n-L +1)]。

Microphone signal m (n) includes upstream speech signal v (n) and real echo signal y (n), i.e., m (n) ═ v (n) + y (n).

However, the real echo signal y (n) cannot be obtained, so the adaptive filter 321 estimates an estimated echo signal according to the first parameter and the downlink speech signal x (n)

Wherein the content of the first and second substances,

is the filter coefficient of the adaptive filter 321.

Further, the adaptive echo cancellation module 32 may further include a computing unit 322 coupled to the adaptive filter 321 and the acquisition module 31, where the computing unit 322 may be configured to estimate the echo signal according to the audio signal m (n) and the estimated echo signal

And calculating to obtain the error signal.

The sign of the audio signal m (n) obtained from the acquisition module 31 is positive ("+"), and the estimated echo signal obtained from the adaptive filter 321 is obtained

Is negative ("-"), the calculating unit 322 adds the two physical quantities to obtain the error signal e (n). In particular, the error signal

Generally, the filter coefficients of the adaptive filter 321 are influenced by the error signal e (n) and a first parameter (i.e. the AEC parameter), e.g. the AEC parameter influences whether the filter coefficients converge to an optimum. However, the existing AEC parameters are fixed and not constant, and during the audio signal processing, the method simply depends on the error signal e (n) to affect the filter coefficients of the adaptive filter 321, on one hand, the response speed is not satisfactory, and on the other hand, the echo estimation is not accurate enough.

With the embodiment, whether the echo path changes is judged by directly detecting the error signal e (n), and the first parameter (i.e. the AEC parameter) is adjusted in time when the echo path changes. Therefore, the filter coefficients of the adaptive filter 321 can be simultaneously influenced from the AEC parameters and the error signal e (n), which is beneficial to improving the response speed and the accuracy of echo estimation.

In one implementation, the audio signal processing system 3 may further include: an echo path detecting module 34, coupled to the adaptive echo canceling module 32, wherein the echo path detecting module 34 detects whether the echo path changes according to a residual amount of the residual echo in the error signal e (n).

Specifically, the echo path detecting module 34 may be coupled to the calculating unit 322 to obtain the error signal e (n) output by the calculating unit 322.

Further, the echo path detection module 34 may generate a control signal to adjust the first parameter when detecting that the echo path changes.

For example, the control signal may be transmitted to the adaptive filter 321 to adjust a first parameter used by the adaptive filter 321, as shown by the data transmission path shown by the dotted line in fig. 3.

In one implementation, the echo path detecting module 34 may be further coupled to the receiving module 33 to receive the downlink voice signal x (n).

Specifically, the input signals of the echo path detecting module 34 are an error signal e (n) and a downlink voice signal x (n), and when it is determined that the receiving module 33 receives the downlink voice signal x (n), the echo path detecting module 34 is enabled and executes the scheme of the embodiment shown in fig. 1 to detect whether the echo path changes.

In one implementation, the echo path detecting module 34 may include: a comparison unit (not shown) for comparing the residual amount with a preset threshold; a first detecting unit (not shown), configured to detect that the echo path changes if the residual amount is greater than the preset threshold; and a second detecting unit (not shown), if the residual amount is smaller than the preset threshold, determining that the echo path is unchanged.

In particular, the residual amount is measured based on the mean square error of the error signal e (n), or based on the energy or amplitude of the residual echo in the error signal e (n). Taking the mean square error as an example, the mean square error is the square of the residual amount of residual echo in the error signal e (n), and is recorded as σ_e ². The residual amount may be based on a formula

Is represented by (A), wherein E2]Indicating that it is desired.

The following is a detailed description of the mean square error.

For example, in actual calculations, the result of an exponential recursive weighting algorithm may be used to characterize the mean square error. The residual amount may be based on a formula

Where λ is a weight coefficient.

Definition of

Is the preset threshold value when

When the time is long, the echo path is changed; otherwise, no change in the echo path is indicated.

In one implementation, the audio signal processing system 3 may further include: an echo suppression module 35, coupled to the adaptive echo cancellation module 32, where the echo suppression module 32 further suppresses the residual echo in the error signal e (n) according to a second parameter until the echo is completely eliminated.

Specifically, the echo suppressing module 35 may be coupled to the calculating unit 322 to obtain the error signal e (n) output by the calculating unit 322.

Accordingly, the output of the adaptive echo cancellation module 32 is a control signal (shown by a dashed line in fig. 3), and the control signal can be transmitted to the adaptive echo cancellation module 32 and the echo suppression module 35 respectively to adjust the first parameter and the second parameter respectively.

For example, when no change in the echo path is detected, the adaptive echo cancellation operation may be performed according to the first and second parameters of the normal echo path. When a change in the echo path is detected, the echo path detecting module 34 triggers a control signal to control the adaptive echo cancellation module 32 and the echo suppression module 35 to adjust the respective first parameter and second parameter, respectively, for example, to perform a subsequent adaptive echo cancellation operation using the larger first parameter and second parameter.

Thus, by adopting the embodiment, the change condition of the echo path is detected by measuring the residual quantity of the residual echo in the error signal, and the response to the change of the echo path is more timely. Because the change of the residual quantity is detected, the residual quantity after the double-talk and the echo path are changed is obviously different, so that the double-talk cannot be mistaken for the echo, the accurate distinguishing of the double-talk and the echo path change becomes possible, and the detection accuracy rate is high. Specifically, the residual quantity can visually represent the size of the echo in the audio signal, the embodiment can directly detect whether the echo path changes or not based on the error signal, the detection calculation amount is small, and the response to the change of the echo path is more timely.

Furthermore, the embodiment can balance double-talk and echo suppression after the echo path is changed, and ensure good conversation experience of far-end listeners. Specifically, the variation of the echo path is detected by measuring the residual amount of the residual echo in the error signal, and the echo suppression parameters are adjusted according to the detection result. Therefore, by distinguishing the current echo path state in real time, on one hand, the echo cancellation under the normal echo path can be ensured to be clean and continuous in double talk, and on the other hand, the larger echo can be effectively cancelled after the echo path is changed.

Fig. 4 shows a section of audio signal collected by a microphone without echo cancellation processing when a voice communication terminal performs an echo path change test in a hands-free call narrowband mode. It can be seen that at the echo path change segment (at sample point 650000), the echo signal increases significantly. In addition, the sampling points 460000 and 600000 are in the two-way state. Fig. 5 is a downstream signal as an echo suppression reference signal.

Fig. 6 is an output result without applying the echo path detection algorithm and suppression, which is subjected to AEC and ES processing. It can be seen that the echo in the normal echo path (before the echo path changes the segment) can be eliminated completely, and the double talk segment remains intact, but the larger echo after the echo path is changed remains more.

Fig. 7 is the output result after the echo suppression parameters are enhanced, which is processed by the enhanced AEC parameters and ES parameters, again without applying the echo path detection algorithm and suppression. It can be seen that the echo after the echo path is changed can be basically eliminated, but the uplink voice of the double-talk segment (sample point 460000-.

Fig. 8 is a judgment result obtained by judging each frame of audio signal by using the echo path detection method described in this embodiment. It can be seen that the echo path detection algorithm employed in this embodiment can accurately determine whether the echo path is changed, and can distinguish from the dual talk segment.

Fig. 9 is the output after applying the echo path detection algorithm and suppression. It can be seen that when the echo path is not changed, the echo can be cancelled cleanly, and the double talk segment remains intact. When the echo path changes, the echo can also be eliminated cleanly and the double talk is unaffected.

In the legends shown in fig. 4 to 9, the abscissa of the legends is the sampling point, except that the abscissa of fig. 8 is the number of frames (frames); the ordinate of all legends is the amplitude of the signal (amplitude for short). The number of frames is the total number of samples per frame, and in the legend shown in fig. 4 to fig. 9, the number of samples per frame is 160.

Therefore, by applying the embodiment, whether the echo path changes or not is simply and effectively judged by calculating the mean square error, the energy or the amplitude of the error signal, different echo suppression parameters are called according to the echo path detection result, and the current echo path state is distinguished in real time, so that the echo under the normal echo path is completely eliminated, the two-way communication is continuous, and the clean and large echo is eliminated after the echo path is changed. The purpose of improving echo cancellation and double-talk performance is achieved, and the talk experience of the hands-free voice talk terminal is improved.

Embodiments of the present invention further provide a storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method shown in fig. 1 or fig. 2. Preferably, the storage medium may include a computer-readable storage medium such as a non-volatile (non-volatile) memory or a non-transitory (non-transient) memory. The storage medium may include ROM, RAM, magnetic or optical disks, etc.

An embodiment of the present invention further provides a terminal, where the terminal may include: an audio acquisition device; the audio signal processing system 3 shown in fig. 3 is coupled to the audio acquisition device to process the audio signal acquired by the audio acquisition device. For example, the terminal may include a mobile communication terminal, a wearable device, a smart speaker, and a smart home device. In particular, the audio capturing device may comprise a microphone and a speaker.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An echo path detection method, comprising:

receiving an error signal, wherein the error signal is obtained by performing an adaptive echo cancellation operation on an audio signal, and the error signal comprises a residual echo;

detecting whether the echo path changes according to the residual quantity of the residual echo in the error signal;

wherein the residual amount is characterized based on a mean square error of the error signal, the mean square error is characterized by a result of an exponential recursive weighting algorithm, the residual amount is based on a formula

It is shown that, among others,

is the residual amount of the residual echo in the error signal e (n), λ is the weight coefficient, and n is the sampling point.

2. The echo path detecting method according to claim 1, wherein said detecting whether the echo path has changed according to a residual amount of residual echo in the error signal comprises:

comparing the residual quantity with a preset threshold value;

if the residual quantity is larger than the preset threshold value, detecting that the echo path changes;

and if the residual quantity is smaller than the preset threshold value, determining that the echo path is not changed.

3. An audio signal processing method, comprising:

receiving the audio signal, wherein the audio signal comprises an uplink voice signal and an echo signal;

performing an adaptive echo cancellation operation on the audio signal to obtain an error signal, wherein the error signal comprises a residual echo;

detecting whether an echo path changes according to the residual quantity of residual echo in the error signal, wherein the residual quantity is represented by the mean square error of the error signal, the mean square error is represented by the result of an exponential recursive weighting algorithm, and the residual quantity is based on a formula

It is shown that, among others,

is the residual quantity of the residual echo in the error signal e (n), λ is the weight coefficient, n is the sampling point;

when detecting that the echo path changes, generating a control signal, wherein the control signal is used for adjusting a first parameter used when the echo cancellation operation is executed.

4. The audio signal processing method of claim 3, wherein the control signal is further configured to adjust a second parameter used in performing an echo suppression operation, wherein the echo suppression operation is configured to cancel residual echo in the error signal.

5. The audio signal processing method of claim 3, wherein the detecting whether the echo path is changed according to a residual amount of residual echo in the error signal comprises:

comparing the residual quantity with a preset threshold value;

if the residual quantity is greater than the preset threshold value, detecting that the echo path changes;

6. The audio signal processing method of claim 3, wherein the error signal is a difference between the audio signal and an estimated echo signal, wherein the estimated echo signal is estimated according to the first parameter and a downlink speech signal.

7. The audio signal processing method of claim 3, further comprising, before detecting whether the echo path is changed according to a residual amount of residual echo in the error signal:

judging whether a downlink voice signal is received or not;

and when the judgment result shows that the downlink voice signal is received, detecting whether the echo path is changed or not according to the residual quantity of the residual echo in the error signal.

8. The audio signal processing method according to claim 3, wherein the audio signal is acquired by a microphone.

9. An audio signal processing system, comprising:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring audio signals, and the audio signals comprise uplink voice signals and echo signals;

the adaptive echo cancellation module is coupled with the acquisition module and performs adaptive echo cancellation operation on the audio signal according to a first parameter to obtain an error signal, wherein the error signal comprises residual echo;

an echo path detection module coupled to the adaptive echo cancellation module, the echo path detection module detecting whether the echo path changes according to a residual amount of a residual echo in the error signal, wherein the residual amount is characterized by a mean square error of the error signal, the mean square error is characterized by a result of an exponential recursive weighting algorithm, and the residual amount is based on a formula

It is shown that, among others,

when detecting that the echo path changes, the echo path detection module generates a control signal to adjust the first parameter.

10. The audio signal processing system of claim 9, further comprising:

and the echo suppression module is coupled with the self-adaptive echo cancellation module and cancels the residual echo in the error signal according to a second parameter.

11. The audio signal processing system of claim 10, wherein the control signal is further configured to adjust the second parameter.

12. The audio signal processing system of claim 9, wherein the adaptive echo cancellation module comprises:

the adaptive filter receives a downlink voice signal and estimates an estimated echo signal according to the first parameter and the downlink voice signal;

and the computing unit is coupled with the adaptive filter and the acquisition module and computes to obtain the error signal according to the audio signal and the estimated echo signal.

13. The audio signal processing system of claim 9, wherein the echo path detection module comprises:

the comparison unit is used for comparing the residual quantity with a preset threshold value;

the first detection unit detects that the echo path changes if the residual quantity is greater than the preset threshold value;

and the second detection unit is used for determining that the echo path is not changed if the residual quantity is smaller than the preset threshold value.

14. The audio signal processing system of claim 9, further comprising:

and the receiving module is used for receiving the downlink voice signal.

15. The audio signal processing system of claim 14, wherein the echo path detection module is coupled to the receiving module, and wherein the echo path detection module detects whether the echo path has changed when it is determined that the receiving module receives a downlink voice signal.

16. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, performing the steps of the method according to any of the claims 1 to 8.

17. A terminal, comprising:

the audio acquisition equipment is used for acquiring audio signals;

the audio signal processing system of any one of claims 9 to 15 coupled with the audio capture device to capture the audio signal by the audio capture device and process it.

18. The terminal of claim 17, wherein the audio capture device comprises a microphone.