CN113903351A - Echo cancellation method, device, equipment and storage medium - Google Patents
Echo cancellation method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN113903351A CN113903351A CN202111171723.4A CN202111171723A CN113903351A CN 113903351 A CN113903351 A CN 113903351A CN 202111171723 A CN202111171723 A CN 202111171723A CN 113903351 A CN113903351 A CN 113903351A
- Authority
- CN
- China
- Prior art keywords
- voice interaction
- interaction device
- time delay
- time
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000003993 interaction Effects 0.000 claims abstract description 445
- 238000004422 calculation algorithm Methods 0.000 claims description 33
- 230000008030 elimination Effects 0.000 claims description 13
- 238000003379 elimination reaction Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 13
- 230000008859 change Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0224—Processing in the time domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Telephone Function (AREA)
- Circuit For Audible Band Transducer (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
The disclosure provides an echo cancellation method, apparatus, device and storage medium. The method comprises the following steps: the method comprises the steps that computing equipment estimates time delay between a reference signal played by a second voice interaction device and an echo signal corresponding to the acquired reference signal, wherein the second voice interaction device is a voice interaction device currently used by the computing equipment; and the computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay. The present disclosure improves echo cancellation effects.
Description
The application is a divisional application with the application number of 201910205707.9, the application date of 2019, 03 and 18, and the name of echo cancellation method, device, equipment and storage medium.
Technical Field
The present disclosure relates to the field of signal processing, and in particular, to a method, an apparatus, a device, and a storage medium for echo cancellation.
Background
Currently, in speech recognition, Echo Cancellation in a collected speech signal can be achieved through an Echo Cancellation process, such as an Acoustic Echo Cancellation (AEC) algorithm.
In the prior art, echo cancellation processing specifically cancels an echo signal included in a speech signal collected by a microphone according to a time delay between a played reference signal and an echo signal corresponding to the reference signal collected by the microphone, so as to obtain an original signal sent by a speaker, and avoid an echo caused by the echo signal being superimposed on the original signal. In general, the time delay used in performing the echo cancellation process is a default time delay, that is, an echo signal included in a voice signal collected by a microphone is cancelled based on the default time delay.
However, in the prior art, a default time delay is used in the echo cancellation processing, so that the echo cancellation effect is poor.
Disclosure of Invention
The embodiment of the disclosure provides an echo cancellation method, an echo cancellation device, an echo cancellation apparatus, and a storage medium, which are used to solve the problem in the prior art that an echo cancellation effect is poor due to the use of a default time delay in echo cancellation processing.
In a first aspect, an embodiment of the present disclosure provides an echo cancellation method, including:
when a voice interaction device used by computing equipment is changed from a first voice interaction device to a second voice interaction device, the computing equipment estimates the time delay between a reference signal played by the second voice interaction device and an echo signal corresponding to the acquired reference signal;
and the computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
In a possible implementation, if a connection object of the terminal computing device changes, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus.
In one possible implementation, if the computing device is changed from being connected with a target device to not being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus;
or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
In one possible implementation, the target device is a vehicle.
In a possible implementation, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
In one possible implementation, the estimating, by the computing device, a time delay between a reference signal played by the second voice interaction apparatus and a collected echo signal corresponding to the reference signal includes:
the computing equipment determines a time difference between each first time point in the first time points and a second time point corresponding to each first time point according to a plurality of first time points and a plurality of second time points corresponding to the first time points one by one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
the computing device determines a time delay of the reference signal and the echo signal according to the plurality of time differences.
In one possible implementation, the computing device determining, from the plurality of time differences, a time delay of the reference signal and the echo signal includes:
and the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
In one possible implementation, the predetermined estimation algorithm is a least mean square LMS algorithm.
In a possible implementation, the eliminating, by the computing device, the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay includes:
the computing equipment judges whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In one possible implementation, the canceling, by the terminal computing device, the echo signal in the acquired original signal according to the estimated time delay includes:
and the terminal computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device by adopting an Acoustic Echo Cancellation (AEC) algorithm according to the time delay obtained by estimation.
In a possible implementation, after the terminal computing device cancels the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay, the method further includes:
carrying out voice recognition on the voice signal obtained after the elimination to obtain a voice recognition result;
and performing subsequent processing according to the voice recognition result.
In one possible implementation, the subsequent processing includes a wake-up processing and/or an output processing.
In a second aspect, an embodiment of the present disclosure provides an echo cancellation apparatus applied to a computing device, including:
the estimation module is used for estimating the time delay between a reference signal played by the second voice interaction device and an acquired echo signal corresponding to the reference signal when the voice interaction device used by the computing equipment is changed from a first voice interaction device to a second voice interaction device;
and the elimination module is used for eliminating the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
In a possible implementation, if a connection object of the terminal computing device changes, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus.
In one possible implementation, if the computing device is changed from being connected with a target device to not being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus;
or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
In a possible implementation, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
In one possible implementation, the estimation module is specifically configured to:
determining a time difference between each first time point in the plurality of first time points and a second time point corresponding to each first time point according to the plurality of first time points and a plurality of second time points corresponding to the plurality of first time points one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
and determining the time delay of the reference signal and the echo signal according to the plurality of time differences.
In a possible implementation, the estimating module is configured to determine, according to the plurality of time differences, a time delay between the reference signal and the echo signal, and specifically includes:
and determining the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
In one possible implementation, the predetermined estimation algorithm is a least mean square LMS algorithm.
In one possible implementation, the cancellation module is specifically configured to:
judging whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In a possible implementation, the canceling module cancels the echo signal in the original signal collected by the second voice interaction apparatus according to the time delay, specifically including:
and according to the time delay obtained by estimation, eliminating the echo signal in the original signal acquired by the second voice interaction device by adopting an acoustic echo elimination AEC algorithm.
In one possible implementation, the apparatus further comprises: a response module;
the response module is configured to: carrying out voice recognition on the voice signal obtained after the elimination to obtain a voice recognition result; and performing subsequent processing according to the voice recognition result.
In one possible implementation, the subsequent processing includes a wake-up processing and/or an output processing.
In a third aspect, an embodiment of the present disclosure provides an echo cancellation device, including:
a processor and a memory for storing computer instructions; the processor executes the computer instructions to perform the method of any of the first aspects described above.
In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, where instructions that, when executed by a processor of an echo cancellation device, enable the echo cancellation device to perform the method of any one of the above first aspects.
In a fifth aspect, an embodiment of the present disclosure provides a computer program product, including: a computer program, stored in a readable storage medium, from which at least one processor of an electronic device can read the computer program, execution of the computer program by the at least one processor causing the electronic device to perform the method of any of the first aspects.
The echo cancellation method, apparatus, device and storage medium provided in the embodiments of the present disclosure, when a voice interaction device used by a computing device is changed from a first voice interaction device to a second voice interaction device, the computing device estimates a time delay between a reference signal played by the second voice interaction device and an echo signal corresponding to the acquired reference signal, and cancels the echo signal in an original signal acquired by the second voice interaction device according to the estimated time delay, so that when the voice interaction device used by the computing device is changed, the time delay of the changed voice interaction device can be estimated in time, and the echo signal in the original signal acquired by the changed voice interaction device is cancelled based on the estimated time delay, thereby not only avoiding a problem of poor echo cancellation effect due to the use of a default time delay, and the problem of poor echo cancellation effect caused by inaccurate time delay when the voice interaction device used by the computer equipment is changed or the time delay of the voice interaction device before the change is used for canceling the echo signal in the original signal collected by the changed voice interaction device can be avoided, and the echo cancellation effect is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a schematic view of a first application scenario of an echo cancellation method according to an embodiment of the present disclosure;
fig. 2 is a schematic view of an application scenario of the echo cancellation method according to the embodiment of the present disclosure;
fig. 3 is a schematic view of an application scenario of the echo cancellation method according to the embodiment of the present disclosure;
fig. 4 is a schematic flowchart of a first echo cancellation method according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of a second echo cancellation method according to an embodiment of the present disclosure;
fig. 6 is a schematic flowchart of a third echo cancellation method according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a first echo cancellation device according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a second echo cancellation device according to an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments obtained based on the embodiments in the disclosure belong to the protection scope of the disclosure.
Fig. 1 is a schematic view of an application scenario of the echo cancellation method according to the embodiment of the present disclosure, as shown in fig. 1, the application scenario may include a computing device 11, and the computing device 11 may include at least two voice interaction apparatuses, for example, a voice interaction apparatus a and a voice interaction apparatus b in fig. 1. Computing device 11 may use voice interaction means a or voice interaction means b to engage in voice interactions with the user. Specifically, the computing device 11 may use the voice interaction device a of the computing device 11 to collect voice, and use the voice interaction device b of the computing device 11 to perform voice playing, such as playing music and playing navigation; alternatively, the computing device 11 may use the voice interaction apparatus b of the computing device 11 to collect voice and use the voice interaction apparatus b of the computing device 11 to play voice.
Fig. 2 is a schematic view of an application scenario of the echo cancellation method according to the embodiment of the present disclosure, as shown in fig. 2, the application scenario may include a computing device 11 and a first target device 12, where the computing device 11 may include at least one voice interaction apparatus, the first target device 12 may include at least one voice interaction apparatus, for example, in fig. 1, the computing device 11 includes a voice interaction apparatus a, and the first target device 12 includes a voice interaction apparatus b. The computing device 11 may use the voice interaction means a of the computing device 11 or the voice interaction means b of the first target device 12 for voice interaction with the user. Specifically, the computing device 11 may use the voice interaction device a of the computing device 11 to collect voice, and use the voice interaction device b of the computing device 11 to perform voice playing, such as playing music and playing navigation; alternatively, the computing device 11 may use the voice interaction apparatus b of the first target device 12 to collect voice, and use the voice interaction apparatus b of the first target device 12 to perform voice playing.
Fig. 3 is a schematic diagram of an application scenario of the echo cancellation method according to the embodiment of the present disclosure, as shown in fig. 3, the application scenario may include a computing device 11, a first target device 12, and a second target device 12, where the first target device 12 may include at least one voice interaction apparatus, and the second target device 13 may include at least one voice interaction apparatus, for example, in fig. 1, the first target device 12 includes a voice interaction apparatus a, and the second target device 13 includes a voice interaction apparatus b. The computing device 11 may use the voice interaction means a of the first target apparatus 12 or the voice interaction means b of the second target apparatus 12 for voice interaction with the user. Specifically, the computing device 11 may use the voice interaction device a of the first target device 12 to collect voice, and use the voice interaction device b of the first target device 12 to perform voice playing, such as playing music and playing navigation; alternatively, the computing device 11 may use the second target device 13 voice interaction apparatus b to collect voice, and use the second target device 13 voice interaction apparatus b to perform voice playing.
It is understood that the above three application scenarios may be combined, and one application scenario may include the computing device 11, the first target device 12, and the second target device 12, where the computing device 11 may include at least two voice interaction apparatuses, and the first target device 12 and the second target device 13 may each include one voice interaction apparatus. Wherein, the computing device 11 may use a voice interaction apparatus of the computing device 11 to collect voice, and use the voice interaction apparatus of the computing device 11 to perform voice playing; alternatively, the computing device 11 may use another voice interaction apparatus of the computing device 11 to collect voice, and use the another voice interaction apparatus of the computing device 11 to perform voice playing; or, the computing device 11 may use the first target device 12 voice interaction apparatus to collect voice, and use the first target device 12 voice interaction apparatus to perform voice playing; the computing device 11 may use the second target device 13 voice interaction apparatus to collect voice and use the second target device 13 voice interaction apparatus to play voice.
It should be noted that the voice interaction device in the embodiment of the present disclosure may be any entity device capable of collecting voice and playing the voice.
It should be noted that the computing device (computing device)11 may specifically be a device capable of playing voice and collecting voice through the voice interaction apparatus, and may have a certain computing capability (e.g., estimating a time delay). For a specific type of computing device, the present disclosure may not be limited, and may be, for example, a cell phone, a tablet, a wearable device, and the like.
It should be noted that, the connection manner of the voice interaction apparatus between the computing device and the target device in fig. 2 and fig. 3 may not be limited in the present disclosure.
Fig. 4 is a schematic flowchart of a first embodiment of an echo cancellation method according to an embodiment of the present disclosure. The method of this embodiment may be performed by a computing device, as shown in fig. 4, and the method of this embodiment may include:
In this step, the first voice interaction device may be understood as the voice interaction device a, and the second voice interaction device may be understood as the voice interaction device b; alternatively, the first voice interaction device may be understood as the voice interaction device b, and the second voice interaction device may be understood as the voice interaction device a. The voice interaction device used by the computing equipment can be understood as a voice interaction device used by the computing equipment for playing and collecting voice, and a user can perform voice interaction with the computing equipment through the voice interaction device.
For the application scenario shown in fig. 1, the voice interaction means used by the computing device is changed from the first voice interaction means to the second voice interaction means, for example, the voice interaction means used by the computing device 11 can be changed from the voice interaction means a of the computing device 11 to the voice interaction means b of the computing device 11. At this time, the voice interaction device a of the computing apparatus 11 may be understood as a first voice interaction device, and the voice interaction device b of the computing apparatus 11 may be understood as a second voice interaction device.
For the application scenario shown in fig. 2, the voice interaction means used by the computing device is changed from the first voice interaction means to the second voice interaction means, for example, the voice interaction means used by the computing device 11 is changed from the voice interaction means a of the computing device 11 to the voice interaction means b of the first target device 12. At this time, the voice interaction apparatus a of the computing device 11 may be understood as a first voice interaction apparatus, and the voice interaction apparatus b of the first target device 12 may be understood as a second voice interaction apparatus.
For the application scenario shown in fig. 3, the voice interaction means used by the computing device is changed from the first voice interaction means to the second voice interaction means, for example, the voice interaction means used by the computing device 11 is changed from the voice interaction means a of the first target device 12 to the voice interaction means b of the second target device 13. At this time, the voice interaction apparatus a of the first target device 12 may be understood as a first voice interaction apparatus, and the voice interaction apparatus b of the second target device 13 may be understood as a second voice interaction apparatus.
The voice signal played by the computing device using the voice interaction device may be referred to as a reference signal, and the voice signal collected by the computing device using the voice interaction device may be referred to as an original signal. It is understood that after the reference signal is played by the computing device, the played sound may be collected by the voice interaction apparatus, i.e. the collected original signal may include the voice signal played by the reference signal computing device.
Due to different hardware structures of different voice interaction devices, the time delay between the reference signal played by the computing equipment and the echo signal corresponding to the acquired reference signal by the different voice interaction devices may be different. Here, by estimating the time delay between the reference signal played by the second voice interaction device and the acquired echo signal corresponding to the reference signal when the voice interaction device used by the computing device is changed from the first voice interaction device to the second voice interaction device, the time delay between the changed reference signal played by the second voice interaction device and the acquired echo signal corresponding to the reference signal can be estimated in time when the voice interaction device used by the computing device is changed.
It will be appreciated that during the playing of the reference signal by the computing device, the original signal collected may also include the user's speech signal when the user speaks.
It should be noted that, the disclosure may not be limited to a specific manner in which the computing device estimates a time delay between a reference signal played by the second voice interaction apparatus and an echo signal corresponding to the collected reference signal.
It should be noted that, for a specific manner in which the computing device determines that the voice interaction apparatus used by the computing device is changed from the first voice interaction apparatus to the second voice interaction apparatus, the embodiment of the present disclosure may not be limited, for example, the computing device may monitor the used voice interaction apparatus to determine whether the used voice interaction apparatus is changed, that is, whether the used voice interaction apparatus is changed from the first voice interaction apparatus to the second voice interaction apparatus.
In this step, as to a specific manner of canceling the echo signal in the original signal acquired by the second voice interaction apparatus according to the time delay estimated and obtained in step 401, the embodiment of the present disclosure may not be limited, for example, the reference signal may be moved according to the time delay estimated and the echo signal in the original signal acquired by the second voice interaction apparatus may be cancelled according to the acquired original signal and the moved reference signal.
Here, since in step 401, when the speech interaction device used by the computing apparatus is changed from the first speech interaction device to the second speech interaction device, the time delay of the echo signal corresponding to the reference signal played by the second speech interaction device and the acquired reference signal is estimated, so that in step 402, the time delay of the echo signal corresponding to the reference signal played by the second speech interaction device and the acquired reference signal can be used to cancel the echo signal in the original signal acquired by the second speech interaction device, thereby avoiding that when the speech interaction device is changed from the first speech interaction device to the second speech interaction device, or when the echo signal in the original signal acquired by the second speech interaction device is canceled by using the time delay of the echo signal corresponding to the reference signal played by the first speech interaction device and the acquired reference signal, the echo cancellation effect is poor due to inaccurate time delay.
In the echo cancellation method provided in this embodiment, when the voice interaction apparatus used by the computing device is changed from the first voice interaction apparatus to the second voice interaction apparatus, the computing device estimates a time delay between the reference signal played by the second voice interaction apparatus and the echo signal corresponding to the acquired reference signal, and cancels the echo signal in the original signal acquired by the second voice interaction apparatus according to the estimated time delay, so that when the voice interaction apparatus used by the computing device is changed, the time delay of the changed voice interaction apparatus can be estimated in time, and the echo signal in the original signal acquired by the changed voice interaction apparatus is cancelled based on the estimated time delay, which not only can avoid the problem of poor echo cancellation effect caused by using a default time delay, but also can avoid that when the voice interaction apparatus used by the computing device is changed, or when the echo signal in the original signal collected by the voice interaction device (i.e. the second voice interaction device) after the change is eliminated by using the time delay of the voice interaction device (i.e. the first voice interaction device) before the change, the echo elimination effect is poor due to inaccurate time delay, and the echo elimination effect is improved.
Fig. 5 is a flowchart illustrating a second echo cancellation method according to an embodiment of the present disclosure. On the basis of the embodiment shown in fig. 5, this embodiment mainly describes an optional implementation manner in which, when the voice interaction apparatus changes, the computing device estimates a time delay between a reference signal played by the second voice interaction apparatus and an echo signal corresponding to the acquired reference signal.
In this step, if the connection object of the computing device changes, it may indicate that the voice interaction apparatus changes, that is, the voice interaction apparatus used by the computing device changes from the first voice interaction apparatus to the second voice interaction apparatus. If the connection object of the computing device is not changed, it may indicate that the voice interaction apparatus is not changed, that is, the voice interaction apparatus used by the computing device is not changed from the first voice interaction apparatus to the second voice interaction apparatus.
The first voice interaction device can be understood as a voice interaction device used before the voice interaction device used by the computing equipment is changed. The second voice interaction device can be understood as a voice interaction device used by the computing equipment after being changed.
Optionally, the connection object of the computing device changes, specifically, the change may be a change between two states, that is, the computing device is connected with the target device, and the computing device is not connected with the target device.
Specifically, if the computing device is changed from being connected with a target device to being unconnected with the target device, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus; or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from the first voice interaction apparatus to the second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
For example, as shown in fig. 2, when the computing device 11 is connected to the first target device 12, the computing device 11 may perform voice interaction with the user using the voice interaction apparatus b of the first target device 12; when the computing device 11 is not connected with the first target device 12, the computing device 11 may perform voice interaction with the user using the voice interaction means a of the computing device 11. Therefore, when the connection state of the computing device 11 and the first target device 12 changes, the voice interaction apparatus that can represent the usage of the computing device changes from the first voice interaction apparatus to the second voice interaction apparatus. Specifically, when the computing device 11 changes from being connected with the first target device 12 to not being connected with the first target device, the voice interaction apparatus b may be regarded as a first voice interaction apparatus, and the voice interaction apparatus a may be regarded as a second voice interaction apparatus; when the computing device 11 is changed from being unconnected to the first target device 12 to being connected to the first target device, the voice interaction apparatus a may be regarded as a first voice interaction apparatus, and the voice interaction apparatus b may be regarded as a second voice interaction apparatus.
It should be noted that the target device may specifically be a device that the computing device 11 can establish a connection with and can control part of hardware of the target device, where the part of hardware includes a voice interaction device. For example, the target device may be a vehicle, and in this case, the computing device may be a computing device that supports a specific function that is a function that the computing device can establish a connection with the target device and can control part of hardware of the target device.
Or, optionally, the connection object of the computing device changes, specifically, the change between two states of the computing device being connected to one target device and the computing device being connected to another target device may be used. Specifically, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from the first voice interaction apparatus to the second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
For example, as shown in fig. 3, when the computing device 11 is connected to the first target device 12, the computing device 11 may perform voice interaction with the user using the voice interaction apparatus a of the first target device 12; when the computing device 11 is connected with the second target device 13, the computing device 11 may perform voice interaction with the user using the voice interaction apparatus b of the second target device 13. Therefore, when the connection state of the computing device 11 with the first target device 12 and the second target device 13 changes, the voice interaction apparatus that can represent the use of the computing device changes from the first voice interaction apparatus to the second voice interaction apparatus. Specifically, when the computing device 11 changes from being connected with the first target device 12 to being connected with the second target device 13, the voice interaction apparatus a may be regarded as a first voice interaction apparatus, and the voice interaction apparatus b may be regarded as a second voice interaction apparatus; when the computing device 11 is changed from being connected with the second target device 13 to being connected with the first target device 12, the voice interaction apparatus b can be regarded as a first voice interaction apparatus, and the voice interaction apparatus a can be regarded as a second voice interaction apparatus.
If the connection object of the computing device changes, executing step 502; and if the connection object of the computing equipment is not changed, ending the process.
In this step, the second voice interaction apparatus may be understood as a voice interaction apparatus currently used by the computing device. Optionally, the time delay may be determined by:
step A, the computing device determines a time difference between each first time point in the plurality of first time points and a second time point corresponding to each first time point according to the plurality of first time points and a plurality of second time points corresponding to the plurality of first time points one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point.
Here, in order to avoid the problem that the determined time delay is inaccurate due to inaccuracy of a single time difference, optionally, a plurality of time differences may be obtained according to the plurality of first time points and the plurality of second time points. For example, the computing device may record a time point 1 (which may be understood as a first time point) at which the speech signal x is played (which may be understood as a reference signal), collect an original signal, and record a time point 2 at which the original signal is collected, where if the speech signal x is included in the original signal, the time point 2 is a second time point corresponding to the time point 1, and further, may obtain a time difference between the time point 2 and the time point 1. For another example, when playing the voice signal y (which may be understood as a reference signal), the computing device may record a time point 3 (which may be understood as a first time point) at which the voice signal y is played, collect an original signal, and record a time point 4 at which the original signal is collected, where if the original signal includes the voice signal y, the time point 4 is a second time point corresponding to the time point 3, and further, may obtain a time difference between the time point 4 and the time point 3.
It should be noted that, the disclosure is not limited to a specific manner of including the reference signal in the acquired original signal.
And step B, the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences.
Specifically, the time delay between the reference signal and the echo signal may be obtained by performing mathematical calculation on a plurality of time differences, for example, the time delay may be obtained by averaging a plurality of time differences. Optionally, when the time delay is obtained according to the time difference, a certain estimation algorithm may be adopted. Further optionally, step B may specifically include: and the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
Illustratively, the pre-set estimation algorithm is a Least-Mean-Square (LMS) algorithm. Here, the preset estimation algorithm is an LMS algorithm, so that the time delay is determined by adopting a machine learning mode according to a plurality of time differences, and the accuracy of time delay determination is improved.
In this step, optionally, the echo signal in the acquired original signal may be eliminated by using an AEC algorithm. Specifically, step 503 may include: and the computing equipment adopts an Acoustic Echo Cancellation (AEC) algorithm to cancel the echo signal in the original signal acquired by the second voice interaction device according to the time delay obtained by estimation.
Considering that the applicable delay range of an AEC algorithm is certain after the AEC algorithm is determined, so as to avoid the problem that the echo cancellation effect is poor due to the determined delay being outside the certain delay range, optionally, step 503 may specifically include: the computing equipment judges whether the time delay is within a preset time delay range or not; if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay; and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In the echo cancellation method provided in this embodiment, whether a connection object of the computing device changes is determined, and if the connection object of the computing device changes, the computing device estimates a time delay between a reference signal played by the second voice interaction apparatus and an echo signal corresponding to the acquired reference signal, and the computing device cancels the echo signal in the acquired original signal according to the time delay obtained by estimation, so that the fact that the connection object of the computing device changes to represent that the voice interaction apparatus used by the computing device is changed from the first voice interaction apparatus to the second voice interaction apparatus is realized.
Fig. 6 is a schematic flowchart of a third echo cancellation method according to an embodiment of the present disclosure. On the basis of the foregoing embodiments, the present embodiment mainly describes an alternative implementation manner after performing echo cancellation. As shown in fig. 6, the method of this embodiment may include:
It should be noted that step 601 is similar to step 401, and is not described herein again.
It should be noted that step 602 is similar to step 402, and is not described herein again.
In this step, the speech recognition result may be, for example, "power on", "weather", or the like. The present disclosure is not limited to a specific embodiment of performing speech recognition on a speech signal obtained after cancellation.
Since the echo cancellation effect can be improved in step 601 and step 602, the accuracy of the speech signal on which the speech recognition is performed in step 603 is higher, so that the accuracy of the speech recognition result can be improved.
And step 604, performing subsequent processing according to the voice recognition result.
In this step, after the voice recognition result is obtained, certain processing may be performed based on the voice recognition result. Here, the present disclosure may not be limited as to the type of processing, and the subsequent processing may include, for example, a wake-up processing and/or an output processing.
For the wake-up processing, for example, it may be determined whether the voice recognition result is the same as a preset wake-up instruction, and if the voice recognition result is the same as the preset result, the application program corresponding to the preset wake-up instruction of the computing device is woken up. For the output process, for example, the speech recognition result may be output in a text box of an input interface.
In the echo cancellation method provided by this embodiment, when the voice interaction device used by the computing device is changed from the first voice interaction device to the second voice interaction device, the computing device estimates a time delay between the reference signal played by the second voice interaction device and the echo signal corresponding to the acquired reference signal, and according to the estimated time delay, the computing device cancels the echo signal in the acquired original signal, performs voice recognition on the voice signal obtained after cancellation to obtain a voice recognition result, and performs subsequent processing according to the voice recognition result.
Fig. 7 is a schematic structural diagram of a first embodiment of an echo cancellation device according to the present disclosure, where the device provided in this embodiment may be applied to the foregoing method embodiment to implement the function of a computing device thereof. As shown in fig. 7, the apparatus of the present embodiment may include: an estimation module 701 and a cancellation module 702.
The estimating module 701 is configured to estimate a time delay between a reference signal played by a second voice interaction device and an acquired echo signal corresponding to the reference signal when a voice interaction device used by the computing device is changed from a first voice interaction device to the second voice interaction device;
a cancellation module 702, configured to cancel, according to the estimated time delay, the echo signal in the original signal collected by the second voice interaction apparatus.
In a possible implementation, if a connection object of the computing device changes, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus.
In one possible implementation, if the computing device is changed from being connected with a target device to not being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus;
or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
In one possible implementation, the target device is a vehicle.
In a possible implementation, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
In one possible implementation, the estimation module 701 is specifically configured to:
determining a time difference between each first time point in the plurality of first time points and a second time point corresponding to each first time point according to the plurality of first time points and a plurality of second time points corresponding to the plurality of first time points one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
and determining the time delay of the reference signal and the echo signal according to the plurality of time differences.
In a possible implementation, the estimating module 701 is configured to determine, according to the multiple time differences, a time delay between the reference signal and the echo signal, and specifically includes:
and determining the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
In one possible implementation, the predetermined estimation algorithm is a least mean square LMS algorithm.
In one possible implementation, the elimination module 702 is specifically configured to:
judging whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In a possible implementation, the eliminating module 702 eliminates the echo signal in the original signal collected by the second voice interaction apparatus according to the time delay, specifically including:
and according to the time delay obtained by estimation, eliminating the echo signal in the original signal acquired by the second voice interaction device by adopting an acoustic echo elimination AEC algorithm.
In one possible implementation, the apparatus further comprises: a response module 703;
the response module 703 is configured to: carrying out voice recognition on the voice signal obtained after the elimination to obtain a voice recognition result; and performing subsequent processing according to the voice recognition result.
In one possible implementation, the subsequent processing includes a wake-up processing and/or an output processing.
The apparatus of this embodiment may be configured to implement the technical solutions of the embodiments shown in the foregoing methods, and the implementation principles and technical effects are similar, which are not described herein again.
Fig. 8 is a schematic structural diagram of a second echo cancellation device according to an embodiment of the present disclosure, and as shown in fig. 8, the device may include: a processor 801 and a memory 802 for storing computer instructions.
Wherein, the processor 801 executes the computer instructions to execute the following method:
when a voice interaction device used by computing equipment is changed from a first voice interaction device to a second voice interaction device, the computing equipment estimates the time delay between a reference signal played by the second voice interaction device and an echo signal corresponding to the acquired reference signal;
and the computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
In a possible implementation, if a connection object of the computing device changes, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus.
In one possible implementation, if the computing device is changed from being connected with a target device to not being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus;
or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
In one possible implementation, the target device is a vehicle.
In a possible implementation, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
In one possible implementation, the method for estimating, by the computing device, a time delay between a played reference signal and a collected echo signal corresponding to the reference signal includes:
the computing equipment determines a time difference between each first time point in the first time points and a second time point corresponding to each first time point according to a plurality of first time points and a plurality of second time points corresponding to the first time points one by one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
the computing device determines a time delay of the reference signal and the echo signal according to the plurality of time differences.
In one possible implementation, the computing device determining, from the plurality of time differences, a time delay of the reference signal and the echo signal includes:
and the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
In one possible implementation, the predetermined estimation algorithm is a least mean square LMS algorithm.
In a possible implementation, the eliminating, by the computing device, the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay includes:
the computing equipment judges whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In a possible implementation, the eliminating, by the computing device, the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay includes:
and the computing equipment adopts an Acoustic Echo Cancellation (AEC) algorithm to cancel the echo signal in the original signal acquired by the second voice interaction device according to the time delay obtained by estimation.
In a possible implementation, after the computing device cancels the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay, the computing device further includes:
carrying out voice recognition on the voice signal obtained after the elimination to obtain a voice recognition result;
and performing subsequent processing according to the voice recognition result.
In one possible implementation, the subsequent processing includes a wake-up processing and/or an output processing.
The disclosed embodiments also provide a computer-readable storage medium, where instructions, when executed by a processor of an echo cancellation device, enable the echo cancellation device to perform an echo cancellation method, the method comprising:
when a voice interaction device used by a computing device is changed from a first voice interaction device to a second voice interaction device, the computing device estimates the time delay between a reference signal played by the second voice interaction device and an echo signal corresponding to the acquired reference signal, wherein the voice interaction device is used for voice interaction between a user and the computing device;
and the computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
In a possible implementation, if a connection object of the computing device changes, a voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus.
In one possible implementation, if the computing device is changed from being connected with a target device to not being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the target device includes the first voice interaction apparatus, and the computing device includes the second voice interaction apparatus;
or, if the computing device is changed from being not connected with the target device to being connected with the target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the computing device includes the first voice interaction apparatus, and the target device includes the second voice interaction apparatus.
In one possible implementation, the target device is a vehicle.
In a possible implementation, if the computing device is changed from being connected with a first target device to being connected with a second target device, the voice interaction apparatus used by the computing device is changed from a first voice interaction apparatus to a second voice interaction apparatus, the first target device includes the first voice interaction apparatus, and the second target device includes the second voice interaction apparatus.
In one possible implementation, the estimating, by the computing device, a time delay between a reference signal played by the second voice interaction apparatus and a collected echo signal corresponding to the reference signal includes:
the computing equipment determines a time difference between each first time point in the first time points and a second time point corresponding to each first time point according to a plurality of first time points and a plurality of second time points corresponding to the first time points one by one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
the computing device determines a time delay of the reference signal and the echo signal according to the plurality of time differences.
In one possible implementation, the computing device determining, from the plurality of time differences, a time delay of the reference signal and the echo signal includes:
and the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
In one possible implementation, the predetermined estimation algorithm is a least mean square LMS algorithm.
In a possible implementation, the eliminating, by the computing device, the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay includes:
the computing equipment judges whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
In a possible implementation, the eliminating, by the computing device, the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay includes:
and the computing equipment adopts an Acoustic Echo Cancellation (AEC) algorithm to cancel the echo signal in the original signal acquired by the second voice interaction device according to the time delay obtained by estimation.
In a possible implementation, after the computing device cancels the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay, the computing device further includes:
carrying out voice recognition on the voice signal obtained after the elimination to obtain a voice recognition result;
and performing subsequent processing according to the voice recognition result.
In one possible implementation, the subsequent processing includes a wake-up processing and/or an output processing.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.
Claims (15)
1. An echo cancellation method, comprising:
the method comprises the steps that computing equipment estimates time delay between a reference signal played by a second voice interaction device and an echo signal corresponding to the acquired reference signal, wherein the second voice interaction device is a voice interaction device currently used by the computing equipment;
and the computing equipment eliminates the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
2. The method of claim 1, wherein the computing device estimating a time delay between a reference signal played by a second voice interaction device and the acquired echo signal corresponding to the reference signal comprises:
the computing equipment determines a time difference between each first time point in the first time points and a second time point corresponding to each first time point according to a plurality of first time points and a plurality of second time points corresponding to the first time points one by one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
the computing device determines a time delay of the reference signal and the echo signal according to the plurality of time differences.
3. The method of claim 2, wherein the computing device determining the time delay of the reference signal and the echo signal from the plurality of time differences comprises:
and the computing equipment determines the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
4. The method according to any one of claims 1 to 3, wherein the computing device cancels the echo signal in the original signal collected by the second voice interaction apparatus according to the estimated time delay, and the method includes:
the computing equipment judges whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
5. The method of any one of claims 1 to 4, wherein before the computing device estimates the time delay between the reference signal played by the second voice interaction apparatus and the acquired echo signal corresponding to the reference signal, the computing device further comprises:
the computing device determines the second voice interaction device as the currently used voice interaction device.
6. The method of claim 5, wherein the computing device determining that the currently used voice interaction apparatus is the second voice interaction apparatus comprises at least one of:
if the computing equipment plays and collects voice through the second voice interaction device, the computing equipment determines the second voice interaction device as a currently used voice interaction device; alternatively, the first and second electrodes may be,
if the computing equipment is changed from being connected with target equipment to not being connected with the target equipment, the computing equipment determines the second voice interaction device as a currently used voice interaction device, the target equipment comprises a first voice interaction device, and the computing equipment comprises the second voice interaction device; alternatively, the first and second electrodes may be,
if the computing equipment is changed from being not connected with the target equipment to being connected with the target equipment, the computing equipment determines the second voice interaction device as a currently used voice interaction device, the computing equipment comprises a first voice interaction device, and the target equipment comprises the second voice interaction device; alternatively, the first and second electrodes may be,
if the computing equipment is changed from being connected with first target equipment to being connected with second target equipment, the computing equipment determines the second voice interaction device to be a currently used voice interaction device, the first target equipment comprises the first voice interaction device, and the second target equipment comprises the second voice interaction device.
7. An echo cancellation device, comprising:
the estimation module is used for estimating the time delay between a reference signal played by a second voice interaction device and an acquired echo signal corresponding to the reference signal, wherein the second voice interaction device is a voice interaction device currently used by the computing equipment;
and the elimination module is used for eliminating the echo signal in the original signal acquired by the second voice interaction device according to the estimated time delay.
8. The apparatus of claim 7, wherein the estimation module is specifically configured to:
determining a time difference between each first time point in the plurality of first time points and a second time point corresponding to each first time point according to the plurality of first time points and a plurality of second time points corresponding to the plurality of first time points one to obtain a plurality of time differences, wherein the first time point is a time point when the second voice interaction device plays the reference signal, and the second time point is a time point when the second voice interaction device acquires the echo signal corresponding to the reference signal played by the corresponding first time point;
and determining the time delay of the reference signal and the echo signal according to the plurality of time differences.
9. The apparatus of claim 8, wherein the estimation module is specifically configured to:
and determining the time delay of the reference signal and the echo signal according to the time differences and a preset estimation algorithm.
10. The apparatus of any one of claims 7 to 9, the cancellation module being specifically configured to:
judging whether the time delay is within a preset time delay range or not;
if the time delay is within the time delay range, eliminating the echo signal in the original signal collected by the second voice interaction device according to the time delay;
and if the time delay is not in the time delay range, eliminating the echo signal in the original signal acquired by the second voice interaction device according to the time delay in the time delay range.
11. The apparatus of any of claims 7 to 10, the estimation module further to:
and determining the second voice interaction device as the currently used voice interaction device.
12. The apparatus of claim 11, wherein the estimation module is specifically configured to perform at least one of:
if the computing equipment plays and collects voice through the second voice interaction device, determining the second voice interaction device as a currently used voice interaction device; alternatively, the first and second electrodes may be,
if the computing equipment is changed from being connected with target equipment to not being connected with the target equipment, determining the second voice interaction device as a currently used voice interaction device, wherein the target equipment comprises a first voice interaction device, and the computing equipment comprises the second voice interaction device; alternatively, the first and second electrodes may be,
if the computing equipment is changed from being not connected with the target equipment to being connected with the target equipment, determining the second voice interaction device as a currently used voice interaction device, wherein the computing equipment comprises a first voice interaction device, and the target equipment comprises the second voice interaction device; alternatively, the first and second electrodes may be,
and if the computing equipment is changed from being connected with first target equipment to being connected with second target equipment, determining the second voice interaction device as a currently used voice interaction device, wherein the first target equipment comprises the first voice interaction device, and the second target equipment comprises the second voice interaction device.
13. An echo cancellation device, comprising:
a processor and a memory for storing computer instructions; the processor executes the computer instructions to perform the method of any of claims 1-6.
14. A computer-readable storage medium having instructions that, when executed by a processor of an echo cancellation device, enable the echo cancellation device to perform the method of any of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111171723.4A CN113903351A (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910205707.9A CN110265048B (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
CN202111171723.4A CN113903351A (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910205707.9A Division CN110265048B (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113903351A true CN113903351A (en) | 2022-01-07 |
Family
ID=67913077
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111171723.4A Pending CN113903351A (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
CN201910205707.9A Active CN110265048B (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910205707.9A Active CN110265048B (en) | 2019-03-18 | 2019-03-18 | Echo cancellation method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN113903351A (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111613238B (en) * | 2020-05-21 | 2023-09-19 | 阿波罗智联(北京)科技有限公司 | Method, device, equipment and storage medium for determining delay between signals |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130332156A1 (en) * | 2012-06-11 | 2013-12-12 | Apple Inc. | Sensor Fusion to Improve Speech/Audio Processing in a Mobile Device |
CN106470284B (en) * | 2015-08-20 | 2020-02-11 | 钉钉控股(开曼)有限公司 | Method, device, system, server and communication device for eliminating acoustic echo |
US9997151B1 (en) * | 2016-01-20 | 2018-06-12 | Amazon Technologies, Inc. | Multichannel acoustic echo cancellation for wireless applications |
US9779755B1 (en) * | 2016-08-25 | 2017-10-03 | Google Inc. | Techniques for decreasing echo and transmission periods for audio communication sessions |
CN106210371B (en) * | 2016-08-31 | 2018-09-18 | 广州视源电子科技股份有限公司 | A kind of the determination method, apparatus and intelligent meeting equipment of echo delay time |
JP6670224B2 (en) * | 2016-11-14 | 2020-03-18 | 株式会社日立製作所 | Audio signal processing system |
US9916840B1 (en) * | 2016-12-06 | 2018-03-13 | Amazon Technologies, Inc. | Delay estimation for acoustic echo cancellation |
CN109285554B (en) * | 2017-07-20 | 2023-07-07 | 阿里巴巴集团控股有限公司 | Echo cancellation method, server, terminal and system |
CN107785026B (en) * | 2017-10-18 | 2020-10-20 | 会听声学科技(北京)有限公司 | Time delay estimation method for indoor echo cancellation of set top box |
CN107966910B (en) * | 2017-11-30 | 2021-08-03 | 深圳Tcl新技术有限公司 | Voice processing method, intelligent sound box and readable storage medium |
CN109040501A (en) * | 2018-09-10 | 2018-12-18 | 成都擎天树科技有限公司 | A kind of echo cancel method improving VOIP phone quality |
CN109087660A (en) * | 2018-09-29 | 2018-12-25 | 百度在线网络技术(北京)有限公司 | Method, apparatus, equipment and computer readable storage medium for echo cancellor |
CN109273020B (en) * | 2018-09-29 | 2022-04-19 | 阿波罗智联(北京)科技有限公司 | Audio signal processing method, apparatus, device and storage medium |
-
2019
- 2019-03-18 CN CN202111171723.4A patent/CN113903351A/en active Pending
- 2019-03-18 CN CN201910205707.9A patent/CN110265048B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110265048A (en) | 2019-09-20 |
CN110265048B (en) | 2021-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7018130B2 (en) | Echo cancellation method and equipment based on delay time estimation | |
EP2573768A2 (en) | Reverberation suppression device, reverberation suppression method, and computer-readable storage medium storing a reverberation suppression program | |
WO2020097828A1 (en) | Echo cancellation method, delay estimation method, echo cancellation apparatus, delay estimation apparatus, storage medium, and device | |
CN113507662B (en) | Noise reduction processing method, apparatus, device, storage medium, and program | |
CN109756818B (en) | Dual-microphone noise reduction method and device, storage medium and electronic equipment | |
WO2020252629A1 (en) | Residual acoustic echo detection method, residual acoustic echo detection device, voice processing chip, and electronic device | |
CN111048061A (en) | Method, device and equipment for obtaining step length of echo cancellation filter | |
CN105432062A (en) | Echo removal | |
CN111028855B (en) | Echo suppression method, device, equipment and storage medium | |
CN111081246B (en) | Method and device for awakening live broadcast robot, electronic equipment and storage medium | |
CN110265048B (en) | Echo cancellation method, device, equipment and storage medium | |
CN113329315B (en) | Detection method, device and equipment of audio playing equipment and storage medium | |
CN112289336B (en) | Audio signal processing method and device | |
CN110021289B (en) | Sound signal processing method, device and storage medium | |
CN110913312B (en) | Echo cancellation method and device | |
JP5438629B2 (en) | Stereo echo canceling method, stereo echo canceling device, stereo echo canceling program | |
JP6537997B2 (en) | Echo suppressor, method thereof, program, and recording medium | |
JP6644213B1 (en) | Acoustic signal processing device, acoustic system, acoustic signal processing method, and acoustic signal processing program | |
US20220270630A1 (en) | Noise suppression apparatus, method and program for the same | |
CN110232905B (en) | Uplink noise reduction method and device and electronic equipment | |
JP6343585B2 (en) | Unknown transmission system estimation device, unknown transmission system estimation method, and program | |
CN112201266B (en) | Echo suppression method and device | |
KR102218742B1 (en) | Adaptive delay diversity filter, echo cancel device using the same, and echo cancel method thereof | |
JP2018061228A (en) | Noise suppression device and noise suppression method | |
JP4094523B2 (en) | Echo canceling apparatus, method, echo canceling program, and recording medium recording the program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |