CN113824843A - Voice call quality detection method, device, equipment and storage medium - Google Patents

Voice call quality detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN113824843A
CN113824843A CN202010566405.7A CN202010566405A CN113824843A CN 113824843 A CN113824843 A CN 113824843A CN 202010566405 A CN202010566405 A CN 202010566405A CN 113824843 A CN113824843 A CN 113824843A
Authority
CN
China
Prior art keywords
signal
voice
terminal
residual signal
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010566405.7A
Other languages
Chinese (zh)
Other versions
CN113824843B (en
Inventor
王夏鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Original Assignee
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Volkswagen Mobvoi Beijing Information Technology Co Ltd filed Critical Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority to CN202010566405.7A priority Critical patent/CN113824843B/en
Publication of CN113824843A publication Critical patent/CN113824843A/en
Application granted granted Critical
Publication of CN113824843B publication Critical patent/CN113824843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2236Quality of speech transmission monitoring

Abstract

The embodiment of the invention discloses a voice call quality detection method, a voice call quality detection device, voice call quality detection equipment and a storage medium. The method comprises the following steps: in the voice communication process between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal and sending the first voice signal to the second terminal so that the second terminal plays the first voice signal; acquiring a second voice signal acquired by a second terminal; determining an estimated voice signal according to the first voice signal and the voice call parameter; determining the voice call state of the second terminal according to the first voice signal and the second voice signal; and if the voice call state is the unmanned call, determining a first residual signal according to the second voice signal and the estimated voice signal, and generating a voice receiving detection result of the second terminal. The embodiment of the invention can avoid the feedback obstacle of the voice receiving problem between the calling parties caused by abnormal voice receiving in the voice calling process, optimize the existing voice calling quality detection mode and improve the calling efficiency.

Description

Voice call quality detection method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a voice call quality detection method, a voice call quality detection device, voice call quality detection equipment and a storage medium.
Background
In the current voice communication system, the system can be used as the basis for judging the communication quality by detecting the network speed, the packet loss rate and the like, so that the other party with normal signals is actively prompted to inform the other party that the normal communication can not be realized because of the abnormal network in the voice communication process. In actual use, however, there are still a large number of such scenarios: the speaking party needs to confirm whether the sound is normally received (whether the sound of the speaking party can be heard by the sound receiving party) and then start the conversation.
In the related art, the speaking party usually determines whether the receiving party confirms normal receiving according to the active oral feedback of the receiving party. In the process of communication, if the opposite party of the sound receiving party receives sound abnormally, the speaking party cannot know the sound directly and depends on the active oral feedback of the sound receiving party. However, at this time, the call exchange process itself is affected due to abnormal sound reception of the sound receiver, and the period of suspending the call to eliminate the abnormal call is long when the problem occurs and the sound receiver and the speaker both confirm the occurrence of the problem, thereby affecting the call efficiency.
Disclosure of Invention
The embodiment of the invention provides a voice call quality detection method, a voice call quality detection device, voice call quality detection equipment and a storage medium, which are used for optimizing the existing voice call quality detection mode, actively detecting abnormal sound reception and improving the call efficiency.
In a first aspect, an embodiment of the present invention provides a method for detecting voice call quality, including:
in the process of voice communication between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal;
acquiring a second voice signal corresponding to the first voice signal and acquired by the second terminal;
determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal;
determining the voice call state of the second terminal according to the first voice signal and the second voice signal;
and if the voice call state of the second terminal is unmanned call, determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal, and generating a voice reception detection result of the second terminal according to the first residual signal.
In a second aspect, an embodiment of the present invention further provides a device for detecting voice call quality, including:
the first signal acquisition module is used for acquiring a first voice signal acquired by a first terminal in the voice call process between the first terminal and a second terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal;
the second signal acquisition module is used for acquiring a second voice signal which is acquired by the second terminal and corresponds to the first voice signal;
the estimated signal determining module is used for determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal;
the call state determining module is used for determining the voice call state of the second terminal according to the first voice signal and the second voice signal;
and the first result generation module is used for determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal and generating a voice receiving detection result of the second terminal according to the first residual signal if the voice call state of the second terminal is unmanned call.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the voice call quality detection method according to the embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the voice call quality detection method according to the embodiment of the present invention.
According to the technical scheme of the embodiment of the invention, the echo signal of the sound receiving party in the voice communication process is estimated according to the reference signal and is compared with the echo signal of the actual sound receiving party through the algorithm, whether the sound receiving party receives the sound normally or not is judged, and the corresponding prompt is made for the speaking party, so that the sound receiving problem feedback obstacle between the speaking parties caused by abnormal sound receiving in the voice communication process is avoided, the existing voice communication quality detection mode is optimized, and the communication efficiency is improved.
Drawings
Fig. 1 is a flowchart of a voice call quality detection method according to an embodiment of the present invention.
Fig. 2 is a flowchart of a voice call quality detection method according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a voice call quality detection apparatus according to a third embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.
It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a flowchart of a voice call quality detection method according to an embodiment of the present invention. The embodiment of the invention can be suitable for the situation that the feedback of the voice receiving problem between the calling parties is obstructed under the condition of abnormal voice receiving in the voice calling process, the method can be executed by the voice calling quality detection device provided by the embodiment of the invention, and the device can be realized by adopting a software and/or hardware mode and can be generally integrated in computer equipment. Such as a cloud server. As shown in fig. 1, the method of the embodiment of the present invention specifically includes:
step 101, in a voice communication process between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal so that the second terminal plays the first voice signal.
In this embodiment, the first terminal is a terminal device of a speaking party in a voice call, and may be any electronic device having a voice call function, such as a mobile phone, a fixed-line telephone, and the like, and the number of the first terminals may be one or more, corresponding to a two-party call and a multi-party call; the second terminal is a terminal device at the receiving end in the voice call, and can be any electronic device with a voice call function, such as a mobile phone, a fixed-line telephone, and the like, and the number of the second terminals can be one or more, which corresponds to the situations of two-party call and multi-party call; the first voice signal is a voice signal which can be collected by the first terminal when the speaker starts speaking, and includes but is not limited to voice generated by speaking of the speaker and noise in the environment where the speaker is located; the device in the first terminal for acquiring the first voice signal may be a microphone, or may be any other voice signal input device that may be connected to the first terminal, without limitation; the device in the second terminal for playing the first voice signal may be a speaker, or may be any other voice signal output device that can be connected to the second terminal, without limitation.
And 102, acquiring a second voice signal which is acquired by the second terminal and corresponds to the first voice signal.
The second voice signal is a sound signal which can be collected by the second terminal when the second terminal speaker plays the first voice signal, and includes but is not limited to noise in the environment where the sound receiver is located, the sound generated by the second terminal speaker playing the first voice signal and the sound generated by the second terminal speaker playing the first voice signal after being reflected by a complex and changeable wall surface; the device in the second terminal for acquiring the second voice signal may be a microphone, or may be any other voice signal input device that may be connected to the second terminal, without limitation.
Step 103, determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal.
The voice call parameters include parameters that can be set by the calling party in the second terminal and parameters inherent to the device, specifically, but not limited to, a system volume value of the second terminal, a microphone sensitivity of the second terminal, and a speaker frequency response curve and a speaker power of the second terminal.
In this embodiment, optionally, the voice call parameter may be acquired in a cloud server before acquiring the first voice signal acquired by the first terminal.
The method for determining the estimated voice signal corresponding to the first voice signal may be to bring the voice call parameter into a formula corresponding to a correlation algorithm to obtain an estimated voice signal through calculation, and preferably, the following voice signal estimation formula is provided in this embodiment, and is used for calculating the estimated voice signal corresponding to the first voice signal:
B(F,t)=α(V)*V*S*f(F)A(F,t),
wherein, B (F, t) is an estimated voice signal corresponding to the first voice signal, V is a system volume value of the second terminal, α (V) is an empirical amplification coefficient and is a function of the system volume value V of the second terminal, and is a proportionality coefficient of the loudness of the second voice signal collected by the second terminal relative to the first voice signal collected by the first terminal under the condition that the system volume values of the second terminal are different, S is a microphone sensitivity of the second terminal, F (F) is a speaker response curve of the second terminal, and a (F, t) is the first voice signal.
And step 104, determining the voice call state of the second terminal according to the first voice signal and the second voice signal.
The voice call state of the second terminal can be determined to be an unmanned call if the voice call state of the second terminal does not exist simultaneously; the existing dual-talk detection algorithm includes an energy comparison method, a correlation comparison method, a dual-filter method, and the like, which is not limited in this embodiment.
And 105, if the voice call state of the second terminal is unmanned call, determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal, and generating a voice reception detection result of the second terminal according to the first residual signal.
The first residual signal is a signal representing a difference between the second voice signal and the estimated voice signal when the voice call state of the second terminal is an unmanned call, and specifically may be a difference obtained by subtracting the estimated voice signal from the second voice signal.
Preferably, the present embodiment provides the following residual signal calculation formula for calculating the first residual signal corresponding to the second speech signal:
e1(F,t)=C(F,t)-B(F,t),
wherein e is1(F, t) is a first residual signal corresponding to the second speech signal, C (F, t) is the second speech signal, and B (F, t) is the estimated speech signal.
Preferably, the voice reception detection result of the second terminal includes normal voice reception and too small voice reception.
Optionally, the generating a sound receiving detection result of the second terminal according to the first residual signal includes: calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude positive; calculating the frequency domain time domain area corresponding to the first residual signal; calculating a variance of the first residual signal; and if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the first residual signal to the frequency-domain time-domain area corresponding to the first residual signal is larger than a first preset threshold value, and the variance of the first residual signal is smaller than a second preset threshold value, determining that the voice receiving detection result of the second terminal is normal voice receiving.
Optionally, the first preset threshold and the second preset threshold are preset empirical parameters, and may be adjusted according to the actual system performance of the voice call system. Illustratively, the first predetermined threshold is 90% and the second predetermined threshold is 6 db. Preferably, according to the first residual signal, the method for generating the voiced sound detection result of the second terminal includes calculating an integral area of a frequency-time coordinate, where the amplitude of the first residual signal is positive, by using formula × - [ dFdt ] according to the sign expression rule provided in this embodiment; calculating a frequency-domain time-domain area (F) corresponding to the first residual signalmax-Fmin)tmaxWherein F ismaxIs the frequency maximum of the first residual signal, FminIs the frequency minimum, t, of the first residual signalmaxIs the time maximum of the first residual signal; the variance of the first residual signal is calculated and may be expressed as s according to the sign indication rule provided in this embodiment2(e1) (ii) a If the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the first residual signal to the frequency-domain time-domain area corresponding to the first residual signal is greater than a first preset threshold value, and the variance of the first residual signal is smaller than a second preset threshold value, it means that the loudness of the second voice signal is large enough and stable enough to maintain normal conversation, and it is determined that the receiving detection result of the second terminal is normal receiving; wherein the first preset threshold and the second preset threshold are determined empirical parameters.
Optionally, the generating a sound receiving detection result of the second terminal according to the first residual signal further includes: calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude negative; determining an absolute value signal corresponding to the first residual signal and calculating a mean value of the absolute value signal corresponding to the first residual signal; and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the first residual signal negative to the frequency-domain time-domain area corresponding to the first residual signal is larger than a third preset threshold value, and the mean value of the absolute value signal corresponding to the first residual signal is larger than a fourth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
Optionally, the third preset threshold and the fourth preset threshold are preset empirical parameters, and may be adjusted according to the actual system performance of the voice call system. Illustratively, the third predetermined threshold is 90% and the fourth predetermined threshold is 20 db.
Preferably, according to the first residual signal, the method for generating the voiced sound detection result of the second terminal further includes calculating an integral area of a frequency-time coordinate that makes the amplitude of the first residual signal negative by using formula × - [ integral ] dFdt according to the sign expression rule provided in this embodiment; determining an absolute value signal corresponding to the first residual signal and calculating a mean value of the absolute value signal corresponding to the first residual signal
Figure BDA0002547782930000081
If the ratio of the integral area of the frequency-time coordinate making the amplitude of the first residual signal negative to the frequency-domain time-domain area corresponding to the first residual signal is larger than a third preset threshold value, and the mean value of the absolute value signal corresponding to the first residual signal is larger than a fourth preset threshold value, it means that the loudness of the second voice signal is small and is not enough to maintain normal conversation, it is determined that the reception detection result of the second terminal is too small, and the reception detection result is sent to the first terminal; wherein the third preset threshold and the fourth preset threshold are determined empirical parameters.
The embodiment of the invention provides a voice call quality detection method, which estimates echo signals of a receiving party in a voice call process according to reference signals, compares the estimated echo signals with the echo signals of an actual receiving party through an algorithm, judges whether the receiving party is normal when no person speaks and gives a corresponding prompt to the speaking party, avoids the feedback obstacle of the receiving problem between the speaking parties caused by abnormal receiving in the voice call process, optimizes the existing voice call quality detection mode and improves the call efficiency.
Example two
Fig. 2 is a flowchart of a voice call quality detection method according to a second embodiment of the present invention. In this embodiment of the present invention, after determining the voice call state of the second terminal according to the first voice signal and the second voice signal, the method may further include: and if the voice call state of the second terminal is the manned call, determining a second residual signal corresponding to the second voice signal according to an echo signal corresponding to the first voice signal and the estimated voice signal which are pre-estimated by a preset adaptive filter, and generating a voice receiving detection result of the second terminal according to the second residual signal.
As shown in fig. 2, the method of the embodiment of the present invention specifically includes:
step 201, in a voice communication process between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal, so that the second terminal plays the first voice signal.
Step 202, acquiring a second voice signal corresponding to the first voice signal and acquired by the second terminal.
Step 203, determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal.
And 204, determining the voice call state of the second terminal according to the first voice signal and the second voice signal.
Step 205, if the voice call state of the second terminal is a person call, determining a second residual signal corresponding to the second voice signal according to an echo signal corresponding to the first voice signal and the estimated voice signal, which are pre-estimated by a preset adaptive filter, and generating a voice reception detection result of the second terminal according to the second residual signal.
In this embodiment, when the voice call state of the second terminal is a talk with someone, the preset adaptive filter may eliminate sound signals other than the echo signal corresponding to the first voice signal collected by the second terminal, so as to obtain an estimated echo signal corresponding to the first voice signal, for comparison with the estimated voice signal.
The second residual signal is a signal representing a difference between the estimated echo signal corresponding to the first voice signal and the estimated voice signal when the voice call state of the second terminal is a person call, and specifically may be a difference obtained by subtracting the estimated voice signal from the estimated echo signal corresponding to the first voice signal.
Preferably, the present embodiment provides the following residual signal calculation formula for calculating the second residual signal corresponding to the second speech signal:
e2(F,t)=W(F,t)-B(F,t)
wherein e is2(F, t) is a second residual signal corresponding to the second speech signal, W (F, t) is an echo signal corresponding to the first speech signal estimated according to a preset adaptive filter, and B (F, t) is the estimated speech signal.
Preferably, the voice reception detection result of the second terminal includes normal voice reception and too small voice reception.
Optionally, the generating a sound receiving detection result of the second terminal according to the second residual signal includes: calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude positive; calculating the frequency domain time domain area corresponding to the second residual signal; calculating a variance of the second residual signal; and if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the second residual signal to the frequency-domain time-domain area corresponding to the second residual signal is greater than a fifth preset threshold, and the variance of the second residual signal is smaller than a sixth preset threshold, determining that the voice reception detection result of the second terminal is normal.
Optionally, the fifth preset threshold and the sixth preset threshold are preset empirical parameters, and may be adjusted according to the actual system performance of the voice call system. Illustratively, the fifth preset threshold is 90% and the sixth preset threshold is 6 db.
Preferably, according to the second residual signal, the method for generating the voiced sound detection result of the second terminal includes calculating an integral area of a frequency-time coordinate, where the amplitude of the second residual signal is positive, by using formula × - [ dFdt ] according to the sign expression rule provided in this embodiment; calculating a frequency-domain time-domain area (F) corresponding to the second residual signalmax-Fmin)tmaxWherein F ismaxIs the frequency maximum of the second residual signal, FminIs the frequency minimum, t, of the second residual signalmaxIs the time maximum of the second residual signal; the variance of the second residual signal is calculated and may be expressed as s according to the sign expression rule provided in this embodiment2(e2) (ii) a If the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the second residual signal to the frequency-domain time-domain area corresponding to the second residual signal is greater than a fifth preset threshold, and the variance of the second residual signal is smaller than a sixth preset threshold, it means that the loudness of the second voice signal is large enough and stable enough to maintain normal conversation, and it is determined that the receiving detection result of the second terminal is normal receiving; wherein the fifth preset threshold and the sixth preset threshold are determined empirical parameters.
Optionally, the generating a sound receiving detection result of the second terminal according to the second residual signal further includes: calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude negative; determining an absolute value signal corresponding to the second residual signal and calculating a mean value of the absolute value signal corresponding to the second residual signal; and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the second residual signal negative to the frequency-domain time-domain area corresponding to the second residual signal is greater than a seventh preset threshold value, and the mean value of the absolute value signal corresponding to the second residual signal is greater than an eighth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
Optionally, the seventh preset threshold and the eighth preset threshold are preset empirical parameters, and may be adjusted according to the actual system performance of the voice call system. Illustratively, the seventh preset threshold is 90% and the eighth preset threshold is 20 db.
Preferably, according to the second residual signal, the method for generating the voiced sound detection result of the second terminal further includes calculating an integral area of a frequency-time coordinate that makes the amplitude of the second residual signal negative by using formula × - [ integral ] dFdt according to the sign expression rule provided in this embodiment; determining an absolute value signal corresponding to the second residual signal and calculating a mean value of the absolute value signal corresponding to the second residual signal
Figure BDA0002547782930000121
If the ratio of the integral area of the frequency-time coordinate making the amplitude of the second residual signal negative to the frequency-domain time-domain area corresponding to the second residual signal is greater than a seventh preset threshold, and the mean value of the absolute value signal corresponding to the second residual signal is greater than an eighth preset threshold, it means that the loudness of the second voice signal is small and is not enough to maintain normal conversation, it is determined that the reception detection result of the second terminal is too small, and the reception detection result is sent to the first terminal; wherein the seventh preset threshold and the eighth preset threshold are determined empirical parameters.
The embodiment of the invention provides a voice call quality detection method, which is characterized in that echo signals of a receiving party in a voice call process are estimated according to reference signals and are compared with echo signals of an actual receiving party through an algorithm, whether the receiving party receives the voice normally or not is judged, and a corresponding prompt is made for the speaking party, so that the feedback obstacle of the receiving problem between the calling parties caused by abnormal receiving in the voice call process is avoided, the existing voice call quality detection mode is optimized, and the call efficiency is improved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a voice call quality detection apparatus according to a third embodiment of the present invention, and as shown in fig. 3, the apparatus includes:
the first signal obtaining module 311 is configured to obtain a first voice signal collected by a first terminal during a voice call between the first terminal and a second terminal, and send the first voice signal to the second terminal, so that the second terminal plays the first voice signal;
a second signal obtaining module 312, configured to obtain a second voice signal corresponding to the first voice signal, where the second voice signal is collected by the second terminal;
an estimated signal determining module 313, configured to determine, according to the first voice signal and the voice call parameter of the second terminal, an estimated voice signal corresponding to the first voice signal;
a call state determining module 314, configured to determine a voice call state of the second terminal according to the first voice signal and the second voice signal;
a first result generating module 315, configured to determine, according to the second voice signal and the estimated voice signal, a first residual signal corresponding to the second voice signal if the voice call state of the second terminal is an unattended call, and generate a voice reception detection result of the second terminal according to the first residual signal.
Optionally, the voice call quality detection apparatus may further include a call parameter obtaining module, configured to obtain voice call parameters of the first terminal and the second terminal before obtaining the first voice signal collected by the first terminal.
Further, the estimated signal determining module 313 is specifically configured to calculate the estimated speech signal corresponding to the first speech signal according to the following speech signal estimation formula:
B(F,t)=α(V)*V*S*f(F)A(F,t),
wherein B (F, t) is an estimated speech signal corresponding to the first speech signal, α (V) is an empirical amplification factor, V is a system volume value of the second terminal, S is a microphone sensitivity of the second terminal, F (F) is a speaker frequency response curve of the second terminal, and a (F, t) is the first speech signal.
Further, the call state determining module 314 is specifically configured to determine the voice call state of the second terminal according to the first voice signal and the second voice signal by using a dual-talk detection algorithm.
Further, the first result generating module 315 is specifically configured to:
calculating a first residual signal corresponding to the second speech signal according to the following residual signal calculation formula:
e1(F,t)=C(F,t)-B(F,t),
wherein e is1(F, t) is a first residual signal corresponding to the second speech signal, C (F, t) is the second speech signal, B (F, t) is the estimated speech signal;
calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude positive;
calculating the frequency domain time domain area corresponding to the first residual signal;
calculating a variance of the first residual signal;
if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the first residual signal to the frequency-domain time-domain area corresponding to the first residual signal is larger than a first preset threshold value, and the variance of the first residual signal is smaller than a second preset threshold value, determining that the voice receiving detection result of the second terminal is normal;
calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude negative;
determining an absolute value signal corresponding to the first residual signal and calculating a mean value of the absolute value signal corresponding to the first residual signal;
and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the first residual signal negative to the frequency-domain time-domain area corresponding to the first residual signal is larger than a third preset threshold value, and the mean value of the absolute value signal corresponding to the first residual signal is larger than a fourth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
The embodiment of the invention provides a voice call quality detection device, which estimates echo signals of a receiving party in a voice call process according to reference signals, compares the estimated echo signals with the echo signals of an actual receiving party through an algorithm, judges whether the receiving party is normal when no person speaks and gives a corresponding prompt to the speaking party, avoids the feedback obstacle of the receiving problem between the speaking parties caused by abnormal receiving in the voice call process, optimizes the existing voice call quality detection mode and improves the call efficiency.
In an optional implementation manner of the embodiment of the present invention, the apparatus for detecting voice call quality further includes:
and the second result generation module is used for determining a second residual signal corresponding to the second voice signal according to an echo signal which is estimated by a preset adaptive filter and corresponds to the first voice signal and the estimated voice signal if the voice call state of the second terminal is a person call, and generating a voice receiving detection result of the second terminal according to the second residual signal.
Further, the second result generation module is specifically configured to:
calculating a second residual signal corresponding to the second speech signal according to the following residual signal calculation formula:
e2(F,t)=W(F,t)-B(F,t)
wherein e is2(F, t) is the second voice signalTwo residual signals, wherein W (F, t) is an echo signal corresponding to the first voice signal estimated according to a preset adaptive filter, and B (F, t) is the estimated voice signal;
calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude positive;
calculating the frequency domain time domain area corresponding to the second residual signal;
calculating a variance of the second residual signal;
if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the second residual signal to the frequency-domain time-domain area corresponding to the second residual signal is greater than a fifth preset threshold, and the variance of the second residual signal is less than a sixth preset threshold, determining that the voice reception detection result of the second terminal is normal;
calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude negative;
determining an absolute value signal corresponding to the second residual signal and calculating a mean value of the absolute value signal corresponding to the second residual signal;
and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the second residual signal negative to the frequency-domain time-domain area corresponding to the second residual signal is greater than a seventh preset threshold value, and the mean value of the absolute value signal corresponding to the second residual signal is greater than an eighth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
In the voice call quality detection device provided in this embodiment, the echo signal of the receiving party in the voice call process is estimated according to the reference signal and compared with the echo signal of the actual receiving party through the algorithm, whether the receiving party receives the voice normally or not is judged, and a corresponding prompt is made to the speaking party, so that the feedback obstacle of the receiving problem between the speaking parties caused by abnormal receiving in the voice call process is avoided, the existing voice call quality detection mode is optimized, and the call efficiency is improved.
The voice call quality detection device can execute the voice call quality detection method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the voice call quality detection method.
Example four
Fig. 4 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 4 is only one example and should not bring any limitations to the functionality or scope of use of embodiments of the present invention.
As shown in FIG. 4, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors 16, a memory 28, and a bus 18 that connects the various system components (including the memory 28 and the processors 16).
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processor 16 executes various functional applications and data processing by running the program stored in the memory 28, so as to implement the voice call quality detection method provided by the embodiment of the present invention: in the process of voice communication between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal; acquiring a second voice signal corresponding to the first voice signal and acquired by the second terminal; determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal; determining the voice call state of the second terminal according to the first voice signal and the second voice signal; and if the voice call state of the second terminal is unmanned call, determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal, and generating a voice reception detection result of the second terminal according to the first residual signal.
EXAMPLE five
Fifth embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where when the computer program is executed by a processor, the method for detecting voice call quality provided in the fifth embodiment of the present invention is implemented: in the process of voice communication between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal; acquiring a second voice signal corresponding to the first voice signal and acquired by the second terminal; determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal; determining the voice call state of the second terminal according to the first voice signal and the second voice signal; and if the voice call state of the second terminal is unmanned call, determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal, and generating a voice reception detection result of the second terminal according to the first residual signal.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or computer device. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (14)

1. A voice call quality detection method is characterized by comprising the following steps:
in the process of voice communication between a first terminal and a second terminal, acquiring a first voice signal acquired by the first terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal;
acquiring a second voice signal corresponding to the first voice signal and acquired by the second terminal;
determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal;
determining the voice call state of the second terminal according to the first voice signal and the second voice signal;
and if the voice call state of the second terminal is unmanned call, determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal, and generating a voice reception detection result of the second terminal according to the first residual signal.
2. The method of claim 1, further comprising, after determining the voice call state of the second terminal based on the first voice signal and the second voice signal:
and if the voice call state of the second terminal is the manned call, determining a second residual signal corresponding to the second voice signal according to an echo signal corresponding to the first voice signal and the estimated voice signal which are pre-estimated by a preset adaptive filter, and generating a voice receiving detection result of the second terminal according to the second residual signal.
3. The method of claim 1, wherein determining an estimated voice signal corresponding to the first voice signal according to the first voice signal, the voice call parameter of the second terminal, and a preset voice signal estimation formula comprises:
calculating an estimated speech signal corresponding to the first speech signal according to the following speech signal estimation formula:
B(F,t)=α(V)*V*S*f(F)A(F,t),
wherein B (F, t) is an estimated speech signal corresponding to the first speech signal, α (V) is an empirical amplification factor, V is a system volume value of the second terminal, S is a microphone sensitivity of the second terminal, F (F) is a speaker frequency response curve of the second terminal, and a (F, t) is the first speech signal.
4. The method of claim 1, wherein the determining the voice call state of the second terminal according to the first voice signal and the second voice signal comprises:
and determining the voice call state of the second terminal according to the first voice signal and the second voice signal by using a double-talk detection algorithm.
5. The method of claim 1, wherein determining a first residual signal corresponding to the second speech signal based on the second speech signal and the estimated speech signal comprises:
calculating a first residual signal corresponding to the second speech signal according to the following residual signal calculation formula:
e1(F,t)=C(F,t)-B(F,t),
wherein e is1(F, t) is a first residual signal corresponding to the second speech signal, C (F, t) is the second speech signal, and B (F, t) is the estimated speech signal.
6. The method according to claim 5, wherein generating the voiced detection result of the second terminal according to the first residual signal comprises:
calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude positive;
calculating the frequency domain time domain area corresponding to the first residual signal;
calculating a variance of the first residual signal;
and if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the first residual signal to the frequency-domain time-domain area corresponding to the first residual signal is larger than a first preset threshold value, and the variance of the first residual signal is smaller than a second preset threshold value, determining that the voice receiving detection result of the second terminal is normal voice receiving.
7. The method of claim 6, wherein generating the voicing detection result for the second terminal based on the first residual signal further comprises:
calculating an integral area of a frequency-time coordinate that makes the first residual signal amplitude negative;
determining an absolute value signal corresponding to the first residual signal and calculating a mean value of the absolute value signal corresponding to the first residual signal;
and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the first residual signal negative to the frequency-domain time-domain area corresponding to the first residual signal is larger than a third preset threshold value, and the mean value of the absolute value signal corresponding to the first residual signal is larger than a fourth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
8. The method of claim 2, wherein determining a second residual signal corresponding to the second speech signal according to the echo signal corresponding to the first speech signal and the estimated speech signal estimated by the preset adaptive filter comprises:
calculating a second residual signal corresponding to the second speech signal according to the following residual signal calculation formula:
e2(F,t)=W(F,t)-B(F,t)
wherein e is2(F, t) is a second residual signal corresponding to the second speech signal, W (F, t) is an echo signal corresponding to the first speech signal estimated according to a preset adaptive filter, and B (F, t) is the estimated speech signal.
9. The method according to claim 8, wherein generating the voiced detection result of the second terminal according to the second residual signal comprises:
calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude positive;
calculating the frequency domain time domain area corresponding to the second residual signal;
calculating a variance of the second residual signal;
and if the ratio of the integral area of the frequency-time coordinate with the positive amplitude of the second residual signal to the frequency-domain time-domain area corresponding to the second residual signal is greater than a fifth preset threshold, and the variance of the second residual signal is smaller than a sixth preset threshold, determining that the voice reception detection result of the second terminal is normal.
10. The method of claim 9, wherein generating the voicing detection result for the second terminal based on the second residual signal further comprises:
calculating an integral area of a frequency-time coordinate that makes the second residual signal amplitude negative;
determining an absolute value signal corresponding to the second residual signal and calculating a mean value of the absolute value signal corresponding to the second residual signal;
and if the ratio of the integral area of the frequency-time coordinate making the amplitude of the second residual signal negative to the frequency-domain time-domain area corresponding to the second residual signal is greater than a seventh preset threshold value, and the mean value of the absolute value signal corresponding to the second residual signal is greater than an eighth preset threshold value, determining that the reception detection result of the second terminal is too small, and sending the reception detection result to the first terminal.
11. A voice call quality detection apparatus, comprising:
the first signal acquisition module is used for acquiring a first voice signal acquired by a first terminal in the voice call process between the first terminal and a second terminal, and sending the first voice signal to the second terminal so as to enable the second terminal to play the first voice signal;
the second signal acquisition module is used for acquiring a second voice signal which is acquired by the second terminal and corresponds to the first voice signal;
the estimated signal determining module is used for determining an estimated voice signal corresponding to the first voice signal according to the first voice signal and the voice call parameter of the second terminal;
the call state determining module is used for determining the voice call state of the second terminal according to the first voice signal and the second voice signal;
and the first result generation module is used for determining a first residual signal corresponding to the second voice signal according to the second voice signal and the estimated voice signal and generating a voice receiving detection result of the second terminal according to the first residual signal if the voice call state of the second terminal is unmanned call.
12. The apparatus of claim 11, further comprising:
and the second result generation module is used for determining a second residual signal corresponding to the second voice signal according to an echo signal which is estimated by a preset adaptive filter and corresponds to the first voice signal and the estimated voice signal if the voice call state of the second terminal is a person call, and generating a voice receiving detection result of the second terminal according to the second residual signal.
13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the voice call quality detection method according to any one of claims 1 to 10 when executing the computer program.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a voice call quality detection method according to any one of claims 1 to 10.
CN202010566405.7A 2020-06-19 2020-06-19 Voice call quality detection method, device, equipment and storage medium Active CN113824843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010566405.7A CN113824843B (en) 2020-06-19 2020-06-19 Voice call quality detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010566405.7A CN113824843B (en) 2020-06-19 2020-06-19 Voice call quality detection method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113824843A true CN113824843A (en) 2021-12-21
CN113824843B CN113824843B (en) 2023-11-21

Family

ID=78911610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010566405.7A Active CN113824843B (en) 2020-06-19 2020-06-19 Voice call quality detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113824843B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060166624A1 (en) * 2003-08-28 2006-07-27 Van Vugt Jeroen M Measuring a talking quality of a communication link in a network
CN101247616A (en) * 2008-03-20 2008-08-20 浙江三维通信股份有限公司 Mobile communications network overlapping effect remote dial testing method based on digital voice technology
CN101268624A (en) * 2005-09-20 2008-09-17 艾利森电话股份有限公司 Method and test signal for measuring speech intelligibility
US20110124380A1 (en) * 2009-11-26 2011-05-26 Via Telecom, Inc. Method and system for double-end talk detection, and method and system for echo elimination
CN103077727A (en) * 2013-01-04 2013-05-01 华为技术有限公司 Method and device used for speech quality monitoring and prompting
JP2014171021A (en) * 2013-03-01 2014-09-18 Nippon Telegr & Teleph Corp <Ntt> Delay time measurement device, delay time measurement method, and program
US20150279386A1 (en) * 2014-03-31 2015-10-01 Google Inc. Situation dependent transient suppression
CN106328166A (en) * 2016-08-31 2017-01-11 上海交通大学 Man-machine dialogue anomaly detection system and method
CN107580155A (en) * 2017-08-31 2018-01-12 百度在线网络技术(北京)有限公司 Networking telephone quality determination method, device, computer equipment and storage medium
CN107645615A (en) * 2017-09-30 2018-01-30 上海二三四五金融科技有限公司 A kind of control method of the audio call of more operator multi-users, apparatus and system
CN108074585A (en) * 2018-02-08 2018-05-25 河海大学常州校区 A kind of voice method for detecting abnormality based on sound source characteristics
CN108810296A (en) * 2018-04-28 2018-11-13 上海车音智能科技有限公司 A kind of intelligence calling-out method and device
US20190005961A1 (en) * 2017-06-28 2019-01-03 Baidu Online Network Technology (Beijing) Co., Ltd. Method and device for processing voice message, terminal and storage medium
CN109348072A (en) * 2018-08-30 2019-02-15 湖北工业大学 A kind of double talk detection method applied to acoustic echo cancellation system
CN110351443A (en) * 2019-06-17 2019-10-18 深圳壹账通智能科技有限公司 Intelligent outgoing call processing method, device, computer equipment and storage medium
CN110955770A (en) * 2019-12-18 2020-04-03 圆通速递有限公司 Intelligent dialogue system

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060166624A1 (en) * 2003-08-28 2006-07-27 Van Vugt Jeroen M Measuring a talking quality of a communication link in a network
CN101268624A (en) * 2005-09-20 2008-09-17 艾利森电话股份有限公司 Method and test signal for measuring speech intelligibility
CN101247616A (en) * 2008-03-20 2008-08-20 浙江三维通信股份有限公司 Mobile communications network overlapping effect remote dial testing method based on digital voice technology
US20110124380A1 (en) * 2009-11-26 2011-05-26 Via Telecom, Inc. Method and system for double-end talk detection, and method and system for echo elimination
CN103077727A (en) * 2013-01-04 2013-05-01 华为技术有限公司 Method and device used for speech quality monitoring and prompting
JP2014171021A (en) * 2013-03-01 2014-09-18 Nippon Telegr & Teleph Corp <Ntt> Delay time measurement device, delay time measurement method, and program
US20150279386A1 (en) * 2014-03-31 2015-10-01 Google Inc. Situation dependent transient suppression
CN106328166A (en) * 2016-08-31 2017-01-11 上海交通大学 Man-machine dialogue anomaly detection system and method
US20190005961A1 (en) * 2017-06-28 2019-01-03 Baidu Online Network Technology (Beijing) Co., Ltd. Method and device for processing voice message, terminal and storage medium
CN107580155A (en) * 2017-08-31 2018-01-12 百度在线网络技术(北京)有限公司 Networking telephone quality determination method, device, computer equipment and storage medium
CN107645615A (en) * 2017-09-30 2018-01-30 上海二三四五金融科技有限公司 A kind of control method of the audio call of more operator multi-users, apparatus and system
CN108074585A (en) * 2018-02-08 2018-05-25 河海大学常州校区 A kind of voice method for detecting abnormality based on sound source characteristics
CN108810296A (en) * 2018-04-28 2018-11-13 上海车音智能科技有限公司 A kind of intelligence calling-out method and device
CN109348072A (en) * 2018-08-30 2019-02-15 湖北工业大学 A kind of double talk detection method applied to acoustic echo cancellation system
CN110351443A (en) * 2019-06-17 2019-10-18 深圳壹账通智能科技有限公司 Intelligent outgoing call processing method, device, computer equipment and storage medium
CN110955770A (en) * 2019-12-18 2020-04-03 圆通速递有限公司 Intelligent dialogue system

Also Published As

Publication number Publication date
CN113824843B (en) 2023-11-21

Similar Documents

Publication Publication Date Title
US7945442B2 (en) Internet communication device and method for controlling noise thereof
EP2987316B1 (en) Echo cancellation
US9768829B2 (en) Methods for processing audio signals and circuit arrangements therefor
US8842851B2 (en) Audio source localization system and method
US11297178B2 (en) Method, apparatus, and computer-readable media utilizing residual echo estimate information to derive secondary echo reduction parameters
EP3058710B1 (en) Detecting nonlinear amplitude processing
CN112071328B (en) Audio noise reduction
EP2982101B1 (en) Noise reduction
US10978086B2 (en) Echo cancellation using a subset of multiple microphones as reference channels
JP2009503568A (en) Steady separation of speech signals in noisy environments
CN112004177B (en) Howling detection method, microphone volume adjustment method and storage medium
JP3507020B2 (en) Echo suppression method, echo suppression device, and echo suppression program storage medium
CN109215672B (en) Method, device and equipment for processing sound information
WO2020252629A1 (en) Residual acoustic echo detection method, residual acoustic echo detection device, voice processing chip, and electronic device
CN111556210B (en) Call voice processing method and device, terminal equipment and storage medium
CN112581960A (en) Voice wake-up method and device, electronic equipment and readable storage medium
JP2024507916A (en) Audio signal processing method, device, electronic device, and computer program
US8345860B1 (en) Method and system for detection of onset of near-end signal in an echo cancellation system
CN112235462A (en) Voice adjusting method, system, electronic equipment and computer readable storage medium
CN113824843B (en) Voice call quality detection method, device, equipment and storage medium
CN110992975A (en) Voice signal processing method and device and terminal
CN111210799A (en) Echo cancellation method and device
WO2019169272A1 (en) Enhanced barge-in detector
CN111681666B (en) Backup method and device for filter coefficient and computer storage medium
JP3466049B2 (en) Voice switch for talker

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant