CN111445916A - Audio dereverberation method, device and storage medium in conference system - Google Patents

Audio dereverberation method, device and storage medium in conference system Download PDF

Info

Publication number
CN111445916A
CN111445916A CN202010160669.2A CN202010160669A CN111445916A CN 111445916 A CN111445916 A CN 111445916A CN 202010160669 A CN202010160669 A CN 202010160669A CN 111445916 A CN111445916 A CN 111445916A
Authority
CN
China
Prior art keywords
reverberation time
audio
reverberation
audio signal
dereverberation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010160669.2A
Other languages
Chinese (zh)
Other versions
CN111445916B (en
Inventor
黄景标
林聚财
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202010160669.2A priority Critical patent/CN111445916B/en
Publication of CN111445916A publication Critical patent/CN111445916A/en
Application granted granted Critical
Publication of CN111445916B publication Critical patent/CN111445916B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Abstract

The invention discloses an audio dereverberation method, an audio dereverberation device and a storage medium in a conference system. The audio dereverberation method in the conference system comprises the following steps: calculating a first reverberation time under an audio scene by using an acoustic echo path; calculating a second reverberation time under the audio scene by using the audio signal received by the microphone; calculating a path deviation of the acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation, so that the first reverberation time and the second reverberation time after weight distribution are the same, and defining that the first reverberation time and the second reverberation time after weight distribution are third reverberation time; and performing dereverberation processing on the audio signal according to the third reverberation time. The invention can improve the effectiveness and robustness of estimating the reverberation time and more effectively remove the reverberation component in the audio signal.

Description

Audio dereverberation method, device and storage medium in conference system
Technical Field
The present invention relates to the field of audio signal processing technologies, and in particular, to an audio dereverberation method, device and storage medium in a conference system.
Background
In the case of sound signal collection or recording, the microphone receives not only the part of the sound wave emitted by the desired sound source and directly arriving, but also the sound wave emitted by the sound source and arriving by other routes, and the undesired sound wave (i.e. background noise) generated by other sound sources in the environment. Acoustically, the reflected wave with a delay time of about 50ms or more is called echo, and the effect of the remaining reflected wave is called reverberation. The reverberation phenomenon will have an effect on the reception effect of the desired acoustic signal. In many cases, reverberation tends to cause interference, resulting in poor performance of the acoustic receiving system. Therefore, it is important to reduce the influence of reverberation on the sound receiving system, i.e. dereverberation.
The inventors of the present application found that the present dereverberation process is not effective.
Disclosure of Invention
The invention provides a method and a device for removing reverberation of audio in a conference system and a storage medium, which can solve the technical problem of poor reverberation removing effect in the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
an audio dereverberation method in a conference system, comprising the steps of:
calculating a first reverberation time in the audio scene using the acoustic echo path, an
Calculating a second reverberation time in the audio scene using the audio signal received by the microphone;
calculating a path deviation of the acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation, so that the first reverberation time and the second reverberation time after weight distribution are the same, and defining that the first reverberation time and the second reverberation time after weight distribution are third reverberation time;
and performing dereverberation processing on the audio signal according to the third reverberation time.
The technical scheme adopted by the invention also comprises the following steps: the calculating, using the acoustic echo path, prior to the first reverberation time in the audio scene comprises:
and acquiring the acoustic echo path between a loudspeaker and a microphone under the audio scene by using an echo cancellation algorithm.
The technical scheme adopted by the invention also comprises the following steps: said weight assigning the first and second reverberation times according to the path deviation further comprises:
if the path deviation is larger than a set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is unsuccessful, distributing a smaller weight to the first reverberation time; if the path deviation is smaller than a set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is successful, allocating a larger weight to the first reverberation time;
the weight of the second reverberation time is: the total weight minus the weight assigned to the first reverberation time.
The technical scheme adopted by the invention also comprises the following steps: the dereverberating the audio signal according to the third reverberation time further comprises:
and calculating the late reverberation power spectral density of the audio signal by using the third reverberation time.
The technical scheme adopted by the invention also comprises the following steps: the dereverberating the audio signal according to the third reverberation time further comprises:
and carrying out short-time Fourier transform on the audio signal to obtain the representation of the audio signal on a short-time frequency domain, and calculating the noise power spectral density of the audio signal by using a noise estimation algorithm.
The technical scheme adopted by the invention also comprises the following steps: the dereverberating the audio signal according to the third reverberation time further comprises:
and based on the calculation results of the noise power spectral density and the late reverberation power spectral density, performing voice enhancement processing on each frequency point in the audio signal by using a voice enhancement mode to eliminate a reverberation part in the audio signal.
The technical scheme adopted by the invention also comprises the following steps: the speech enhancement means comprises spectral subtraction, wiener filtering or mmse estimator.
The invention adopts another technical scheme that: an audio dereverberation apparatus in a conference system, the apparatus comprising:
a first reverberation time estimation module: for calculating a first reverberation time in the audio scene using the acoustic echo path;
a second reverberation time estimation module: for calculating a second reverberation time in the audio scene using the audio signal received by the microphone;
a weight assignment module: the acoustic echo path is used for calculating a path deviation of the acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation, so that the first reverberation time and the second reverberation time after weight distribution are the same, and the first reverberation time and the second reverberation time after weight distribution are both defined as a third reverberation time;
a voice enhancement module: for dereverberating the audio signal in accordance with the third reverberation time.
In order to solve the technical problems, the invention adopts another technical scheme that: there is provided an audio dereverberation apparatus in a conference system, comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the audio dereverberation method in the conferencing system as set forth above;
the processor is to execute the program instructions stored by the memory to dereverberate an audio signal.
In order to solve the technical problems, the invention adopts another technical scheme that: a storage medium storing program instructions executable by a processor to perform the audio dereverberation method in a conference system as described above.
The invention has the beneficial effects that: according to the audio dereverberation method, the device and the storage medium in the conference system, the reverberation time of the audio signal is estimated by using the acoustic echo path, so that the effectiveness and robustness of the estimated reverberation time are improved; meanwhile, in order to prevent the acoustic echo path from changing in the time-varying process, the calculated reverberation time is subjected to weight distribution, so that the accuracy of reverberation time estimation is further improved, and reverberation components in the audio signal are removed more effectively.
Drawings
Fig. 1 is a flow chart illustrating an audio dereverberation method in a conference system according to a first embodiment of the present invention;
fig. 2 is a flow chart of an audio dereverberation method in a conference system according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of a first structure of an audio dereverberation apparatus in a conference system according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a second structure of an audio dereverberation apparatus in the conference system according to the embodiment of the present invention;
FIG. 5 is a schematic diagram of a storage medium structure according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Example one
Please refer to fig. 1, which is a flowchart illustrating an audio dereverberation method in a conference system according to a first embodiment of the present invention. The audio dereverberation method in the conference system of the first embodiment of the present invention includes the steps of:
s100: calculating a first reverberation time under an audio scene by using an acoustic echo path;
in S100, first, an acoustic echo path is estimated by adaptive filtering:
Figure BDA0002405660810000051
in the formula (1), ω is1Denotes the estimation of ω, μadpIs a step size factor, and ranges from [0, 1 ]];
Figure BDA0002405660810000052
Is a residual signal;
Figure BDA0002405660810000053
the average power of the reference audio signal is calculated as follows:
Figure BDA0002405660810000061
in the formula (2), λ is a smoothing factor, and is usually set to 0.98.
Then, calculating a first reverberation time using the acoustic echo path;
converting the acoustic echo path into a DB representation:
Figure BDA0002405660810000062
estimating a correlation regression coefficient c by adopting a linear fitting mode, setting a fitted curve as cn + b, and calculating the correlation regression coefficient c in a mode of:
Figure BDA0002405660810000063
wherein the content of the first and second substances,
Figure BDA0002405660810000064
Figure BDA0002405660810000065
the first reverberation time calculated using the acoustic echo path is:
Figure BDA0002405660810000066
wherein
Figure BDA0002405660810000067
Is a regulatory factor.
S101: calculating a second reverberation time under the audio scene by using the audio signal received by the microphone;
in S101, it is assumed that the audio signal received by the microphone is represented as:
Figure BDA0002405660810000068
in the formula (7), the reaction mixture is,
Figure BDA0002405660810000069
propagation of audio played for speakers through a conference roomAnd the signal arriving in the microphone has an acoustic echo path with the length of N, wherein the path is omega (N) [ [ omega ] ]0(n),…,ωN-1(n)]T
Figure BDA00024056608100000610
xrevb(n) is the audio reverberation signal and v (n) is the background noise.
Performing echo cancellation algorithm processing on the audio signal d (n) received by the microphone to obtain d '(n), and performing second reverberation time estimation on d' (n):
d′(n)=xrevb(n)+v(n) (8)
in the formula (8), xrevb(n) may be represented as
Figure BDA0002405660810000071
Wherein T issIs the inverse of the sampling rate and,
Figure BDA0002405660810000072
referred to as the reverberation attenuation factor;
estimating the reverberation attenuation factor by using maximum likelihood estimation:
ρ=arg{max{L(d′,ρ)}} (9)
in the formula (9), the reaction mixture is,
Figure BDA0002405660810000073
Figure BDA0002405660810000074
is the power of the noise.
S102: calculating a path deviation of an acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation so that the first reverberation time and the second reverberation time after weight distribution are the same, and defining that the first reverberation time and the second reverberation time after weight distribution are third reverberation time;
in S102, the weight distribution strategy adopted by the present invention is: firstly, calculating the path deviation of an acoustic echo path, and if the path deviation is greater than a set path deviation threshold value, indicating that the convergence of an echo cancellation algorithm is unsuccessful, distributing a smaller weight to the first reverberation time; if the path deviation is smaller than the set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is successful, distributing a larger weight to the first reverberation time; the weight of the second reverberation time is assigned as: the total weight minus the weight assigned to the first reverberation time.
S103: performing dereverberation processing on the audio signal according to the third reverberation time;
in S103, dereverberation specifically includes: and calculating the noise power spectral density of the audio signal by using a noise estimation algorithm, calculating the late reverberation power spectral density of the audio signal by using the third reverberation time, and performing dereverberation processing on the audio signal by using a voice enhancement mode based on the calculation results of the noise power spectral density and the late reverberation power spectral density.
According to the audio dereverberation method in the conference system, the first reverberation time of the audio signal is estimated by using the acoustic echo path, the second reverberation time is calculated by using the audio signal received by the microphone, then the two reverberation times are subjected to weight distribution, and the audio signal is subjected to dereverberation treatment according to the reverberation time after the weight distribution, so that the dereverberation effect of the audio signal is improved.
Example two
Please refer to fig. 2, which is a flowchart illustrating an audio dereverberation method in a conference system according to a second embodiment of the present invention. The audio dereverberation method in the conference system of the second embodiment of the present invention includes the steps of:
s200: acquiring an acoustic echo path between a loudspeaker and a microphone in a current audio scene;
in S200, the acoustic echo path acquisition mode is: and calculating by using an echo cancellation algorithm in a conference scene. The embodiment of the invention adopts self-adaptive filtering to estimate the acoustic echo path:
Figure BDA0002405660810000081
in the formula (1), ω is1Denotes the estimation of ω, μadpIs a step size factor, and ranges from [0, 1 ]];
Figure BDA0002405660810000082
Is a residual signal;
Figure BDA0002405660810000083
the average power of the reference audio signal is calculated as follows:
Figure BDA0002405660810000084
in the formula (2), λ is a smoothing factor, and is usually set to 0.98.
S201: calculating a first reverberation time under a current audio scene by using an acoustic echo path;
in S201, the reverberation time refers to the time required for the sound source to stop playing sound and attenuate the sound source energy by 60dB, and may be used to represent the reverberation degree of the room, and also may be used to estimate the power of late reverberation; the calculation method for calculating the first reverberation time by using the acoustic echo path comprises the following steps:
converting the acoustic echo path into a DB representation:
Figure BDA0002405660810000085
estimating a correlation regression coefficient c by adopting a linear fitting mode, setting a fitted curve as cn + b, and calculating the correlation regression coefficient c in a mode of:
Figure BDA0002405660810000086
wherein the content of the first and second substances,
Figure BDA0002405660810000091
Figure BDA0002405660810000092
the first reverberation time calculated using the acoustic echo path is:
Figure BDA0002405660810000093
wherein
Figure BDA0002405660810000094
Is a regulatory factor.
S202: acquiring an audio signal received by a microphone, calculating a second reverberation time in the current audio scene by using the received audio signal, and respectively executing S203 and S204;
in S202, in a conference scene, there are usually a speaker and a microphone, where audio played by the speaker is from an audio signal sent by a network, and an audio signal received by the microphone includes the audio signal played by the speaker and an audio signal of a speaker in the current conference scene. Assume that the audio signal received by the microphone is represented as:
Figure BDA0002405660810000095
in the formula (7), the reaction mixture is,
Figure BDA0002405660810000096
the acoustic echo path ω (N) ([ ω) with a length N for the audio played by the loudspeaker to travel through the conference room to reach the signal in the microphone0(n),…,ωN-1(n)]T
Figure BDA0002405660810000097
xrevb(n) is the audio reverberation signal and v (n) is the background noise.
Performing echo cancellation algorithm processing on the audio signal d (n) received by the microphone to obtain d '(n), and performing second reverberation time estimation on d' (n):
d′(n)=xrevb(n)+v(n) (8)
in the formula (8), xrevb(n) may be represented as
Figure BDA0002405660810000098
Wherein T issIs the inverse of the sampling rate and,
Figure BDA0002405660810000099
referred to as the reverberation attenuation factor;
estimating the reverberation attenuation factor by using maximum likelihood estimation:
ρ=arg{max{L(d′,ρ)}} (9)
in the formula (9), the reaction mixture is,
Figure BDA0002405660810000101
Figure BDA0002405660810000102
is the power of the noise.
S203: analyzing the acoustic echo path, calculating a path deviation of the acoustic echo path, performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation of the acoustic echo path, so that the first reverberation time and the second reverberation time after the weight distribution are the same, obtaining a final third reverberation time, and executing S205;
in S203, since the reverberation time is not affected by the distance between the microphone and the sound source, the accuracy of the first reverberation time calculated using the acoustic echo path is higher than that of the second reverberation time calculated using the audio signal. Due to environmental factors, the acoustic echo path is time-varying, so that the acoustic echo path calculated by using an echo cancellation algorithm changes during updating, and aiming at the situation, the weight distribution strategy adopted by the invention is as follows: firstly, calculating the path deviation of an acoustic echo path calculated by using an echo cancellation algorithm, and if the path deviation is greater than a set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is unsuccessful, distributing a smaller weight to a first reverberation time calculated by the acoustic echo path; on the contrary, if the path deviation is smaller than the set path deviation threshold value, which indicates that the convergence of the echo cancellation algorithm is successful, a larger weight is assigned to the first reverberation time calculated by the acoustic echo path. The first reverberation time and the second reverberation time are subjected to weight distribution through the path deviation of the acoustic echo path, namely the first reverberation time calculated by the acoustic echo path is corrected by the second reverberation time calculated by the audio signal, so that the final third reverberation time is more accurate, and the reverberation component in the audio signal can be more effectively removed in the subsequent dereverberation.
In the embodiment of the present invention, the weight of the first reverberation time is:
Figure BDA0002405660810000103
Figure BDA0002405660810000104
sx is 11+ e-x; the weights of the second reverberation time are: 1-w; it is understood that the weight distribution includes, but is not limited to, the above modes, and can be adjusted or set according to actual operation.
S204: carrying out short-time Fourier transform on the audio signal received by the microphone to obtain the representation of the audio signal on a short-time frequency domain, and calculating the noise power spectral density in the audio signal by using a noise estimation algorithm;
s205: calculating to obtain the late reverberation power spectral density of the audio signal by using the third reverberation time;
in S205, the late reverberation power spectral density is calculated by:
ηnlk(n,k)=e-2β(n)Rη(n-Ne,k) (10)
in formula (10), ηnlk(n, k) is the power of the late reverberation component, n represents the time frame, k represents the frequency point, η (n, k) is the average power of the signal, η (n, k) is αη (n-1, k) + (1- α) | d' (n, k) |2Where α is a smoothing factor, typically taken to be 0.95NeTo adjust the parameters, typically 8, R is the size of each time frame sliding;
Figure BDA0002405660810000111
fs is the sampling rate;
Figure BDA0002405660810000112
wherein
Figure BDA0002405660810000113
Representing a first reverberation time estimated for the acoustic transfer path,
Figure BDA0002405660810000114
representing a second reverberation time estimated for the audio signal.
S206: based on the calculation results of the noise power spectral density and the late reverberation power spectral density, performing voice enhancement processing on each frequency point in the audio signal by using a voice enhancement mode to eliminate a reverberation part in the audio signal;
in S206, the speech enhancement method includes, but is not limited to, spectral subtraction, wiener filtering, or mmse estimator. Taking wiener filtering as an example, wiener filtering can be expressed as:
Figure BDA0002405660810000115
in equation (11), ξ (n, k) represents the prior signal-to-noise ratio,
Figure BDA0002405660810000116
Figure BDA0002405660810000117
ξminthe lower limit of the prior signal-to-noise ratio can be set according to the actual situation.
Finally, the audio signal after dereverberation is obtained: x is the number ofe(n,k)=H(n,k)d′(n,k)。
The audio dereverberation method in the conference system of the second embodiment of the invention estimates the reverberation time of the audio signal by using the acoustic echo path, thereby improving the effectiveness and robustness of the estimation of the reverberation time; meanwhile, in order to prevent the acoustic echo path from changing in the time-varying process, the calculated reverberation time is subjected to weight distribution, so that the accuracy of reverberation time estimation is further improved, and reverberation components in the audio signal are removed more effectively.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a first structure of an audio dereverberation apparatus in a conference system according to an embodiment of the present invention. The apparatus 40 comprises:
the first reverberation time estimation module 41: for calculating a first reverberation time in the audio scene using the acoustic echo path;
the second reverberation time estimation module 42: the second reverberation time is used for calculating a second reverberation time under the audio scene by using the audio signals received by the microphone;
weight assignment module 43: the path deviation is used for calculating the path deviation of the acoustic echo path, and the first reverberation time and the second reverberation time are subjected to weight distribution according to the path deviation, so that the first reverberation time and the second reverberation time after the weight distribution are the same, and the first reverberation time and the second reverberation time after the weight distribution are both defined as a third reverberation time;
the speech enhancement module 44: for dereverberating the audio signal in accordance with the third reverberation time.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a second structure of the audio dereverberation apparatus in the conference system according to the present invention. As shown in fig. 4, the apparatus 50 includes a processor 51, and a memory 52 coupled to the processor 51.
The memory 52 stores program instructions for implementing the audio dereverberation method in the conference system described above.
The processor 51 is operative to execute program instructions stored in the memory 52 to dereverberate the audio signal.
The processor 51 may also be referred to as a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip having signal processing capabilities. The processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a storage medium according to an embodiment of the invention. The storage medium of the embodiment of the present invention stores a program file 61 capable of implementing all the methods described above, wherein the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for audio dereverberation in a conference system, comprising the steps of:
calculating a first reverberation time under an audio scene by using an acoustic echo path, and calculating a second reverberation time under the audio scene by using an audio signal received by a microphone;
calculating a path deviation of the acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation, so that the first reverberation time and the second reverberation time after weight distribution are the same, and defining that the first reverberation time and the second reverberation time after weight distribution are third reverberation time;
and performing dereverberation processing on the audio signal according to the third reverberation time.
2. The method of claim 1, wherein the calculating the first reverberation time of the audio scene using the acoustic echo path comprises:
and acquiring the acoustic echo path between a loudspeaker and a microphone under the audio scene by using an echo cancellation algorithm.
3. The audio dereverberation method in a conference system according to claim 2,
said weight assigning the first and second reverberation times according to the path deviation further comprises:
if the path deviation is larger than a set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is unsuccessful, distributing a smaller weight to the first reverberation time; if the path deviation is smaller than a set path deviation threshold value, indicating that the convergence of the echo cancellation algorithm is successful, allocating a larger weight to the first reverberation time;
the weight of the second reverberation time is: the total weight minus the weight assigned to the first reverberation time.
4. The audio dereverberation method in a conference system as claimed in any one of claims 1 to 3, wherein the dereverberation processing of the audio signal according to the third reverberation time further comprises:
and calculating the late reverberation power spectral density of the audio signal by using the third reverberation time.
5. The method of claim 4, wherein the dereverberating the audio signal according to the third reverberation time further comprises:
and carrying out short-time Fourier transform on the audio signal to obtain the representation of the audio signal on a short-time frequency domain, and calculating the noise power spectral density of the audio signal by using a noise estimation algorithm.
6. The method of claim 5, wherein the dereverberating the audio signal according to the third reverberation time further comprises:
and based on the calculation results of the noise power spectral density and the late reverberation power spectral density, performing voice enhancement processing on each frequency point in the audio signal by using a voice enhancement mode to eliminate a reverberation part in the audio signal.
7. The method as claimed in claim 6, wherein the speech enhancement mode comprises spectral subtraction, wiener filtering or mmse estimator.
8. An apparatus for audio dereverberation in a conference system, the apparatus comprising:
a first reverberation time estimation module: for calculating a first reverberation time in the audio scene using the acoustic echo path;
a second reverberation time estimation module: for calculating a second reverberation time in the audio scene using the audio signal received by the microphone;
a weight assignment module: the acoustic echo path is used for calculating a path deviation of the acoustic echo path, and performing weight distribution on the first reverberation time and the second reverberation time according to the path deviation, so that the first reverberation time and the second reverberation time after weight distribution are the same, and the first reverberation time and the second reverberation time after weight distribution are both defined as a third reverberation time;
a voice enhancement module: for dereverberating the audio signal in accordance with the third reverberation time.
9. An audio dereverberation apparatus in a conference system, the apparatus comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing an audio dereverberation method in a conference system as claimed in any one of claims 1 to 7;
the processor is to execute the program instructions stored by the memory to dereverberate an audio signal.
10. A storage medium having stored thereon program instructions executable by a processor to perform the method of audio dereverberation in a conference system as claimed in any one of claims 1 to 7.
CN202010160669.2A 2020-03-10 2020-03-10 Audio dereverberation method, device and storage medium in conference system Active CN111445916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010160669.2A CN111445916B (en) 2020-03-10 2020-03-10 Audio dereverberation method, device and storage medium in conference system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010160669.2A CN111445916B (en) 2020-03-10 2020-03-10 Audio dereverberation method, device and storage medium in conference system

Publications (2)

Publication Number Publication Date
CN111445916A true CN111445916A (en) 2020-07-24
CN111445916B CN111445916B (en) 2022-10-28

Family

ID=71627390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010160669.2A Active CN111445916B (en) 2020-03-10 2020-03-10 Audio dereverberation method, device and storage medium in conference system

Country Status (1)

Country Link
CN (1) CN111445916B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115665642A (en) * 2022-12-12 2023-01-31 杭州兆华电子股份有限公司 Noise elimination method and system
WO2023040456A1 (en) * 2021-09-20 2023-03-23 International Business Machines Corporation Dynamic mute control for web conferencing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006157498A (en) * 2004-11-30 2006-06-15 Matsushita Electric Ind Co Ltd Sound echo canceller, hands free telephone using the same, and sound echo canceling method
JP2006270368A (en) * 2005-03-23 2006-10-05 Yamaha Corp Howling canceler
US20090154692A1 (en) * 2007-12-13 2009-06-18 Sony Corporation Voice processing apparatus, voice processing system, and voice processing program
CN103262163A (en) * 2010-10-25 2013-08-21 弗兰霍菲尔运输应用研究公司 Echo suppression comprising modeling of late reverberation components
CN106128451A (en) * 2016-07-01 2016-11-16 北京地平线机器人技术研发有限公司 Method for voice recognition and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006157498A (en) * 2004-11-30 2006-06-15 Matsushita Electric Ind Co Ltd Sound echo canceller, hands free telephone using the same, and sound echo canceling method
JP2006270368A (en) * 2005-03-23 2006-10-05 Yamaha Corp Howling canceler
US20090154692A1 (en) * 2007-12-13 2009-06-18 Sony Corporation Voice processing apparatus, voice processing system, and voice processing program
CN103262163A (en) * 2010-10-25 2013-08-21 弗兰霍菲尔运输应用研究公司 Echo suppression comprising modeling of late reverberation components
CN106128451A (en) * 2016-07-01 2016-11-16 北京地平线机器人技术研发有限公司 Method for voice recognition and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
袁佳能等: "一种改进的双通道滤波回声抵消算法――过采样无延迟子带方法", 《电声技术》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023040456A1 (en) * 2021-09-20 2023-03-23 International Business Machines Corporation Dynamic mute control for web conferencing
US11838340B2 (en) 2021-09-20 2023-12-05 International Business Machines Corporation Dynamic mute control for web conferencing
CN115665642A (en) * 2022-12-12 2023-01-31 杭州兆华电子股份有限公司 Noise elimination method and system
CN115665642B (en) * 2022-12-12 2023-03-17 杭州兆华电子股份有限公司 Noise elimination method and system

Also Published As

Publication number Publication date
CN111445916B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
US10650796B2 (en) Single-channel, binaural and multi-channel dereverberation
US10827263B2 (en) Adaptive beamforming
JP6534180B2 (en) Adaptive block matrix with pre-whitening for adaptive beamforming
JP5762956B2 (en) System and method for providing noise suppression utilizing nulling denoising
WO2012026126A1 (en) Sound source separator device, sound source separator method, and program
US10553236B1 (en) Multichannel noise cancellation using frequency domain spectrum masking
JP2016518628A (en) Multi-channel echo cancellation and noise suppression
US20190349471A1 (en) Method, apparatus, and computer-readable media utilizing residual echo estimate information to derive secondary echo reduction parameters
JP2013518477A (en) Adaptive noise suppression by level cue
US9330677B2 (en) Method and apparatus for generating a noise reduced audio signal using a microphone array
CN111063366A (en) Method and device for reducing noise, electronic equipment and readable storage medium
CN111445916B (en) Audio dereverberation method, device and storage medium in conference system
JP2023133472A (en) Background noise estimation using gap confidence
CN109215672B (en) Method, device and equipment for processing sound information
CN110199528B (en) Far field sound capture
CN112151060A (en) Single-channel voice enhancement method and device, storage medium and terminal
CN109326297B (en) Adaptive post-filtering
US11195540B2 (en) Methods and apparatus for an adaptive blocking matrix
WO2014132500A1 (en) Signal processing device and method
Wang et al. A robust generalized sidelobe canceller controlled by a priori sir estimate
TW202331701A (en) Echo cancelling method for dual-microphone array, echo cancelling device for dual-microphone array, electronic equipment, and computer-readable medium
CN115527549A (en) Echo residue suppression method and system based on special structure of sound
JP6221463B2 (en) Audio signal processing apparatus and program
CN117116281A (en) Acoustic feedback suppression method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant