CN115762547A - Method, device, coding method, medium and equipment for detecting and eliminating noise - Google Patents

Method, device, coding method, medium and equipment for detecting and eliminating noise Download PDF

Info

Publication number
CN115762547A
CN115762547A CN202211212786.4A CN202211212786A CN115762547A CN 115762547 A CN115762547 A CN 115762547A CN 202211212786 A CN202211212786 A CN 202211212786A CN 115762547 A CN115762547 A CN 115762547A
Authority
CN
China
Prior art keywords
noise
current frame
pseudo
audio
flatness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211212786.4A
Other languages
Chinese (zh)
Inventor
李强
王尧
叶东翔
朱勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Barrot Wireless Co Ltd
Original Assignee
Barrot Wireless Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Barrot Wireless Co Ltd filed Critical Barrot Wireless Co Ltd
Priority to CN202211212786.4A priority Critical patent/CN115762547A/en
Publication of CN115762547A publication Critical patent/CN115762547A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application discloses a method, a device, an encoding method, a medium and equipment for detecting and eliminating noise, which belong to the technical field of Bluetooth, wherein the method comprises the following steps: acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by a coder/decoder with improved discrete cosine transform processing; calculating the flatness of a pseudo spectrum according to the original frequency domain spectral coefficient; setting a noise updating speed parameter according to the pseudospectral flatness and calculating the noise energy of the current frame audio under the condition that the pseudospectral flatness is greater than a preset threshold; and calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing subsequent encoding or decoding process according to the updated spectral coefficient. The audio data are encoded by using the existing encoding process in the encoder, and meanwhile, the noise updating speed parameter is set by using the pseudo-spectrum flatness, so that transition among multiple frames is performed, power consumption and delay are reduced, and the user experience is improved.

Description

Method, device, coding method, medium and equipment for detecting and eliminating noise
Technical Field
The present application relates to the field of bluetooth technology, and in particular, to a method, an apparatus, an encoding method, a medium, and a device for detecting and eliminating noise.
Background
Hiss noise is a stable additive noise in the full frequency band, and sounds a hissing sound, such as the noise is often introduced by the process of some old recordings using digital storage, and is also easily introduced by recording equipment and the like during live broadcasting, so that the user experience is reduced. In order to eliminate noise, the prior art includes a processing flow for suppressing Hiss noise: 1. inputting an audio signal to be processed (PCM format); 2. an analysis window; 3. fourier transform; 4. identifying a frame type; 5, estimating and updating Hiss noise; 6. a spectral gain; 7. smoothing the frequency spectrum; 8. multiplying; 9. performing Fourier inversion; 10. a synthesis window; 11. overlap-add; 12. and outputting the audio signal with the Hiss noise suppressed. In the existing bluetooth application field, there are requirements for high sound quality, low power consumption and low latency. Firstly, the processing procedures are multiple and complex, and the power consumption of the Bluetooth equipment is increased. Meanwhile, the method adopts the overlap-add method to ensure the smoothness of the inter-frame signals, but the processing method can increase the delay, does not meet the low-delay target or requirement of Bluetooth and influences the use experience of users.
Disclosure of Invention
The application provides a method, a device, an encoding method, a medium and equipment for detecting and eliminating noise, aiming at the problems of complex flow, high power consumption and high delay when Hiss noise is detected and eliminated.
In a first aspect, the present application provides a method for detecting and eliminating noise, including: acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by a coder/decoder with improved discrete cosine transform processing; calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients; setting a noise updating speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is larger than a preset threshold; calculating the noise energy of the current frame audio according to the noise updating speed parameter; and calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing subsequent encoding or decoding process according to the updated spectral coefficient.
Optionally, in the process of encoding or decoding the audio by the codec with the modified discrete cosine transform process, obtaining the original frequency domain spectral coefficient corresponding to the current frame audio includes: in the standard coding or decoding process of the coder and the decoder, the audio of the current frame is subjected to improved discrete cosine transform to obtain the original frequency domain spectral coefficient.
Optionally, calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients, including: calculating a pseudo spectrum corresponding to the current frame audio according to the original frequency domain spectral coefficient; calculating and obtaining a pseudo-spectrum geometric mean value and a pseudo-spectrum arithmetic mean value corresponding to the current frame audio according to the pseudo-spectrum; and calculating to obtain the pseudospectral flatness according to the pseudospectral geometric mean and the pseudospectral arithmetic mean.
Optionally, setting a noise update speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold, including: and setting a noise updating speed parameter according to the size of the pseudospectral flatness, wherein the size of the noise updating speed parameter is in direct proportion to the size of the pseudospectral flatness.
Optionally, setting a noise update speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold, further comprising: under the condition that the flatness of the pseudo-spectrum is greater than a preset threshold, judging whether the flatness of the pseudo-spectrum corresponding to one audio frame is smaller than the preset threshold or not in a preset number of frames of audio before the current frame of audio; if not, setting a noise updating speed parameter according to the size of the pseudospectral flatness corresponding to the current frame audio.
Optionally, calculating the noise energy of the current frame audio according to the noise update speed parameter includes: selecting a non-speech frequency band of the current frame audio, and calculating a pseudo-spectral median corresponding to the current frame audio; and calculating to obtain the current frame moving average energy corresponding to the current frame audio according to the pseudo-spectral median, the noise updating speed parameter and the moving average energy of the previous frame audio, and taking the current frame moving average energy as the noise energy of the current frame audio.
In a second aspect, the present application provides an apparatus for detecting and removing noise, comprising: the frequency domain spectral coefficient acquisition module is used for acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by the codec with the improved discrete cosine transform processing; the pseudo-spectral flatness calculation module is used for calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients; the noise updating speed parameter determining module is used for setting a noise updating speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold; the noise energy calculation module calculates the noise energy of the current frame audio according to the noise updating speed parameter; and the spectral coefficient updating module is used for calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing subsequent encoding or decoding process according to the updated spectral coefficient.
In a third aspect, the present application provides an audio encoding method, including: in the process of coding the audio by the coder with improved discrete cosine transform processing, obtaining an original frequency domain spectral coefficient corresponding to the current frame audio; calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficient; setting a noise updating speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is larger than a preset threshold; calculating the noise energy of the current frame audio according to the noise updating speed parameter; calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy; and updating other coding modules in the coder according to the updated spectral coefficients, and continuously coding the current frame audio by using the updated coding modules.
In a fourth aspect, the present application provides a computer-readable storage medium storing a computer program, wherein the computer program is operative to perform the method of detecting and removing noise in scheme one or the method of audio encoding in scheme three.
In a fifth aspect, the present application provides a computer apparatus comprising a processor and a memory, the memory storing a computer program, wherein: the processor operates the computer program to perform the method of detecting and removing noise in scheme one or the audio encoding method in scheme three.
The method for detecting and eliminating the noise encodes the audio data by utilizing the existing encoding process in the encoder, sets the noise updating speed parameter by utilizing the pseudospectral flatness, performs transition among multiple frames, has the characteristics of low power consumption and low delay, and improves the use experience of users.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below illustrate some embodiments of the present application by way of example.
FIG. 1 is a schematic diagram illustrating one embodiment of a method of detecting and canceling noise according to the present application;
FIG. 2 is a diagram illustrating a section of noise and its corresponding noise flatness according to the present application;
FIG. 3 is a schematic diagram of a section of a human voice and its corresponding flatness of the human voice spectrum of the present application;
FIG. 4 is a schematic diagram illustrating one embodiment of an apparatus for detecting and canceling noise according to the present disclosure;
fig. 5 is a schematic diagram illustrating an embodiment of an audio encoding method of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
The following detailed description of the preferred embodiments of the present application, taken in conjunction with the accompanying drawings, will provide those skilled in the art with a better understanding of the advantages and features of the present application, and will make the scope of the present application more clear and definite.
It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising 8230; \8230;" comprises 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Hiss noise is a stable additive noise in the full frequency band, and sounds a hissing sound, such as the noise is often introduced by the process of some old recordings using digital storage, and is also easily introduced by recording equipment and the like during live broadcasting, so that the user experience is reduced. In order to eliminate noise, the prior art includes a processing flow for suppressing Hiss noise: 1. inputting an audio signal to be processed (PCM format); 2. an analysis window; 3. fourier transform; 4. identifying a frame type; 5, estimating and updating Hiss noise; 6. a spectral gain; 7. smoothing the frequency spectrum; 8. multiplying; 9. performing Fourier inversion; 10. a synthesis window; 11. overlap-add; 12. and outputting the audio signal with the Hiss noise suppressed. In the existing bluetooth application field, there are requirements for high sound quality, low power consumption and low delay. Firstly, the processing procedures are multiple and complex, and the power consumption of the Bluetooth equipment is increased. Meanwhile, the method adopts the overlap-add method to ensure the smoothness of the inter-frame signals, but the processing method can increase the delay, does not meet the low-delay target or requirement of Bluetooth and influences the use experience of users.
In view of the above problems, the present application provides a method, an apparatus, an encoding method, a medium, and a device for detecting and eliminating noise, where the method includes: acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by a coder/decoder with improved discrete cosine transform processing; calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients; setting a noise updating speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is larger than a preset threshold; calculating the noise energy of the current frame audio according to the noise updating speed parameter; and calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing subsequent encoding or decoding process according to the updated spectral coefficient. The technical scheme of the application mainly aims at Hiss noise. The noise is of various types, including stationary and non-stationary, and additive and multiplicative, and the application is also effective for additive stationary noise.
The method for detecting and eliminating the noise encodes or decodes the audio by using the existing encoding or decoding process in the codec with the improved discrete cosine transform processing, obtains the original frequency domain spectral coefficient, obtains the pseudo-spectral flatness, sets the noise updating speed parameter by using the pseudo-spectral flatness, performs transition among multiple frames, has the characteristics of low power consumption and low delay, and improves the user experience.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific examples. The specific embodiments described below can be combined with each other to form new embodiments. The same or similar ideas or processes described in one embodiment may not be repeated in other embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
FIG. 1 is a schematic diagram illustrating one embodiment of a method for detecting and canceling noise according to the present application.
In the embodiment shown in fig. 1, the method for detecting and removing noise of the present application includes a process S101, where in the process of encoding or decoding the audio by the codec with the modified discrete cosine transform process, the original frequency-domain spectral coefficients corresponding to the current frame audio are obtained.
In this embodiment, in the method for detecting and removing noise of the present application, an original frequency domain spectral coefficient of a current frame audio is obtained in an encoding or decoding process by using an existing standard encoding or decoding process of an audio in a codec through an improved discrete cosine transform process, and then subsequent noise detection is performed through the original frequency domain spectral coefficient.
Optionally, in the process of encoding or decoding the audio by the codec with the modified discrete cosine transform, obtaining an original frequency domain spectral coefficient corresponding to the current frame audio, including: in the standard coding or decoding process of the coder and the decoder, the audio of the current frame is subjected to improved discrete cosine transform to obtain the original frequency domain spectral coefficient.
In this alternative embodiment, for convenience of description, the present application takes the LC3 codec as an example to encode audio for the codec with the modified dct process, and the technical solution of the present application is described. Wherein the process is similar for other codecs with modified discrete cosine transform processing.
The method comprises the steps of utilizing an existing improved discrete cosine transform module in an LC3 encoder to encode current frame audio to obtain an original frequency domain spectral coefficient corresponding to the current frame audio, and then utilizing the original frequency domain spectral coefficient to detect Hiss noise in audio data. In the background art, when the Hiss noise is eliminated, the prior art includes processes such as an analysis window, fourier transform, inverse fourier transform, and the like, so that the problems of high power consumption, high delay, and poor user experience exist. In the method, the processing procedures can be omitted by means of the existing coding module in the coder, so that the power consumption is reduced, and the requirements on the storage space and the computing capability of the whole coding system are reduced. Meanwhile, in the scheme in the prior art, certain errors are introduced in the processes of Fourier transform, inverse transform, a synthesis window, an analysis window and the like, and the influence of high delay on tone quality is avoided because the modules are omitted in the method; in addition, the technical scheme in the application has lower requirements on computing power, and the endurance time of the Bluetooth embedded equipment is ensured.
Specifically, taking the example that an LC3 encoder encodes audio data with a frame length of 10ms and a sampling rate of 48kHz to obtain frequency-domain spectral coefficients, an acquisition process of original frequency-domain spectral coefficients of audio is described. According to LC3 weavingThe standard coding process of the coder completes low-delay modified discrete cosine transform (LD-MDCT) calculation on the input audio data with the frame length of 10ms to obtain a corresponding original frequency domain spectral coefficient. Wherein, the audio data of the current frame, n =0,1, x s (n)2,…,N F
t(n)=x s (Z-N F +n),for n=0…2·N F -1-Z
t(2N F -Z+n)=0,for n=0…Z-1
Figure BDA0003871723370000041
In the above formula, based on the LC3 standard specification, N F Is 480, Z is 180, w Nms_NF (n) is the low-delay MDCT window, and X (k) is the current frame time-domain audio data X s (n) corresponding frequency domain spectral coefficients.
In the embodiment shown in fig. 1, the method for detecting and removing noise of the present application includes a process S102 of calculating a pseudo-spectral flatness of the audio of the current frame according to the original frequency-domain spectral coefficients.
Optionally, calculating the pseudo-spectral flatness of the current frame audio data according to the original frequency domain spectral coefficients includes: calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients, wherein the pseudo-spectral flatness comprises the following steps: calculating a pseudo spectrum corresponding to the current frame audio according to the original frequency domain spectral coefficient; calculating and obtaining a pseudo-spectrum geometric mean value and a pseudo-spectrum arithmetic mean value corresponding to the current frame audio according to the pseudo-spectrum; and calculating to obtain the pseudospectral flatness according to the pseudospectral geometric mean and the pseudospectral arithmetic mean.
Specifically, the process of calculating the pseudo-spectral flatness of the current frame audio data according to the original frequency domain spectral coefficients is as follows:
calculating a pseudo spectrum:
Figure BDA0003871723370000051
wherein X (k) =0, when k = -1 or N F Time of flight
Calculating a pseudo-spectral geometric mean:
Figure BDA0003871723370000052
calculating the arithmetic mean of the pseudo-spectrum:
Figure BDA0003871723370000053
calculating the pseudospectral flatness:
Figure BDA0003871723370000054
in the embodiment shown in fig. 1, the method for detecting and eliminating noise of the present application includes a process S103, where a noise update speed parameter is set according to a pseudo-spectral flatness if the pseudo-spectral flatness is greater than a preset threshold.
In this embodiment, when noise removal of audio is performed, because noise data in an audio frame is removed, there may be a difference in the degree of noise removal between two consecutive frames, so that the originally smooth audio fluctuates greatly, thereby reducing the user experience. In order to ensure the smoothness of transition among multiple frames after noise processing, the noise updating speed parameter is set in the process of noise elimination so as to adjust the degree of noise elimination among different audio frames, ensure the audio tone quality and eliminate noise.
Specifically, the preset threshold is set to 0.15, where the preset threshold is used to distinguish whether the current frame audio is a pitch frame of a normal audio or a voiced sound frame containing noise. If the flatness of the pseudo-spectrum is less than the threshold, the current frame is a tone signal and is a tone frame, and noise elimination processing is not needed; if the flatness of the pseudo spectrum is greater than or equal to the preset threshold, the audio frame is a voiced frame, and noise elimination processing is required.
Optionally, under the condition that the pseudo-spectral flatness is greater than the preset threshold, setting a noise update speed parameter according to the pseudo-spectral flatness, including: and setting a noise updating speed parameter according to the size of the pseudospectral flatness, wherein the size of the noise updating speed parameter is in direct proportion to the size of the pseudospectral flatness.
In this alternative embodiment, when determining the noise update speed parameter, reasonable setting is performed according to the magnitude of the pseudo-spectral flatness. Wherein, the magnitude of the noise updating speed parameter is in direct proportion to the magnitude of the pseudospectral flatness. Correspondingly setting a larger noise updating speed parameter when the pseudo-spectral flatness of the current frame audio data is larger; the smaller the pseudospectral flatness of the current frame audio data is, the smaller the noise update speed parameter is correspondingly set.
Optionally, setting a noise update speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold, further comprising: under the condition that the flatness of the pseudo spectrum is greater than a preset threshold, judging whether the flatness of the pseudo spectrum corresponding to one audio frame is smaller than the preset threshold or not in a preset number of frames of audio before the current frame of audio; if not, setting a noise updating speed parameter according to the size of the pseudospectral flatness corresponding to the current frame audio.
In this alternative embodiment, when the pseudo-spectral flatness of the audio of the current frame is determined to be greater than or equal to the predetermined threshold and is determined to be a voiced frame containing noise, specifically, for speech, the unvoiced speech after the voiced frame is generally spectral flatness similar to noise, and for the sake of more accurate noise processing, a pitch frame delay factor is set, and if the current frame is a pitch frame, the following frames are considered to be pitch frames regardless of the magnitude of the spectral flatness. If the current frame is a tone frame, the Hiss noise does not exist or the energy of the Hiss noise is very small, the spectral coefficient is not updated at the moment, the damage to the tone quality is avoided, the noise calculation is still updated at the moment, and the reasonable noise estimation is convenient for a certain frame in the future when the spectral coefficient needs to be updated. Specifically, the frame number of the preset number of frames may be set to 5 frames, and it should be noted that, regarding the selection of the preset threshold and the preset number of frames, reasonable adjustment may be performed according to the actual processing requirement, and the present application is not particularly limited.
The noise update speed α is set according to the spectral flatness, i.e., the lower the spectral flatness, the slower the noise update speed becomes, and vice versa, the stronger the pitch component becomes. Specifically, the update speed α may be set to 0.05 when the spectrum flatness is 0.1, and 0.25 when the spectrum flatness is 0.8, so that the linear relationship is as follows, and other values are obtained according to the following formula.
Figure BDA0003871723370000061
Fig. 2 shows a schematic diagram of a section of noise and its corresponding noise flatness according to the present application. Fig. 3 shows a schematic diagram of a section of human voice and its corresponding human voice spectrum flatness.
In the embodiment shown in fig. 1, the method for detecting and removing noise of the present application includes a process S104 of calculating noise energy of the current frame audio according to the noise update speed parameter.
Optionally, calculating the noise energy of the current frame audio according to the noise update speed parameter includes: selecting a non-speech frequency band of the current frame audio, and calculating a pseudo-spectral median corresponding to the current frame audio; and calculating to obtain the current frame moving average energy corresponding to the current frame audio according to the pseudo-spectral median, the noise updating speed parameter and the moving average energy of the previous frame audio, and taking the current frame moving average energy as the noise energy of the current frame audio.
Specifically, a non-voice frequency band is selected and a median of a pseudo spectrum is calculated, usually, main energy of the voice frequency band is concentrated in a range of 300Hz to 3400Hz, taking a currently configured sampling rate of 48Hz as an example, an effective bandwidth of the non-voice frequency band is 24kHz, a frequency band of 8kHz to 20kHz can be selected to calculate and judge Hiss noise, and a corresponding spectral coefficient index k ranges from 160 to 400.
Median value of non-speech band pseudo spectrum of current frame:
X pseudo-Med =median(X pseudo (k) K = 160-400, the median () operation represents taking the median value thereof.
The moving average energy of the current frame is:
P ma =α*P ma-la +(1-α)*X pseudo-Me in which P is ma-last Is the moving average energy of the previous frame, and α is a moving average factor, which is used to control the noise update speed and avoid abrupt change of the inter-frame noise estimation, typically taking a value of 0.95.
In the embodiment shown in fig. 1, the method for detecting and removing noise of the present application includes a process S105, calculating updated spectral coefficients of the current frame audio according to the original frequency domain spectral coefficients and the noise energy, and performing a subsequent encoding or decoding process according to the updated spectral coefficients.
Specifically, as mentioned above, when the spectral flatness is greater than 0.15, which indicates that there is noise in the current frame audio, the noise processing is started, and the noise is removed by updating the spectral coefficients:
Figure BDA0003871723370000071
the updating of the spectral coefficient is as shown in the above formula, and after obtaining a new spectral coefficient, the encoding process of the rest modules of the encoder is continuously completed in the encoder. And finishing the rest coding modules according to LC3 standard specifications, including transform domain noise shaping, time domain noise shaping, quantization and noise level estimation, arithmetic and residual coding and code stream packaging.
The method for detecting and eliminating the noise can be applied to the coding process of a coder or the decoding process of a decoder with improved discrete cosine transform processing to detect and eliminate the Hiss noise. In addition, the method for detecting and eliminating the noise encodes the audio data by using the existing encoding process in the encoder, sets the noise updating speed parameter by using the pseudospectral flatness, performs transition among multiple frames, has the characteristics of low power consumption and low delay, and improves the user experience.
Fig. 4 shows an embodiment of the apparatus for detecting and removing noise according to the present application.
In the embodiment shown in fig. 4, the apparatus for detecting and eliminating noise of the present application includes: a frequency-domain spectral coefficient obtaining module 401, which obtains an original frequency-domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by the codec with the modified discrete cosine transform processing; a pseudo-spectral flatness calculation module 402, which calculates the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients; a noise update speed parameter determination module 403, configured to set a noise update speed parameter according to the pseudo-spectrum flatness if the pseudo-spectrum flatness is greater than a preset threshold; a noise energy calculating module 404, which calculates the noise energy of the current frame audio according to the noise update speed parameter; and a spectral coefficient updating module 405, which calculates an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performs a subsequent encoding or decoding process according to the updated spectral coefficient.
Optionally, in the frequency domain spectral coefficient obtaining module 401, in the standard encoding or decoding process of the codec, the modified discrete cosine transform is performed on the current frame audio to obtain the original frequency domain spectral coefficient.
Optionally, in the pseudo-spectrum flatness calculation module 402, a pseudo-spectrum corresponding to the current frame audio is calculated according to the original frequency domain spectral coefficient; calculating and obtaining a pseudo-spectrum geometric mean value and a pseudo-spectrum arithmetic mean value corresponding to the current frame audio according to the pseudo-spectrum; and calculating to obtain the pseudospectral flatness according to the pseudospectral geometric mean and the pseudospectral arithmetic mean.
Optionally, in the noise update speed parameter determining module 403, a noise update speed parameter is set according to the magnitude of the pseudo-spectral flatness, where the magnitude of the noise update speed parameter is in a direct proportional relationship with the magnitude of the pseudo-spectral flatness.
Optionally, in the noise update speed parameter determining module 403, under the condition that the pseudo-spectral flatness is greater than the preset threshold, it is determined whether there is a pseudo-spectral flatness corresponding to an audio frame that is less than the preset threshold in a preset number of frames of audio before the current frame of audio; if not, setting a noise updating speed parameter according to the size of the pseudospectral flatness corresponding to the current frame audio.
Optionally, in the noise energy calculation module 404, a non-speech frequency band of the current frame audio is selected, and a pseudo-spectral median corresponding to the current frame audio is calculated; and calculating to obtain the current frame moving average energy corresponding to the current frame audio according to the pseudo-spectral median, the noise updating speed parameter and the moving average energy of the previous frame audio, and taking the current frame moving average energy as the noise energy of the current frame audio.
The device for detecting and eliminating the noise encodes the audio data by utilizing the existing encoding process in the encoder, sets the noise updating speed parameter by utilizing the pseudospectral flatness, performs transition among multiple frames, has the characteristics of low power consumption and low delay, and improves the use experience of users.
Fig. 5 shows an embodiment of the audio encoding method of the present application.
In the embodiment shown in fig. 5, the audio encoding method of the present application includes: the process S501, in the process of coding the audio by the coder with improved discrete cosine transform processing, obtaining the original frequency domain spectral coefficient corresponding to the current frame audio; the process S502, calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficient; a process S503, setting a noise updating speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold; the process S504, calculating the noise energy of the current frame audio frequency according to the noise updating speed parameter; and a process S505, calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy; and a process S506, updating other coding modules in the encoder according to the updated spectral coefficients, and continuing to encode the current frame audio by using the updated coding modules.
Optionally, in the process of encoding or decoding the audio by the codec with the modified discrete cosine transform, obtaining an original frequency domain spectral coefficient corresponding to the current frame audio, including: in the standard coding or decoding process of the coder and the decoder, the audio of the current frame is subjected to improved discrete cosine transform to obtain the original frequency domain spectral coefficient.
Optionally, calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients, including: calculating a pseudo spectrum corresponding to the current frame audio according to the original frequency domain spectral coefficient; calculating and obtaining a pseudo-spectrum geometric mean value and a pseudo-spectrum arithmetic mean value corresponding to the current frame audio according to the pseudo-spectrum; and calculating to obtain the pseudospectral flatness according to the pseudospectral geometric mean and the pseudospectral arithmetic mean.
Optionally, setting a noise update speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold, including: and setting a noise updating speed parameter according to the size of the pseudospectral flatness, wherein the size of the noise updating speed parameter is in direct proportion to the size of the pseudospectral flatness.
Optionally, setting a noise update speed parameter according to the pseudospectral flatness under the condition that the pseudospectral flatness is greater than a preset threshold, further comprising: under the condition that the flatness of the pseudo-spectrum is greater than a preset threshold, judging whether the flatness of the pseudo-spectrum corresponding to one audio frame is smaller than the preset threshold or not in a preset number of frames of audio before the current frame of audio; if not, setting a noise updating speed parameter according to the size of the pseudospectral flatness corresponding to the current frame audio.
Optionally, calculating the noise energy of the current frame audio according to the noise update speed parameter includes: selecting a non-speech frequency band of the current frame audio, and calculating a pseudo-spectral median corresponding to the current frame audio; and calculating to obtain the current frame moving average energy corresponding to the current frame audio according to the pseudo-spectral median, the noise updating speed parameter and the moving average energy of the previous frame audio, and taking the current frame moving average energy as the noise energy of the current frame audio.
The audio coding method of the application codes audio data by using the existing coding process in the coder, sets the noise updating speed parameter by using the pseudo-spectral flatness, performs transition among multiple frames, has the characteristics of low power consumption and low delay, and improves the user experience.
In one embodiment of the present application, a computer-readable storage medium stores computer instructions, wherein the computer instructions are operable to perform the method of detecting and removing noise or the method of audio coding described in any of the embodiments. Wherein the storage medium may be directly in hardware, in a software module executed by a processor, or in a combination of the two.
A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
The Processor may be a Central Processing Unit (CPU), other general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), other Programmable logic devices, discrete Gate or transistor logic, discrete hardware components, or any combination thereof. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one embodiment of the present application, a computer device includes a processor and a memory, the memory storing computer instructions, wherein: the processor operates the computer instructions to perform the method of detecting and eliminating noise or the method of audio encoding described in any of the embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above embodiments are merely examples, which are not intended to limit the scope of the present disclosure, and all equivalent structural changes made by using the contents of the specification and the drawings, or any other related technical fields, are also included in the scope of the present disclosure.

Claims (10)

1. A method of detecting and canceling noise, comprising:
acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by a coder/decoder with improved discrete cosine transform processing;
calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficient;
setting a noise updating speed parameter according to the pseudo-spectrum flatness under the condition that the pseudo-spectrum flatness is larger than a preset threshold;
calculating the noise energy of the current frame audio according to the noise updating speed parameter; and
and calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing subsequent encoding or decoding process according to the updated spectral coefficient.
2. The method for detecting and removing noise according to claim 1, wherein the obtaining of the original frequency-domain spectral coefficients corresponding to the current frame audio in the encoding or decoding process of the audio by the codec with modified discrete cosine transform process comprises:
and in the standard encoding or decoding process of the encoder and the decoder, performing improved discrete cosine transform on the current frame audio to obtain the original frequency domain spectral coefficient.
3. The method for detecting and removing noise according to claim 1, wherein said calculating the pseudo-spectral flatness of the current frame audio according to the original frequency-domain spectral coefficients comprises:
calculating a pseudo spectrum corresponding to the current frame audio according to the original frequency domain spectral coefficient;
calculating and obtaining a pseudo-spectrum geometric mean value and a pseudo-spectrum arithmetic mean value corresponding to the current frame audio according to the pseudo-spectrum;
and calculating to obtain the pseudospectral flatness according to the pseudospectral geometric mean and the pseudospectral arithmetic mean.
4. The method for detecting and eliminating noise according to claim 1, wherein the setting a noise update speed parameter according to the pseudo-spectral flatness if the pseudo-spectral flatness is greater than a preset threshold comprises:
and setting the noise updating speed parameter according to the size of the pseudo-spectrum flatness, wherein the size of the noise updating speed parameter is in direct proportion to the size of the pseudo-spectrum flatness.
5. The method for detecting and removing noise according to claim 4, wherein the setting of the noise update speed parameter according to the pseudo-spectral flatness if the pseudo-spectral flatness is greater than a preset threshold further comprises:
under the condition that the pseudo-spectrum flatness is larger than a preset threshold, judging whether the pseudo-spectrum flatness corresponding to one audio frame is smaller than the preset threshold or not in a preset number of frames of audio before the current frame of audio;
if not, setting the noise updating speed parameter according to the size of the pseudo-spectral flatness corresponding to the current frame audio.
6. The method for detecting and removing noise according to claim 1, wherein said calculating the noise energy of the current frame audio according to the noise update speed parameter comprises:
selecting a non-voice frequency band of the current frame audio, and calculating a pseudo-spectral median corresponding to the current frame audio;
and calculating to obtain the current frame moving average energy corresponding to the current frame audio according to the pseudo-spectral median, the noise updating speed parameter and the moving average energy of the previous frame audio, and using the current frame moving average energy as the noise energy of the current frame audio.
7. An apparatus for detecting and canceling noise, comprising:
the frequency domain spectral coefficient acquisition module is used for acquiring an original frequency domain spectral coefficient corresponding to the current frame audio in the process of encoding or decoding the audio by the codec with the improved discrete cosine transform processing;
the pseudo-spectral flatness calculation module is used for calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients;
the noise updating speed parameter determining module is used for setting a noise updating speed parameter according to the pseudo-spectrum flatness under the condition that the pseudo-spectrum flatness is larger than a preset threshold;
the noise energy calculation module calculates the noise energy of the current frame audio according to the noise updating speed parameter; and
and the spectral coefficient updating module is used for calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy, and performing a subsequent encoding or decoding process according to the updated spectral coefficient.
8. An audio encoding method, comprising:
in the process of coding the audio by the coder with improved discrete cosine transform processing, obtaining an original frequency domain spectral coefficient corresponding to the current frame audio;
calculating the pseudo-spectral flatness of the current frame audio according to the original frequency domain spectral coefficients;
setting a noise updating speed parameter according to the pseudo-spectrum flatness under the condition that the pseudo-spectrum flatness is larger than a preset threshold;
calculating the noise energy of the current frame audio according to the noise updating speed parameter; and
calculating to obtain an updated spectral coefficient of the current frame audio according to the original frequency domain spectral coefficient and the noise energy;
and updating other encoding modules in the encoder according to the updated spectral coefficients, and continuously encoding the current frame audio by using the updated encoding modules.
9. A computer-readable storage medium storing a computer program, wherein the computer program is operative to perform the method of detecting and removing noise of any one of claims 1 to 6 or the audio encoding method of claim 8.
10. A computer device comprising a processor and a memory, the memory storing a computer program, wherein: the processor operates the computer program to perform the method of detecting and removing noise of any one of claims 1-6 or the audio encoding method of claim 8.
CN202211212786.4A 2022-09-29 2022-09-29 Method, device, coding method, medium and equipment for detecting and eliminating noise Pending CN115762547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211212786.4A CN115762547A (en) 2022-09-29 2022-09-29 Method, device, coding method, medium and equipment for detecting and eliminating noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211212786.4A CN115762547A (en) 2022-09-29 2022-09-29 Method, device, coding method, medium and equipment for detecting and eliminating noise

Publications (1)

Publication Number Publication Date
CN115762547A true CN115762547A (en) 2023-03-07

Family

ID=85350762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211212786.4A Pending CN115762547A (en) 2022-09-29 2022-09-29 Method, device, coding method, medium and equipment for detecting and eliminating noise

Country Status (1)

Country Link
CN (1) CN115762547A (en)

Similar Documents

Publication Publication Date Title
KR102205596B1 (en) Multi-channel signal encoding method and encoder
US10706865B2 (en) Apparatus and method for selecting one of a first encoding algorithm and a second encoding algorithm using harmonics reduction
EP3175454B1 (en) Apparatus and method for processing an audio signal using a harmonic post-filter
JP6271531B2 (en) Effective pre-echo attenuation in digital audio signals
CN113450810B (en) Harmonic dependent control of harmonic filter tools
WO2010091013A1 (en) Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder
WO2015139958A1 (en) Apparatus, method and corresponding computer program for generating an error concealment signal using power compensation
CN110556118A (en) Coding method and device for stereo signal
JP6526091B2 (en) Low complexity tonal adaptive speech signal quantization
WO2015139956A1 (en) Apparatus, method and corresponding computer program for generating an error concealment signal using an adaptive noise estimation
WO2013017018A1 (en) Method and apparatus for performing voice adaptive discontinuous transmission
US20200273475A1 (en) Selecting pitch lag
JP2002123298A (en) Method and device for encoding signal, recording medium recorded with signal encoding program
CN115762547A (en) Method, device, coding method, medium and equipment for detecting and eliminating noise
US10332527B2 (en) Method and apparatus for encoding and decoding audio signal
CN113205826B (en) LC3 audio noise elimination method, device and storage medium
CN116884423A (en) Reverberation detection and suppression method, system, medium and equipment
CN116504256A (en) Speech coding method, apparatus, medium, device and program product
CN113539277B (en) Bluetooth audio decoding method, device, medium and equipment for protecting hearing
JP2006126372A (en) Audio signal coding device, method, and program
CN114566174B (en) Method, device, system, medium and equipment for optimizing voice coding
CN116978391A (en) Audio coding method, system, encoder, medium and equipment
CN116741201A (en) Howling detection method, system, decoding method and decoder of audio receiving end
CN116805999A (en) Howling detection method, system, coding method and coder of audio transmitting end
CN115497488A (en) Voice filtering method, device, storage medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination