CN112185408A - Audio noise reduction method and device, electronic equipment and storage medium - Google Patents

Audio noise reduction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112185408A
CN112185408A CN202011080460.1A CN202011080460A CN112185408A CN 112185408 A CN112185408 A CN 112185408A CN 202011080460 A CN202011080460 A CN 202011080460A CN 112185408 A CN112185408 A CN 112185408A
Authority
CN
China
Prior art keywords
audio
noise reduction
scene
algorithm
distortion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011080460.1A
Other languages
Chinese (zh)
Other versions
CN112185408B (en
Inventor
蒋燚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202011080460.1A priority Critical patent/CN112185408B/en
Publication of CN112185408A publication Critical patent/CN112185408A/en
Application granted granted Critical
Publication of CN112185408B publication Critical patent/CN112185408B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application discloses an audio noise reduction method and device, electronic equipment and a storage medium, and relates to the technical field of electronic equipment. The method comprises the following steps: the method comprises the steps of obtaining an audio signal to be denoised, obtaining an audio use scene corresponding to the audio signal to be denoised, selecting a plurality of target audio denoising algorithms based on the audio use scene, and sequentially carrying out denoising treatment on the audio signal to be denoised through each target audio denoising algorithm in the plurality of target audio denoising algorithms according to a specified denoising treatment sequence to obtain a denoised audio signal. According to the audio denoising method and device, the electronic device and the storage medium, the plurality of audio denoising algorithms are selected according to the audio using scene corresponding to the audio signal to be denoised to denoise the audio signal to be denoised, so that the corresponding number of audio denoising algorithms are selected according to the actual voice quality requirement to denoise the audio signal, and the audio processing effect is improved.

Description

Audio noise reduction method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to an audio noise reduction method and apparatus, an electronic device, and a storage medium.
Background
The classification of noise is complicated various, has stationary noise, non-stationary noise etc.. At present, the single noise reduction algorithm is mostly adopted for noise reduction treatment aiming at noise in the market, and two noise reduction algorithms are adopted for a small number of noise reduction algorithms.
Disclosure of Invention
In view of the above problems, the present application provides an audio noise reduction method, apparatus, electronic device and storage medium to solve the above problems.
In a first aspect, an embodiment of the present application provides an audio noise reduction method, where the method includes: acquiring an audio signal to be denoised, and acquiring an audio use scene corresponding to the audio signal to be denoised; selecting a plurality of target audio noise reduction algorithms based on the audio use scene; and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
In a second aspect, an embodiment of the present application provides an audio noise reduction apparatus, including: the audio using scene acquiring module is used for acquiring an audio signal to be subjected to noise reduction and acquiring an audio using scene corresponding to the audio signal to be subjected to noise reduction; the audio noise reduction algorithm acquisition module is used for selecting a plurality of target audio noise reduction algorithms based on the audio use scene; and the audio noise reduction processing module is used for sequentially carrying out noise reduction processing on the audio signal to be subjected to noise reduction through each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms according to a specified noise reduction processing sequence to obtain a noise-reduced audio signal.
In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, the memory being coupled to the processor, the memory storing instructions, and the processor performing the above method when the instructions are executed by the processor.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a program code is stored, and the program code can be called by a processor to execute the above method.
The audio denoising method, the audio denoising device, the electronic device and the storage medium, which are provided by the embodiment of the application, acquire an audio signal to be denoised, acquire an audio use scene corresponding to the audio signal to be denoised, select a plurality of target audio denoising algorithms based on the audio use scene, and perform denoising processing on the audio signal to be denoised sequentially through each target audio denoising algorithm in the plurality of target audio denoising algorithms according to a specified denoising processing sequence to obtain the denoised audio signal, thereby select a plurality of audio denoising algorithms according to the audio use scene corresponding to the audio signal to be denoised to perform denoising processing on the audio signal to be denoised, so as to select a corresponding number of audio denoising algorithms to perform denoising processing according to actual voice quality requirements, and improve an audio processing effect.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart illustrating an audio noise reduction method according to an embodiment of the present application;
FIG. 2 is a flow chart illustrating an audio denoising method according to another embodiment of the present application;
FIG. 3 is a flow chart illustrating an audio denoising method according to another embodiment of the present application;
FIG. 4 is a flow chart illustrating an audio denoising method according to another embodiment of the present application;
FIG. 5 is a flow chart illustrating an audio denoising method according to yet another embodiment of the present application;
fig. 6 shows a block diagram of an audio noise reduction apparatus provided in an embodiment of the present application;
fig. 7 is a block diagram of an electronic device for executing an audio noise reduction method according to an embodiment of the present application;
fig. 8 illustrates a storage unit for storing or carrying program codes for implementing an audio noise reduction method according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Along with the development of science and technology, electronic equipment's application is more and more extensive, gathers and plays audio frequency through electronic equipment and also more and more generally, but, generally all includes the noise in the audio frequency of gathering, and the classification of noise is complicated various, has steady noise, non-steady noise etc.. At present, the single noise reduction algorithm is mostly adopted for noise reduction treatment aiming at noise in the market, and two noise reduction algorithms are adopted for a small number of noise reduction algorithms.
In view of the above problems, the inventors have found through long-term research and provide an audio denoising method, an audio denoising device, an electronic device, and a storage medium provided in the embodiments of the present application, and select a plurality of audio denoising algorithms according to an audio usage scene corresponding to an audio signal to be denoised to denoise the audio signal to be denoised, so as to select a corresponding number of audio denoising algorithms according to an actual voice quality requirement to denoise the audio signal, thereby improving an audio processing effect. The specific audio noise reduction method is described in detail in the following embodiments.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an audio denoising method according to an embodiment of the present application. The audio denoising method is used for selecting a plurality of audio denoising algorithms according to the audio using scene corresponding to the audio signal to be denoised to denoise the audio signal to be denoised so as to select the corresponding number of audio denoising algorithms to denoise the audio signal according to the actual voice quality requirement and improve the audio processing effect. In a specific embodiment, the audio noise reduction method is applied to the audio noise reduction apparatus 200 shown in fig. 6 and the electronic device 100 (fig. 7) equipped with the audio noise reduction apparatus 200. The specific process of the present embodiment will be described below by taking an electronic device as an example, and it is understood that the electronic device applied in the present embodiment may be a smart phone, a tablet computer, a wearable electronic device, and the like, which is not limited herein. As will be explained in detail with respect to the flow shown in fig. 1, the audio denoising method may specifically include the following steps:
step S110: the method comprises the steps of obtaining an audio signal to be subjected to noise reduction, and obtaining an audio use scene corresponding to the audio signal to be subjected to noise reduction.
In this embodiment, an electronic device is used to obtain an audio signal to be noise-reduced, and obtain an audio usage scene corresponding to the audio signal to be noise-reduced.
In a noisy indoor environment or outdoor environment, there are many different sound sources at the same time, and specifically, there may be sounds of multiple persons speaking at the same time, collision sounds of tableware, musical sounds, and reflected sounds generated by reflecting these sounds by objects. In the process of transmitting sound waves, sound waves emitted by different sound sources (sound of different people speaking and sound emitted by vibration of other objects) and direct sound and reflected sound are overlapped in a propagation medium (usually air) to form complex mixed sound waves, so that independent sound waves corresponding to each sound source do not exist in the mixed sound waves reaching the auditory canal of a listener, namely, the mixed sound waves need to be subjected to audio noise reduction processing to obtain relatively independent sound waves, and the mixed sound waves can be used as audio signals to be subjected to noise reduction.
In some embodiments, the electronic device may obtain the audio signal to be noise-reduced locally, may obtain the audio signal to be noise-reduced from a server, may obtain the audio signal to be noise-reduced from another electronic device, may also collect the audio signal to be noise-reduced by an internal audio collecting device, and the like, which is not limited herein. When the electronic equipment collects the audio signal to be denoised through the built-in audio collecting equipment, the electronic equipment can collect the audio signal to be denoised through the built-in sound pick-up.
In some implementations, the audio usage scenario may include a recording scenario, a far-field pickup scenario, a voice recognition scenario, a voiceprint wake scenario, and so on. Therefore, in this embodiment, whether the audio usage scene corresponding to the audio signal to be denoised is a recording scene and a detection result is obtained may be detected, whether the audio usage scene corresponding to the audio signal to be denoised is a far-field pickup scene and a detection result is obtained may be detected, whether the audio usage scene corresponding to the audio signal to be denoised is a speech recognition scene and a detection result is obtained may be detected, whether the audio usage scene corresponding to the audio signal to be denoised is a voiceprint wake-up scene and a detection result is obtained, and then, based on the detection result, the audio usage scene corresponding to the audio signal to be denoised may be obtained.
As one mode, the electronic device may set four options, namely a "recording scene", a "far-field pickup scene", a "voice recognition scene", and a "voiceprint wake-up scene", and may detect a selection of the four options when detecting an audio usage scene corresponding to an audio signal to be noise-reduced, wherein when the "recording scene" is selected from the four options, an audio usage scene corresponding to the audio signal to be noise-reduced may be determined as the "recording scene", when the "far-field pickup scene" is selected from the four options, an audio usage scene corresponding to the audio signal to be noise-reduced may be determined as the "far-field pickup scene", when the "voice recognition scene" is selected from the four options, an audio usage scene corresponding to the audio signal to be noise-reduced may be determined as the "voice recognition scene", and when the "voiceprint wake-up scene" is selected from the four options, the audio use scene corresponding to the audio signal to be denoised can be determined as a 'voiceprint wake-up scene'.
Step S120: and selecting a plurality of target audio noise reduction algorithms based on the audio use scene.
In this embodiment, after the audio usage scene corresponding to the audio signal to be denoised is obtained, a plurality of target audio denoising algorithms may be selected based on the audio usage scene corresponding to the audio signal to be denoised. In some embodiments, the number of the selected target audio noise reduction algorithms is at least three, so as to improve the noise reduction processing effect of the audio signal to be noise reduced.
As a mode, a plurality of audio noise reduction algorithms may be pre-stored locally in the electronic device, and after an audio usage scene is acquired, a plurality of target audio noise reduction algorithms may be selected from the plurality of locally-stored audio noise reduction algorithms based on the audio usage scene. As another way, the electronic device may not locally store a plurality of audio noise reduction algorithms in advance, and after the audio usage scenario is obtained, a plurality of target audio noise reduction algorithms may be selected from a server in communication with the electronic device based on the audio usage scenario.
In some embodiments, when the electronic device stores a plurality of audio noise reduction algorithms in advance, after an audio usage scene is acquired, a plurality of audio noise reduction algorithms to be used may be acquired based on the audio usage scene, and it may be detected whether the plurality of audio noise reduction algorithms stored locally in the electronic device completely include the plurality of audio noise reduction algorithms to be used. When it is detected that the plurality of locally stored audio noise reduction algorithms of the electronic device completely include the plurality of audio noise reduction algorithms to be used, the plurality of locally stored audio noise reduction algorithms may be selected from the plurality of locally stored audio noise reduction algorithms to be used as the plurality of target audio noise reduction algorithms. When it is detected that the plurality of audio noise reduction algorithms stored locally in the electronic device do not completely include the plurality of audio noise reduction algorithms to be used, the included audio noise reduction algorithms to be used may be selected from the plurality of audio noise reduction algorithms stored locally in the electronic device, and the remaining audio noise reduction algorithms to be used may be selected from a server in communication with the electronic device, and the audio noise reduction algorithms to be used selected locally in the electronic device and the audio noise reduction algorithms to be used selected from the server may be collectively used as the plurality of target audio noise reduction algorithms.
In some embodiments, the plurality of target audio noise reduction algorithms may include a beamforming algorithm, a blind source separation algorithm, a wiener filter algorithm, a spectral subtraction algorithm, a deep neural network noise reduction algorithm, and the like, which are not limited herein.
The beamforming algorithm may process the spatial information and the time-frequency information provided by the microphones, estimate a time delay between the microphone arrays to synchronize the voice signals of the channels, and perform delay and sum (delay and sum beamforming) to eliminate background noise with a zero mean. The method mainly suppresses sound interference outside a main lobe, for example, when a beam is incident from a zero-degree angle direction, signals collected by a microphone array have no delay difference, so that the voice is enhanced by a "superposition" effect, but collected noise and voice in other directions have no "superposition" enhancement because of different delay differences or low correlation, so that the gain of the beam formed in the zero-degree angle direction is larger than that of the signal incident in other directions. Most of unsteady noises in other directions can be filtered through a beam forming algorithm, the suppression effect on the unsteady noises is obvious, but the voice in a specific direction also comprises the unsteady noises and steady noises, for example, the unsteady noises such as music played by a sound system and the steady noises such as an air conditioner, and the noises usually do not have spatial directivity.
The Blind Source Separation algorithm adopts a Blind Source Separation technology (BSS), which is a process of collecting and separating unknown observation signals, and then recovering the acquired Source signals. In order to separate the mixed signals, different assumptions need to be made about the source signals, for example, it is assumed that the source signals are statistically independent of each other and that the most dominant constituent satisfies the gaussian distribution. Based on the above assumptions, the blind source separation problem can be converted into Independent Component Analysis (ICA), i.e., the received mixed signal is separated into several independent components by an optimization algorithm according to a statistically independent principle, and the separation part mainly works in solving a separation matrix by using a learning strategy. Therefore, the blind source separation algorithm uses the high-order statistics of the signal to separate the target speech, and can separate all non-gaussian sound sources which are relatively independent.
The wiener filtering algorithm can firstly perform discrete Fourier transform on an audio signal to be denoised, calculate the probability of voice and noise in a frequency domain, then suppress the noise through wiener filtering, and perform inverse discrete Fourier transform to a time domain to obtain a final denoised voice signal. Therefore, the wiener filter algorithm has a good effect of inhibiting steady-state noise (such as fan noise with constant rotating speed and household appliance noise), and the introduced distortion is extremely small.
The spectral subtraction algorithm is to estimate the power spectrum of noise in the frequency domain, and subtract the power spectrum of noise-carrying speech to obtain the estimated pure speech power spectrum theoretically, i.e. to achieve the speech enhancement effect. However, music noise is easily introduced in the spectral subtraction method, and in order to improve more obvious music noise caused by the spectral subtraction method, a psychoacoustic model can be considered to be introduced to improve a gain function, effectively suppress noise and simultaneously reduce the problem of introducing music noise. Therefore, the spectrum subtraction algorithm has a good suppression effect on the steady-state noise, and can remove the residual noise processed by other noise reduction algorithms.
The deep neural network noise reduction algorithm is characterized in that pure voice and various types of noise data can be combined and trained, a back propagation network method is used, the output of a noisy voice network is compared with the pure voice in a noise environment, the weighting coefficient of the network is continuously adjusted to obtain the deep neural network noise reduction algorithm, the pure voice signal is estimated through the deep neural network noise reduction algorithm obtained through training, and the noise reduction effect is obvious in an environment with high signal-to-noise ratio. However, the deep neural network noise reduction algorithm has a significant suppression effect on specific noise, but has a poor suppression effect on unusual noise.
Step S130: and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
In this embodiment, after obtaining the plurality of target audio noise reduction algorithms, the audio signal to be noise-reduced may be sequentially subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms according to a specified noise reduction processing sequence, so as to obtain the noise-reduced audio signal, and improve the noise reduction effect of the audio signal. The specified denoising procedure may be a default denoising procedure, or may be an empirically updated denoising procedure, which is not limited herein.
For example, assume that the plurality of target audio noise reduction algorithms includes a first target audio noise reduction algorithm, a second target audio noise reduction algorithm, a third target audio noise reduction algorithm … …, an nth target audio noise reduction algorithm, and the noise reduction processing order is specified as: first target audio noise reduction algorithm-second target audio noise reduction algorithm-third target audio noise reduction algorithm … … nth target audio noise reduction algorithm. Then, the audio signal to be denoised may be first subjected to denoising processing by a first target audio denoising algorithm to obtain a first audio signal to be denoised, the first audio signal to be denoised is subjected to denoising processing by a second target audio denoising algorithm to obtain a second audio signal to be denoised, the second audio signal to be denoised is subjected to denoising processing by a third target audio denoising algorithm to obtain a third audio signal to be denoised, and by analogy, the N-1 th audio signal to be denoised is subjected to denoising processing by an nth target audio denoising algorithm to obtain a denoised audio signal.
According to the audio denoising method provided by one embodiment of the application, the audio signal to be denoised is obtained, the audio using scene corresponding to the audio signal to be denoised is obtained, based on the audio using scene, a plurality of target audio denoising algorithms are selected, denoising is performed on the audio signal to be denoised sequentially through each target audio denoising algorithm in the plurality of target audio denoising algorithms according to the specified denoising processing sequence, the denoised audio signal is obtained, denoising is performed on the audio signal to be denoised according to a plurality of audio denoising algorithms selected according to the audio using scene corresponding to the audio signal to be denoised, denoising processing is performed on the audio signal to be denoised according to the actual voice quality requirement by selecting the corresponding number of audio denoising algorithms, and the audio processing effect is improved.
Referring to fig. 2, fig. 2 is a flow chart illustrating an audio denoising method according to another embodiment of the present application. As will be described in detail with respect to the flow shown in fig. 2, the audio denoising method may specifically include the following steps:
step S210: the method comprises the steps of obtaining an audio signal to be subjected to noise reduction, and obtaining an audio use scene corresponding to the audio signal to be subjected to noise reduction.
For detailed description of step S210, please refer to step S110, which is not described herein again.
Step S220: and acquiring the maximum audio distortion allowed by the audio use scene.
The requirements of different audio use scenes on audio distortion are different, some audio use scenes have higher requirements on audio distortion, and some audio use scenes have lower requirements on audio distortion. For example, a speech recognition scene, a voiceprint wake-up scene, and the like have high requirements for audio distortion, and a recording scene, a far-field sound pickup scene, and the like have low requirements for audio distortion. Therefore, the audio usage scenario with a high requirement for audio distortion allows a smaller maximum audio distortion, and the audio usage scenario with a low requirement for audio distortion allows a larger maximum audio distortion.
In this embodiment, after an audio usage scene corresponding to an audio signal to be denoised is obtained, the maximum audio distortion allowed by the audio usage scene may be obtained. In some embodiments, the electronic device may obtain, in advance, the maximum audio distortion allowed for each audio usage scenario, and establish and store a corresponding relationship between the audio usage scenario and the maximum audio distortion allowed for the audio usage scenario, so that after obtaining the audio usage scenario corresponding to the audio signal to be noise reduced, the electronic device may obtain, based on the locally stored corresponding relationship between the audio usage scenario and the maximum audio distortion allowed for the audio usage scenario corresponding to the audio signal to be noise reduced.
Step S230: and selecting the target audio noise reduction algorithms based on the allowed maximum audio distortion.
In this embodiment, after obtaining the maximum audio distortion allowed by the audio usage scenario, a plurality of target audio noise reduction algorithms may be selected based on the maximum audio distortion allowed by the audio usage scenario. In some embodiments, if the maximum audio distortion allowed by an audio use scene is larger, the requirement of the audio use scene on the audio distortion is represented to be low, a large number of target audio noise reduction algorithms can be selected to perform noise reduction processing on an audio signal to be subjected to noise reduction so as to improve the effect of the audio noise reduction processing; if the maximum audio distortion allowed by the audio using scene is smaller and the requirement of the audio using scene on the audio distortion is higher, the target audio noise reduction algorithm with less quantity can be selected to perform noise reduction processing on the audio signal to be subjected to noise reduction so as to balance the audio noise reduction processing effect and the distortion requirement.
As one mode, the electronic device may preset and store a preset audio distortion, where the preset audio distortion may be used as a criterion for determining a maximum audio distortion allowed by an audio usage scene. Therefore, in the embodiment, when the maximum audio distortion allowed by the audio usage scenario is obtained, the maximum audio distortion allowed by the audio usage scenario may be compared with the preset audio distortion to determine whether the maximum audio distortion allowed by the audio usage scenario is greater than the preset audio distortion, wherein when the maximum audio distortion allowed by the audio use scene is larger than the preset audio distortion, the requirement of representing the audio use scene on the audio distortion is lower, a first number of audio noise reduction algorithms can be selected as a plurality of target audio noise reduction algorithms, when the maximum audio distortion allowed by the audio usage scenario is not greater than the preset audio distortion, the requirement of the audio usage scenario for representing the audio usage scenario is higher, a second number of audio noise reduction algorithms may be selected as the plurality of target audio noise reduction algorithms, wherein the first number is greater than the second number.
In some embodiments, when the audio usage scene is a recording scene or a far-field pickup scene, it may be determined that the maximum audio distortion allowed by the audio usage scene is greater than a preset audio distortion, that is, it is determined that the requirements of the recording scene and the far-field pickup scene on the audio distortion are low, then a first number of audio noise reduction algorithms may be selected as the plurality of target audio noise reduction algorithms; when the audio using scene is a voice recognition scene or a voiceprint awakening scene, it can be determined that the maximum audio distortion allowed by the audio using scene is not greater than the preset audio distortion, that is, it is determined that the voice recognition scene and the voiceprint awakening scene have high requirements on the audio distortion, and then a second number of audio noise reduction algorithms can be selected as the plurality of target audio noise reduction algorithms.
Step S240: and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
For a detailed description of step S240, please refer to step S130, which is not described herein again.
According to the audio noise reduction method provided by another embodiment of the application, an audio signal to be subjected to noise reduction is acquired, an audio use scene corresponding to the audio signal to be subjected to noise reduction is acquired, maximum audio distortion allowed by the audio use scene is acquired, a plurality of target audio noise reduction algorithms are selected based on the allowed maximum audio distortion, and the audio signal to be subjected to noise reduction is sequentially subjected to noise reduction processing through each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms according to a specified noise reduction processing sequence, so that the audio signal subjected to noise reduction is acquired. Compared with the audio noise reduction method shown in fig. 1, in the embodiment, the maximum audio distortion allowed by the audio usage scene is further obtained, and a plurality of target noise reduction algorithms are selected based on the maximum audio distortion allowed, so that the finally obtained noise-reduced audio signal meets the distortion requirement of the audio usage scene, and the audio usage effect is improved.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an audio denoising method according to still another embodiment of the present application. As will be described in detail with respect to the flow shown in fig. 3, the audio denoising method may specifically include the following steps:
step S310: the method comprises the steps of obtaining an audio signal to be subjected to noise reduction, and obtaining an audio use scene corresponding to the audio signal to be subjected to noise reduction.
For detailed description of step S310, please refer to step S110, which is not described herein again.
Step S320: and acquiring the maximum audio distortion allowed by the audio use scene.
For the detailed description of step S320, please refer to step S220, which is not described herein again.
Step S330: and when the audio using scene is a recording scene or a far-field pickup scene, determining that the maximum audio distortion allowed by the audio using scene is larger than the preset audio distortion, and selecting a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm and a spectral subtraction algorithm as the target audio denoising algorithms.
In some embodiments, when the audio usage scene is a recording scene or a far-field pickup scene, it may be determined that the maximum audio distortion allowed by the audio usage scene is greater than a preset audio distortion, that is, it is determined that the recording scene and the far-field pickup scene have a low requirement for the audio distortion, a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm, and a spectral subtraction algorithm may be selected as the multiple target audio denoising algorithms, so that more audio denoising algorithms participate in denoising processing of an audio signal to be denoised, and a denoising effect of the audio signal to be denoised is improved.
In some embodiments, a deep neural network noise reduction algorithm, a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm, and a spectral subtraction algorithm may be selected as the multiple target audio noise reduction algorithms, which is not limited herein.
Step S340: and according to the appointed noise reduction processing sequence, carrying out noise reduction processing on the audio signal to be subjected to noise reduction sequentially through the beam forming algorithm, the blind source separation algorithm, the wiener filtering algorithm and the spectrum subtraction algorithm to obtain the noise-reduced audio signal.
In some embodiments, the specified noise reduction processing order may be: beam forming algorithm-blind source separation algorithm-wiener filtering algorithm-spectral subtraction algorithm. Then, in this embodiment, the audio signal to be denoised may be first denoised by the beamforming algorithm to obtain the audio signal processed by the beamforming algorithm, then denoised by the blind source separation algorithm to obtain the audio signal processed by the beamforming algorithm and the blind source separation algorithm, then denoised by the wiener filter algorithm to obtain the audio signal processed by the beamforming algorithm, the blind source separation algorithm and the wiener filter algorithm, and finally denoised by the spectral subtraction algorithm to obtain the audio signal processed by the beamforming algorithm, the blind source separation algorithm, the wiener filter algorithm and the spectral subtraction algorithm, as a noise-reduced audio signal.
Step S350: and when the audio using scene is a voice recognition scene or a voiceprint awakening scene, determining that the maximum audio distortion allowed by the audio using scene is not larger than the preset audio distortion, and selecting a beam forming algorithm, a blind source separation algorithm and a wiener filtering algorithm as the target audio noise reduction algorithms.
In some embodiments, when the audio usage scene is a speech recognition scene or a voiceprint wake-up scene, it may be determined that the maximum audio distortion allowed by the audio usage scene is not greater than a preset audio distortion, that is, it is determined that the requirements of the speech recognition scene and the voiceprint wake-up scene on the audio distortion are high, a beam forming algorithm, a blind source separation algorithm, and a wiener filtering algorithm may be selected as the multiple target audio noise reduction algorithms, so as to participate in noise reduction processing of an audio signal to be noise reduced through fewer audio noise reduction algorithms, and balance the noise reduction effect and the audio distortion requirement of the audio signal to be noise reduced.
Step S360: and according to the appointed noise reduction processing sequence, carrying out noise reduction processing on the audio signal to be subjected to noise reduction sequentially through the beam forming algorithm, the blind source separation algorithm and the wiener filtering algorithm to obtain the noise-reduced audio signal.
In some embodiments, the specified noise reduction processing order may be: beamforming algorithms-blind source separation algorithms and wiener filtering algorithms. Then, in this embodiment, the audio signal to be denoised may be first denoised by the beamforming algorithm to obtain the audio signal processed by the beamforming algorithm, then the audio signal processed by the beamforming algorithm is denoised by the blind source separation algorithm to obtain the audio signal processed by the beamforming algorithm and the blind source separation algorithm, and finally the audio signal processed by the beamforming algorithm and the blind source separation algorithm is denoised by the wiener filtering algorithm to obtain the audio signal processed by the beamforming algorithm, the blind source separation algorithm, and the wiener filtering algorithm as the denoised audio signal.
The audio denoising method provided in another embodiment of the present application obtains an audio signal to be denoised, obtains an audio usage scene corresponding to the audio signal to be denoised, obtains a maximum audio distortion allowed by the audio usage scene, determines that the maximum audio distortion allowed by the audio usage scene is greater than a preset audio distortion when the audio usage scene is a recording scene or a far-field pickup scene, selects a beamforming algorithm, a blind source separation algorithm, a wiener algorithm, and a general subtraction algorithm as a plurality of target audio denoising algorithms, performs denoising on the audio signal to be denoised sequentially through the beamforming algorithm, the blind source separation algorithm, the wiener algorithm, and the general subtraction algorithm according to a specified denoising processing sequence, obtains a denoised audio signal, determines that the maximum audio distortion allowed by the audio usage scene is not greater than the preset audio distortion when the audio usage scene is a speech recognition scene or a voiceprint scene is awakened, selecting a beam forming algorithm, a blind source separation algorithm and a wiener algorithm as a target audio noise reduction algorithm, and sequentially carrying out noise reduction processing on an audio signal to be subjected to noise reduction through the beam forming algorithm, the blind source separation algorithm and the wiener algorithm according to a specified noise reduction processing sequence to obtain a noise-reduced audio signal. Compared with the audio noise reduction method shown in fig. 1, in this embodiment, when the audio usage scene is a recording scene or a far-field pickup scene or other scenes with low distortion requirements, the beam forming algorithm, the blind source separation algorithm, the wiener filtering algorithm, and the spectral subtraction algorithm are selected as the multiple target audio noise reduction algorithms, and when the audio usage scene is a voice recognition scene or a voiceprint wake-up scene or other scenes with high distortion requirements, the beam forming algorithm, the blind source separation algorithm, and the wiener filtering algorithm are selected as the multiple target audio noise reduction algorithms, so as to improve the rationality of the audio noise reduction algorithms.
Referring to fig. 4, fig. 4 is a schematic flowchart illustrating an audio denoising method according to another embodiment of the present application. As will be described in detail with respect to the flow shown in fig. 4, the audio denoising method may specifically include the following steps:
step S410: the method comprises the steps of obtaining an audio signal to be subjected to noise reduction, and obtaining an audio use scene corresponding to the audio signal to be subjected to noise reduction.
For detailed description of step S410, please refer to step S110, which is not described herein again.
Step S420: and acquiring the signal-to-noise ratio of the audio signal to be denoised.
In some embodiments, after the audio signal to be noise reduced is acquired, the signal-to-noise ratio of the audio signal to be noise reduced may be acquired. As one way, after the audio signal to be denoised is acquired, the signal-to-noise ratio of the audio signal to be denoised may be acquired by the audio analyzer.
Step S430: and selecting the target audio noise reduction algorithms based on the signal-to-noise ratio and the audio use scene.
In some embodiments, after acquiring the signal-to-noise ratio of the audio signal to be noise-reduced and the audio usage scenario corresponding to the audio signal to be noise-reduced, a plurality of target audio noise reduction algorithms may be selected based on the signal-to-noise ratio and the audio usage scenario. As a mode, under the condition that the audio usage scenarios of different audio signals to be denoised are consistent, the number of target audio denoising algorithms selected for audio signals to be denoised with a higher signal-to-noise ratio is smaller than the number of target audio denoising algorithms selected for audio signals to be denoised with a lower signal-to-noise ratio. For example, under the condition that the audio usage scenarios of different audio signals to be denoised are consistent, for an audio signal to be denoised with a high signal-to-noise ratio, a beam forming algorithm, a blind source separation algorithm and a wiener filtering algorithm may be selected as a target audio denoising algorithm, and for an audio signal to be denoised with a low signal-to-noise ratio, a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm and a spectral subtraction algorithm may be selected as a target audio denoising algorithm.
Step S440: and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
For detailed description of step S440, please refer to step S130, which is not described herein.
According to the audio noise reduction method provided by another embodiment of the application, an audio signal to be noise reduced is obtained, an audio use scene corresponding to the audio signal to be noise reduced is obtained, a signal to noise ratio of the audio signal to be noise reduced is obtained, a plurality of target audio noise reduction algorithms are selected based on the signal ratio and the audio use scene, and the audio signal to be noise reduced is subjected to noise reduction processing sequentially through each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms according to a specified noise reduction processing sequence, so that the audio signal after noise reduction is obtained. Compared with the audio noise reduction method shown in fig. 1, in this embodiment, the signal-to-noise ratio of the audio signal to be noise reduced is further obtained, and a plurality of target audio noise reduction algorithms are selected according to the signal-to-noise ratio and the audio usage scene, so that the selected target audio noise reduction algorithms are more adaptive to the signal-to-noise ratio of the audio signal to be noise reduced, and the audio noise reduction effect is improved.
Referring to fig. 5, fig. 5 is a schematic flowchart illustrating an audio denoising method according to yet another embodiment of the present application. As will be described in detail with respect to the flow shown in fig. 5, the audio denoising method may specifically include the following steps:
step S510: the method comprises the steps of obtaining an audio signal to be subjected to noise reduction, and obtaining an audio use scene corresponding to the audio signal to be subjected to noise reduction.
For detailed description of step S510, please refer to step S110, which is not described herein again.
Step S520: and acquiring the noise type contained in the audio signal to be subjected to noise reduction.
In some embodiments, after the audio signal to be noise-reduced is acquired, the type of noise included in the audio signal to be noise-reduced may be acquired. As one mode, after the audio signal to be denoised is obtained, whether the noise included in the audio signal to be denoised includes a stationary noise, whether the noise included in the audio signal to be denoised includes an unsteady noise, and whether the noise included in the audio signal to be denoised includes a specific noise may be detected, so that whether the type of the noise included in the audio signal to be denoised includes a stationary noise type, whether the type of the unsteady noise is included, and whether the type of the specific noise is included are obtained according to the detection result.
Step S530: and selecting the target audio noise reduction algorithms based on the noise types and the audio use scenes.
In some embodiments, after obtaining the noise type included in the audio signal to be denoised and the audio usage scene corresponding to the audio signal to be denoised, a plurality of target audio denoising algorithms may be selected based on the noise type and the audio usage scene. As one mode, different audio noise reduction algorithms may focus on filtering different types of noise, for example, a beam forming algorithm may focus on filtering non-stationary noise, a deep neural network noise reduction algorithm may focus on filtering specific noise, and a wiener algorithm may focus on filtering stationary noise. Therefore, in this embodiment, a corresponding audio noise reduction algorithm may be selected as a target audio noise reduction algorithm according to the noise type, for example, when the audio types in the audio signal to be noise-reduced are all unsteady noise, the wiener algorithm may not be selected to participate in the noise reduction processing; when the audio types in the audio signal to be denoised are all steady-state noise, a beam forming algorithm may not be selected to participate in denoising processing, and the like, which is not limited herein.
Step S540: and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
For detailed description of step S540, please refer to step S130, which is not described herein again.
In yet another embodiment of the present application, an audio denoising method obtains an audio signal to be denoised, obtains an audio usage scene corresponding to the audio signal to be denoised, obtains a noise type included in the audio signal to be denoised, selects a plurality of target audio denoising algorithms based on the noise type and the audio usage scene, and performs denoising on the audio signal to be denoised by each target audio denoising algorithm in the plurality of target audio denoising algorithms in sequence according to a specified denoising processing sequence to obtain a denoised audio signal. Compared with the audio noise reduction method shown in fig. 1, in this embodiment, the noise type included in the audio signal to be noise reduced is further obtained, and a plurality of target audio noise reduction algorithms are selected according to the noise type and the audio usage scene, so that the selected target audio noise reduction algorithms are more adaptive to the noise type of the audio signal to be noise reduced, and the audio noise reduction effect is improved.
Referring to fig. 6, fig. 6 shows a block diagram of an audio noise reduction apparatus according to an embodiment of the present application. As will be explained below with respect to the block diagram shown in fig. 6, the audio noise reduction apparatus 200 includes: an audio usage scene obtaining module 210, an audio denoising algorithm obtaining module 220, and an audio denoising processing module 230, wherein:
the audio usage scene obtaining module 210 is configured to obtain an audio signal to be denoised, and obtain an audio usage scene corresponding to the audio signal to be denoised.
And an audio noise reduction algorithm obtaining module 220, configured to select multiple target audio noise reduction algorithms based on the audio usage scenario.
Further, the audio denoising algorithm obtaining module 220 includes: an allowed maximum audio distortion acquisition sub-module and a first audio noise reduction algorithm selection sub-module, wherein:
and the maximum audio distortion acquisition sub-module is used for acquiring the maximum audio distortion allowed by the audio use scene.
And the first audio noise reduction algorithm selection sub-module is used for selecting the target audio noise reduction algorithms based on the allowed maximum audio distortion.
Further, the selecting sub-module of the first audio noise reduction algorithm comprises: a first audio noise reduction algorithm selection unit and a second audio noise reduction algorithm selection unit, wherein:
and the first audio noise reduction algorithm selecting unit is used for selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms when the allowed maximum audio distortion is larger than the preset audio distortion.
Further, the first audio noise reduction algorithm selecting unit includes: the first audio noise reduction algorithm selects subunits, wherein:
and the first audio noise reduction algorithm selecting subunit is used for determining that the maximum audio distortion allowed by the audio using scene is greater than the preset audio distortion when the audio using scene is a recording scene or a far-field pickup scene, and selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms.
Further, the selecting the sub-unit by the first audio noise reduction algorithm comprises: the first audio noise reduction algorithm selects a subunit, wherein:
and the first audio noise reduction algorithm selection subunit is used for determining that the maximum audio distortion allowed by the audio using scene is greater than the preset audio distortion when the audio using scene is a recording scene or a far-field pickup scene, and selecting a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm and a spectral subtraction algorithm as the multiple target audio noise reduction algorithms.
And the second audio noise reduction algorithm selecting unit is used for selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms when the allowed maximum audio distortion is not greater than the preset audio distortion, wherein the first number is greater than the second number.
Further, the second audio noise reduction algorithm selecting unit includes: the second audio noise reduction algorithm selects subunits, wherein:
and the second audio noise reduction algorithm selecting subunit is used for determining that the maximum audio distortion allowed by the audio using scene is not greater than the preset audio distortion when the audio using scene is a voice recognition scene or a voiceprint awakening scene, and selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms.
Further, the selecting the sub-unit by the second audio noise reduction algorithm comprises: a second audio noise reduction algorithm subunit, wherein:
and the second audio noise reduction algorithm subunit is configured to, when the audio usage scene is a speech recognition scene or a voiceprint wake-up scene, determine that the maximum audio distortion allowed by the audio usage scene is not greater than the preset audio distortion, and select a beam forming algorithm, a blind source separation algorithm, and a wiener filtering algorithm as the multiple target audio noise reduction algorithms.
Further, the audio denoising algorithm obtaining module 220 includes: the signal-to-noise ratio acquisition submodule and the second audio noise reduction algorithm selection submodule, wherein:
and the signal-to-noise ratio acquisition submodule is used for acquiring the signal-to-noise ratio of the audio signal to be denoised.
And the second audio noise reduction algorithm selection submodule is used for selecting the target audio noise reduction algorithms based on the signal-to-noise ratio and the audio use scene.
Further, the audio denoising algorithm obtaining module 220 includes: a noise type obtaining submodule and a third audio noise reduction algorithm selecting submodule, wherein:
and the noise type acquisition submodule is used for acquiring the noise type contained in the audio signal to be subjected to noise reduction.
And the third audio noise reduction algorithm selection submodule is used for selecting the multiple target audio noise reduction algorithms based on the noise types and the audio use scenes.
And the audio noise reduction processing module 230 is configured to perform noise reduction processing on the audio signal to be noise reduced sequentially through each of the multiple target audio noise reduction algorithms according to a specified noise reduction processing sequence, so as to obtain a noise-reduced audio signal.
Further, the audio denoising processing module 230 includes: a first audio noise reduction processing sub-module, wherein:
and the first audio noise reduction processing submodule is used for sequentially carrying out noise reduction processing on the audio signal to be subjected to noise reduction through the beam forming algorithm, the blind source separation algorithm, the wiener filtering algorithm and the spectral subtraction algorithm according to the specified noise reduction processing sequence to obtain the noise-reduced audio signal.
Further, the audio denoising processing module 230 includes: a second audio noise reduction processing sub-module, wherein:
and the second audio noise reduction processing submodule is used for sequentially carrying out noise reduction processing on the audio signal to be subjected to noise reduction through the beam forming algorithm, the blind source separation algorithm and the wiener filtering algorithm according to the specified noise reduction processing sequence to obtain the noise-reduced audio signal.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 7, a block diagram of an electronic device 100 according to an embodiment of the present disclosure is shown. The electronic device 100 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. The electronic device 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more applications, wherein the one or more applications may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 110 may include one or more processing cores, among other things. The processor 110 connects various parts within the overall electronic device 100 using various interfaces and lines, and performs various functions of the electronic device 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content to be displayed; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.
The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created by the electronic device 100 during use (e.g., phone book, audio-video data, chat log data), and the like.
Referring to fig. 8, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 300 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 300 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 300 includes a non-volatile computer-readable storage medium. The computer readable storage medium 300 has storage space for program code 310 for performing any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 310 may be compressed, for example, in a suitable form.
In summary, the audio denoising method, the audio denoising device, the electronic device and the storage medium provided in the embodiments of the present application acquire an audio signal to be denoised, acquire an audio usage scene corresponding to the audio signal to be denoised, select a plurality of target audio denoising algorithms based on the audio usage scene, and perform denoising processing on the audio signal to be denoised sequentially through each target audio denoising algorithm in the plurality of target audio denoising algorithms according to a specified denoising processing sequence to obtain a denoised audio signal, thereby select a plurality of audio denoising algorithms according to the audio usage scene corresponding to the audio signal to be denoised to perform denoising processing on the audio signal to be denoised, so as to select a corresponding number of audio denoising algorithms according to actual voice quality requirements to perform denoising processing, and improve an audio processing effect.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (12)

1. A method for audio noise reduction, the method comprising:
acquiring an audio signal to be denoised, and acquiring an audio use scene corresponding to the audio signal to be denoised;
selecting a plurality of target audio noise reduction algorithms based on the audio use scene;
and according to the appointed noise reduction processing sequence, the audio signal to be subjected to noise reduction is subjected to noise reduction processing by each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms in sequence, and the noise-reduced audio signal is obtained.
2. The method of claim 1, wherein selecting a plurality of target audio noise reduction algorithms based on the audio usage scenario comprises:
acquiring the maximum audio distortion allowed by the audio use scene;
and selecting the target audio noise reduction algorithms based on the allowed maximum audio distortion.
3. The method of claim 2, wherein selecting the plurality of target audio noise reduction algorithms based on the maximum allowed audio distortion comprises:
when the allowed maximum audio distortion is larger than the preset audio distortion, selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms;
and when the allowed maximum audio distortion is not larger than the preset audio distortion, selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms, wherein the first number is larger than the second number.
4. The method of claim 3, wherein selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms when the maximum allowed audio distortion is greater than a predetermined audio distortion comprises:
when the audio using scene is a recording scene or a far-field pickup scene, determining that the maximum audio distortion allowed by the audio using scene is larger than the preset audio distortion, and selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms.
5. The method of claim 4, wherein when the audio usage scenario is a recording scenario or a far-field pickup scenario, determining that a maximum audio distortion allowed by the audio usage scenario is greater than the preset audio distortion, and selecting a first number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms comprises:
when the audio using scene is a recording scene or a far-field pickup scene, determining that the maximum audio distortion allowed by the audio using scene is larger than the preset audio distortion, and selecting a beam forming algorithm, a blind source separation algorithm, a wiener filtering algorithm and a spectral subtraction algorithm as the target audio noise reduction algorithms;
the step of sequentially subjecting the audio signal to be denoised to each target audio denoising algorithm in the plurality of target audio denoising algorithms according to the designated denoising processing sequence to perform denoising processing, so as to obtain denoised audio signals, includes:
and according to the appointed noise reduction processing sequence, carrying out noise reduction processing on the audio signal to be subjected to noise reduction sequentially through the beam forming algorithm, the blind source separation algorithm, the wiener filtering algorithm and the spectrum subtraction algorithm to obtain the noise-reduced audio signal.
6. The method of claim 3, wherein selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms when the maximum allowed audio distortion is not greater than the predetermined audio distortion, wherein the first number is greater than the second number comprises:
and when the audio using scene is a voice recognition scene or a voiceprint awakening scene, determining that the maximum audio distortion allowed by the audio using scene is not larger than the preset audio distortion, and selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms.
7. The method according to claim 6, wherein when the audio usage scenario is a speech recognition scenario or a voiceprint wake-up scenario, determining that the maximum audio distortion allowed by the audio usage scenario is not greater than the preset audio distortion, and selecting a second number of audio noise reduction algorithms as the plurality of target audio noise reduction algorithms comprises:
when the audio using scene is a voice recognition scene or a voiceprint awakening scene, determining that the maximum audio distortion allowed by the audio using scene is not larger than the preset audio distortion, and selecting a beam forming algorithm, a blind source separation algorithm and a wiener filtering algorithm as the target audio noise reduction algorithms;
the step of sequentially subjecting the audio signal to be denoised to each target audio denoising algorithm in the plurality of target audio denoising algorithms according to the designated denoising processing sequence to perform denoising processing, so as to obtain denoised audio signals, includes:
and according to the appointed noise reduction processing sequence, carrying out noise reduction processing on the audio signal to be subjected to noise reduction sequentially through the beam forming algorithm, the blind source separation algorithm and the wiener filtering algorithm to obtain the noise-reduced audio signal.
8. The method of claim 1, wherein selecting a plurality of target audio noise reduction algorithms based on the audio usage scenario comprises:
acquiring the signal-to-noise ratio of the audio signal to be denoised;
and selecting the target audio noise reduction algorithms based on the signal-to-noise ratio and the audio use scene.
9. The method of claim 1, wherein selecting a plurality of target audio noise reduction algorithms based on the audio usage scenario comprises:
acquiring the noise type contained in the audio signal to be denoised;
and selecting the target audio noise reduction algorithms based on the noise types and the audio use scenes.
10. An audio noise reduction apparatus, characterized in that the apparatus comprises:
the audio using scene acquiring module is used for acquiring an audio signal to be subjected to noise reduction and acquiring an audio using scene corresponding to the audio signal to be subjected to noise reduction;
the audio noise reduction algorithm acquisition module is used for selecting a plurality of target audio noise reduction algorithms based on the audio use scene;
and the audio noise reduction processing module is used for sequentially carrying out noise reduction processing on the audio signal to be subjected to noise reduction through each target audio noise reduction algorithm in the plurality of target audio noise reduction algorithms according to a specified noise reduction processing sequence to obtain a noise-reduced audio signal.
11. An electronic device comprising a memory and a processor, the memory coupled to the processor, the memory storing instructions that, when executed by the processor, the processor performs the method of any of claims 1-9.
12. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 9.
CN202011080460.1A 2020-10-10 2020-10-10 Audio noise reduction method and device, electronic equipment and storage medium Active CN112185408B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011080460.1A CN112185408B (en) 2020-10-10 2020-10-10 Audio noise reduction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011080460.1A CN112185408B (en) 2020-10-10 2020-10-10 Audio noise reduction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112185408A true CN112185408A (en) 2021-01-05
CN112185408B CN112185408B (en) 2024-05-03

Family

ID=73947510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011080460.1A Active CN112185408B (en) 2020-10-10 2020-10-10 Audio noise reduction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112185408B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785288A (en) * 2020-06-30 2020-10-16 北京嘀嘀无限科技发展有限公司 Voice enhancement method, device, equipment and storage medium
CN112992153A (en) * 2021-04-27 2021-06-18 太平金融科技服务(上海)有限公司 Audio processing method, voiceprint recognition device and computer equipment
CN113096677A (en) * 2021-03-31 2021-07-09 深圳市睿耳电子有限公司 Intelligent noise reduction method and related equipment
CN113891109A (en) * 2021-12-08 2022-01-04 深圳市北科瑞声科技股份有限公司 Adaptive noise reduction method, device, equipment and storage medium
CN114255779A (en) * 2021-12-17 2022-03-29 思必驰科技股份有限公司 Audio noise reduction method for VR device, electronic device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103871421A (en) * 2014-03-21 2014-06-18 厦门莱亚特医疗器械有限公司 Self-adaptive denoising method and system based on sub-band noise analysis
CN105120389A (en) * 2015-08-17 2015-12-02 惠州Tcl移动通信有限公司 A method and earphone for carrying out noise reduction processing according to scenes
US10079026B1 (en) * 2017-08-23 2018-09-18 Cirrus Logic, Inc. Spatially-controlled noise reduction for headsets with variable microphone array orientation
CN109817236A (en) * 2019-02-01 2019-05-28 安克创新科技股份有限公司 Audio defeat method, apparatus, electronic equipment and storage medium based on scene
CN110197670A (en) * 2019-06-04 2019-09-03 大众问问(北京)信息科技有限公司 Audio defeat method, apparatus and electronic equipment
CN110211598A (en) * 2019-05-17 2019-09-06 北京华控创为南京信息技术有限公司 Intelligent sound noise reduction communication means and device
CN111128214A (en) * 2019-12-19 2020-05-08 网易(杭州)网络有限公司 Audio noise reduction method and device, electronic equipment and medium
CN111696567A (en) * 2020-06-12 2020-09-22 苏州思必驰信息科技有限公司 Noise estimation method and system for far-field call

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103871421A (en) * 2014-03-21 2014-06-18 厦门莱亚特医疗器械有限公司 Self-adaptive denoising method and system based on sub-band noise analysis
CN105120389A (en) * 2015-08-17 2015-12-02 惠州Tcl移动通信有限公司 A method and earphone for carrying out noise reduction processing according to scenes
US10079026B1 (en) * 2017-08-23 2018-09-18 Cirrus Logic, Inc. Spatially-controlled noise reduction for headsets with variable microphone array orientation
CN109817236A (en) * 2019-02-01 2019-05-28 安克创新科技股份有限公司 Audio defeat method, apparatus, electronic equipment and storage medium based on scene
CN110211598A (en) * 2019-05-17 2019-09-06 北京华控创为南京信息技术有限公司 Intelligent sound noise reduction communication means and device
CN110197670A (en) * 2019-06-04 2019-09-03 大众问问(北京)信息科技有限公司 Audio defeat method, apparatus and electronic equipment
CN111128214A (en) * 2019-12-19 2020-05-08 网易(杭州)网络有限公司 Audio noise reduction method and device, electronic equipment and medium
CN111696567A (en) * 2020-06-12 2020-09-22 苏州思必驰信息科技有限公司 Noise estimation method and system for far-field call

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785288A (en) * 2020-06-30 2020-10-16 北京嘀嘀无限科技发展有限公司 Voice enhancement method, device, equipment and storage medium
CN111785288B (en) * 2020-06-30 2022-03-15 北京嘀嘀无限科技发展有限公司 Voice enhancement method, device, equipment and storage medium
CN113096677A (en) * 2021-03-31 2021-07-09 深圳市睿耳电子有限公司 Intelligent noise reduction method and related equipment
CN113096677B (en) * 2021-03-31 2024-04-26 深圳市睿耳电子有限公司 Intelligent noise reduction method and related equipment
CN112992153A (en) * 2021-04-27 2021-06-18 太平金融科技服务(上海)有限公司 Audio processing method, voiceprint recognition device and computer equipment
CN113891109A (en) * 2021-12-08 2022-01-04 深圳市北科瑞声科技股份有限公司 Adaptive noise reduction method, device, equipment and storage medium
CN113891109B (en) * 2021-12-08 2022-03-15 深圳市北科瑞声科技股份有限公司 Adaptive noise reduction method, device, equipment and storage medium
CN114255779A (en) * 2021-12-17 2022-03-29 思必驰科技股份有限公司 Audio noise reduction method for VR device, electronic device and storage medium

Also Published As

Publication number Publication date
CN112185408B (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN112185408B (en) Audio noise reduction method and device, electronic equipment and storage medium
CN109671433B (en) Keyword detection method and related device
CN107910011B (en) Voice noise reduction method and device, server and storage medium
CN110970057B (en) Sound processing method, device and equipment
CN107393550B (en) Voice processing method and device
CN102164328B (en) Audio input system used in home environment based on microphone array
US9197974B1 (en) Directional audio capture adaptation based on alternative sensory input
CN110021307B (en) Audio verification method and device, storage medium and electronic equipment
US20200184987A1 (en) Noise reduction using specific disturbance models
WO2021022094A1 (en) Per-epoch data augmentation for training acoustic models
US20140025374A1 (en) Speech enhancement to improve speech intelligibility and automatic speech recognition
CN110211599B (en) Application awakening method and device, storage medium and electronic equipment
CN110400571B (en) Audio processing method and device, storage medium and electronic equipment
JP2020115206A (en) System and method
CN110660407B (en) Audio processing method and device
CN111063366A (en) Method and device for reducing noise, electronic equipment and readable storage medium
CN112185410B (en) Audio processing method and device
CN110992967A (en) Voice signal processing method and device, hearing aid and storage medium
EP3692529A1 (en) An apparatus and a method for signal enhancement
CN113823301A (en) Training method and device of voice enhancement model and voice enhancement method and device
CN114627899A (en) Sound signal detection method and device, computer readable storage medium and terminal
CN107360497B (en) Calculation method and device for estimating reverberation component
WO2021021683A1 (en) Method and apparatus for normalizing features extracted from audio data for signal recognition or modification
CN108899041B (en) Voice signal noise adding method, device and storage medium
CN107393553B (en) Auditory feature extraction method for voice activity detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant