CN112185407A - Dual-MIC input environmental sound suppression method and device, storage medium and equipment - Google Patents

Dual-MIC input environmental sound suppression method and device, storage medium and equipment Download PDF

Info

Publication number
CN112185407A
CN112185407A CN202011013186.6A CN202011013186A CN112185407A CN 112185407 A CN112185407 A CN 112185407A CN 202011013186 A CN202011013186 A CN 202011013186A CN 112185407 A CN112185407 A CN 112185407A
Authority
CN
China
Prior art keywords
audio data
frequency domain
external
built
domain audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011013186.6A
Other languages
Chinese (zh)
Inventor
李夏龙
罗益峰
陈龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Lango Electronic Science and Technology Co Ltd
Original Assignee
Guangzhou Lango Electronic Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Lango Electronic Science and Technology Co Ltd filed Critical Guangzhou Lango Electronic Science and Technology Co Ltd
Priority to CN202011013186.6A priority Critical patent/CN112185407A/en
Publication of CN112185407A publication Critical patent/CN112185407A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02165Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method, a device, a storage medium and equipment for inhibiting environmental sound with double MIC inputs, wherein the method comprises the following steps: externally connecting a far-field microphone device on the device, and simultaneously carrying out audio acquisition work on the externally connected microphone device and the built-in microphone device of the device to obtain externally connected audio data and built-in audio data; respectively carrying out frequency domain transformation processing on the external audio data and the built-in audio data to obtain external frequency domain audio data and built-in frequency domain audio data; respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectrum and cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; and performing frequency domain gain calculation by using a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum, and eliminating environmental noise in external audio data. According to the embodiment of the invention, the recording of the environmental sound of the microphone can be inhibited, and the definition of the voice is improved.

Description

Dual-MIC input environmental sound suppression method and device, storage medium and equipment
Technical Field
The invention relates to the technical field of education integrated machines, in particular to a method, a device, a storage medium and equipment for inhibiting environmental sound with double MIC inputs.
Background
Along with continuous innovation of electronic equipment and more modernization of an education mode, the education all-in-one machine is gradually popularized to classrooms at home and abroad, teachers explain in class through wireless microphone equipment connected to the education machine, the volume is enlarged, and voices are protected. The external microphone devices on the market at present are divided into two types: 1.
the system is provided with a DSP processing module, can better process the problems of noise floor, squeaking and the like of sound, but is expensive; 2. the Digital Signal Processor (DSP) processing module is not needed, the price is relatively low, but the background noise is large, and the tone quality played from the education machine is poor; due to the limitation of single-end input, the two types of external microphone devices cannot achieve a good suppression effect on external environment sound, that is, it is difficult to filter the external environment sound and only keep voice data.
Because the microphone is generally single-ended input, voice and external environment sound cannot be well distinguished, and effective inhibition cannot be performed on software and hardware, especially when teaching videos are live broadcast/recorded, teacher teaching contents and external noise are simultaneously recorded, and the voice definition of live broadcast/video recording is seriously influenced.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a method, a device, a storage medium and equipment for inhibiting environment sound with double MIC inputs, which can inhibit the recording of the environment sound of a microphone and improve the definition of voice; and the tone quality of network teaching and course video recording is improved.
In order to solve the above technical problem, an embodiment of the present invention provides an ambient sound suppression method for dual MIC inputs, including:
externally connecting a far-field microphone device on the device, and simultaneously carrying out audio acquisition work on the externally connected microphone device and the built-in microphone device of the device to obtain externally connected audio data and built-in audio data;
respectively carrying out frequency domain transformation processing on the external audio data and the built-in audio data to obtain external frequency domain audio data and built-in frequency domain audio data;
respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectra and cross-correlation spectra of the external frequency domain audio data and the internal frequency domain audio data;
and performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum, and eliminating environmental noise in external audio data.
Optionally, the enabling of the external microphone device and the device-mounted microphone device to perform audio acquisition simultaneously includes:
modifying a primary audio strategy on an operating system of the equipment, and adding a recording interface of a virtual audio equipment, which is adapted to the built-in microphone equipment;
when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
Optionally, the operating system is an Android system;
the modifying of the native audio policy on the running system of the device and the addition of a recording interface of a virtual audio device adapted to the built-in microphone device includes:
and modifying a native audio strategy on an Android system of the equipment to realize the simultaneous recording of a plurality of audiocord threads, and adding a recording interface of a virtual audio equipment adapted with a built-in microphone equipment on an HAL layer AudioPolicy.
Optionally, the device is provided with a built-in microphone device with a DSP module, and when the device is used for audio acquisition, the built-in microphone device is used for gain adjustment and processing parameters for audio cancellation of human voice through the built-in DSP module to obtain built-in frequency domain audio data;
the external microphone equipment is carried by a user and is used for acquiring audio data of the user during speaking to obtain external frequency domain audio data;
the external microphone equipment is wired external microphone equipment or wireless external microphone equipment.
Optionally, the frequency domain transformation processing is performed on the external audio data and the internal audio data respectively to obtain external frequency domain audio data and internal frequency domain audio data, and the method includes:
respectively carrying out frequency domain windowing on the external audio data and the built-in audio data to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing is windowing operation of a Hanning window;
and respectively carrying out Fourier change processing on the windowed external audio data and the windowed internal audio data to obtain external frequency domain audio data and internal frequency domain audio data.
Optionally, the performing autocorrelation spectrum and cross-correlation spectrum calculation based on the external frequency domain audio data and the internal frequency domain audio data respectively to obtain autocorrelation spectrum and cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data includes:
respectively carrying out autocorrelation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain an autocorrelation spectrum of the external frequency domain audio data and an autocorrelation spectrum of the internal frequency domain audio data;
respectively performing cross-correlation spectrum calculation on the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data;
the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data;
the calculation formula of the cross-correlation spectrum calculation is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
Optionally, the calculation formula for performing frequency domain gain calculation by using a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum is as follows:
Figure BDA0002698181990000031
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; the PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data.
In addition, an embodiment of the present invention further provides an ambient sound suppression apparatus with dual MIC inputs, where the apparatus includes:
the audio acquisition module: the device is used for externally connecting a far-field microphone device on the device, and enabling the externally connected microphone device and the built-in microphone device of the device to simultaneously carry out audio acquisition work so as to obtain externally connected audio data and built-in audio data;
a frequency domain transformation module: the external audio data and the built-in audio data are respectively subjected to frequency domain transformation processing to obtain external frequency domain audio data and built-in frequency domain audio data;
a correlation spectrum calculation module: the device is used for respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectrum and cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data;
a gain calculation module: and the method is used for performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum to eliminate the environmental noise in the external audio data.
In addition, an embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the ambient sound suppression method as described in any one of the above.
In addition, an embodiment of the present invention further provides a terminal device, which includes:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to: performing the ambient sound suppression method of any one of the above.
In the embodiment of the invention, the microphone equipment on the terminal equipment is added through the external microphone equipment, the external microphone equipment is used for being carried by a user, and the external microphone equipment is carried by the user; the external microphone equipment and the built-in microphone equipment are used for simultaneously acquiring audio information, and then a series of processing is carried out, so that the recording of the environmental sound of the microphone is inhibited, and the definition of the voice is improved; the tone quality of network teaching and course video recording is improved; and the requirement on external microphone equipment is not high, and the equipment does not need to carry a DSP module, so that the equipment cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an ambient sound suppression method with dual MIC inputs according to an embodiment of the present invention;
fig. 2 is a schematic structural composition diagram of an ambient sound suppression apparatus with dual MIC inputs according to an embodiment of the present invention;
fig. 3 is a schematic structural component diagram of a terminal device in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating an ambient sound suppression method with dual MIC inputs according to an embodiment of the present invention.
As shown in fig. 1, a method for ambient sound suppression with dual MIC inputs, the method comprising:
s11: externally connecting a far-field microphone device on the device, and simultaneously carrying out audio acquisition work on the externally connected microphone device and the built-in microphone device of the device to obtain externally connected audio data and built-in audio data;
in the specific implementation process of the present invention, the enabling of the external microphone device and the internal microphone device to perform audio acquisition simultaneously includes: modifying a primary audio strategy on an operating system of the equipment, and adding a recording interface of a virtual audio equipment, which is adapted to the built-in microphone equipment; when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
Further, the operating system is an Android system; the modifying of the native audio policy on the running system of the device and the addition of a recording interface of a virtual audio device adapted to the built-in microphone device includes: and modifying a native audio strategy on an Android system of the equipment, and adding a recording interface of a virtual audio equipment adapted with a built-in microphone equipment on an HAL layer AudioPolicy.
Furthermore, the built-in microphone equipment of the equipment is provided with a DSP module, and when the built-in microphone equipment of the equipment carries out audio acquisition work, gain adjustment and human voice audio elimination parameter processing are carried out through the built-in DSP module to obtain built-in frequency domain audio data; the external microphone equipment is carried by a user and is used for acquiring audio data of the user during speaking to obtain external frequency domain audio data; the external microphone equipment is wired external microphone equipment or wireless external microphone equipment.
Specifically, all the integrated equipment (equipment terminals) are provided with array microphone equipment, so that the integrated equipment becomes built-in microphone equipment, and in addition, the integrated equipment is externally connected with microphone equipment, so that the integrated equipment becomes externally connected microphone equipment, wherein the externally connected microphone equipment is generally carried by a user, and the user acquires voice information data sent by the user; the built-in microphone equipment is generally far away from the user and is used for acquiring environmental audio information data; the external microphone equipment can be wired microphone equipment or wireless microphone equipment, and is generally wireless microphone equipment, so that the microphone equipment is convenient for a user to carry; and the built-in microphone equipment of equipment is from having a DSP module, can realize strengthening the suppression of the human voice and the collection of environment sound through this DSP module for the contrast eliminates the environment sound in external microphone equipment data, thereby the input of the environment sound of suppression microphone, promotes the definition of pronunciation.
If far-field microphone equipment is simultaneously opened on the equipment, a native audio strategy needs to be modified on a built-in operating system on the all-in-one machine equipment, wherein the operating system is an Android system, and a recording interface of video built-in microphone equipment of virtual audio equipment is newly added; when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
And modifying a native audio strategy on an Android system running on the all-in-one machine equipment to realize the simultaneous recording of a plurality of audiocord threads, and additionally arranging a recording interface of a virtual audio equipment adapted with the built-in microphone equipment on the HAL layer AudioPolicy.
The externally connected microphone device is close to the speaker and mainly used for voice pickup, while the microphone device arranged in the device is far away from the speaker, and the gain and the voice elimination parameters can be adjusted, so that the pickup area of the two microphones is distinguished from the distribution structure of the hardware device by focusing on the pickup of the environmental sound; the far-field (external) microphone is used for recording voice, the picked voice strength is high, the background noise is relatively low, the voice picked by the built-in microphone on the all-in-one machine is relatively low, and the background noise is relatively high.
S12: respectively carrying out frequency domain transformation processing on the external audio data and the built-in audio data to obtain external frequency domain audio data and built-in frequency domain audio data;
in a specific implementation process of the present invention, the performing frequency domain transformation processing on the external audio data and the internal audio data respectively to obtain external frequency domain audio data and internal frequency domain audio data includes: respectively carrying out frequency domain windowing on the external audio data and the built-in audio data to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing is windowing operation of a Hanning window; and respectively carrying out Fourier change processing on the windowed external audio data and the windowed internal audio data to obtain external frequency domain audio data and internal frequency domain audio data.
Specifically, when external audio data and built-in audio data are obtained, frequency domain windowing processing needs to be performed on the external audio data and the built-in audio data respectively to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing processing is windowing operation of a Hanning window; and then carrying out Fourier transform processing on the windowed external audio data and the windowed internal audio data respectively to obtain external frequency domain audio data and internal frequency domain audio data.
A specific calculation formula is F1 ═ FFT (han _ win × T1); f2 ═ FFT (han _ win × T2); wherein, T1 is external audio data; t2 embeds audio data; f1 is external frequency domain audio data; f2 is built-in frequency domain audio data; the FFT is Fourier transform; han _ win is a windowing operation coefficient of a Hanning window, and 128 data of each frame are multiplied by the Hanning window coefficient to prevent frequency spectrum aliasing during subsequent time-frequency conversion.
S13: respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectra and cross-correlation spectra of the external frequency domain audio data and the internal frequency domain audio data;
in the specific implementation process of the invention, the autocorrelation spectrum and the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data are respectively calculated based on the autocorrelation spectrum and the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data to obtain the autocorrelation spectrum and the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio dataThe method comprises the following steps: respectively carrying out autocorrelation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain an autocorrelation spectrum of the external frequency domain audio data and an autocorrelation spectrum of the internal frequency domain audio data; respectively performing cross-correlation spectrum calculation on the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data; the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data; the calculation formula of the cross-correlation spectrum calculation is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
Specifically, firstly, autocorrelation spectrum calculation is respectively carried out according to external frequency domain audio data and internal frequency domain audio data, and autocorrelation spectrums of the external frequency domain audio data and autocorrelation spectrums of the internal frequency domain audio data are obtained; then, cross-correlation spectrum calculation is respectively carried out according to the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data; the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data; the cross-correlation spectrum calculation formula is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
S14: and performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum, and eliminating environmental noise in external audio data.
In a specific implementation process of the present invention, the calculation formula for performing frequency domain gain calculation by using a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum is as follows:
Figure BDA0002698181990000081
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; the PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data.
Specifically, a frequency domain correlation function is adopted for the suppression of the background noise, and the function is as follows:
Figure BDA0002698181990000082
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data; the frequency domain gain is calculated to eliminate ambient noise entrained in the data recorded by the wireless microphone.
In the embodiment of the invention, the microphone equipment on the terminal equipment is added through the external microphone equipment, the external microphone equipment is used for being carried by a user, and the external microphone equipment is carried by the user; the external microphone equipment and the built-in microphone equipment are used for simultaneously acquiring audio information, and then a series of processing is carried out, so that the recording of the environmental sound of the microphone is inhibited, and the definition of the voice is improved; the tone quality of network teaching and course video recording is improved; and the requirement on external microphone equipment is not high, and the equipment does not need to carry a DSP module, so that the equipment cost is reduced.
Examples
Referring to fig. 2, fig. 2 is a schematic structural diagram of an ambient sound suppression apparatus with dual MIC inputs according to an embodiment of the present invention.
As shown in fig. 2, an ambient sound suppression apparatus for dual MIC inputs, the apparatus comprising:
the audio acquisition module 21: the device is used for externally connecting a microphone device on the device, and enabling the externally connected microphone device and the built-in microphone device of the device to simultaneously carry out audio acquisition work so as to obtain externally connected audio data and built-in audio data;
in the specific implementation process of the present invention, the enabling of the external microphone device and the internal microphone device to perform audio acquisition simultaneously includes: modifying a primary audio strategy on an operating system of the equipment, and adding a recording interface of a virtual audio equipment, which is adapted to the built-in microphone equipment; when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
Further, the operating system is an Android system; the modifying of the native audio policy on the running system of the device and the addition of a recording interface of a virtual audio device adapted to the built-in microphone device includes: and modifying a native audio strategy on an Android system of the equipment, and adding a recording interface of a virtual audio equipment adapted with a built-in microphone equipment on an HAL layer AudioPolicy.
Furthermore, the built-in microphone equipment of the equipment is provided with a DSP module, and when the built-in microphone equipment of the equipment carries out audio acquisition work, gain adjustment and human voice audio elimination parameter processing are carried out through the built-in DSP module to obtain built-in frequency domain audio data; the external microphone equipment is carried by a user and is used for acquiring audio data of the user during speaking to obtain external frequency domain audio data; the external microphone equipment is wired external microphone equipment or wireless external microphone equipment.
Specifically, all the integrated equipment (equipment terminals) are provided with array microphone equipment, so that the integrated equipment becomes built-in microphone equipment, and in addition, the integrated equipment is externally connected with microphone equipment, so that the integrated equipment becomes externally connected microphone equipment, wherein the externally connected microphone equipment is generally carried by a user, and the user acquires voice information data sent by the user; the built-in microphone equipment is generally far away from the user and is used for acquiring environmental audio information data; the external microphone equipment can be wired microphone equipment or wireless microphone equipment, and is generally wireless microphone equipment, so that the microphone equipment is convenient for a user to carry; and the built-in microphone equipment of equipment is from having a DSP module, can realize strengthening the suppression of the human voice and the collection of environment sound through this DSP module for the contrast eliminates the environment sound in external microphone equipment data, thereby the input of the environment sound of suppression microphone, promotes the definition of pronunciation.
If a far-field (external) microphone device is used for recording on the device at the same time, a native audio strategy needs to be modified on a built-in operating system on the all-in-one machine device, wherein the operating system is an Android system, the purpose of creating simultaneous recording of a plurality of Audio threads is achieved, and a recording interface of a video built-in microphone device of a virtual audio device is newly added; when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
And modifying a native audio strategy on an Android system running on the all-in-one machine equipment, and adding a recording interface of a virtual audio equipment adapted to the built-in microphone equipment on the HAL layer AudioPolicy.
The externally connected microphone device is close to the speaker and mainly used for voice pickup, while the microphone device arranged in the device is far away from the speaker, and the gain and the voice elimination parameters can be adjusted, so that the pickup area of the two microphones is distinguished from the distribution structure of the hardware device by focusing on the pickup of the environmental sound; the far-field (external) microphone is used for recording voice, the picked voice strength is high, the background noise is relatively low, the voice picked by the built-in microphone on the all-in-one machine is relatively low, and the background noise is relatively high.
Frequency domain transform module 22: the external audio data and the built-in audio data are respectively subjected to frequency domain transformation processing to obtain external frequency domain audio data and built-in frequency domain audio data;
in a specific implementation process of the present invention, the performing frequency domain transformation processing on the external audio data and the internal audio data respectively to obtain external frequency domain audio data and internal frequency domain audio data includes: respectively carrying out frequency domain windowing on the external audio data and the built-in audio data to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing is windowing operation of a Hanning window; and respectively carrying out Fourier change processing on the windowed external audio data and the windowed internal audio data to obtain external frequency domain audio data and internal frequency domain audio data.
Specifically, when external audio data and built-in audio data are obtained, frequency domain windowing processing needs to be performed on the external audio data and the built-in audio data respectively to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing processing is windowing operation of a Hanning window; and then carrying out Fourier transform processing on the windowed external audio data and the windowed internal audio data respectively to obtain external frequency domain audio data and internal frequency domain audio data.
A specific calculation formula is F1 ═ FFT (han _ win × T1); f2 ═ FFT (han _ win × T2); wherein, T1 is external audio data; t2 embeds audio data; f1 is external frequency domain audio data; f2 is built-in frequency domain audio data; the FFT is Fourier transform; han _ win is a windowing operation coefficient of a Hanning window, and 128 data of each frame are multiplied by the Hanning window coefficient to prevent frequency spectrum aliasing during subsequent time-frequency conversion.
The correlation spectrum calculation module 23: the device is used for respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectrum and cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data;
in a specific implementation process of the present invention, the calculating of the autocorrelation spectrum and the cross-correlation spectrum based on the external frequency domain audio data and the internal frequency domain audio data to obtain the autocorrelation spectrum and the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data respectively includes: respectively carrying out autocorrelation spectrum calculation based on the external frequency domain audio data and the internal frequency domain audio data to obtain the external frequency domain audio dataThe autocorrelation spectrum and the autocorrelation spectrum of the built-in frequency domain audio data; respectively performing cross-correlation spectrum calculation on the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data; the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data; the calculation formula of the cross-correlation spectrum calculation is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
Specifically, firstly, autocorrelation spectrum calculation is respectively carried out according to external frequency domain audio data and internal frequency domain audio data, and autocorrelation spectrums of the external frequency domain audio data and autocorrelation spectrums of the internal frequency domain audio data are obtained; then, cross-correlation spectrum calculation is respectively carried out according to the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data; the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data; the cross-correlation spectrum calculation formula is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
The gain calculation module 24: and the method is used for performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum to eliminate the environmental noise in the external audio data.
In a specific implementation process of the present invention, the calculation formula for performing frequency domain gain calculation by using a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum is as follows:
Figure BDA0002698181990000121
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; the PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data.
Specifically, a frequency domain correlation function is adopted for the suppression of the background noise, and the function is as follows:
Figure BDA0002698181990000122
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data; the frequency domain gain is calculated to eliminate ambient noise entrained in the data recorded by the wireless microphone.
In the embodiment of the invention, the microphone equipment on the terminal equipment is added through the external microphone equipment, the external microphone equipment is used for being carried by a user, and the external microphone equipment is carried by the user; the external microphone equipment and the built-in microphone equipment are used for simultaneously acquiring audio information, and then a series of processing is carried out, so that the recording of the environmental sound of the microphone is inhibited, and the definition of the voice is improved; the tone quality of network teaching and course video recording is improved; and the requirement on external microphone equipment is not high, and the equipment does not need to carry a DSP module, so that the equipment cost is reduced.
A computer-readable storage medium is provided in an embodiment of the present invention, and the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the ambient sound suppression method in any one of the above embodiments. The computer-readable storage medium includes, but is not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magneto-optical disks, ROMs (Read-Only memories), RAMs (Random AcceSS memories), EPROMs (EraSable Programmable Read-Only memories), EEPROMs (Electrically EraSable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards. That is, a storage device includes any medium that stores or transmits information in a form readable by a device (e.g., a computer, a cellular phone), and may be a read-only memory, a magnetic or optical disk, or the like.
The embodiment of the present invention further provides a computer application program, which runs on a computer, and is configured to execute the ambient sound suppression method according to any one of the above embodiments.
In addition, fig. 3 is a schematic structural composition diagram of the terminal device in the embodiment of the present invention.
An embodiment of the present invention further provides a terminal device, as shown in fig. 3. The terminal device includes a processor 302, a memory 303, an input unit 304, a display unit 305, and the like. Those skilled in the art will appreciate that the device configuration means shown in fig. 3 do not constitute a limitation of all devices and may include more or less components than those shown, or some components in combination. The memory 303 may be used to store the application 301 and various functional modules, and the processor 302 executes the application 301 stored in the memory 303, thereby performing various functional applications of the device and data processing. The memory may be internal or external memory, or include both internal and external memory. The memory may comprise read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), flash memory, or random access memory. The external memory may include a hard disk, a floppy disk, a ZIP disk, a usb-disk, a magnetic tape, etc. The disclosed memory includes, but is not limited to, these types of memory. The disclosed memory is by way of example only and not by way of limitation.
The input unit 304 is used for receiving input of signals and receiving keywords input by a user. The input unit 304 may include a touch panel and other input devices. The touch panel can collect touch operations of a user on or near the touch panel (for example, operations of the user on or near the touch panel by using any suitable object or accessory such as a finger, a stylus and the like) and drive the corresponding connecting device according to a preset program; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., play control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like. The display unit 305 may be used to display information input by a user or information provided to the user and various menus of the terminal device. The display unit 305 may take the form of a liquid crystal display, an organic light emitting diode, or the like. The processor 302 is a control center of the terminal device, connects various parts of the entire device using various interfaces and lines, and performs various functions and processes data by operating or executing software programs and/or modules stored in the memory 302 and calling data stored in the memory.
As an embodiment, the terminal device includes: one or more processors 302, a memory 303, one or more applications 301, wherein the one or more applications 301 are stored in the memory 303 and configured to be executed by the one or more processors 302, the one or more applications 301 being configured to perform the ambient sound suppression method of any of the above embodiments.
In the embodiment of the invention, the microphone equipment on the terminal equipment is added through the external microphone equipment, the external microphone equipment is used for being carried by a user, and the external microphone equipment is carried by the user; the external microphone equipment and the built-in microphone equipment are used for simultaneously acquiring audio information, and then a series of processing is carried out, so that the recording of the environmental sound of the microphone is inhibited, and the definition of the voice is improved; the tone quality of network teaching and course video recording is improved; and the requirement on external microphone equipment is not high, and the equipment does not need to carry a DSP module, so that the equipment cost is reduced.
In addition, the above detailed descriptions of the method, apparatus, storage medium and device for suppressing environmental sound with dual MIC inputs according to the embodiments of the present invention are provided, and specific examples should be used herein to explain the principles and embodiments of the present invention, and the above descriptions of the embodiments are only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method of ambient sound suppression for dual MIC inputs, the method comprising:
externally connecting a far-field microphone device on the device, and simultaneously carrying out audio acquisition work on the externally connected microphone device and the built-in microphone device of the device to obtain externally connected audio data and built-in audio data;
respectively carrying out frequency domain transformation processing on the external audio data and the built-in audio data to obtain external frequency domain audio data and built-in frequency domain audio data;
respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectra and cross-correlation spectra of the external frequency domain audio data and the internal frequency domain audio data;
and performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum, and eliminating environmental noise in external audio data.
2. The ambient sound suppression method according to claim 1, wherein the enabling of the external microphone device and the internal microphone device to perform audio acquisition simultaneously comprises:
modifying a primary audio strategy on an operating system of the equipment, and adding a recording interface of a virtual audio equipment, which is adapted to the built-in microphone equipment;
when the external microphone device and the built-in microphone device carry out audio acquisition work, PCM data of the built-in microphone device is obtained through the virtual audio device in a single thread.
3. The ambient sound suppression method according to claim 2, wherein the operating system Android system;
the modifying of the native audio policy on the running system of the device and the addition of a recording interface of a virtual audio device adapted to the built-in microphone device includes:
and modifying a native audio strategy on an Android system of the equipment to realize the simultaneous recording of a plurality of audiocord threads, and adding a recording interface of a virtual audio equipment adapted with a built-in microphone equipment on an HAL layer AudioPolicy.
4. The ambient sound suppression method according to any one of claims 1 or 2, wherein the microphone device is provided with a DSP module, and when the microphone device is performing audio acquisition, the DSP module is used to perform gain adjustment and human audio cancellation parameter processing to obtain audio data in a built-in frequency domain;
the external microphone equipment is carried by a user and is used for acquiring audio data of the user during speaking to obtain external frequency domain audio data;
the external microphone equipment is wired external microphone equipment or wireless external microphone equipment.
5. The ambient sound suppression method according to claim 1, wherein the performing frequency domain transform processing on the external audio data and the internal audio data to obtain external frequency domain audio data and internal frequency domain audio data respectively comprises:
respectively carrying out frequency domain windowing on the external audio data and the built-in audio data to obtain windowed external audio data and windowed built-in audio data, wherein the frequency domain windowing is windowing operation of a Hanning window;
and respectively carrying out Fourier change processing on the windowed external audio data and the windowed internal audio data to obtain external frequency domain audio data and internal frequency domain audio data.
6. The ambient sound suppression method according to claim 1, wherein the obtaining of the autocorrelation spectrum and the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data by performing autocorrelation spectrum and cross-correlation spectrum calculation based on the external frequency domain audio data and the internal frequency domain audio data, respectively, comprises:
respectively carrying out autocorrelation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain an autocorrelation spectrum of the external frequency domain audio data and an autocorrelation spectrum of the internal frequency domain audio data;
respectively performing cross-correlation spectrum calculation on the external frequency domain audio data and the internal frequency domain audio data to obtain cross-correlation spectrums of the external frequency domain audio data and the internal frequency domain audio data;
the calculation formula of the autocorrelation spectrum calculation is as follows: PSDn ═ Σ | Fn non-woven grid2Wherein n is 1 or 2; when n is 1, PSDn is an autocorrelation spectrum of the audio data of the circumscribed frequency domain; fn is external frequency domain audio data; when the value of n is 2, PSDn is the autocorrelation spectrum of the audio data of the built-in frequency domain; fn is built-in frequency domain audio data;
the calculation formula of the cross-correlation spectrum calculation is as follows: CPSD ═ Σ (| F1| × | F2 |)*) Wherein F1 is circumscribed frequency domain audio data; f2 is built-in frequency domain audio data; the symbol is the conjugate of the complex number.
7. The ambient sound suppression method according to claim 1, wherein the calculation formula for performing the frequency domain gain calculation using the frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum is as follows:
Figure FDA0002698181980000031
wherein, CPSD represents the cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data; PSD1 represents an autocorrelation spectrum of circumscribed frequency domain audio data; the PSD2 represents an autocorrelation spectrum of the built-in frequency domain audio data.
8. An ambient sound suppression apparatus for dual MIC inputs, the apparatus comprising:
the audio acquisition module: the device is used for externally connecting a far-field microphone device on the device, and enabling the externally connected microphone device and the built-in microphone device of the device to simultaneously carry out audio acquisition work so as to obtain externally connected audio data and built-in audio data;
a frequency domain transformation module: the external audio data and the built-in audio data are respectively subjected to frequency domain transformation processing to obtain external frequency domain audio data and built-in frequency domain audio data;
a correlation spectrum calculation module: the device is used for respectively carrying out autocorrelation spectrum and cross-correlation spectrum calculation on the basis of the external frequency domain audio data and the internal frequency domain audio data to obtain autocorrelation spectrum and cross-correlation spectrum of the external frequency domain audio data and the internal frequency domain audio data;
a gain calculation module: and the method is used for performing frequency domain gain calculation by utilizing a frequency domain correlation function based on the autocorrelation spectrum and the cross-correlation spectrum to eliminate the environmental noise in the external audio data.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the ambient sound suppression method according to any one of claims 1 to 7.
10. A terminal device, characterized in that it comprises:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to: performing the ambient sound suppression method according to any one of claims 1 to 7.
CN202011013186.6A 2020-09-24 2020-09-24 Dual-MIC input environmental sound suppression method and device, storage medium and equipment Pending CN112185407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011013186.6A CN112185407A (en) 2020-09-24 2020-09-24 Dual-MIC input environmental sound suppression method and device, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011013186.6A CN112185407A (en) 2020-09-24 2020-09-24 Dual-MIC input environmental sound suppression method and device, storage medium and equipment

Publications (1)

Publication Number Publication Date
CN112185407A true CN112185407A (en) 2021-01-05

Family

ID=73956131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011013186.6A Pending CN112185407A (en) 2020-09-24 2020-09-24 Dual-MIC input environmental sound suppression method and device, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN112185407A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916567A (en) * 2009-11-23 2010-12-15 瑞声声学科技(深圳)有限公司 Speech enhancement method applied to dual-microphone system
KR101442700B1 (en) * 2013-09-03 2014-09-23 서강대학교산학협력단 Noise cancellation method and apparatus using independent component analysis for headphones
CN104602162A (en) * 2014-12-17 2015-05-06 惠州Tcl移动通信有限公司 External noise reduction device for mobile terminal and noise reduction method of external noise reduction device
CN105577909A (en) * 2015-05-26 2016-05-11 东莞酷派软件技术有限公司 Denoising method and device
CN108269582A (en) * 2018-01-24 2018-07-10 厦门美图之家科技有限公司 A kind of orientation sound pick-up method and computing device based on two-microphone array
CN109741758A (en) * 2019-01-14 2019-05-10 杭州微纳科技股份有限公司 A kind of dual microphone voice de-noising method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916567A (en) * 2009-11-23 2010-12-15 瑞声声学科技(深圳)有限公司 Speech enhancement method applied to dual-microphone system
KR101442700B1 (en) * 2013-09-03 2014-09-23 서강대학교산학협력단 Noise cancellation method and apparatus using independent component analysis for headphones
CN104602162A (en) * 2014-12-17 2015-05-06 惠州Tcl移动通信有限公司 External noise reduction device for mobile terminal and noise reduction method of external noise reduction device
CN105577909A (en) * 2015-05-26 2016-05-11 东莞酷派软件技术有限公司 Denoising method and device
CN108269582A (en) * 2018-01-24 2018-07-10 厦门美图之家科技有限公司 A kind of orientation sound pick-up method and computing device based on two-microphone array
CN109741758A (en) * 2019-01-14 2019-05-10 杭州微纳科技股份有限公司 A kind of dual microphone voice de-noising method

Similar Documents

Publication Publication Date Title
CN107123430B (en) Echo cancel method, device, meeting plate and computer storage medium
US8219394B2 (en) Adaptive ambient sound suppression and speech tracking
CN108376548B (en) Echo cancellation method and system based on microphone array
CN110956969B (en) Live broadcast audio processing method and device, electronic equipment and storage medium
CN111951819A (en) Echo cancellation method, device and storage medium
US9232309B2 (en) Microphone array processing system
US9280984B2 (en) Noise cancellation method
CN110782914B (en) Signal processing method and device, terminal equipment and storage medium
CN110176244A (en) Echo cancel method, device, storage medium and computer equipment
CN107833579B (en) Noise elimination method, device and computer readable storage medium
CN109361995B (en) Volume adjusting method and device for electrical equipment, electrical equipment and medium
CN108447496A (en) A kind of sound enhancement method and device based on microphone array
CN109727605B (en) Method and system for processing sound signal
US20160073209A1 (en) Maintaining spatial stability utilizing common gain coefficient
CN111583950A (en) Audio processing method and device, electronic equipment and storage medium
CN107452398B (en) Echo acquisition method, electronic device and computer readable storage medium
US11380312B1 (en) Residual echo suppression for keyword detection
CN204117590U (en) Voice collecting denoising device and voice quality assessment system
US9516418B2 (en) Sound field spatial stabilizer
CN112185407A (en) Dual-MIC input environmental sound suppression method and device, storage medium and equipment
CN109584898B (en) Voice signal processing method and device, storage medium and electronic equipment
WO2020107455A1 (en) Voice processing method and apparatus, storage medium, and electronic device
CN114220451A (en) Audio denoising method, electronic device, and storage medium
CN108234792A (en) Audio signal processing method, electronic device and computer readable storage medium
CN115665642B (en) Noise elimination method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 238, room 406, No.1, Yichuang street, Huangpu District, Guangzhou, Guangdong 510000

Applicant after: Guangzhou langguo Electronic Technology Co.,Ltd.

Address before: 510000 unit a and B, zone 02, 4th floor, No. 136, Gaopu Road, high tech Development Zone, Tianhe District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU LANGO ELECTRONIC SCIENCE & TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105