CN110310655B - Microphone signal processing method, device, equipment and storage medium - Google Patents

Microphone signal processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN110310655B
CN110310655B CN201910324799.2A CN201910324799A CN110310655B CN 110310655 B CN110310655 B CN 110310655B CN 201910324799 A CN201910324799 A CN 201910324799A CN 110310655 B CN110310655 B CN 110310655B
Authority
CN
China
Prior art keywords
signal
processing
voice
module
noise reduction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910324799.2A
Other languages
Chinese (zh)
Other versions
CN110310655A (en
Inventor
刘荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201910324799.2A priority Critical patent/CN110310655B/en
Publication of CN110310655A publication Critical patent/CN110310655A/en
Application granted granted Critical
Publication of CN110310655B publication Critical patent/CN110310655B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The invention provides a microphone signal processing method, a device, equipment and a storage medium, wherein a signal is divided into three parts after linear echo cancellation processing and beam forming processing are carried out, a first nonlinear echo suppression processing is carried out after a first noise reduction processing is carried out on one part, and then voice existence detection is carried out to obtain a voice existence detection result X; the second path is subjected to second noise reduction processing and then is subjected to first automatic gain control processing to obtain a voice recognition signal Y for voice recognition; combining X and Y into two sound channels for the speech recognition APP to use; and the third path is subjected to third noise reduction processing and then is subjected to second nonlinear echo suppression processing to further suppress residual echo, and then is subjected to second automatic gain control processing to obtain a voice application signal Z for recording or communication APP. The invention branches the signal into three paths aiming at different requirements of the voice recognition APP and other voice APPs, has flexible structure, can independently adjust parameters and algorithms for processing two parts of signals, and does not influence each other.

Description

Microphone signal processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of speech signal processing, and more particularly, to a method, an apparatus, a device, and a storage medium for processing a microphone signal.
Background
In speech recognition applications, some pre-processing of the microphone signal is required, such as Beamforming (Beamforming), echo cancellation (AEC), Noise Reduction (NR), Automatic Gain Control (AGC), Dereverberation (DR), voice presence detection (VAD), etc. In an operating system, the software of voice recognition is usually a general APP, which can directly acquire a voice signal from a sound card device and perform recognition, while beam forming, echo cancellation, dereverberation and the like are highly related to hardware design, and are not well independently placed in application software, and each application software needs to be independently implemented, repeatedly calculated, some information is even unavailable, and the universality is poor. Some of the prior art solutions are therefore implemented in the firmware of the microphone module, which has the following disadvantages: the calculation amount is large, and the module cost is high. Or in the drive, which has the following disadvantages: resources are limited, such as floating point operations, locks, task scheduling, sleeping, etc.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a microphone signal processing method, a microphone signal processing device, microphone signal processing equipment and a microphone signal processing storage medium.
In a first aspect, an embodiment of the present invention provides a microphone signal processing method, including the following steps:
s1: carrying out linear echo cancellation (AEC) on the multi-path microphone signals and the reference signals together, and canceling out loudspeaker sounds picked up from the microphone;
s2: the multi-path microphone signals after the linear echo cancellation processing are processed by beam forming (Beamforming), one part of the beam formed signals is divided into three,
after first noise reduction processing, performing first nonlinear echo suppression processing on one path of signals to further suppress residual echo, and then performing voice presence detection (VAD) to obtain a voice presence detection result X;
the second path of signal is subjected to second noise reduction processing and then is subjected to first Automatic Gain Control (AGC) processing to obtain a voice recognition signal Y for voice recognition;
combining the voice existence detection result X and the voice recognition signal Y into two sound channels for being provided for the voice recognition APP to use;
the two different first and second noise reduction algorithms are used here because the speech signal used for speech recognition will severely affect the recognition rate if the noise is reduced too much or not well processed; the noise reduction of VAD needs to be strong, otherwise normal operation of VAD is affected. The reason why the nonlinear echo suppression part is only used on the VAD channel is that the nonlinear echo suppression part influences the voice recognition rate but is very helpful for VAD detection; after the two paths of processing are separated, the voice recognition effect and the VAD effect can be ensured, the debugging and the optimization are more convenient, and the parameters can not be mutually coupled.
And performing second nonlinear echo suppression processing on the third path of signals after third noise reduction processing to further suppress residual echo, and then performing second automatic gain control processing to obtain a voice application signal Z for recording or communication APP.
Preferably, in step S1, the reference signal is obtained from a speaker or from sound card driving/voice playing software.
Preferably, in step S1, the adaptive filter is used to perform linear echo cancellation processing on each microphone signal and the reference signal together.
Preferably, in step S2, when the multi-path microphone signal is processed by beamforming, the angle of arrival (DOA) needs to be known, and the DOA is calculated according to a preset estimation method of the DOA.
Preferably, in step S2, the voice existence detection result X and the voice recognition signal Y are combined into two channels, and the specific method is as follows: the speech presence detection result X is placed solely on one of the channels and the speech recognition signal Y is placed solely on the other channel. If the left channel stores a voice signal, the right channel stores VAD information, 0 indicates no voice, and non-0 indicates voice.
Preferably, in step S2, the voice existence detection result X and the voice recognition signal Y are combined into two channels, and the specific method is as follows: a certain bit of the speech recognition signal Y is used to store the presence detection result X. For example, the presence detection result X is stored using the lowest bit of the speech recognition signal Y, and when the lowest bit (bit) is 0, it indicates no speech, and when the lowest bit is 1, it indicates speech. The normal voice signal is 16bit or 24bit, and when the lowest 1bit is replaced by 0 or 1, the voice signal can be submerged by noise, and the original recognition rate is hardly influenced.
Preferably, the multi-path microphone signals are acquired from the multiple microphone hardware through the sound card driver and are sent to the signal processing service program, the signal processing service program processes according to the method, the processed signals are stored in the virtual sound card driver, and the virtual sound card driver simulates multiple audio input ports for providing the processed microphone signals for the voice recognition APP and other APPs respectively. For example, an audio stream formed by combining the speech presence detection result X and the speech recognition signal Y is provided for the speech recognition APP, and an audio stream of the speech application signal Z is provided for other APPs such as the recording APP and the communication APP.
The signal processing service program + virtual sound card driver is adopted in the following structural forms:
1. the universality is strong, the upper layer interfaces are uniform, each APP does not need to be independently processed, and repeated calculation is avoided;
2. the independence is strong, the whole set of processing method is executed in a signal processing service program, and the development limit is less; the algorithm and the code of the signal processing service program can be independently debugged, updated and deployed;
3. the signal processing service program is placed in the application-level service program, so that the development difficulty is low, the resource limitation is less, and the debugging is convenient;
4. the VAD and the signal processing are put together, more information can be obtained, such as a reference signal, various intermediate data in the signal processing process and the like, and the VAD effect is better after the information is utilized.
In a second aspect, an embodiment of the present invention provides a microphone signal processing apparatus, including:
a linear echo cancellation module: the linear echo cancellation device is used for carrying out linear echo cancellation processing on a plurality of paths of microphone signals and a reference signal together and canceling out loudspeaker sound picked in a microphone;
a beam forming module: the system comprises a linear echo cancellation module, a beam forming module and a control module, wherein the linear echo cancellation module is used for outputting signals of multiple microphones;
a first noise reduction module: the device is used for carrying out noise reduction processing on one path of signals formed by the wave beams;
a first nonlinear echo suppression module: the first noise reduction module is used for carrying out nonlinear echo suppression processing on the signal output by the first noise reduction module;
a voice presence detection module: the voice presence detection module is used for detecting the voice presence of the signal output by the first nonlinear echo suppression module to obtain a voice presence detection result X;
a second noise reduction module: the noise reduction processing is carried out on the other path of signals formed by the wave beams;
a first automatic gain control module: the automatic gain control module is used for carrying out automatic gain control on the signal output by the second noise reduction module to obtain a voice recognition signal Y for voice recognition;
a signal merging module: the voice recognition system is used for combining a voice existence detection result X and a voice recognition signal Y into a left sound channel and a right sound channel which are provided for a voice recognition APP to use;
a third noise reduction module: the noise reduction processing is carried out on the beam-formed third path signal;
a second nonlinear echo suppression module: the nonlinear echo suppression module is used for carrying out nonlinear echo suppression processing on the signal output by the third noise reduction module;
a second automatic gain control module: and the second automatic gain control processing module is used for carrying out second automatic gain control processing on the signal output by the second nonlinear echo suppression module to obtain a voice application signal Z for recording or communication APP.
In a third aspect, an embodiment of the present invention provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements any one of the steps of the method when executing the program.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which program, when executed by a processor, performs the steps of any one of the methods described above.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
1. aiming at different requirements of a voice recognition APP and other voice APPs, the signal is branched into three paths, one path of signal is subjected to voice existence detection, the other path of signal is subjected to voice signal processing comprising noise reduction, nonlinear echo suppression and automatic gain control, parameters and algorithms of the three signal processing parts can be independently adjusted, and mutual influence is avoided;
2. the information of the voice existence detection result X is directly mixed into the voice recognition signal Y, an additional channel is not needed to be added to provide VAD information, the implementation is convenient, and the implementation framework and the structure of the original system are not needed to be changed.
Drawings
Fig. 1 is a flowchart of a microphone signal processing method according to embodiment 1 of the present invention.
Fig. 2 is a schematic diagram of a left channel storing a voice signal and a right channel storing VAD information according to embodiment 1 of the present invention.
Fig. 3 is a schematic diagram of a microphone signal processing apparatus according to embodiment 2 of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
the technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a microphone signal processing method, including the following steps:
s1: carrying out linear echo cancellation (AEC) on the multi-path microphone signals and the reference signals together, and canceling out loudspeaker sounds picked up from the microphone;
s2: the multi-path microphone signals after the linear echo cancellation processing are processed by beam forming (Beamforming), one part of the beam formed signals is divided into three,
and performing first noise reduction processing on one path of signal, performing first nonlinear echo suppression processing on the other path of signal, further suppressing residual echo, and performing voice presence detection (VAD) to obtain a voice presence detection result X. The reference signal in step S1 is also needed for nonlinear echo suppression. The linear echo cancellation part usually cannot completely cancel the loudspeaker sound picked up in the microphone, so that voice presence detection (VAD) is more reliably performed conveniently, and then voice presence detection is performed to obtain a voice presence detection result X;
the second path of signal is subjected to second noise reduction processing and then is subjected to first Automatic Gain Control (AGC) processing to obtain a voice recognition signal Y for voice recognition;
combining the voice existence detection result X and the voice recognition signal Y into two sound channels for being provided for the voice recognition APP to use;
the two different first and second noise reduction algorithms are used here because the speech signal used for speech recognition will severely affect the recognition rate if the noise is reduced too much or not well processed; the noise reduction of VAD needs to be strong, otherwise normal operation of VAD is affected. The reason why the nonlinear echo suppression part is only used on the VAD channel is that the nonlinear echo suppression part influences the voice recognition rate but is very helpful for VAD detection; after the two paths of processing are separated, the voice recognition effect and the VAD effect can be ensured, the debugging and the optimization are more convenient, and the parameters can not be mutually coupled.
When the noise reduction algorithm is executed, a noise estimation value needs to be known, and the noise estimation value is obtained through calculation according to a preset noise estimation method. Here, a conventional noise estimation method may be used.
And performing second nonlinear echo suppression processing on the third path of signals after third noise reduction processing to further suppress residual echo, and then performing second automatic gain control processing to obtain a voice application signal Z for recording or communication APP.
In step S1, the reference signal is obtained from a speaker, and the reference signal is obtained from a speaker, or obtained from sound card driving/voice playing software.
In step S1, the adaptive filter is used to perform linear echo cancellation processing on each microphone signal and the reference signal.
In step S2, when the multi-path microphone signal is processed for beamforming, the angle of arrival (DOA) needs to be known, and the DOA is calculated according to a preset estimation method of the DOA.
In step S2, the speech presence detection result X and the speech recognition signal Y are combined into two sound channels, and the specific method is as follows: the speech presence detection result X is placed solely on one of the channels and the speech recognition signal Y is placed solely on the other channel. As shown in fig. 2, the left channel stores a voice signal, the right channel stores VAD information, 0 indicates no voice, and non-0 indicates voice.
In step S2, the voice presence detection result X and the voice recognition signal Y are combined into two sound channels, and the specific method may further be: a certain bit of the speech recognition signal Y is used to store the presence detection result X. For example, the presence detection result X is stored using the lowest bit of the speech recognition signal Y, and when the lowest bit (bit) is 0, it indicates no speech, and when the lowest bit is 1, it indicates speech. The normal voice signal is 16bit or 24bit, and when the lowest 1bit is replaced by 0 or 1, the voice signal can be submerged by noise, and the original recognition rate is hardly influenced.
Preferably, the multi-path microphone signals are acquired from the multiple microphone hardware through the sound card driver and are sent to the signal processing service program, the signal processing service program processes according to the method, the processed signals are stored in the virtual sound card driver, and the virtual sound card driver simulates multiple audio input ports for providing the processed microphone signals for the voice recognition APP and other APPs respectively. For example, an audio stream formed by combining the speech presence detection result X and the speech recognition signal Y is provided for the speech recognition APP, and an audio stream of the speech application signal Z is provided for other APPs such as the recording APP and the communication APP.
The signal processing service program + virtual sound card is adopted, and the following reasons exist:
1. the universality is strong, the upper layer interfaces are uniform, each APP does not need to be independently processed, and repeated calculation is avoided;
2. the independence is strong, and the algorithm and the code of the signal processing service program can be debugged, updated and deployed independently;
3. the signal processing service program is placed in the application-level service program, so that the development difficulty is low, the resource limitation is less, and the debugging is convenient;
4. the VAD and the signal processing are put together, more information can be obtained, such as a reference signal, various intermediate data in the signal processing process and the like, and the VAD effect is better after the information is utilized.
The scheme of the embodiment can be used in a video conference machine/a preschool education machine. In consideration of recording, remote education, voice control and other functions, the whole machine needs to have a microphone input and a loudspeaker to output sound. The effect of recording and speech recognition can be seriously affected by the requirement of longer pickup distance and the interference of loudspeaker signals. Therefore, a pre-processing module of the microphone signal is needed to remove the loudspeaker echo signal and the noise signal in the environment contained in the microphone signal, and adjust the signal amplitude to a proper amplitude to send to the recording software or the voice recognition module for recognition. Meanwhile, in order to ensure that the microphone signal is not sent to the voice recognition module when no voice exists, VAD is needed to detect whether the voice signal exists at present, and only when the voice signal exists, the microphone data is sent to the voice recognition module for recognition. The speech recognition module and recording software can work independently at the user application level without concern for portions of speech signal processing. This arrangement allows the use of a very low cost (because there is no signal processing) microphone module, with the signal processing part being located on the main CPU of the system.
Example 2
As shown in fig. 3, embodiment 2 of the present invention provides a microphone signal processing apparatus, including:
a linear echo cancellation module: the linear echo cancellation device is used for carrying out linear echo cancellation processing on a plurality of paths of microphone signals and a reference signal together and canceling out loudspeaker sound picked in a microphone;
a beam forming module: the system comprises a linear echo cancellation module, a beam forming module and a control module, wherein the linear echo cancellation module is used for outputting signals of multiple microphones;
a first noise reduction module: the device is used for carrying out noise reduction processing on one path of signals formed by the wave beams;
a first nonlinear echo suppression module: the first noise reduction module is used for carrying out nonlinear echo suppression processing on the signal output by the first noise reduction module;
a voice presence detection module: the voice presence detection module is used for detecting the voice presence of the signal output by the first nonlinear echo suppression module to obtain a voice presence detection result X;
a second noise reduction module: the noise reduction processing is carried out on the other path of signals formed by the wave beams;
a first automatic gain control module: the automatic gain control module is used for carrying out automatic gain control on the signal output by the second noise reduction module to obtain a voice recognition signal Y for voice recognition;
a signal merging module: the voice recognition system is used for combining a voice existence detection result X and a voice recognition signal Y into a left sound channel and a right sound channel which are provided for a voice recognition APP to use;
a third noise reduction module: the noise reduction processing is carried out on the beam-formed third path signal;
a second nonlinear echo suppression module: the nonlinear echo suppression module is used for carrying out nonlinear echo suppression processing on the signal output by the third noise reduction module;
a second automatic gain control module: and the second automatic gain control processing module is used for carrying out second automatic gain control processing on the signal output by the second nonlinear echo suppression module to obtain a voice application signal Z for recording or communication APP.
Example 3
Embodiment 3 of the present invention provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements any of the steps of the method described above. In this embodiment, the processor is a control center of the computer system, and may be a processor of a physical machine or a processor of a virtual machine.
Example 4
Embodiment 4 of the present invention provides a computer-readable storage medium on which a computer program is stored, the program being executed by a processor to perform the steps of any one of the methods described above. The computer-readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
It is clear to a person skilled in the art that the solution according to the embodiments of the invention can be implemented by means of software and/or hardware. The "unit" or "module" in the present specification means software and/or hardware capable of performing a specific function by itself or in cooperation with other components.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A microphone signal processing method, characterized by comprising the steps of:
s1: carrying out linear echo cancellation processing on a plurality of paths of microphone signals and a reference signal together, and canceling out loudspeaker sound picked in a microphone;
s2: the multi-path microphone signals after the linear echo cancellation are processed by beam forming, one beam forming signal is divided into three,
after first noise reduction processing, performing first nonlinear echo suppression processing on one path of signal to further suppress residual echo, and then performing voice existence detection to obtain a voice existence detection result X;
the second path of signal is subjected to second noise reduction processing and then is subjected to first automatic gain control processing to obtain a voice recognition signal Y for voice recognition;
combining the voice existence detection result X and the voice recognition signal Y into two sound channels for being provided for the voice recognition APP to use;
and performing second nonlinear echo suppression processing on the third path of signals after third noise reduction processing to further suppress residual echo, and then performing second automatic gain control processing to obtain a voice application signal Z for recording or communication APP.
2. The microphone signal processing method according to claim 1, wherein in step S1, the reference signal is obtained from a speaker or from sound card driver/voice playing software.
3. The microphone signal processing method according to claim 1, wherein in step S1, the adaptive filter is used to perform linear echo cancellation processing on each microphone signal and the reference signal together.
4. The method as claimed in claim 1, wherein in step S2, the arrival angle is required to be known when the multi-path microphone signal is processed by beamforming, and the arrival angle is calculated according to a predetermined arrival angle estimation method.
5. The microphone signal processing method of claim 1, wherein in step S2, the voice presence detection result X and the voice recognition signal Y are combined into two channels, specifically: the speech presence detection result X is placed solely on one of the channels and the speech recognition signal Y is placed solely on the other channel.
6. The microphone signal processing method of claim 1, wherein in step S2, the voice presence detection result X and the voice recognition signal Y are combined into two channels, specifically: a certain bit of the speech recognition signal Y is used to store the presence detection result X.
7. The microphone signal processing method according to any one of claims 1 to 6, wherein the multiple microphone signals are acquired from multiple microphone hardware by a sound card driver and sent to a signal processing service program, the signal processing service program performs processing according to the method, the processed signals are stored in a virtual sound card driver, and the virtual sound card driver simulates multiple audio input ports for providing the processed microphone signals for the speech recognition APP and other APPs, respectively.
8. A microphone signal processing apparatus, comprising:
a linear echo cancellation module: the linear echo cancellation device is used for carrying out linear echo cancellation processing on a plurality of paths of microphone signals and a reference signal together and canceling out loudspeaker sound picked in a microphone;
a beam forming module: the system comprises a linear echo cancellation module, a beam forming module and a control module, wherein the linear echo cancellation module is used for outputting signals of multiple microphones;
a first noise reduction module: the device is used for carrying out noise reduction processing on one path of signals formed by the wave beams;
a first nonlinear echo suppression module: the first noise reduction module is used for carrying out nonlinear echo suppression processing on the signal output by the first noise reduction module;
a voice presence detection module: the voice presence detection module is used for detecting the voice presence of the signal output by the first nonlinear echo suppression module to obtain a voice presence detection result X;
a second noise reduction module: the noise reduction processing is carried out on the other path of signals formed by the wave beams;
a first automatic gain control module: the automatic gain control module is used for carrying out automatic gain control on the signal output by the second noise reduction module to obtain a voice recognition signal Y for voice recognition;
a signal merging module: the voice recognition system is used for combining a voice existence detection result X and a voice recognition signal Y into a left sound channel and a right sound channel which are provided for a voice recognition APP to use;
a third noise reduction module: the noise reduction processing is carried out on the beam-formed third path signal;
a second nonlinear echo suppression module: the nonlinear echo suppression module is used for carrying out nonlinear echo suppression processing on the signal output by the third noise reduction module;
a second automatic gain control module: and the second automatic gain control processing module is used for carrying out second automatic gain control processing on the signal output by the second nonlinear echo suppression module to obtain a voice application signal Z for recording or communication APP.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1-7 are implemented when the program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN201910324799.2A 2019-04-22 2019-04-22 Microphone signal processing method, device, equipment and storage medium Active CN110310655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910324799.2A CN110310655B (en) 2019-04-22 2019-04-22 Microphone signal processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910324799.2A CN110310655B (en) 2019-04-22 2019-04-22 Microphone signal processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110310655A CN110310655A (en) 2019-10-08
CN110310655B true CN110310655B (en) 2021-10-22

Family

ID=68075394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910324799.2A Active CN110310655B (en) 2019-04-22 2019-04-22 Microphone signal processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110310655B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233687B (en) * 2020-12-10 2021-07-16 统信软件技术有限公司 Audio noise reduction device and computing equipment
CN113823314B (en) * 2021-08-12 2022-10-28 北京荣耀终端有限公司 Voice processing method and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8239194B1 (en) * 2011-07-28 2012-08-07 Google Inc. System and method for multi-channel multi-feature speech/noise classification for noise suppression
CN104183234A (en) * 2013-05-28 2014-12-03 展讯通信(上海)有限公司 Method and device for processing voice signal and achieving multi-party conversation, and communication terminal
CN105304093A (en) * 2015-11-10 2016-02-03 百度在线网络技术(北京)有限公司 Signal front-end processing method used for voice recognition and device thereof
CN107547813A (en) * 2016-06-29 2018-01-05 深圳市巨龙科教高技术股份有限公司 A kind of system and method for acquisition process multipath audio signal
CN108766456A (en) * 2018-05-22 2018-11-06 出门问问信息科技有限公司 A kind of method of speech processing and device
CN109074816A (en) * 2016-06-15 2018-12-21 英特尔公司 Far field automatic speech recognition pretreatment
CN109660904A (en) * 2019-02-02 2019-04-19 恒玄科技(上海)有限公司 Headphone device, audio signal processing method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8239194B1 (en) * 2011-07-28 2012-08-07 Google Inc. System and method for multi-channel multi-feature speech/noise classification for noise suppression
CN104183234A (en) * 2013-05-28 2014-12-03 展讯通信(上海)有限公司 Method and device for processing voice signal and achieving multi-party conversation, and communication terminal
CN105304093A (en) * 2015-11-10 2016-02-03 百度在线网络技术(北京)有限公司 Signal front-end processing method used for voice recognition and device thereof
CN109074816A (en) * 2016-06-15 2018-12-21 英特尔公司 Far field automatic speech recognition pretreatment
CN107547813A (en) * 2016-06-29 2018-01-05 深圳市巨龙科教高技术股份有限公司 A kind of system and method for acquisition process multipath audio signal
CN108766456A (en) * 2018-05-22 2018-11-06 出门问问信息科技有限公司 A kind of method of speech processing and device
CN109660904A (en) * 2019-02-02 2019-04-19 恒玄科技(上海)有限公司 Headphone device, audio signal processing method and system

Also Published As

Publication number Publication date
CN110310655A (en) 2019-10-08

Similar Documents

Publication Publication Date Title
CN110097891B (en) Microphone signal processing method, device, equipment and storage medium
US10535362B2 (en) Speech enhancement for an electronic device
US9913022B2 (en) System and method of improving voice quality in a wireless headset with untethered earbuds of a mobile device
US10269369B2 (en) System and method of noise reduction for a mobile device
US9197974B1 (en) Directional audio capture adaptation based on alternative sensory input
US9438985B2 (en) System and method of detecting a user's voice activity using an accelerometer
US9838784B2 (en) Directional audio capture
KR101171494B1 (en) Robust two microphone noise suppression system
EP2715725B1 (en) Processing audio signals
US8712069B1 (en) Selection of system parameters based on non-acoustic sensor information
US20100290615A1 (en) Echo canceller operative in response to fluctuation on echo path
US20140126746A1 (en) Signal-separation system using a directional microphone array and method for providing same
EP2863392B1 (en) Noise reduction in multi-microphone systems
US10978086B2 (en) Echo cancellation using a subset of multiple microphones as reference channels
US20170365249A1 (en) System and method of performing automatic speech recognition using end-pointing markers generated using accelerometer-based voice activity detector
KR20160099640A (en) Systems and methods for feedback detection
KR20120101457A (en) Audio zoom
WO2014051969A1 (en) System and method of detecting a user's voice activity using an accelerometer
CN112424863A (en) Voice perception audio system and method
CN110310655B (en) Microphone signal processing method, device, equipment and storage medium
US20150371656A1 (en) Acoustic Echo Preprocessing for Speech Enhancement
KR101982812B1 (en) Headset and method for improving sound quality thereof
CN111081233B (en) Audio processing method and electronic equipment
CN114008999B (en) Acoustic echo cancellation
US9729967B2 (en) Feedback canceling system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant