CN109410975B - Voice noise reduction method, device and storage medium - Google Patents

Voice noise reduction method, device and storage medium Download PDF

Info

Publication number
CN109410975B
CN109410975B CN201811288716.0A CN201811288716A CN109410975B CN 109410975 B CN109410975 B CN 109410975B CN 201811288716 A CN201811288716 A CN 201811288716A CN 109410975 B CN109410975 B CN 109410975B
Authority
CN
China
Prior art keywords
signal
microphone
source
sound
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811288716.0A
Other languages
Chinese (zh)
Other versions
CN109410975A (en
Inventor
张晓红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Goertek Techology Co Ltd
Original Assignee
Goertek Techology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Goertek Techology Co Ltd filed Critical Goertek Techology Co Ltd
Priority to CN201811288716.0A priority Critical patent/CN109410975B/en
Publication of CN109410975A publication Critical patent/CN109410975A/en
Application granted granted Critical
Publication of CN109410975B publication Critical patent/CN109410975B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02163Only one microphone

Abstract

The embodiment of the application provides a voice noise reduction method, equipment and a storage medium, wherein the method comprises the following steps: acquiring a sound signal collected by a microphone; determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.

Description

Voice noise reduction method, device and storage medium
Technical Field
The present application relates to the field of noise reduction technologies, and in particular, to a method, device, and storage medium for voice noise reduction.
Background
During a call, a voice signal is collected by a microphone, and since the microphone for collecting the voice signal is usually exposed to the external environment, a noise signal in the external environment will also be collected by the microphone. This results in that the voice signal during the call is severely interfered by the noise signal.
In the prior art, microphone array beam forming is generally adopted to optimize call quality. The use of the microphone array beam forming can pick up only the sound signal in the sound source direction, which can reduce the noise signal interference in other directions, thereby eliminating a part of the noise signals other than the voice signal.
However, this method has poor noise reduction effect and poor speech quality.
Disclosure of Invention
Aspects of the present disclosure provide a voice noise reduction method, apparatus, and storage medium to reduce interference of a noise signal with a voice signal.
The embodiment of the application provides a voice noise reduction method, which comprises the following steps:
acquiring a sound signal collected by a microphone;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
The embodiment of the application also provides voice noise reduction equipment, which comprises a memory, a processor and a communication component;
the memory is to store one or more computer instructions;
the processor is coupled with the memory and the communication component to execute one or more computer instructions to:
acquiring a sound signal collected by a microphone through the communication assembly;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions, which when executed by one or more processors, cause the one or more processors to perform the aforementioned voice noise reduction method.
In the embodiment of the application, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic structural diagram of a speech noise reduction system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a speech noise reduction method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of a speech noise reduction device according to yet another embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the prior art, microphone array beam forming is generally adopted to optimize call quality. Forming a sound signal capable of picking up only the sound source direction by using a microphone array beam, which can reduce noise signal interference in other directions, thereby eliminating a part of noise signals except for voice signals; however, this method has poor noise reduction effect and poor speech quality. To solve the problems of the prior art, in some embodiments of the present application: and reversely deducing a source signal sent by the sound source based on the sound signal collected by the microphone and a transfer function between the sound source and the microphone, and filtering a noise signal in the sound signal collected by the microphone when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a speech noise reduction system according to an embodiment of the present application. As shown in fig. 1, the system includes: a microphone 10 and a speech noise reduction device 20.
In this embodiment, the voice noise reduction system can be applied to various voice processing scenarios, such as a voice call scenario, a voice analysis scenario, and the like. According to different voice processing scenarios, the voice noise reduction system may be carried in different voice processing devices, for example, in a voice call scenario, the voice noise reduction system may be carried in a voice processing device such as an earphone or a mobile phone, and for example, in a voice processing scenario, the voice noise reduction system may be carried in a voice processing device such as a smart speaker or a voice collector. Based on the difference of the voice processing device where the voice noise reduction system is located, the deployment forms of the microphone 10 and the voice noise reduction device 20 in the voice noise reduction system can be adaptively adjusted. For example, for a headset, the microphone of the headset may be multiplexed as the microphone 10 in the voice noise reduction system, and the voice noise reduction device 20 may be disposed within the body of the headset or on a server that may communicate with the headset. For another example, for a mobile phone, a microphone of the mobile phone may be multiplexed as the microphone 10 of the voice noise reduction system, and a CPU in the mobile phone may be multiplexed as the voice noise reduction device 20 of the voice noise reduction system. Of course, these are exemplary, and the present embodiment is not limited thereto.
In this embodiment, the voice noise reduction device 20 may acquire the sound signal collected by the microphone 10, and may determine the source signal emitted by the sound source 30 according to the sound signal and the transfer function between the sound source 30 and the microphone 10. In this embodiment, a single microphone may be used to implement voice noise reduction, and the number of microphones is not limited in this embodiment. It should be noted that, although only one microphone 10 is shown in fig. 1, this should not limit the number of microphones in the present embodiment.
In various application scenarios of the voice noise reduction system, the relative position between the sound source 30 and the microphone 10 is usually fixed, for example, in a voice call scenario, when a voice call is performed by using an earphone, the mouth of a person is used as the sound source 30, and the angle, the distance, and the like between the mouth of a person and a microphone of a mobile phone are usually fixed. The inventors have found in their research that when a speech signal is transmitted between a sound source and a microphone with a fixed relative position between the sound source and the microphone, there is a specific attenuation of the energy at each frequency in the speech signal, and this attenuation can be attributed to a fixed transfer function, wherein the determination of the transfer function between the sound source and the microphone will be described in detail later. Therefore, in the present embodiment, the source signal emitted by the sound source 30 can be reversely deduced according to the sound signal collected by the microphone 10 and the transfer function between the sound source 30 and the microphone 10.
The voice noise reduction device 20 may determine whether a target standard voice signal matching the source signal exists in the at least one standard voice signal according to the reversely derived source signal. The voice noise reduction device 20 may store information of at least one standard voice signal in advance, and certainly, the voice noise reduction device 20 may also obtain information of the standard voice signal from a network in the voice noise reduction process, which is not limited in this embodiment. The information of the standard voice signal can be obtained from a public way, and is not described herein again.
As mentioned above, the microphone 10 is usually exposed to the external environment, which results in the possibility of noise signals being included in the sound signals collected by the microphone 10. When the sound signal collected by the microphone 10 includes a noise signal, the source signal inversely derived from the sound signal collected by the microphone 10 includes a part of an interference signal corresponding to the noise signal included in the sound signal collected by the microphone 10, in addition to the voice signal emitted by the sound source 30. When the occupation ratio corresponding to the noise signal included in the sound signal collected by the microphone 10 is low, the source signal inversely derived from the sound signal collected by the microphone 10 will not be much different from the voice signal emitted by the sound source 30. Therefore, in this case, the target standard speech signal corresponding to the source signal that is reversely derived can be matched in the at least one standard speech signal.
In this embodiment, if a target standard speech signal matching the source signal exists in the at least one standard speech signal, the speech noise reduction device 20 may perform filtering processing on the sound signal collected by the microphone 10 by using the target standard speech signal as a reference.
When a target standard voice signal matched with the source signal exists in at least one standard voice signal, the voice signal contained in the voice signal collected by the characterization microphone 10 is relatively large, and the voice signal contained in the voice signal collected by the characterization microphone 10 corresponds to the target standard voice signal, so that the voice signal collected by the characterization microphone 10 can be filtered by taking the target standard voice signal as a reference, so as to filter the noise signal contained in the voice signal collected by the characterization microphone 10.
In the embodiment of the present application, a source signal emitted by the sound source 30 is reversely deduced based on the sound signal collected by the microphone 10 and a transfer function between the sound source 30 and the microphone 10, and when a target standard speech signal matching the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone 10 is filtered. In this embodiment, various types of noise signals in the sound signals collected by the microphone 10 can be filtered, and useful signals obtained after noise reduction are consistent with standard speech signals, so that interference of the noise signals on speech processing can be effectively avoided.
In the above or below embodiments, the speech noise reduction apparatus 20 may perform a fourier transform on the source signal to determine a frequency spectrum of the source signal; respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal; and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
As described above, the voice noise reduction apparatus 20 may store information of the at least one standard voice signal therein or may obtain information of the at least one standard voice signal from other ways. In this embodiment, whether a target standard speech signal matching the source signal exists in the at least one standard speech signal may be determined according to a correlation coefficient between the source signal and the frequency spectrum of each standard speech signal. In some practical applications, the formula can be used
Figure BDA0001849627550000051
Calculating a correlation coefficient between the source signal and the frequency spectrum of each standard voice signal, wherein S represents the frequency spectrum of the sound signal collected by the microphone 10, F represents the transfer function between the sound source 30 and the microphone 10, Pn represents the frequency spectrum of the nth standard voice signal, n is an integer greater than 1, Cov is a covariance function, and D is a variance function.
The voice noise reduction device 20 may calculate the correlation coefficient between the spectrum of the source signal and the spectrum of the at least one standard voice signal in at least the following two implementations, which is not limited to this embodiment.
In some implementations, in order to save the amount of computation, when a correlation coefficient within a preset range occurs in the computation process, that is, a standard speech signal whose correlation coefficient is within the preset range is used as a target standard speech signal matched with a source signal, so that the correlation coefficient between the frequency spectrum of the source signal and the frequency spectrum of each standard speech signal does not need to be computed, and the amount of computation can be effectively saved.
In other implementations, correlation coefficients between the spectrum of the source signal and the spectrum of each standard speech signal may be calculated, and with a largest correlation coefficient as a determination criterion, it is determined whether the largest correlation coefficient is within a preset range, and if so, the standard speech signal corresponding to the largest correlation coefficient is taken as a target standard speech signal matched with the source signal. This can effectively improve the accuracy of the judgment result.
In this embodiment, the range of the correlation coefficient between the source signal and the frequency spectrum of each standard voice signal is [ 0,1 ]. In this embodiment, the preset range of the correlation coefficient used in determining the target standard speech signal may be determined as (a, b), where a and b are both positive rational numbers smaller than 1, and a < b. The values of a and b can be set according to actual requirements. For example, a may be set to 0.8, b may be set to 1, and accordingly, the preset range of the correlation coefficient is (0.8, 1). For another example, a may be set to 0.7, b may be set to 0.9, and accordingly, the preset range of the correlation coefficient is (0.7, 0.9).
Based on the preset range (a, b), the range [ 0,1 ] of the correlation coefficient between the source signal and the frequency spectrum of each standard voice signal may be divided into three sections, i.e., [ 0, a ], [ a, b ], and [ b, 1 ]. From these three intervals, the signals collected by the microphone 10 can be classified into three categories. Taking the maximum correlation coefficient between the source signal and the frequency spectrum of each standard voice signal as an example of the judgment standard:
when the maximum correlation coefficient is located in the interval [ 0, a ], the representation microphone 10 has a large noise signal ratio in the sound signals collected, and at least one standard sound signal does not have a target standard sound signal matched with the source signal, and the speech noise reduction device 20 may discard the sound signal collected by the microphone 10 and continue to acquire the sound signal collected by the microphone 10 at the next time.
When the maximum correlation coefficient is located in the interval (a, b), the noise signal in the sound signal collected by the characterizing microphone 10 is smaller in proportion, and at least one standard sound signal has a target standard sound signal matching the source signal, and the speech noise reduction device 20 may filter the sound signal with the target standard sound signal as a reference to filter the noise signal included in the sound signal, so as to restore a useful signal.
When the maximum correlation coefficient is located in the interval [ b, 1 ], the noise signal representing the sound signal collected by the microphone 10 is very small, and particularly, when the maximum correlation coefficient is equal to 1, the noise signal representing the sound signal collected by the microphone 10 does not exist, and the voice noise reduction device 20 may not perform any processing on the sound signal collected by the microphone 10, and may directly use the sound signal as a useful signal.
In the above or the following embodiments, if there is a target standard voice signal matching the source signal in the at least one standard voice signal, the voice noise reduction device 20 may obtain, from the frequency spectrum of the target standard voice signal, a frequency band whose amplitude value does not satisfy the reservation condition as a frequency band to be filtered out; and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered out in the sound signals collected by the microphone 10.
Because the speech signal contains abundant harmonic components, peaks exist in the frequency spectrum of the target standard speech signal in both the fundamental frequency band and the harmonic frequency band. In some practical applications, for each peak, in the fundamental frequency band or the harmonic frequency band where the peak is located, a frequency band in which an amplitude difference with an amplitude value of the peak is greater than a certain preset threshold may be used as a frequency band to be filtered, and a frequency band in the target standard voice signal other than the frequency band to be filtered may be used as a reserved frequency band. In other practical applications, for each peak, in the fundamental frequency band or the harmonic frequency band where the peak is located, a frequency band with a preset width around the frequency point where the peak is located may be used as a reserved frequency band, and a frequency band other than the reserved frequency band in the target standard voice signal may be used as a frequency band to be filtered. Of course, this is only an example, and in this embodiment, the frequency band to be filtered and the reserved frequency band in the target standard speech signal may also be determined according to other reserved conditions, which is not exhaustive here.
According to the determined frequency band to be filtered and the reserved frequency band of the target standard voice signal, filtering processing can be performed on the voice signal collected by the microphone 10. The voice noise reduction device 20 can filter the signals in the frequency band to be filtered out from the sound signals collected by the microphone 10, and keep the signals in the reserved frequency band unchanged.
As mentioned above, the sound signal collected by the microphone 10 contains noise signals, and thus, the signal in the reserved frequency band also contains a part of the noise signals. In this embodiment, the voice noise reduction device 20 may adjust the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 by using an amplitude adjustment coefficient. Wherein the amplitude adjustment coefficient may be a positive rational number smaller than 1. The amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 can be reduced based on the amplitude adjustment coefficient to reduce the influence of the noise signal.
In some practical applications, the voice noise reduction device 20 may use a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to the reserved frequency band; the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 is adjusted according to the amplitude adjustment coefficient.
In this embodiment, the correlation coefficient between the target standard speech signal and the source signal is used as the amplitude adjustment coefficient corresponding to the reserved frequency band, and the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 is reduced, so that the influence of the noise signal on the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 can be reduced, and the amplitude value of the signal in the reserved frequency band in the adjusted sound signal is closer to the speech signal actually contained in the sound source 30 signal. This may therefore further improve the quality of the speech processing, further reducing the impact of noise signals on the speech processing.
It should be noted that, when adjusting the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10, the present embodiment does not limit the sequence of the processing procedures of the reserved frequency band and the frequency to be filtered, and the two procedures may be executed synchronously or in other sequences.
In the above or below described embodiments, the voice noise reduction apparatus 20 may determine the transfer function between the sound source 30 and the microphone 10 in advance. The voice noise reduction apparatus 20 may emit a test source signal to the microphone 10 using the test sound source 30 in a noise-free environment; acquiring a test sound signal acquired by a microphone 10; the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source 30 and the microphone 10.
In a noiseless environment, the relative position between the test sound source 30 and the microphone 10 may simulate the relative position between the sound source 30 and the microphone 10 during application of the speech noise reduction system. The test sound source 30 is used for simulating the sound source 30 to emit a test source signal, wherein the sound emitting direction of the test sound source 30 can simulate the sound emitting direction of the sound source 30 in the application process of the voice noise reduction system, and the microphone 10 can acquire the test source signal to obtain a test sound signal. Since both the test sound source 30 and the microphone 10 are in a noise-free environment, which ensures that no other interfering signals are present in the test sound signal collected by the microphone 10, the ratio between the test sound signal and the test source signal can be used as a transfer function between the sound source 30 and the microphone 10.
The transfer function thus determined may be stored in the speech noise reduction apparatus 20 in advance, and the speech noise reduction apparatus 20 may directly call the transfer function when it retroactive the source signal emitted by the sound source 30.
Fig. 2 is a flowchart illustrating a speech noise reduction method according to another embodiment of the present application. As shown in fig. 2, the method includes:
200. acquiring a sound signal collected by a microphone;
201. determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
202. and if the target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
In this embodiment, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
In an optional embodiment, the method further comprises:
performing a fourier transform on the source signal to determine a frequency spectrum of the source signal;
respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
In an alternative embodiment, step 202 comprises:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering the signals on the frequency band to be filtered in the sound signals collected by the microphone.
In an optional embodiment, the method further comprises:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in the target standard voice signal, and the amplitude value of the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
In an optional embodiment, before step 201, further comprising:
in a noise-free environment, sending a test source signal to a microphone by using a test sound source;
acquiring a test sound signal acquired by a microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
In an optional embodiment, the method further comprises:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
Fig. 3 is a schematic structural diagram of a speech noise reduction device according to yet another embodiment of the present application. As shown in fig. 3, the voice noise reduction apparatus includes: memory 30, processor 31, and communications component 32.
A memory 30 for storing a computer program and may be configured to store other various data to support operations on the voice noise reduction apparatus. Examples of such data include instructions for any application or method operating on the voice noise reduction device, contact data, phonebook data, messages, pictures, videos, and the like.
The memory 30 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
A processor 31, coupled to the memory 30 and the communication component 32, for executing computer programs in the memory for:
acquiring sound signals collected by a microphone through the communication component 32;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if the target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
In this embodiment, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
In an alternative embodiment, the processor 31 is further configured to:
performing a fourier transform on the source signal to determine a frequency spectrum of the source signal;
respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
In an alternative embodiment, the processor 31, when performing filtering processing on the sound signal collected by the microphone with the target standard speech signal as a reference, is configured to:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering the signals on the frequency band to be filtered in the sound signals collected by the microphone.
In an alternative embodiment, the processor 31 is further configured to:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in the target standard voice signal, and the amplitude value of the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
In an alternative embodiment, the processor 31 is further configured to, before determining the source signal emitted by the sound source based on the sound signal and the transfer function between the sound source and the microphone:
in a noise-free environment, sending a test source signal to a microphone by using a test sound source;
acquiring a test sound signal acquired by a microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
In an alternative embodiment, the processor 31 is further configured to:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
Further, as shown in fig. 3, the voice noise reduction apparatus further includes: power supply components 33, and the like. Only some of the components are schematically shown in fig. 3, and it is not meant that the speech noise reduction apparatus includes only the components shown in fig. 3.
Wherein the communication component 32 is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device in which the communication component is located may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component may be implemented based on Near Field Communication (NFC) technology, Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, or other technology to facilitate short-range communications.
The power supply unit 33 supplies power to various components of the device in which the power supply unit is installed. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.
Accordingly, the present application further provides a computer readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the voice noise reduction device in the foregoing method embodiments when executed.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. A method for speech noise reduction, comprising:
acquiring a sound signal collected by a microphone;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; a transfer function between the sound source and the microphone is predetermined; when the proportion corresponding to the noise signal contained in the sound signal is lower, the source signal is close to the voice signal sent by the sound source, and at least one standard voice signal has a target standard voice signal matched with the source signal;
and if a target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
2. The method of claim 1, further comprising:
fourier transforming the source signal to determine a frequency spectrum of the source signal;
calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of the at least one standard voice signal respectively;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
3. The method according to claim 2, wherein the filtering the sound signal collected by the microphone with the target standard speech signal as a reference comprises:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered in the sound signals collected by the microphone.
4. The method of claim 3, further comprising:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in which an amplitude value in the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
5. The method of claim 1, wherein prior to determining the source signal from the sound source based on the sound signal and a transfer function between the sound source and a microphone, further comprising:
in a noise-free environment, sending a test source signal to the microphone by using a test sound source;
acquiring a test sound signal acquired by the microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
6. The method according to any one of claims 1 to 5, further comprising:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
7. A voice noise reduction device comprising a memory, a processor, and a communication component;
the memory is to store one or more computer instructions;
the processor is coupled with the memory and the communication component to execute one or more computer instructions to:
acquiring a sound signal collected by a microphone through the communication assembly;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; a transfer function between the sound source and the microphone is predetermined; when the proportion corresponding to the noise signal contained in the sound signal is lower, the source signal is close to the voice signal sent by the sound source, and at least one standard voice signal has a target standard voice signal matched with the source signal;
and if a target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
8. The device of claim 7, wherein the processor is further configured to:
fourier transforming the source signal to determine a frequency spectrum of the source signal;
calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of the at least one standard voice signal respectively;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
9. The apparatus of claim 8, wherein the processor, when filtering the sound signal collected by the microphone with the target standard speech signal as a reference, is configured to:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered in the sound signals collected by the microphone.
10. The device of claim 9, wherein the processor is further configured to:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in which an amplitude value in the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
11. The apparatus of any of claims 7 to 10, wherein the processor is further configured to:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
12. A computer-readable storage medium storing computer instructions, which when executed by one or more processors, cause the one or more processors to perform the method of speech noise reduction of any of claims 1-6.
CN201811288716.0A 2018-10-31 2018-10-31 Voice noise reduction method, device and storage medium Active CN109410975B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811288716.0A CN109410975B (en) 2018-10-31 2018-10-31 Voice noise reduction method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811288716.0A CN109410975B (en) 2018-10-31 2018-10-31 Voice noise reduction method, device and storage medium

Publications (2)

Publication Number Publication Date
CN109410975A CN109410975A (en) 2019-03-01
CN109410975B true CN109410975B (en) 2021-03-09

Family

ID=65470924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811288716.0A Active CN109410975B (en) 2018-10-31 2018-10-31 Voice noise reduction method, device and storage medium

Country Status (1)

Country Link
CN (1) CN109410975B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110267160B (en) * 2019-05-31 2020-09-22 潍坊歌尔电子有限公司 Sound signal processing method, device and equipment
CN111050264A (en) * 2019-11-13 2020-04-21 歌尔股份有限公司 Noise test system and test method for simulating single-ended microphone
CN113160840B (en) * 2020-01-07 2022-10-25 北京地平线机器人技术研发有限公司 Noise filtering method, device, mobile equipment and computer readable storage medium
CN112037825B (en) * 2020-08-10 2022-09-27 北京小米松果电子有限公司 Audio signal processing method and device and storage medium
CN112420066A (en) * 2020-11-05 2021-02-26 深圳市卓翼科技股份有限公司 Noise reduction method, noise reduction device, computer equipment and computer readable storage medium
CN112565531B (en) * 2020-12-12 2021-08-13 深圳波导智慧科技有限公司 Recording method and device applied to multi-person voice conference
CN113409809B (en) * 2021-07-07 2023-04-07 上海新氦类脑智能科技有限公司 Voice noise reduction method, device and equipment
CN115810361A (en) * 2021-09-14 2023-03-17 中兴通讯股份有限公司 Echo cancellation method, terminal device and storage medium
CN117174101A (en) * 2022-05-25 2023-12-05 青岛海尔科技有限公司 Noise signal processing method and device, storage medium and electronic device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107045874A (en) * 2016-02-05 2017-08-15 深圳市潮流网络技术有限公司 A kind of Non-linear Speech Enhancement Method based on correlation
CN107204192A (en) * 2017-06-05 2017-09-26 歌尔科技有限公司 Tone testing method, sound enhancement method and device
CN107945815A (en) * 2017-11-27 2018-04-20 歌尔科技有限公司 Voice signal noise-reduction method and equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6511897B2 (en) * 2015-03-24 2019-05-15 株式会社Jvcケンウッド Noise reduction device, noise reduction method and program
JP6677662B2 (en) * 2017-02-14 2020-04-08 株式会社東芝 Sound processing device, sound processing method and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107045874A (en) * 2016-02-05 2017-08-15 深圳市潮流网络技术有限公司 A kind of Non-linear Speech Enhancement Method based on correlation
CN107204192A (en) * 2017-06-05 2017-09-26 歌尔科技有限公司 Tone testing method, sound enhancement method and device
CN107945815A (en) * 2017-11-27 2018-04-20 歌尔科技有限公司 Voice signal noise-reduction method and equipment

Also Published As

Publication number Publication date
CN109410975A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN109410975B (en) Voice noise reduction method, device and storage medium
CN111418010B (en) Multi-microphone noise reduction method and device and terminal equipment
US10154342B2 (en) Spatial adaptation in multi-microphone sound capture
US8143620B1 (en) System and method for adaptive classification of audio sources
TWI463817B (en) System and method for adaptive intelligent noise suppression
US8712069B1 (en) Selection of system parameters based on non-acoustic sensor information
CN109257675B (en) Wind noise prevention method, earphone and storage medium
KR101210313B1 (en) System and method for utilizing inter?microphone level differences for speech enhancement
US20190273988A1 (en) Beamsteering
US11715481B2 (en) Encoding parameter adjustment method and apparatus, device, and storage medium
US20120123772A1 (en) System and Method for Multi-Channel Noise Suppression Based on Closed-Form Solutions and Estimation of Time-Varying Complex Statistics
KR101475864B1 (en) Apparatus and method for eliminating noise
EP2851898B1 (en) Voice processing apparatus, voice processing method and corresponding computer program
US9343073B1 (en) Robust noise suppression system in adverse echo conditions
CN109688498B (en) Volume adjusting method, earphone and storage medium
CN110706693B (en) Method and device for determining voice endpoint, storage medium and electronic device
US9363600B2 (en) Method and apparatus for improved residual echo suppression and flexible tradeoffs in near-end distortion and echo reduction
US20200286501A1 (en) Apparatus and a method for signal enhancement
US20150066487A1 (en) Voice processing apparatus and voice processing method
GB2577905A (en) Processing audio signals
US20150310875A1 (en) Apparatus and method for improving speech intelligibility in background noise by amplification and compression
CN109712637B (en) Reverberation suppression system and method
JP6789827B2 (en) Multi-auditory MMSE analysis technique for clarifying audio signals
KR20200095370A (en) Detection of fricatives in speech signals
US20150310873A1 (en) System and method for improving sound quality of voice signal in voice communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant