CN109410975B - Voice noise reduction method, device and storage medium - Google Patents
Voice noise reduction method, device and storage medium Download PDFInfo
- Publication number
- CN109410975B CN109410975B CN201811288716.0A CN201811288716A CN109410975B CN 109410975 B CN109410975 B CN 109410975B CN 201811288716 A CN201811288716 A CN 201811288716A CN 109410975 B CN109410975 B CN 109410975B
- Authority
- CN
- China
- Prior art keywords
- signal
- microphone
- source
- sound
- voice signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 68
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000003860 storage Methods 0.000 title claims abstract description 16
- 230000005236 sound signal Effects 0.000 claims abstract description 94
- 238000012546 transfer Methods 0.000 claims abstract description 29
- 238000001914 filtration Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims description 34
- 238000001228 spectrum Methods 0.000 claims description 34
- 238000012360 testing method Methods 0.000 claims description 30
- 238000004891 communication Methods 0.000 claims description 18
- 230000001131 transforming effect Effects 0.000 claims 2
- 238000012545 processing Methods 0.000 abstract description 24
- 238000010586 diagram Methods 0.000 description 12
- 238000004590 computer program Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 238000012512 characterization method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02163—Only one microphone
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The embodiment of the application provides a voice noise reduction method, equipment and a storage medium, wherein the method comprises the following steps: acquiring a sound signal collected by a microphone; determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
Description
Technical Field
The present application relates to the field of noise reduction technologies, and in particular, to a method, device, and storage medium for voice noise reduction.
Background
During a call, a voice signal is collected by a microphone, and since the microphone for collecting the voice signal is usually exposed to the external environment, a noise signal in the external environment will also be collected by the microphone. This results in that the voice signal during the call is severely interfered by the noise signal.
In the prior art, microphone array beam forming is generally adopted to optimize call quality. The use of the microphone array beam forming can pick up only the sound signal in the sound source direction, which can reduce the noise signal interference in other directions, thereby eliminating a part of the noise signals other than the voice signal.
However, this method has poor noise reduction effect and poor speech quality.
Disclosure of Invention
Aspects of the present disclosure provide a voice noise reduction method, apparatus, and storage medium to reduce interference of a noise signal with a voice signal.
The embodiment of the application provides a voice noise reduction method, which comprises the following steps:
acquiring a sound signal collected by a microphone;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
The embodiment of the application also provides voice noise reduction equipment, which comprises a memory, a processor and a communication component;
the memory is to store one or more computer instructions;
the processor is coupled with the memory and the communication component to execute one or more computer instructions to:
acquiring a sound signal collected by a microphone through the communication assembly;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if a target standard voice signal matched with the source signal exists in at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions, which when executed by one or more processors, cause the one or more processors to perform the aforementioned voice noise reduction method.
In the embodiment of the application, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic structural diagram of a speech noise reduction system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a speech noise reduction method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of a speech noise reduction device according to yet another embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the prior art, microphone array beam forming is generally adopted to optimize call quality. Forming a sound signal capable of picking up only the sound source direction by using a microphone array beam, which can reduce noise signal interference in other directions, thereby eliminating a part of noise signals except for voice signals; however, this method has poor noise reduction effect and poor speech quality. To solve the problems of the prior art, in some embodiments of the present application: and reversely deducing a source signal sent by the sound source based on the sound signal collected by the microphone and a transfer function between the sound source and the microphone, and filtering a noise signal in the sound signal collected by the microphone when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a speech noise reduction system according to an embodiment of the present application. As shown in fig. 1, the system includes: a microphone 10 and a speech noise reduction device 20.
In this embodiment, the voice noise reduction system can be applied to various voice processing scenarios, such as a voice call scenario, a voice analysis scenario, and the like. According to different voice processing scenarios, the voice noise reduction system may be carried in different voice processing devices, for example, in a voice call scenario, the voice noise reduction system may be carried in a voice processing device such as an earphone or a mobile phone, and for example, in a voice processing scenario, the voice noise reduction system may be carried in a voice processing device such as a smart speaker or a voice collector. Based on the difference of the voice processing device where the voice noise reduction system is located, the deployment forms of the microphone 10 and the voice noise reduction device 20 in the voice noise reduction system can be adaptively adjusted. For example, for a headset, the microphone of the headset may be multiplexed as the microphone 10 in the voice noise reduction system, and the voice noise reduction device 20 may be disposed within the body of the headset or on a server that may communicate with the headset. For another example, for a mobile phone, a microphone of the mobile phone may be multiplexed as the microphone 10 of the voice noise reduction system, and a CPU in the mobile phone may be multiplexed as the voice noise reduction device 20 of the voice noise reduction system. Of course, these are exemplary, and the present embodiment is not limited thereto.
In this embodiment, the voice noise reduction device 20 may acquire the sound signal collected by the microphone 10, and may determine the source signal emitted by the sound source 30 according to the sound signal and the transfer function between the sound source 30 and the microphone 10. In this embodiment, a single microphone may be used to implement voice noise reduction, and the number of microphones is not limited in this embodiment. It should be noted that, although only one microphone 10 is shown in fig. 1, this should not limit the number of microphones in the present embodiment.
In various application scenarios of the voice noise reduction system, the relative position between the sound source 30 and the microphone 10 is usually fixed, for example, in a voice call scenario, when a voice call is performed by using an earphone, the mouth of a person is used as the sound source 30, and the angle, the distance, and the like between the mouth of a person and a microphone of a mobile phone are usually fixed. The inventors have found in their research that when a speech signal is transmitted between a sound source and a microphone with a fixed relative position between the sound source and the microphone, there is a specific attenuation of the energy at each frequency in the speech signal, and this attenuation can be attributed to a fixed transfer function, wherein the determination of the transfer function between the sound source and the microphone will be described in detail later. Therefore, in the present embodiment, the source signal emitted by the sound source 30 can be reversely deduced according to the sound signal collected by the microphone 10 and the transfer function between the sound source 30 and the microphone 10.
The voice noise reduction device 20 may determine whether a target standard voice signal matching the source signal exists in the at least one standard voice signal according to the reversely derived source signal. The voice noise reduction device 20 may store information of at least one standard voice signal in advance, and certainly, the voice noise reduction device 20 may also obtain information of the standard voice signal from a network in the voice noise reduction process, which is not limited in this embodiment. The information of the standard voice signal can be obtained from a public way, and is not described herein again.
As mentioned above, the microphone 10 is usually exposed to the external environment, which results in the possibility of noise signals being included in the sound signals collected by the microphone 10. When the sound signal collected by the microphone 10 includes a noise signal, the source signal inversely derived from the sound signal collected by the microphone 10 includes a part of an interference signal corresponding to the noise signal included in the sound signal collected by the microphone 10, in addition to the voice signal emitted by the sound source 30. When the occupation ratio corresponding to the noise signal included in the sound signal collected by the microphone 10 is low, the source signal inversely derived from the sound signal collected by the microphone 10 will not be much different from the voice signal emitted by the sound source 30. Therefore, in this case, the target standard speech signal corresponding to the source signal that is reversely derived can be matched in the at least one standard speech signal.
In this embodiment, if a target standard speech signal matching the source signal exists in the at least one standard speech signal, the speech noise reduction device 20 may perform filtering processing on the sound signal collected by the microphone 10 by using the target standard speech signal as a reference.
When a target standard voice signal matched with the source signal exists in at least one standard voice signal, the voice signal contained in the voice signal collected by the characterization microphone 10 is relatively large, and the voice signal contained in the voice signal collected by the characterization microphone 10 corresponds to the target standard voice signal, so that the voice signal collected by the characterization microphone 10 can be filtered by taking the target standard voice signal as a reference, so as to filter the noise signal contained in the voice signal collected by the characterization microphone 10.
In the embodiment of the present application, a source signal emitted by the sound source 30 is reversely deduced based on the sound signal collected by the microphone 10 and a transfer function between the sound source 30 and the microphone 10, and when a target standard speech signal matching the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone 10 is filtered. In this embodiment, various types of noise signals in the sound signals collected by the microphone 10 can be filtered, and useful signals obtained after noise reduction are consistent with standard speech signals, so that interference of the noise signals on speech processing can be effectively avoided.
In the above or below embodiments, the speech noise reduction apparatus 20 may perform a fourier transform on the source signal to determine a frequency spectrum of the source signal; respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal; and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
As described above, the voice noise reduction apparatus 20 may store information of the at least one standard voice signal therein or may obtain information of the at least one standard voice signal from other ways. In this embodiment, whether a target standard speech signal matching the source signal exists in the at least one standard speech signal may be determined according to a correlation coefficient between the source signal and the frequency spectrum of each standard speech signal. In some practical applications, the formula can be usedCalculating a correlation coefficient between the source signal and the frequency spectrum of each standard voice signal, wherein S represents the frequency spectrum of the sound signal collected by the microphone 10, F represents the transfer function between the sound source 30 and the microphone 10, Pn represents the frequency spectrum of the nth standard voice signal, n is an integer greater than 1, Cov is a covariance function, and D is a variance function.
The voice noise reduction device 20 may calculate the correlation coefficient between the spectrum of the source signal and the spectrum of the at least one standard voice signal in at least the following two implementations, which is not limited to this embodiment.
In some implementations, in order to save the amount of computation, when a correlation coefficient within a preset range occurs in the computation process, that is, a standard speech signal whose correlation coefficient is within the preset range is used as a target standard speech signal matched with a source signal, so that the correlation coefficient between the frequency spectrum of the source signal and the frequency spectrum of each standard speech signal does not need to be computed, and the amount of computation can be effectively saved.
In other implementations, correlation coefficients between the spectrum of the source signal and the spectrum of each standard speech signal may be calculated, and with a largest correlation coefficient as a determination criterion, it is determined whether the largest correlation coefficient is within a preset range, and if so, the standard speech signal corresponding to the largest correlation coefficient is taken as a target standard speech signal matched with the source signal. This can effectively improve the accuracy of the judgment result.
In this embodiment, the range of the correlation coefficient between the source signal and the frequency spectrum of each standard voice signal is [ 0,1 ]. In this embodiment, the preset range of the correlation coefficient used in determining the target standard speech signal may be determined as (a, b), where a and b are both positive rational numbers smaller than 1, and a < b. The values of a and b can be set according to actual requirements. For example, a may be set to 0.8, b may be set to 1, and accordingly, the preset range of the correlation coefficient is (0.8, 1). For another example, a may be set to 0.7, b may be set to 0.9, and accordingly, the preset range of the correlation coefficient is (0.7, 0.9).
Based on the preset range (a, b), the range [ 0,1 ] of the correlation coefficient between the source signal and the frequency spectrum of each standard voice signal may be divided into three sections, i.e., [ 0, a ], [ a, b ], and [ b, 1 ]. From these three intervals, the signals collected by the microphone 10 can be classified into three categories. Taking the maximum correlation coefficient between the source signal and the frequency spectrum of each standard voice signal as an example of the judgment standard:
when the maximum correlation coefficient is located in the interval [ 0, a ], the representation microphone 10 has a large noise signal ratio in the sound signals collected, and at least one standard sound signal does not have a target standard sound signal matched with the source signal, and the speech noise reduction device 20 may discard the sound signal collected by the microphone 10 and continue to acquire the sound signal collected by the microphone 10 at the next time.
When the maximum correlation coefficient is located in the interval (a, b), the noise signal in the sound signal collected by the characterizing microphone 10 is smaller in proportion, and at least one standard sound signal has a target standard sound signal matching the source signal, and the speech noise reduction device 20 may filter the sound signal with the target standard sound signal as a reference to filter the noise signal included in the sound signal, so as to restore a useful signal.
When the maximum correlation coefficient is located in the interval [ b, 1 ], the noise signal representing the sound signal collected by the microphone 10 is very small, and particularly, when the maximum correlation coefficient is equal to 1, the noise signal representing the sound signal collected by the microphone 10 does not exist, and the voice noise reduction device 20 may not perform any processing on the sound signal collected by the microphone 10, and may directly use the sound signal as a useful signal.
In the above or the following embodiments, if there is a target standard voice signal matching the source signal in the at least one standard voice signal, the voice noise reduction device 20 may obtain, from the frequency spectrum of the target standard voice signal, a frequency band whose amplitude value does not satisfy the reservation condition as a frequency band to be filtered out; and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered out in the sound signals collected by the microphone 10.
Because the speech signal contains abundant harmonic components, peaks exist in the frequency spectrum of the target standard speech signal in both the fundamental frequency band and the harmonic frequency band. In some practical applications, for each peak, in the fundamental frequency band or the harmonic frequency band where the peak is located, a frequency band in which an amplitude difference with an amplitude value of the peak is greater than a certain preset threshold may be used as a frequency band to be filtered, and a frequency band in the target standard voice signal other than the frequency band to be filtered may be used as a reserved frequency band. In other practical applications, for each peak, in the fundamental frequency band or the harmonic frequency band where the peak is located, a frequency band with a preset width around the frequency point where the peak is located may be used as a reserved frequency band, and a frequency band other than the reserved frequency band in the target standard voice signal may be used as a frequency band to be filtered. Of course, this is only an example, and in this embodiment, the frequency band to be filtered and the reserved frequency band in the target standard speech signal may also be determined according to other reserved conditions, which is not exhaustive here.
According to the determined frequency band to be filtered and the reserved frequency band of the target standard voice signal, filtering processing can be performed on the voice signal collected by the microphone 10. The voice noise reduction device 20 can filter the signals in the frequency band to be filtered out from the sound signals collected by the microphone 10, and keep the signals in the reserved frequency band unchanged.
As mentioned above, the sound signal collected by the microphone 10 contains noise signals, and thus, the signal in the reserved frequency band also contains a part of the noise signals. In this embodiment, the voice noise reduction device 20 may adjust the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 by using an amplitude adjustment coefficient. Wherein the amplitude adjustment coefficient may be a positive rational number smaller than 1. The amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 can be reduced based on the amplitude adjustment coefficient to reduce the influence of the noise signal.
In some practical applications, the voice noise reduction device 20 may use a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to the reserved frequency band; the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 is adjusted according to the amplitude adjustment coefficient.
In this embodiment, the correlation coefficient between the target standard speech signal and the source signal is used as the amplitude adjustment coefficient corresponding to the reserved frequency band, and the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 is reduced, so that the influence of the noise signal on the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10 can be reduced, and the amplitude value of the signal in the reserved frequency band in the adjusted sound signal is closer to the speech signal actually contained in the sound source 30 signal. This may therefore further improve the quality of the speech processing, further reducing the impact of noise signals on the speech processing.
It should be noted that, when adjusting the amplitude value of the signal in the reserved frequency band in the sound signal collected by the microphone 10, the present embodiment does not limit the sequence of the processing procedures of the reserved frequency band and the frequency to be filtered, and the two procedures may be executed synchronously or in other sequences.
In the above or below described embodiments, the voice noise reduction apparatus 20 may determine the transfer function between the sound source 30 and the microphone 10 in advance. The voice noise reduction apparatus 20 may emit a test source signal to the microphone 10 using the test sound source 30 in a noise-free environment; acquiring a test sound signal acquired by a microphone 10; the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source 30 and the microphone 10.
In a noiseless environment, the relative position between the test sound source 30 and the microphone 10 may simulate the relative position between the sound source 30 and the microphone 10 during application of the speech noise reduction system. The test sound source 30 is used for simulating the sound source 30 to emit a test source signal, wherein the sound emitting direction of the test sound source 30 can simulate the sound emitting direction of the sound source 30 in the application process of the voice noise reduction system, and the microphone 10 can acquire the test source signal to obtain a test sound signal. Since both the test sound source 30 and the microphone 10 are in a noise-free environment, which ensures that no other interfering signals are present in the test sound signal collected by the microphone 10, the ratio between the test sound signal and the test source signal can be used as a transfer function between the sound source 30 and the microphone 10.
The transfer function thus determined may be stored in the speech noise reduction apparatus 20 in advance, and the speech noise reduction apparatus 20 may directly call the transfer function when it retroactive the source signal emitted by the sound source 30.
Fig. 2 is a flowchart illustrating a speech noise reduction method according to another embodiment of the present application. As shown in fig. 2, the method includes:
200. acquiring a sound signal collected by a microphone;
201. determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
202. and if the target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
In this embodiment, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
In an optional embodiment, the method further comprises:
performing a fourier transform on the source signal to determine a frequency spectrum of the source signal;
respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
In an alternative embodiment, step 202 comprises:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering the signals on the frequency band to be filtered in the sound signals collected by the microphone.
In an optional embodiment, the method further comprises:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in the target standard voice signal, and the amplitude value of the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
In an optional embodiment, before step 201, further comprising:
in a noise-free environment, sending a test source signal to a microphone by using a test sound source;
acquiring a test sound signal acquired by a microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
In an optional embodiment, the method further comprises:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
Fig. 3 is a schematic structural diagram of a speech noise reduction device according to yet another embodiment of the present application. As shown in fig. 3, the voice noise reduction apparatus includes: memory 30, processor 31, and communications component 32.
A memory 30 for storing a computer program and may be configured to store other various data to support operations on the voice noise reduction apparatus. Examples of such data include instructions for any application or method operating on the voice noise reduction device, contact data, phonebook data, messages, pictures, videos, and the like.
The memory 30 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
A processor 31, coupled to the memory 30 and the communication component 32, for executing computer programs in the memory for:
acquiring sound signals collected by a microphone through the communication component 32;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone;
and if the target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
In this embodiment, a source signal emitted by a sound source is reversely deduced based on a sound signal collected by a microphone and a transfer function between the sound source and the microphone, and when a target standard speech signal matched with the reversely deduced source signal exists in at least one standard speech signal, a noise signal in the sound signal collected by the microphone is filtered. In the embodiment, various types of noise signals in the sound signals collected by the microphone can be filtered, and useful signals obtained after noise reduction are consistent with standard voice signals, so that the interference of the noise signals on voice processing can be effectively avoided.
In an alternative embodiment, the processor 31 is further configured to:
performing a fourier transform on the source signal to determine a frequency spectrum of the source signal;
respectively calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of at least one standard voice signal;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
In an alternative embodiment, the processor 31, when performing filtering processing on the sound signal collected by the microphone with the target standard speech signal as a reference, is configured to:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering the signals on the frequency band to be filtered in the sound signals collected by the microphone.
In an alternative embodiment, the processor 31 is further configured to:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in the target standard voice signal, and the amplitude value of the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
In an alternative embodiment, the processor 31 is further configured to, before determining the source signal emitted by the sound source based on the sound signal and the transfer function between the sound source and the microphone:
in a noise-free environment, sending a test source signal to a microphone by using a test sound source;
acquiring a test sound signal acquired by a microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
In an alternative embodiment, the processor 31 is further configured to:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
Further, as shown in fig. 3, the voice noise reduction apparatus further includes: power supply components 33, and the like. Only some of the components are schematically shown in fig. 3, and it is not meant that the speech noise reduction apparatus includes only the components shown in fig. 3.
Wherein the communication component 32 is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device in which the communication component is located may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component may be implemented based on Near Field Communication (NFC) technology, Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, or other technology to facilitate short-range communications.
The power supply unit 33 supplies power to various components of the device in which the power supply unit is installed. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.
Accordingly, the present application further provides a computer readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the voice noise reduction device in the foregoing method embodiments when executed.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.
Claims (12)
1. A method for speech noise reduction, comprising:
acquiring a sound signal collected by a microphone;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; a transfer function between the sound source and the microphone is predetermined; when the proportion corresponding to the noise signal contained in the sound signal is lower, the source signal is close to the voice signal sent by the sound source, and at least one standard voice signal has a target standard voice signal matched with the source signal;
and if a target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
2. The method of claim 1, further comprising:
fourier transforming the source signal to determine a frequency spectrum of the source signal;
calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of the at least one standard voice signal respectively;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
3. The method according to claim 2, wherein the filtering the sound signal collected by the microphone with the target standard speech signal as a reference comprises:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered in the sound signals collected by the microphone.
4. The method of claim 3, further comprising:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in which an amplitude value in the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
5. The method of claim 1, wherein prior to determining the source signal from the sound source based on the sound signal and a transfer function between the sound source and a microphone, further comprising:
in a noise-free environment, sending a test source signal to the microphone by using a test sound source;
acquiring a test sound signal acquired by the microphone;
the ratio between the test sound signal and the test source signal is taken as a transfer function between the sound source and the microphone.
6. The method according to any one of claims 1 to 5, further comprising:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
7. A voice noise reduction device comprising a memory, a processor, and a communication component;
the memory is to store one or more computer instructions;
the processor is coupled with the memory and the communication component to execute one or more computer instructions to:
acquiring a sound signal collected by a microphone through the communication assembly;
determining a source signal emitted by a sound source according to the sound signal and a transfer function between the sound source and a microphone; a transfer function between the sound source and the microphone is predetermined; when the proportion corresponding to the noise signal contained in the sound signal is lower, the source signal is close to the voice signal sent by the sound source, and at least one standard voice signal has a target standard voice signal matched with the source signal;
and if a target standard voice signal matched with the source signal exists in the at least one standard voice signal, filtering the voice signal collected by the microphone by taking the target standard voice signal as a reference.
8. The device of claim 7, wherein the processor is further configured to:
fourier transforming the source signal to determine a frequency spectrum of the source signal;
calculating correlation coefficients between the frequency spectrum of the source signal and the frequency spectrum of the at least one standard voice signal respectively;
and acquiring a standard voice signal with the correlation coefficient within a preset range from at least one standard voice signal as a target standard voice signal matched with the source signal.
9. The apparatus of claim 8, wherein the processor, when filtering the sound signal collected by the microphone with the target standard speech signal as a reference, is configured to:
acquiring a frequency band of which the amplitude value does not meet the reservation condition from the frequency spectrum of the target standard voice signal as a frequency band to be filtered;
and according to the frequency band to be filtered, filtering signals on the frequency band to be filtered in the sound signals collected by the microphone.
10. The device of claim 9, wherein the processor is further configured to:
taking a correlation coefficient between the target standard voice signal and the source signal as an amplitude adjustment coefficient corresponding to a reserved frequency band, wherein the reserved frequency band is a frequency band in which an amplitude value in the target standard voice signal meets a reserved condition;
and adjusting the amplitude value of the signal on the reserved frequency band in the sound signal collected by the microphone according to the amplitude adjustment coefficient.
11. The apparatus of any of claims 7 to 10, wherein the processor is further configured to:
and if the target standard voice signal matched with the source signal does not exist in the at least one standard voice signal, discarding the voice signal, and continuously acquiring the voice signal collected by the microphone at the next moment.
12. A computer-readable storage medium storing computer instructions, which when executed by one or more processors, cause the one or more processors to perform the method of speech noise reduction of any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811288716.0A CN109410975B (en) | 2018-10-31 | 2018-10-31 | Voice noise reduction method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811288716.0A CN109410975B (en) | 2018-10-31 | 2018-10-31 | Voice noise reduction method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109410975A CN109410975A (en) | 2019-03-01 |
CN109410975B true CN109410975B (en) | 2021-03-09 |
Family
ID=65470924
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811288716.0A Active CN109410975B (en) | 2018-10-31 | 2018-10-31 | Voice noise reduction method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109410975B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110267160B (en) * | 2019-05-31 | 2020-09-22 | 潍坊歌尔电子有限公司 | Sound signal processing method, device and equipment |
CN111050264A (en) * | 2019-11-13 | 2020-04-21 | 歌尔股份有限公司 | Noise test system and test method for simulating single-ended microphone |
CN111179907B (en) * | 2019-12-31 | 2024-08-20 | 深圳Tcl新技术有限公司 | Speech recognition test method, device, equipment and computer readable storage medium |
CN113160840B (en) * | 2020-01-07 | 2022-10-25 | 北京地平线机器人技术研发有限公司 | Noise filtering method, device, mobile equipment and computer readable storage medium |
CN112037825B (en) * | 2020-08-10 | 2022-09-27 | 北京小米松果电子有限公司 | Audio signal processing method and device and storage medium |
CN112420066B (en) * | 2020-11-05 | 2024-05-14 | 深圳市卓翼科技股份有限公司 | Noise reduction method, device, computer equipment and computer readable storage medium |
CN112565531B (en) * | 2020-12-12 | 2021-08-13 | 深圳波导智慧科技有限公司 | Recording method and device applied to multi-person voice conference |
CN113409809B (en) * | 2021-07-07 | 2023-04-07 | 上海新氦类脑智能科技有限公司 | Voice noise reduction method, device and equipment |
CN115810361A (en) * | 2021-09-14 | 2023-03-17 | 中兴通讯股份有限公司 | Echo cancellation method, terminal device and storage medium |
CN117174101A (en) * | 2022-05-25 | 2023-12-05 | 青岛海尔科技有限公司 | Noise signal processing method and device, storage medium and electronic device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107045874A (en) * | 2016-02-05 | 2017-08-15 | 深圳市潮流网络技术有限公司 | A kind of Non-linear Speech Enhancement Method based on correlation |
CN107204192A (en) * | 2017-06-05 | 2017-09-26 | 歌尔科技有限公司 | Tone testing method, sound enhancement method and device |
CN107945815A (en) * | 2017-11-27 | 2018-04-20 | 歌尔科技有限公司 | Voice signal noise-reduction method and equipment |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6511897B2 (en) * | 2015-03-24 | 2019-05-15 | 株式会社Jvcケンウッド | Noise reduction device, noise reduction method and program |
JP6677662B2 (en) * | 2017-02-14 | 2020-04-08 | 株式会社東芝 | Sound processing device, sound processing method and program |
-
2018
- 2018-10-31 CN CN201811288716.0A patent/CN109410975B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107045874A (en) * | 2016-02-05 | 2017-08-15 | 深圳市潮流网络技术有限公司 | A kind of Non-linear Speech Enhancement Method based on correlation |
CN107204192A (en) * | 2017-06-05 | 2017-09-26 | 歌尔科技有限公司 | Tone testing method, sound enhancement method and device |
CN107945815A (en) * | 2017-11-27 | 2018-04-20 | 歌尔科技有限公司 | Voice signal noise-reduction method and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109410975A (en) | 2019-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109410975B (en) | Voice noise reduction method, device and storage medium | |
CN111418010B (en) | Multi-microphone noise reduction method and device and terminal equipment | |
US10154342B2 (en) | Spatial adaptation in multi-microphone sound capture | |
US8143620B1 (en) | System and method for adaptive classification of audio sources | |
TWI463817B (en) | System and method for adaptive intelligent noise suppression | |
US8712069B1 (en) | Selection of system parameters based on non-acoustic sensor information | |
CN109257675B (en) | Wind noise prevention method, earphone and storage medium | |
CN110265046B (en) | Encoding parameter regulation and control method, device, equipment and storage medium | |
EP3526979B1 (en) | Method and apparatus for output signal equalization between microphones | |
KR101210313B1 (en) | System and method for utilizing inter?microphone level differences for speech enhancement | |
CN109961797B (en) | Echo cancellation method and device and electronic equipment | |
US11638083B2 (en) | Earphone abnormality processing method, earphone, system, and storage medium | |
CN109688498B (en) | Volume adjusting method, earphone and storage medium | |
EP2851898B1 (en) | Voice processing apparatus, voice processing method and corresponding computer program | |
US9343073B1 (en) | Robust noise suppression system in adverse echo conditions | |
WO2009117084A2 (en) | System and method for envelope-based acoustic echo cancellation | |
US9343075B2 (en) | Voice processing apparatus and voice processing method | |
US9363600B2 (en) | Method and apparatus for improved residual echo suppression and flexible tradeoffs in near-end distortion and echo reduction | |
US20200286501A1 (en) | Apparatus and a method for signal enhancement | |
GB2577905A (en) | Processing audio signals | |
CN103905656A (en) | Residual echo detection method and apparatus | |
KR102378207B1 (en) | Multi-aural mmse analysis techniques for clarifying audio signals | |
CN109712637B (en) | Reverberation suppression system and method | |
US9330674B2 (en) | System and method for improving sound quality of voice signal in voice communication | |
KR102063824B1 (en) | Apparatus and Method for Cancelling Acoustic Feedback in Hearing Aids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |