WO2017063516A1 - 噪音信号确定方法、语音去噪方法及装置 - Google Patents
噪音信号确定方法、语音去噪方法及装置 Download PDFInfo
- Publication number
- WO2017063516A1 WO2017063516A1 PCT/CN2016/101444 CN2016101444W WO2017063516A1 WO 2017063516 A1 WO2017063516 A1 WO 2017063516A1 CN 2016101444 W CN2016101444 W CN 2016101444W WO 2017063516 A1 WO2017063516 A1 WO 2017063516A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- variance
- segment
- determining
- speech
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 238000001228 spectrum Methods 0.000 claims abstract description 63
- 238000012545 processing Methods 0.000 claims description 15
- 230000008859 change Effects 0.000 claims description 10
- 239000012634 fragment Substances 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 3
- 230000005236 sound signal Effects 0.000 abstract 5
- 230000008569 process Effects 0.000 description 15
- 238000004590 computer program Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 238000003860 storage Methods 0.000 description 8
- 238000005070 sampling Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0324—Details of processing therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02168—Noise filtering characterised by the method used for estimating noise the estimation exclusively taking place during speech pauses
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
Definitions
- the present application relates to the field of voice denoising technology, and in particular, to a noise signal determining method, a voice denoising method, and a device.
- Speech denoising is a technique that improves speech quality by removing ambient noise in speech signals. In the process of speech denoising, it is first necessary to determine the power spectrum of the noise signal in the speech signal, and then denoise according to the determined power spectrum of the noise signal.
- the method for determining the power spectrum of the noise signal in the voice signal is generally: assuming that the first N frame signal in a voice signal is a noise signal (ie, does not include a human voice signal), thereby passing the first N frame signal. An analysis is performed to obtain a power spectrum of the noise signal in the speech signal.
- the prior art determines the first N frame signal in the voice signal as a noise signal in a hypothetical manner, and the first N frame signal obtained by the assumed method does not match the actual noise signal, thereby affecting the acquisition.
- the accuracy of the power spectrum of the noise signal is a reference to determine the first N frame signal in the voice signal as a noise signal in a hypothetical manner, and the first N frame signal obtained by the assumed method does not match the actual noise signal, thereby affecting the acquisition. The accuracy of the power spectrum of the noise signal.
- the purpose of the embodiment of the present application is to provide a noise signal determining method, a voice denoising method, and a device, so as to solve the problem that the first N frame signal obtained by the assumption in the prior art does not match the actual noise signal, thereby affecting the acquired noise signal.
- the problem of the accuracy of the power spectrum is to provide a noise signal determining method, a voice denoising method, and a device, so as to solve the problem that the first N frame signal obtained by the assumption in the prior art does not match the actual noise signal, thereby affecting the acquired noise signal.
- the noise signal determining method the voice denoising method, and the apparatus provided by the embodiments of the present application are implemented as follows:
- a method for determining a noise signal comprising:
- a speech denoising method comprising:
- a noise signal determining device includes:
- a power spectrum acquisition unit configured to perform Fourier transform on each frame signal in the speech signal segment to be analyzed, to obtain a power spectrum of each frame signal in the speech signal segment;
- a variance determining unit configured to determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency
- a noise determining unit configured to determine, according to the variance, whether each frame signal in the segment of the voice signal is a noise signal.
- a speech denoising device comprising:
- a segment determining unit configured to determine a segment of the speech signal to be analyzed included in the to-be-processed speech
- a power spectrum acquisition unit configured to perform Fourier transform on each frame signal in the speech signal segment to be analyzed, to obtain a power spectrum of each frame signal in the speech signal segment;
- a variance determining unit configured to determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency
- a noise determining unit configured to determine, according to the variance, whether each frame signal in the voice signal segment is a noise signal, and obtain a plurality of noise frames included in the voice signal segment;
- a voice denoising unit configured to determine a power average corresponding to the plurality of noise frames included in the voice signal segment, and perform voice denoising processing of the to-be-processed voice according to the power average of the noise frame.
- the method for determining a noise signal provided by the embodiment of the present application can be seen by the technical solution provided by the embodiment of the present application.
- the method and device for denoising a speech performing Fourier transform on the segment of the speech signal to be analyzed to obtain a power spectrum of each frame signal, and determining a variance of each frame signal in the speech signal segment to be analyzed with respect to a power value at each frequency, and finally Determining whether the frame signal is a noise signal according to the variance, thereby accurately obtaining a plurality of noise frames included in the voice signal segment to be analyzed; in the process of voice denoising, according to the power average of the plurality of noise frames determined above
- the processing of the speech is performed to perform denoising processing, thereby improving the speech denoising effect.
- FIG. 1 is a flowchart of a method for determining a noise signal according to an embodiment of the present application
- FIG. 2 is a flowchart of a step of determining whether a frame signal is a noise signal in an embodiment of the present application
- FIG. 3 is a flowchart of a step of determining a variance of a power value of a frame signal at each sampling point in the embodiment of the present application;
- FIG. 5 is a flowchart of a voice denoising method according to an embodiment of the present application.
- FIG. 6 is a block diagram of a noise signal determining apparatus according to an embodiment of the present application.
- FIG. 7 is a block diagram of a voice denoising device according to an embodiment of the present application.
- FIG. 8 is a schematic structural diagram of hardware implementation of the apparatus provided by the present application.
- the noise signal determining method of the embodiment includes the following steps:
- S101 Perform Fourier transform on each frame signal in the segment of the speech signal to be analyzed to obtain the segment of the speech signal. The power spectrum of each frame signal.
- the segment of the speech signal to be analyzed may be intercepted from the speech to be processed by certain rules.
- the segment of the speech signal to be analyzed may be a "suspected noise frame segment" that may initially contain more noise frames.
- the method further includes:
- determining, according to the amplitude change of the time domain signal of the to-be-processed voice, a segment of the speech signal included in the to-be-processed speech whose amplitude variation is less than a preset threshold is the segment of the speech signal to be analyzed.
- the noise signal is usually a segment of the speech signal with a small amplitude or a relatively uniform amplitude, and the speech signal segment containing the speech of the person usually fluctuates greatly.
- a preset threshold for identifying a "suspected noise frame segment" contained in the speech to be processed ie, the speech to be denoised
- the segment of the speech signal included in the to-be-processed speech whose amplitude variation is less than the preset threshold may be determined as the segment of the speech signal to be analyzed.
- the speech signal is first subjected to frame processing
- the frame signal refers to a single frame speech signal
- a segment of the speech signal includes a frame signal of several frames.
- a frame signal may include several sampling points, such as: 1024 sample points, and adjacent two frame signals may overlap each other (for example, the coincidence degree is 50%).
- the power spectrum (frequency domain) of the speech signal can be obtained by performing short-time Fourier transform (STFT) on the speech signal in the time domain.
- STFT short-time Fourier transform
- the power spectrum contains a plurality of power values corresponding to different frequencies, such as: 1024 power values.
- a voice signal in a voice signal including a human voice, may be a noise signal (ambient noise) before a person starts speaking, by a period of time (eg, 1.5 s).
- the embodiment of the present application may determine that the voice signal to be analyzed is a frame signal of a first N frame in a voice signal, for example, the voice signal to be analyzed is a voice signal of the first 1.5 seconds: ⁇ f 1 ', f' 2 , ..., f' n ⁇ , where f 1 ', f' 2 , ..., f' n respectively refer to respective frame signals contained in the speech signal.
- the purpose of the embodiment of the present application is to determine which of the analyzed speech signals are noise signals.
- a plurality of power values corresponding to each frame signal can be calculated.
- the power spectrum of a certain frame signal at a certain frequency is a+bi
- the real part a can represent the amplitude
- the imaginary part b can represent the phase
- the power value of the frame signal at the frequency is: a 2 + b 2 .
- each frame signal ⁇ f 1 ', f' 2 , ..., f' n ⁇ contains 1024 sample points
- 1024 power values of each frame signal at different frequencies can be obtained according to the power spectrum.
- the power value corresponding to the frame signal f 1 ' is:
- the power value corresponding to the frame signal f' 2 is:
- the power value corresponding to the frame signal f' n is:
- S102 Determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency.
- the respective frame signals ⁇ f 1 ', f' 2 , ..., f' n ⁇ at respective frequencies the respective frame signals ⁇ f 1 ', f' 2 , ..., f can be respectively calculated according to the variance calculation formula.
- ' n The variance of the power value ⁇ Var(f 1 '), Var(f' 2 ), ..., Var(f' n ) ⁇ .
- Var(f 1 ') is about Variance
- Var(f' 2 ) is about Variance
- ..., Var(f' n ) is about Variance.
- S103 Determine, according to the variance, whether each frame signal in the voice signal segment is a noise signal.
- the energy (ie, power value) of the frame signal including the segmented segment has a large change with the frequency band.
- the energy of a frame signal (ie, a noise signal) that does not contain a segment of speech is relatively small as the frequency band changes, and the distribution is relatively uniform. Therefore, whether the frame signal is a noise signal can be determined according to the variance of each frame signal with respect to the power value.
- step S103 may include:
- S1031 Determine whether a variance of the frame signal with respect to the power value is greater than a first threshold T 1 .
- the variance of a certain frame signal with respect to the power value exceeds the first threshold value T 1 , it indicates that the energy of the frame signal (ie, the power value) varies with the frequency band by more than the first threshold value T 1 , so that it can be determined that the frame signal is not a noise signal.
- the variance of a certain frame signal with respect to the power value does not exceed the first threshold value T 1 , it indicates that the energy (ie, the power value) of the frame signal does not exceed the first threshold value T 1 with the frequency band, so that the The frame signal is a noise signal.
- the speech signals to be analyzed can be sequentially determined: the frame signals ⁇ f 1 ', f' 2 , ..., f' belonging to the noise signal in ⁇ f 1 ', f' 2 , ..., f' n ⁇ m ⁇ and the frame signals ⁇ f' m+1 , f' m+2 , . . . , f' n ⁇ which are not part of the noise signal, so that the noise signals contained in a piece of speech signal can be determined, and according to these noise signals ⁇ f 1 ',f' 2 ,...,f' m ⁇ for speech denoising.
- step S102 may specifically include:
- the variance statistics are performed on each frame signal in the frequency domain. Since the non-noise signals are generally concentrated in the middle and low frequency bands, the noise signals are generally distributed uniformly in each frequency band, and therefore, for each frame signal corresponding to The power values of the respective frequencies respectively calculate the variance of at least two different frequency bands (ie, the above frequency intervals).
- the first frequency interval may be 0 to 2000 Hz (low frequency band), and the second frequency interval may be 2000 to 4000 Hz (high frequency band).
- the 1024 power values corresponding to each frame signal are respectively classified into the first power value set A corresponding to 0 to 2000 Hz according to the frequency interval, and 2000. ⁇ 4000 Hz corresponds to the second power value set B.
- the corresponding 1024 power values are: Then, according to the frequency interval, the power value included in the first power value set A can be obtained, for example: The power value included in the first power value set A can be obtained, for example: And so on.
- more than two frequency bands may be divided, and the variance of signal power values of two or more frequency bands may be separately counted.
- S1022 Determine a first variance of the power value included in the first power value set.
- the power value included in the first power value set A is obtained, for example:
- the power value can be calculated according to the variance formula The first variance Var high (f 1 ').
- S1021 Determine a second variance of the power values included in the second set of power values.
- the power value included in the second power value set B is obtained, for example:
- the power value can be calculated according to the variance formula The second variance Var low (f 1 ').
- FIG. 4 it is a schematic diagram of a variance curve in the embodiment of the present application.
- the horizontal axis represents the frame number of the frame signal
- the vertical axis represents the magnitude of the variance
- the first variance curve shows the trend of the first variance of each of the above frame signals, the first variance curve showing each of the above The trend of the second variance of the frame signal.
- step S1031 may specifically include:
- first variance of the frame signal with respect to the power value is greater than a first threshold T 1 . If so, it is determined that the frame signal is a noise signal. Taking the frame signal f 1 ' as an example, it is determined whether the first variance Var high (f 1 ') is greater than the first threshold T 1 .
- step S103 may further include:
- the frame signal is determined to be a noise signal.
- the difference between the first variance and the second variance is:
- the speech signals to be analyzed can be determined in sequence: which frame signals in ⁇ f 1 ', f' 2 , ..., f' n ⁇ are noise signals.
- step S102 between step S102 and step S103, the method further includes:
- each frame signal in the segment of the speech signal to be analyzed is sorted according to the size of the variance
- determining, according to the variance, whether each frame signal in the voice signal segment is a noise signal comprising: determining, according to a variance of each frame signal obtained by sorting, a power value at each frequency, determining the voice signal segment Whether each frame signal is a noise signal.
- the present embodiment can separately determine the frame signal: ⁇ f 1 ', f' 2 , ..., f' n ⁇ with respect to the variance of the power value: ⁇ Var(f 1 '), Var(f' 2 ),... , Var(f' n ) ⁇ .
- the frame signals are sorted according to the variance of the power values from small to large. The smaller the variance, the more likely the noise signal is. Therefore, the frame signals belonging to the noise signals among the speech signals to be analyzed can be sorted to the forefront by sorting.
- the variances of the low frequency band (for example, 0 to 2000 Hz) and the high frequency band (for example, 2000 to 4000 Hz) are respectively counted, according to each frame signal ⁇ f 1 ', f' 2 , ..., f' n ⁇
- the frequency interval in which the frequency corresponding to the power spectrum is located the power value of each frame signal at each frequency is classified into the first power value set A corresponding to the first frequency interval (for example, 0 to 2000 Hz), and
- the second frequency range (for example, 2000 to 4000 Hz) corresponds to the second power value set B.
- the first variance ⁇ Var low (f 1 ') of the power value included in the first power value set corresponding to the frame signal ⁇ f 1 ', f' 2 , ..., f' n ⁇ is determined, Var low ( f' 2 ), ..., Var low (f' n ) ⁇ ; respectively determining the second power value included in the second power value set corresponding to the frame signal ⁇ f 1 ', f' 2 , ..., f' n ⁇ Variance ⁇ Var high (f 1 '), Var high (f' 2 ),..., Var high (f' n ) ⁇ .
- the above step S104 may determine the noise signal included in the speech signal to be analyzed (which may be a speech signal sorted according to the variance size) as follows:
- the second variance Var high (f' i-1 ) of the previous frame signal f' i-1 of each frame signal f i ' with respect to the power value and the subsequent frame of the frame signal can be sequentially determined. Whether the difference Var high (f' i+1 )-Var high (f' i-1 ) of the signal f' i+1 with respect to the second variance Var high (f' i+1 ) of the power value is greater than the third Threshold T 3 , if not, the frame signal f i ' is determined as a noise frame signal; the determined set of noise frame signals is determined as a noise signal.
- the first variance of the previous frame signal f'i -1 of each frame signal f i ' with respect to the power value Var low (f' i-1 ) and the latter of the frame signal can be sequentially determined. Whether the difference Var of the frame signal f' i+1 with respect to the first variance of the power value Var low (f' i+1 ) Var low (f' i+1 ) - Var low (f' i-1 ) is greater than the fourth Threshold T 4 , if not, the frame signal f i ' is determined as a noise frame signal; the determined set of noise frame signals is determined as a noise signal.
- the noise frame included in the speech signal to be analyzed may be identified by the above formulas (1) to (4). That is, for any one of the frame signals f i ', if it satisfies any one of the above formulas (1) to (4), it can be determined that the frame signal is a non-noise signal (noise cutoff frame). In other words, for any one of the frame signals f i ', if none of the above formulas (1) to (4) is satisfied, it can be determined that the frame signal is a noise signal.
- the noise cutoff frame f 'm then the noise frame comprises: ⁇ f 1', f ' 2, ..., f' m-1 ⁇ .
- the noise cutoff frame can be determined by some formulas in the above formulas (1) to (4), such as: formula (1) and formula (2), formula (2) and Formula (3). Furthermore, the formula for determining the noise cutoff frame of the embodiment of the present application is not limited to the formulas listed above.
- the above thresholds T 1 , T 2 , T 3 , and T 4 are all obtained by counting a large number of test samples.
- FIG. 5 is a flowchart of a voice denoising method according to an embodiment of the present application, including:
- S201 Determine a segment of the speech signal to be analyzed included in the to-be-processed speech.
- S202 Perform Fourier transform on each frame signal in the segment of the speech signal to be analyzed to obtain a power spectrum of each frame signal in the segment of the speech signal.
- S203 Determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency.
- S204 Determine, according to the variance, whether each frame signal in the voice signal segment is a noise signal, and obtain a plurality of noise frames included in the voice signal segment.
- S205 Determine a power average corresponding to the plurality of noise frames included in the voice signal segment, and perform voice denoising processing on the to-be-processed voice according to the power average of the noise frame.
- the speech denoising process can be performed. Since the denoising method is a technique well known in the art, it will not be described in detail herein.
- the step of sorting the frame signals according to the variance may be omitted, and each frame of the original signal is directly determined to determine which frames are noise frames.
- a part of the frame is usually taken to calculate the power spectrum estimation value P noise . For example, if the determined noise signal is 50 frames, The first 30 frames are intercepted to calculate the power spectrum estimation value P noise , and the accuracy of the power spectrum estimation value is improved.
- the embodiment of the present application further provides a noise signal determining device.
- the device can be implemented by software, or can be implemented by hardware or a combination of hardware and software.
- the CPU Central Process Unit
- the CPU reads the corresponding computer program instructions into the memory.
- a hardware structure of the device can be seen in FIG.
- FIG. 6 is a block diagram of a noise signal determining apparatus according to an embodiment of the present application.
- the functions of the units in the device may correspond to the functions in the steps of the noise signal determining method.
- the noise signal determining apparatus 100 includes:
- the power spectrum acquisition unit 101 is configured to perform Fourier transform on each frame signal in the segment of the speech signal to be analyzed, Obtaining a power spectrum of each frame signal in the segment of the speech signal;
- the variance determining unit 102 is configured to determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency;
- the noise determining unit 103 is configured to determine, according to the variance, whether each frame signal in the voice signal segment is a noise signal.
- the device further includes: a segment obtaining unit, configured to:
- a segment of the speech signal included in the to-be-processed speech that has a magnitude change less than a preset threshold is the segment of the speech signal to be analyzed
- the noise determining unit 103 is configured to:
- the frame signal is determined to be a noise signal.
- the variance determination unit 102 is configured to:
- the noise determining unit 103 is configured to:
- the frame signal is determined to be a noise signal.
- the variance determining unit 102 is specifically configured to:
- the power value of the frame signal at each frequency is classified into the first power value set corresponding to the first frequency interval, and the second a second power value set corresponding to the frequency interval; wherein the first frequency interval is smaller than the second frequency interval;
- the noise determining unit 103 is configured to:
- the frame signal is determined to be a noise signal.
- the embodiment of the present application further provides a voice denoising device.
- the device can be implemented by software, or can be implemented by hardware or a combination of hardware and software. Take software implementation as an example, as logic
- the device in the sense is formed by the CPU (Central Process Unit) of the server reading the corresponding computer program instructions into the memory.
- a hardware structure of the device can be seen in FIG.
- FIG. 7 is a block diagram of a speech denoising apparatus according to an embodiment of the present application.
- the functions of the units in the device may correspond to the functions in the steps of the voice denoising method.
- the voice denoising apparatus 200 includes:
- a segment determining unit 201 configured to determine a segment of the speech signal to be analyzed included in the to-be-processed speech
- the power spectrum acquisition unit 202 performs Fourier transform on each frame signal in the speech signal segment to be analyzed to obtain a power spectrum of each frame signal in the speech signal segment;
- the variance determining unit 203 is configured to determine, according to a power spectrum of the frame signal, a variance of each frame signal in the voice signal segment with respect to a power value at each frequency;
- the noise determining unit 205 is configured to determine, according to the variance, whether each frame signal in the voice signal segment is a noise signal, and obtain a plurality of noise frames included in the voice signal segment;
- the voice denoising unit 10 is configured to determine a power average corresponding to the plurality of noise frames included in the voice signal segment, and perform voice denoising processing of the to-be-processed voice according to the power average of the noise frame.
- the apparatus further comprises a sorting unit 204 for:
- each frame signal in the segment of the speech signal to be analyzed is sorted according to the size of the variance
- the noise determining unit 205 is specifically configured to:
- each frame signal in the segment of the speech signal is a noise signal.
- the noise signal determining method and the voice denoising method and apparatus obtained a power spectrum of each frame signal by Fourier transform of the speech signal segment to be analyzed, and determine each frame signal in the speech signal segment to be analyzed. Regarding the variance of the power values at each frequency, finally determining whether the frame signal is a noise signal according to the variance described above, thereby accurately obtaining a plurality of noise frames included in the speech signal segment to be analyzed; in the process of speech denoising, The denoising process can be performed on the processed speech according to the power average of the plurality of noise frames determined above, thereby improving the speech denoising effect.
- embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention may be embodied in one or more of the computer-usable program code embodied therein.
- the computer is in the form of a computer program product embodied on a storage medium, including but not limited to disk storage, CD-ROM, optical storage, and the like.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
- These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
- the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
- embodiments of the present application can be provided as a method, system, or computer program product.
- the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment in combination of software and hardware.
- the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
- the application can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
- program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
- the present application can also be practiced in distributed computing environments in these distributed computing environments. The task is performed by a remote processing device that is connected through a communication network.
- program modules can be located in both local and remote computer storage media including storage devices.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Circuit For Audible Band Transducer (AREA)
- Telephone Function (AREA)
- Telephonic Communication Services (AREA)
- Noise Elimination (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
Claims (18)
- 一种噪音信号确定方法,其特征在于,包括:对待分析的语音信号片段中的各帧信号作傅里叶变换,得到该语音信号片段中的各帧信号的功率谱;根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差;根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号。
- 根据权利要求1所述的方法,其特征在于,对待分析的语音信号片段中的各帧信号作傅里叶变换,得到该语音信号片段中的各帧信号的功率谱之前,所述方法还包括:根据待处理语音的时域信号的幅度变化,确定该待处理语音中的包含的一段幅度变化小于预设阈值的语音信号片段为所述待分析的语音信号片段;或,截取待处理语音中的前N帧语音信号作为所述待分析的语音信号片段。
- 根据权利要求1所述的方法,其特征在于,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:判断与所述语音信号片段中的各帧信号对应的所述方差是否大于第一阈值;若否,将该帧信号确定为噪音信号。
- 根据权利要求3所述的方法,其特征在于,根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差,包括:根据所述功率谱对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中;确定所述第一功率值集合中包含的功率值的第一方差;则,判断所述方差是否大于第一阈值,包括:判断所述第一方差是否大于第一阈值。
- 根据权利要求1所述的方法,其特征在于,根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差,包括:根据每个帧信号对应的各功率值对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中、及与第二频率区间对应的第二功率值集合中;其中,所述第一频率区间小于所述第二频率区间;确定所述第一功率值集合中包含的功率值的第一方差;确定所述第二功率值集合中包含的功率值的第二方差;则,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:判断与每个帧信号对应的所述第一方差与所述第二方差的差值是否大于第二阈值;若否,将该帧信号确定为噪音信号。
- 根据权利要求1所述的方法,其特征在于,根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差之后,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号之前,所述方法还包括:将所述待分析的语音信号片段中的各帧信号按照所述方差的大小进行排序;则,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:基于排序得到的各帧信号关于各频率下的功率值的方差,确定所述语音信号片段中的各帧信号是否为噪音信号。
- 一种语音去噪方法,其特征在于,包括:确定待处理语音中包含的待分析的语音信号片段;对待分析的语音信号片段中的各帧信号作傅里叶变换,得到该语音信号片段中的各帧信号的功率谱;根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差;根据所述方差确定所述语音信号片段中的各帧信号是否为噪音信号,获得所述语音信号片段中包含的若干噪音帧;确定与所述语音信号片段中包含的若干噪音帧对应的功率均值,并依据所述噪音帧的功率均值进行所述待处理语音的语音去噪处理。
- 根据权利要求7所述的方法,其特征在于,确定待处理语音中包含的待分析的语音信号片段,包括:根据待处理语音的时域信号的幅度变化,确定该待处理语音中的包含的一段幅度变化小于预设阈值的语音信号片段为所述待分析的语音信号片段;或,截取待处理语音中的前N帧语音信号作为所述待分析的语音信号片段。
- 根据权利要求7所述的方法,其特征在于,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:判断与所述语音信号片段中的各帧信号对应的所述方差是否大于第一阈值;若否,将该帧信号确定为噪音信号。
- 根据权利要求9所述的方法,其特征在于,根据所述帧信号的功率谱,确定所 述语音信号片段中各帧信号关于各频率下的功率值的方差,包括:根据所述功率谱对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中;确定所述第一功率值集合中包含的功率值的第一方差;则,判断所述方差是否大于第一阈值,包括:判断所述第一方差是否大于第一阈值。
- 根据权利要求7所述的方法,其特征在于,根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差,包括:根据每个帧信号对应的各功率值对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中、及与第二频率区间对应的第二功率值集合中;其中,所述第一频率区间小于所述第二频率区间;确定所述第一功率值集合中包含的功率值的第一方差;确定所述第二功率值集合中包含的功率值的第二方差;则,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:判断与每个帧信号对应的所述第一方差与所述第二方差的差值是否大于第二阈值;若否,将该帧信号确定为噪音信号。
- 根据权利要求7所述的方法,其特征在于,根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差之后,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号之前,所述方法还包括:将所述待分析的语音信号片段中的各帧信号按照所述方差的大小进行排序;则,根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号,包括:基于排序得到的各帧信号关于各频率下的功率值的方差,确定所述语音信号片段中的各帧信号是否为噪音信号。
- 一种噪音信号确定装置,其特征在于,包括:功率谱获取单元,用于对待分析的语音信号片段中的各帧信号作傅里叶变换,得到该语音信号片段中的各帧信号的功率谱;方差确定单元,用于根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差;噪音确定单元,用于根据所述方差,确定所述语音信号片段中的各帧信号是否为噪音信号。
- 根据权利要求13所述的装置,其特征在于,所述装置还包括:片段获取单元,用于:根据待处理语音的时域信号的幅度变化,确定该待处理语音中的包含的一段幅度变化小于预设阈值的语音信号片段为所述待分析的语音信号片段;或,截取待处理语音中的前N帧语音信号作为所述待分析的语音信号片段。
- 根据权利要求13所述的装置,其特征在于,所述噪音确定单元用于:判断与所述语音信号片段中的各帧信号对应的所述方差是否大于第一阈值;若否,将所述帧信号确定为噪音信号。
- 根据权利要求13所述的装置,其特征在于,所述方差确定单元用于:根据所述功率谱对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中;确定所述第一功率值集合中包含的功率值的第一方差;则,所述噪音确定单元用于:判断所述第一方差是否大于第一阈值;若否,将该帧信号确定为噪音信号。
- 根据权利要求13所述的装置,其特征在于,所述方差确定单元具体用以:根据每个帧信号对应的各功率值对应的频率所处的频率区间,至少将该帧信号在各个频率的功率值归入与第一频率区间对应的第一功率值集合中、及与第二频率区间对应的第二功率值集合中;其中,所述第一频率区间小于所述第二频率区间;确定所述第一功率值集合中包含的功率值的第一方差;确定所述第二功率值集合中包含的功率值的第二方差;则,所述噪音确定单元用于:判断与每个帧信号对应的所述第一方差与所述第二方差的差值是否大于第二阈值;若否,将该帧信号确定为噪音信号。
- 一种语音去噪装置,其特征在于,包括:片段确定单元,用于确定待处理语音中包含的待分析的语音信号片段;功率谱获取单元,用于对待分析的语音信号片段中的各帧信号作傅里叶变换,得到该语音信号片段中的各帧信号的功率谱;方差确定单元,用于根据所述帧信号的功率谱,确定所述语音信号片段中各帧信号关于各频率下的功率值的方差;噪音确定单元,用于根据所述方差确定所述语音信号片段中的各帧信号是否为噪音信号,获得所述语音信号片段中包含的若干噪音帧;语音去噪单元,用于确定与所述语音信号片段中包含的若干噪音帧对应的功率均值,并依据所述噪音帧的功率均值进行所述待处理语音的语音去噪处理。
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018519388A JP6784758B2 (ja) | 2015-10-13 | 2016-10-08 | ノイズ信号判定方法及び装置並びに音声ノイズ除去方法及び装置 |
ES16854895T ES2807529T3 (es) | 2015-10-13 | 2016-10-08 | Método para la determinación de señal de ruido y aparato del mismo |
EP16854895.6A EP3364413B1 (en) | 2015-10-13 | 2016-10-08 | Method of determining noise signal and apparatus thereof |
SG11201803004YA SG11201803004YA (en) | 2015-10-13 | 2016-10-08 | Noise signal determining method and apparatus and voice denoising method and apparatus |
KR1020187013177A KR102208855B1 (ko) | 2015-10-13 | 2016-10-08 | 노이즈 신호 결정 방법과 장치, 및 음성 노이즈 제거 방법과 장치 |
PL16854895T PL3364413T3 (pl) | 2015-10-13 | 2016-10-08 | Sposób określania sygnału szumu i przeznaczone do tego urządzenie |
US15/951,928 US10796713B2 (en) | 2015-10-13 | 2018-04-12 | Identification of noise signal for voice denoising device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510670697.8 | 2015-10-13 | ||
CN201510670697.8A CN106571146B (zh) | 2015-10-13 | 2015-10-13 | 噪音信号确定方法、语音去噪方法及装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/951,928 Continuation US10796713B2 (en) | 2015-10-13 | 2018-04-12 | Identification of noise signal for voice denoising device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017063516A1 true WO2017063516A1 (zh) | 2017-04-20 |
Family
ID=58508605
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/101444 WO2017063516A1 (zh) | 2015-10-13 | 2016-10-08 | 噪音信号确定方法、语音去噪方法及装置 |
Country Status (9)
Country | Link |
---|---|
US (1) | US10796713B2 (zh) |
EP (1) | EP3364413B1 (zh) |
JP (1) | JP6784758B2 (zh) |
KR (1) | KR102208855B1 (zh) |
CN (1) | CN106571146B (zh) |
ES (1) | ES2807529T3 (zh) |
PL (1) | PL3364413T3 (zh) |
SG (2) | SG11201803004YA (zh) |
WO (1) | WO2017063516A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986839A (zh) * | 2017-06-01 | 2018-12-11 | 瑟恩森知识产权控股有限公司 | 减少音频信号中的噪声 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102096533B1 (ko) * | 2018-09-03 | 2020-04-02 | 국방과학연구소 | 음성 구간을 검출하는 방법 및 장치 |
CN110689901B (zh) * | 2019-09-09 | 2022-06-28 | 苏州臻迪智能科技有限公司 | 语音降噪的方法、装置、电子设备及可读存储介质 |
JP7331588B2 (ja) * | 2019-09-26 | 2023-08-23 | ヤマハ株式会社 | 情報処理方法、推定モデル構築方法、情報処理装置、推定モデル構築装置およびプログラム |
KR20220018271A (ko) | 2020-08-06 | 2022-02-15 | 라인플러스 주식회사 | 딥러닝을 이용한 시간 및 주파수 분석 기반의 노이즈 제거 방법 및 장치 |
EP4273860A1 (en) * | 2020-12-31 | 2023-11-08 | Shenzhen Shokz Co., Ltd. | Audio generation method and system |
CN112967738A (zh) * | 2021-02-01 | 2021-06-15 | 腾讯音乐娱乐科技(深圳)有限公司 | 人声检测方法、装置及电子设备和计算机可读存储介质 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03180900A (ja) * | 1989-12-11 | 1991-08-06 | Sanyo Electric Co Ltd | 音声認識装置の雑音除去システム |
EP2031583A1 (en) * | 2007-08-31 | 2009-03-04 | Harman Becker Automotive Systems GmbH | Fast estimation of spectral noise power density for speech signal enhancement |
JP2009216733A (ja) * | 2008-03-06 | 2009-09-24 | Nippon Telegr & Teleph Corp <Ntt> | フィルタ推定装置、信号強調装置、フィルタ推定方法、信号強調方法、プログラム、記録媒体 |
CN101853661A (zh) * | 2010-05-14 | 2010-10-06 | 中国科学院声学研究所 | 基于非监督学习的噪声谱估计与语音活动度检测方法 |
CN101968957A (zh) * | 2010-10-28 | 2011-02-09 | 哈尔滨工程大学 | 一种噪声条件下的语音检测方法 |
CN102314883A (zh) * | 2010-06-30 | 2012-01-11 | 比亚迪股份有限公司 | 一种判断音乐噪声的方法以及语音消噪方法 |
CN102800322A (zh) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | 一种噪声功率谱估计与语音活动性检测方法 |
CN103489446A (zh) * | 2013-10-10 | 2014-01-01 | 福州大学 | 复杂环境下基于自适应能量检测的鸟鸣识别方法 |
CN103632677A (zh) * | 2013-11-27 | 2014-03-12 | 腾讯科技(成都)有限公司 | 带噪语音信号处理方法、装置及服务器 |
CN103903629A (zh) * | 2012-12-28 | 2014-07-02 | 联芯科技有限公司 | 基于隐马尔科夫链模型的噪声估计方法和装置 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0836400A (ja) * | 1994-07-25 | 1996-02-06 | Kokusai Electric Co Ltd | 音声状態判定回路 |
US6529868B1 (en) * | 2000-03-28 | 2003-03-04 | Tellabs Operations, Inc. | Communication system noise cancellation power signal calculation techniques |
US7299173B2 (en) * | 2002-01-30 | 2007-11-20 | Motorola Inc. | Method and apparatus for speech detection using time-frequency variance |
CN101197130B (zh) | 2006-12-07 | 2011-05-18 | 华为技术有限公司 | 声音活动检测方法和声音活动检测器 |
US9047874B2 (en) | 2007-03-06 | 2015-06-02 | Nec Corporation | Noise suppression method, device, and program |
JP4327886B1 (ja) | 2008-05-30 | 2009-09-09 | 株式会社東芝 | 音質補正装置、音質補正方法及び音質補正用プログラム |
US8989403B2 (en) | 2010-03-09 | 2015-03-24 | Mitsubishi Electric Corporation | Noise suppression device |
JP4937393B2 (ja) | 2010-09-17 | 2012-05-23 | 株式会社東芝 | 音質補正装置及び音声補正方法 |
-
2015
- 2015-10-13 CN CN201510670697.8A patent/CN106571146B/zh active Active
-
2016
- 2016-10-08 ES ES16854895T patent/ES2807529T3/es active Active
- 2016-10-08 EP EP16854895.6A patent/EP3364413B1/en active Active
- 2016-10-08 KR KR1020187013177A patent/KR102208855B1/ko active IP Right Grant
- 2016-10-08 SG SG11201803004YA patent/SG11201803004YA/en unknown
- 2016-10-08 WO PCT/CN2016/101444 patent/WO2017063516A1/zh active Application Filing
- 2016-10-08 SG SG10202005490WA patent/SG10202005490WA/en unknown
- 2016-10-08 PL PL16854895T patent/PL3364413T3/pl unknown
- 2016-10-08 JP JP2018519388A patent/JP6784758B2/ja active Active
-
2018
- 2018-04-12 US US15/951,928 patent/US10796713B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03180900A (ja) * | 1989-12-11 | 1991-08-06 | Sanyo Electric Co Ltd | 音声認識装置の雑音除去システム |
EP2031583A1 (en) * | 2007-08-31 | 2009-03-04 | Harman Becker Automotive Systems GmbH | Fast estimation of spectral noise power density for speech signal enhancement |
JP2009216733A (ja) * | 2008-03-06 | 2009-09-24 | Nippon Telegr & Teleph Corp <Ntt> | フィルタ推定装置、信号強調装置、フィルタ推定方法、信号強調方法、プログラム、記録媒体 |
CN101853661A (zh) * | 2010-05-14 | 2010-10-06 | 中国科学院声学研究所 | 基于非监督学习的噪声谱估计与语音活动度检测方法 |
CN102314883A (zh) * | 2010-06-30 | 2012-01-11 | 比亚迪股份有限公司 | 一种判断音乐噪声的方法以及语音消噪方法 |
CN101968957A (zh) * | 2010-10-28 | 2011-02-09 | 哈尔滨工程大学 | 一种噪声条件下的语音检测方法 |
CN102800322A (zh) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | 一种噪声功率谱估计与语音活动性检测方法 |
CN103903629A (zh) * | 2012-12-28 | 2014-07-02 | 联芯科技有限公司 | 基于隐马尔科夫链模型的噪声估计方法和装置 |
CN103489446A (zh) * | 2013-10-10 | 2014-01-01 | 福州大学 | 复杂环境下基于自适应能量检测的鸟鸣识别方法 |
CN103632677A (zh) * | 2013-11-27 | 2014-03-12 | 腾讯科技(成都)有限公司 | 带噪语音信号处理方法、装置及服务器 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3364413A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986839A (zh) * | 2017-06-01 | 2018-12-11 | 瑟恩森知识产权控股有限公司 | 减少音频信号中的噪声 |
Also Published As
Publication number | Publication date |
---|---|
SG10202005490WA (en) | 2020-07-29 |
US20180293997A1 (en) | 2018-10-11 |
EP3364413B1 (en) | 2020-06-10 |
PL3364413T3 (pl) | 2020-10-19 |
US10796713B2 (en) | 2020-10-06 |
KR102208855B1 (ko) | 2021-01-29 |
JP2018534618A (ja) | 2018-11-22 |
EP3364413A4 (en) | 2019-06-26 |
KR20180067608A (ko) | 2018-06-20 |
ES2807529T3 (es) | 2021-02-23 |
JP6784758B2 (ja) | 2020-11-11 |
SG11201803004YA (en) | 2018-05-30 |
CN106571146B (zh) | 2019-10-15 |
EP3364413A1 (en) | 2018-08-22 |
CN106571146A (zh) | 2017-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017063516A1 (zh) | 噪音信号确定方法、语音去噪方法及装置 | |
WO2016095626A1 (zh) | 监控进程的方法和装置 | |
CN106850511B (zh) | 识别访问攻击的方法及装置 | |
WO2016015461A1 (zh) | 异常帧检测方法和装置 | |
US9997168B2 (en) | Method and apparatus for signal extraction of audio signal | |
AU2014386442B2 (en) | Method for detecting audio signal and apparatus | |
US20190311297A1 (en) | Anomaly detection and processing for seasonal data | |
CN108847253B (zh) | 车辆型号识别方法、装置、计算机设备及存储介质 | |
WO2021000498A1 (zh) | 复合语音识别方法、装置、设备及计算机可读存储介质 | |
WO2017045429A1 (zh) | 一种音频数据的检测方法、系统及存储介质 | |
JP2018534618A5 (zh) | ||
CN106034240A (zh) | 视频检测方法及装置 | |
EP3292819B1 (en) | Noisy signal identification from non-stationary audio signals | |
US20180091390A1 (en) | Data validation across monitoring systems | |
WO2015074493A1 (zh) | 一种低频点击的过滤方法、装置、计算机程序以及计算机可读介质 | |
CN117076941A (zh) | 一种光缆鸟害监测方法、系统、电子设备及可读存储介质 | |
JP2016191788A (ja) | 音響処理装置、音響処理方法、及び、プログラム | |
CN113421590B (zh) | 异常行为检测方法、装置、设备及存储介质 | |
US10109298B2 (en) | Information processing apparatus, computer readable storage medium, and information processing method | |
CN107229621B (zh) | 差异数据的清洗方法及装置 | |
CN110543965B (zh) | 基线预测方法、基线预测装置、电子设备和介质 | |
CN112863548A (zh) | 训练音频检测模型的方法、音频检测方法及其装置 | |
US9069849B1 (en) | Methods for enforcing time alignment for speed resistant audio matching | |
Gao et al. | A Method Using EEMD and L-Kurtosis to detect faults in roller bearings | |
TW202030641A (zh) | 服裝的計件方法、裝置及設備 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16854895 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11201803004Y Country of ref document: SG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018519388 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20187013177 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016854895 Country of ref document: EP |