CN108766454A - A kind of voice noise suppressing method and device - Google Patents
A kind of voice noise suppressing method and device Download PDFInfo
- Publication number
- CN108766454A CN108766454A CN201810692665.1A CN201810692665A CN108766454A CN 108766454 A CN108766454 A CN 108766454A CN 201810692665 A CN201810692665 A CN 201810692665A CN 108766454 A CN108766454 A CN 108766454A
- Authority
- CN
- China
- Prior art keywords
- noise
- voice
- frequency domain
- signal
- speech signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 230000000694 effects Effects 0.000 claims abstract description 37
- 239000004568 cement Substances 0.000 claims abstract description 24
- 230000000873 masking effect Effects 0.000 claims description 31
- 238000001514 detection method Methods 0.000 claims description 30
- 238000012545 processing Methods 0.000 claims description 21
- 230000000452 restraining effect Effects 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 11
- 238000004458 analytical method Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000005764 inhibitory process Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 4
- 230000015654 memory Effects 0.000 description 18
- 238000004422 calculation algorithm Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 9
- 238000001228 spectrum Methods 0.000 description 9
- 230000003595 spectral effect Effects 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 5
- 230000002708 enhancing effect Effects 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 230000009467 reduction Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000005534 acoustic noise Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012880 independent component analysis Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005316 response function Methods 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The present invention provides a kind of voice noise suppressing method and devices, are related to voice process technology field.The voice noise suppressing method determines acoustics scene corresponding with the frequency domain speech signal according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal first, parameter adjustment is carried out to noise processed model according to the acoustics scene, speech enhan-cement is carried out to the frequency domain speech signal further according to the noise processed model after adjustment.The voice noise suppressing method is adjusted to carry out speech enhan-cement noise processed model based on a variety of acoustics scenes, so that noise suppressed is had scene specific aim, improves noise processed speed and speech enhan-cement effect.
Description
Technical field
The present invention relates to voice process technology fields, in particular to a kind of voice noise suppressing method and dress
It sets.
Background technology
Universal with electronic equipment, more and more operations or input need the participation of phonetic function, and many electronics
The precise requirements that the refinement of equipment inputs voice are higher and higher.Since voice input is typically with noise
It is carried out under scene, in practical application scene, target voice can usually be interfered by factors such as noise circumstances so that voice
Clarity, intelligibility and comfort level substantially reduce, to seriously affect human ear auditory perception and electronic equipment to voice
Analyzing processing, therefore generally require to carry out again after the noisy voice signal of the band of typing is carried out noise reduction process, speech enhan-cement
Output or other operations.
But existing voice de-noising method carries out the voice messaging of typing under various noise circumstances at the place of identical flow
Reason, lacks specific aim, there is a problem of that voice de-noising efficiency is low, noise suppression effect is bad.
Invention content
In view of this, the embodiment of the present invention is designed to provide a kind of voice noise suppressing method and device, to solve
The problem that voice de-noising efficiency is low in above-mentioned existing voice Enhancement Method, noise suppression effect is bad.
In a first aspect, an embodiment of the present invention provides a kind of voice noise suppressing method, the voice noise suppressing method
Including:Acoustic field corresponding with the frequency domain speech signal is determined according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
Scape;Parameter adjustment is carried out to noise processed model according to the acoustics scene;According to the noise processed model after adjustment to described
Frequency domain speech signal carries out speech enhan-cement.
Synthesis described according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal in a first aspect, determine and the frequency
Before the corresponding acoustics scene of domain voice signal, the voice noise suppressing method further includes:It will by simulating human ear filter
Collected original time domain voice signal is converted into frequency domain speech signal.
Synthesis is in a first aspect, converted collected original time domain voice signal to by simulating human ear filter described
It is after frequency domain speech signal and described according to the determination of the noise estimated result and signal-to-noise ratio of frequency domain speech signal and the frequency domain
Before the corresponding acoustics scene of voice signal, the voice noise suppressing method further includes:Obtain the frequency domain speech signal
Voice activity detection result;The noise estimation knot for obtaining the frequency domain speech signal is calculated according to the voice activity detection result
Fruit and signal-to-noise ratio.
It integrates in a first aspect, the voice activity detection for obtaining frequency domain speech signal is as a result, include:Frequency domain speech is believed
Number carry out voice activity detection, to which the frequency domain speech signal is divided into sound section and unvoiced segments, will described sound section with
The unvoiced segments as the frequency domain speech signal voice activity detection as a result, described sound section for simultaneously including voice signal
With the frequency range of noise signal, the unvoiced segments are only to include the frequency range of noise signal.
Synthesis is in a first aspect, described calculated according to the voice activity detection result obtains making an uproar for the frequency domain speech signal
Sound estimated result and signal-to-noise ratio, including:Comparing calculation is carried out to the energy feature of described sound section and the unvoiced segments, is made an uproar
Sound estimated result and signal-to-noise ratio.
Synthesis shelters submodel in a first aspect, the noise processed model includes noise suppressed submodel and human ear acoustics,
It is described that parameter adjustment is carried out to noise processed model according to the acoustics scene, according to the noise processed model after adjustment to described
Frequency domain speech signal carries out speech enhan-cement, including:Determine that the human ear acoustics shelters estimating for submodel according to the acoustics scene
Masking threshold is counted, auditory perceptual frequency domain language is filtered out in the frequency domain speech signal using human ear acoustics masking submodel
Sound signal;The suppression of noise based on spectrum-subtraction is carried out to the auditory perceptual frequency domain speech signal using the noise suppressed submodel
System processing, obtains speech enhan-cement output signal.
Synthesis to the frequency domain speech signal in the noise processed model according to after adjustment in a first aspect, carry out voice
After enhancing, the voice noise suppressing method further includes:The speech enhan-cement output signal is converted into time domain speech signal;
It is exported by loudspeaker after being amplified the time domain speech signal using power amplifier.
Second aspect, an embodiment of the present invention provides a kind of voice noise restraining device, the voice noise restraining device
Including acoustics scene determining module, parameter adjustment module and noise processed module.The acoustics scene determining module is used for basis
The noise estimated result and signal-to-noise ratio of frequency domain speech signal determine acoustics scene corresponding with the frequency domain speech signal.The ginseng
Number adjustment module is used to carry out parameter adjustment to noise processed model according to the acoustics scene.The noise processed module is used for
Speech enhan-cement is carried out to the frequency domain speech signal according to the noise processed model after adjustment.
Comprehensive second aspect, the voice noise restraining device further includes Voice Activity Detection module and noise analysis mould
Block.The Voice Activity Detection module is used to obtain the voice activity detection result of the frequency domain speech signal.The noise point
Module is analysed to be used to calculate the noise estimated result and letter for obtaining the frequency domain speech signal according to the voice activity detection result
It makes an uproar ratio.
The third aspect, the embodiment of the present invention additionally provide a kind of storage medium, and the storage medium is stored in computer,
The storage medium includes a plurality of instruction, and a plurality of instruction is configured such that the computer executes above-mentioned method.
Advantageous effect provided by the invention is:
The present invention provides a kind of voice noise suppressing method and device, the voice noise suppressing method is based on frequency domain language
Sound signal is handled, and more conducively processing equipment is analyzed it and handled, and improves the speed and essence of Speech processing
Exactness.Meanwhile the voice noise suppressing method is adjusted noise processed model for different acoustics scenes, so that described
Voice noise suppressing method is more accurate on adapting to acoustics scene, can realize more targetedly noise suppressed, improves
The effect of noise suppressed.Further, the judgement of the acoustics scene is carried out by noise estimated result and signal-to-noise ratio, is increased
Add the accuracy that acoustics scene judges and the speed for improving the judgement of acoustics scene, inhibits effect to improve voice noise
Fruit and efficiency.
Other features and advantages of the present invention will be illustrated in subsequent specification, also, partly be become from specification
It is clear that by implementing understanding of the embodiment of the present invention.The purpose of the present invention and other advantages can be by saying what is write
Specifically noted structure is realized and is obtained in bright book, claims and attached drawing.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 is a kind of flow chart for voice noise suppressing method that first embodiment of the invention provides;
Fig. 2 is the flow chart of a kind of voice input that first embodiment of the invention provides and processing step;
Fig. 3 is a kind of flow diagram for noise suppressed mode that first embodiment of the invention provides;
Fig. 4 is a kind of module map for voice noise restraining device that second embodiment of the invention provides;
Fig. 5 is a kind of structure can be applied to the electronic equipment in the embodiment of the present application that third embodiment of the invention provides
Block diagram.
Icon:100- voice noise restraining devices;101- Voice Activity Detection modules;102- noise analysis modules;110-
Acoustics scene determining module;120- parameter adjustment modules;130- noise processed modules;140- voice signal output modules;200-
Electronic equipment;201- memories;202- storage controls;203- processors;204- Peripheral Interfaces;205- input-output units;
206- audio units;207- display units.
Specific implementation mode
Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist
The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause
This, the detailed description of the embodiment of the present invention to providing in the accompanying drawings is not intended to limit claimed invention below
Range, but it is merely representative of the selected embodiment of the present invention.Based on the embodiment of the present invention, those skilled in the art are not doing
The every other embodiment obtained under the premise of going out creative work, shall fall within the protection scope of the present invention.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.Meanwhile the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
Below first to the present embodiments relate to part term explain:
Masking effect refers to the stimulation due to there are multiple same categories (such as sound, image), causes subject that cannot completely connect
The information all stimulated.Wherein, visual masking effect includes lightness masking effect and pattern masking effect, influence factor master
To include spatial domain, time-domain and color gamut;Auditory masking effect is then mainly covered including noise, human ear, frequency domain, time domain and time
Cover effect.When masking effect occurs, generally using sound of different nature as masking sound, such as pure tone, complex tone, noise etc..It grinds
Study carefully and also found, when being reached when masking sound and masked sound difference, can also shelter, this occlusion is known as non-concurrent cover
It covers.Masking sound acts on the masking occurred before masked sound, referred to as preceding masking;Masking sound acts on institute after masked sound
The masking of generation, referred to as after shelter.The masking effect of the sense of hearing is usually to be indicated with the new threshold audiogram in the presence of masking sound,
Therefore the masked sound referred here to generally refers to pure tone.The threshold of audibility existing for masking sound is known as estimating the corresponding masking of masking threshold
Threshold.
First embodiment
Through the applicant the study found that voice messaging can be made an uproar by various scenes in real life voice input
Acoustic jamming, and traditional noise reduction schemes are typically carried out for a general noise reduction model, and not according to actual speech typing
Acoustics scene is adjusted each algorithm parameter in general noise reduction model, and noise reduction is poor, may remain much noise.
To solve the above-mentioned problems, first embodiment of the invention provides a kind of voice noise suppressing method.
Referring to FIG. 1, Fig. 1 is a kind of flow chart for voice noise suppressing method that first embodiment of the invention provides.Institute
Predicate sound noise suppressing method is applied to carry out the electronic equipment of any kind of speech signal analysis, the voice noise
The step of suppressing method, can be as follows:
Step S10:It is determined and the frequency domain speech signal according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
Corresponding acoustics scene.
Step S20:Parameter adjustment is carried out to noise processed model according to the acoustics scene.
Step S30:Speech enhan-cement is carried out to the frequency domain speech signal according to the noise processed model after adjustment.
For step S10, the various scenes for wanting noise type different based on the acoustics scene may include that office makes an uproar
Sound field scape, street noise scene, wind noise scene etc., energy, amplitude and the noise of the noise acquired in various acoustics scenes
The codomain range of ratio is different.The present embodiment is based on frequency domain speech signal and carries out acoustics scene Recognition, the reason for this is that being believed by voice
Number energy and amplitude can more quickly and accurately determine noise class, so that it is determined that corresponding acoustics scene, and frequency domain language
The extraction of the features such as the voice short-time energy of sound signal, short-time average magnitude is more ripe, rapid.The noise estimation is usually logical
Cross and short-time analysis carried out to Noisy Speech Signal, using the mathematical methods such as random process and probability statistics to the power spectrum of noise into
Row estimation, to know noise power how with frequency distribution.The signal-to-noise ratio is signal and noise in the frequency domain speech signal
Ratio, the signal-to-noise ratio of the acoustics scene being typically different is different, and the acoustics that noise source is different from voice input positional distance
Scene signal-to-noise ratio often also differs, therefore can improve acoustics scene by the selection that comprehensive signal-to-noise ratio carries out acoustics scene and sentence
Fixed accuracy.
For step S20, i.e.,:Parameter adjustment is carried out to noise processed model according to the acoustics scene.At the noise
Reason model is mainly recorded microphone in recorded speech the background environment sound, that is, additivity acoustic noise entered and is dropped simultaneously
It makes an uproar processing, for such additivity acoustic noise, the noise processed model may include human ear acoustics masking submodel, optional
Ground, it is described " according to the acoustics scene to noise processed model carry out parameter adjustment " the step of may include:According to the sound
The estimation masking threshold that scene determines the human ear acoustics masking submodel is learned, submodel is sheltered in institute using the human ear acoustics
It states and filters out auditory perceptual frequency domain speech signal in frequency domain speech signal.The auditory perceptual frequency domain speech signal is estimated described in being
Count the corresponding masking threshold of masking threshold and the frequency domain speech signal in the intersection of auditory perceptual frequency domain.
For step S30, i.e.,:Voice increasing is carried out to the frequency domain speech signal according to the noise processed model after adjustment
By force.Wherein, described may include noise suppressed submodel, the voice drop of the frequency domain speech signal to the noise processed model
It makes an uproar and is mainly carried out by the noise suppressed submodel.The basic algorithm of the noise suppressed submodel can be based on spectrum subtraction
Voice enhancement algorithm, the voice enhancement algorithm based on Kalman filtering, is based on signal at the voice enhancement algorithm based on wavelet analysis
Enhancing algorithm, the voice enhancement algorithm based on auditory masking effect, the speech enhan-cement based on independent component analysis of subspace are calculated
Method or voice enhancement algorithm based on neural network, optionally, the present embodiment use the voice enhancement algorithm based on spectrum subtraction.
The present embodiment S10-S30 through the above steps first carries out acoustic field scape judgement when carrying out the enhancing of voice signal,
Parameter adjustment is carried out to noise processed model further according to different acoustics scenes, it is special to be directed to actual voice input scene noise
Point, the different noises being more adapted precisely in different acoustics scenes realize that more targetedly noise suppressed, raising are made an uproar
The efficiency and effect that sound inhibits.
As an alternative embodiment, described " according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
Before the step S10 of determining acoustics scene corresponding with the frequency domain speech signal ", the present embodiment also needs to obtain frequency domain speech
Signal, and noise estimation and signal-to-noise ratio computation are carried out to it.Referring to FIG. 2, Fig. 2 is one kind that first embodiment of the invention provides
The flow chart of voice input and processing step.
Optionally, the step S1 of voice input is:Collected original time domain voice is believed by simulating human ear filter
Number it is converted into frequency domain speech signal.To complete Fast Fourier Transform (FFT).Simulation human ear filter (the first ear analog filtering
Device, the second human ear analog filter) it is a kind of bandpass filter group that simulation human ear is filtered sound and divides, it uses
When gamma bandpass filter (the gammatone filters) in 128 channels, the impulse Response Function of the i-th rank filter is as follows:
gi(t)=t3exp(-2πbit)cos(2πfit+φi),if t≥0
gi(t)=0, otherwise
Wherein, biIt represents and impacts corresponding attenuation rate, the attenuation rate is related to the bandwidth of filter, fiRepresent filter
Center frequency-band, φiRepresent phase (taking 0).biCalculating it is as follows:
ERB(fi(the 4.37f of)=24.7i/1000+1)
bi=1.019ERB (fi)
Wherein, ERB, equivalent rectangular bandwidth, the scale for weighing psychological response, wherein
Frequency of heart fiIt is uniformly distributed (from 80HZ to 5kHZ) in ERB meter full scales.After above-mentioned conversion, it can be handled in rear class
In, finer processing is carried out for frequency domain, improves the levels of precision of Speech processing.
For example, Noisy Speech Signal can obtain the unit of 128 frequency bands, then after the first human ear filter filtering
It carries out adding window to handle frame by frame, 128 voice T-F units (alternatively referred to as voice time frequency unit) in every frame voice can be obtained.
Optionally, following steps should be executed after step S1,
Step S2:Obtain the voice activity detection result of the frequency domain speech signal.
Step S3:The noise estimated result for obtaining the frequency domain speech signal is calculated according to the voice activity detection result
And signal-to-noise ratio.
For step S2, the voice activity detection refers to the detection for the presence or absence that voice is detected in noise circumstance,
Optionally, the voice activity detection can be that the frequency domain speech signal is divided into sound section and unvoiced segments, described sound
Section is while including the frequency range of voice signal and noise signal, and the unvoiced segments are only to include the frequency range of noise signal.
It is described " to be calculated according to the voice activity detection result and obtain making an uproar for the frequency domain speech signal for step S3
The step of sound estimated result and signal-to-noise ratio " may include:The energy feature of described sound section and the unvoiced segments is compared
It calculates, obtains noise estimated result and signal-to-noise ratio.By taking time recursive average type noise Estimation Algorithm as an example, time recursive average type
Noise Estimation Algorithm determines voice at frequency point k with the presence or absence of the general of voice by the division of described sound section and the unvoiced segments
Rate, when introducing probability, noise power spectral density can by noisy speech information frequency point k there is no voice conditional probability and
There are the conditional probability of voice respectively to there is no the noise power spectral density under speech conditions, there are the noises under speech conditions
Power spectral density is weighted and then sums and obtains, and the noise power spectral density is the noise estimated result.
For step S10, i.e., determined and the frequency domain language according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
The corresponding acoustics scene of sound signal, optional acoustics scene decision procedure can be:By making an uproar in the noise estimated result
Power sound spectrum density is matched with the power spectral density of various acoustics scenes, will be in the signal-to-noise ratio and various acoustics scenes
Signal-to-noise ratio is matched, and chooses the highest acoustics scene of Mean match rate as acoustic field corresponding with the frequency domain speech signal
Scape;The frequency point progress noise power spectral density matching that snr value in the noise estimated result is less than predetermined threshold value is chosen, really
Noise power spectral density matching described in fixed various acoustics scenes is highest as acoustics corresponding with the frequency domain speech signal
Scene;Pass through formula according to the noise estimated resultDB SPL obtain band speech
The estimation of noise energy value s of signal f (i)in(i), different acoustics scenes are directed to (such as quiet, office according to what computer mould was drawn up
Room, vehicle-mounted, meeting room and music hall etc.) big data analysis as a result, generating returning for the estimation of noise energy value and signal-to-noise ratio
One changes function, judges acoustics scene corresponding with the frequency domain speech signal according to the value of the normalized function.It should manage
Solution, the judgement of the acoustics scene can also be carried out by neural network model, support vector machines or other decision procedures
Judgement.
For step S30, " determine that the human ear acoustics shelters son according to the acoustics scene described in step S20 completions
The estimation masking threshold of model filters out human ear sense using human ear acoustics masking submodel in the frequency domain speech signal
Know frequency domain speech signal " the step of after, it is described " according to the noise processed model after adjustment to the frequency domain speech signal carry out language
Sound enhance " the step of may include:Base is carried out to the auditory perceptual frequency domain speech signal using the noise suppressed submodel
It is handled in the noise suppressed of spectrum-subtraction, obtains speech enhan-cement output signal.
Referring to FIG. 3, Fig. 3 is a kind of flow diagram for noise suppressed mode that first embodiment of the invention provides.Its
In, the noise suppressed processing of the spectrum-subtraction can be as follows:
Wherein k indicates that k-th of Frequency point, n and m indicate the lower and upper limit of i-th of frequency band respectively,Indicate enhancing
Speech signal energy afterwards,Pending speech energy after indicating smooth,Indicate the noise energy of estimation, αiTable
Show the subtracting coefficient excessively of i-th of subband, δiIndicate the additional subband subtraction factor of the i-th subband.
By upper figure as it can be seen that multi-subband spectrum subtract noise suppressing method first have to the amplitude information of input speech signal X (k) and
Separated phase comes out, the processing that wherein amplitude information is used for carrying out, and phase information is used for coordinating enhanced voice signal
Amplitude information obtains enhanced voice signal Y (k).Then, as follows pre- is carried out to the amplitude of noisy speech according to formula (2)
Processing, pretreated effect are to reduce the big minor swing of noisy speech amplitude, reduce residual noise, improve voice quality.
In formula (2),Indicate the voice amplitudes after present frame, that is, jth frame pretreatment, | Xj-m(k) | indicate current
The voice amplitudes of n frames before input frame and present frame, and W indicates pretreatment spectrum gain control coefrficient.Noisy speech is composed
Can be by noise and speech manual molecule tape handling after being pre-processed, calculate separately each subband crosses subtracting coefficient.
Wherein, the subtracting coefficient of crossing of i-th of subband is calculated by formula (3):
The Signal to Noise Ratio (SNR) of each subband in formula (3)iIt is obtained by following formula (4):
Formula (4) neutron band subtraction factor δiCalculating such as formula (5), mainly consider different frequency voice information content not
Together:
Above-mentioned spectrum-subtraction is that a kind of development is relatively early and the more mature speech de-noising algorithm of application, the algorithm utilize additivity
Noise and the incoherent feature of voice, assuming that noise be statistics smoothly under the premise of, the noise calculated with no speech gaps
Spectrum estimation value, which replaces, the frequency spectrum of noise during voice, and noisy speech spectral substraction, to obtain the estimation of voice spectrum
Value.Spectrum-subtraction has the characteristics that algorithm is simple, operand is small, is easy to implement quick processing, tends to obtain higher output
Signal-to-noise ratio, and then the voice noise of the present embodiment is made to inhibit more rapid, accurate.
As an implementation, after in order to enable user or the more convenient rapid acquisition speech enhan-cement of related personnel
Voice signal, the voice noise suppressing method is described " according to the noise processed model after adjustment to the frequency domain speech
After the step of signal progress speech enhan-cement ", further include:The speech enhan-cement output signal is converted into time domain speech signal;
It is exported by loudspeaker after being amplified the time domain speech signal using power amplifier.
It is carried out at speech enhan-cement it should be understood that the present embodiment can be frequency domain speech signal only to frequency band
Reason can also be to the frequency domain speech signals of the multiple frequency bands of certain section of voice signal respectively according to carrying out voice the step of the present embodiment
Enhancing, then speech enhan-cement result is merged and is exported.
Second embodiment
In order to which the voice noise suppressing method of first embodiment of the invention offer is better achieved, the present invention second is real
It applies example and additionally provides a kind of voice noise restraining device 100.
Referring to FIG. 4, Fig. 4 is a kind of module map for voice noise restraining device that second embodiment of the invention provides.
Voice noise restraining device 100 includes acoustics scene determining module 110, parameter adjustment module 120, noise processed mould
Block 130.
Optionally, voice noise restraining device 100 further include Voice Activity Detection module 101, noise analysis module 102 with
And voice signal output module 140.
Voice Activity Detection module 101, the voice activity detection result for obtaining the frequency domain speech signal.
Noise analysis module 102 obtains the frequency domain speech signal for being calculated according to the voice activity detection result
Noise estimated result and signal-to-noise ratio.
Acoustics scene determining module 110, for according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal determine with
The corresponding acoustics scene of the frequency domain speech signal.
Parameter adjustment module 120, for carrying out parameter adjustment to noise processed model according to the acoustics scene.
Noise processed module 130, for carrying out language to the frequency domain speech signal according to the noise processed model after adjustment
Sound enhances.
Voice signal output module 140 is used for the speech enhan-cement output signal to be converted to time domain speech signal
Power amplifier is exported after amplifying the time domain speech signal by loudspeaker.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description
Specific work process, can refer to preceding method in corresponding process, no longer excessively repeat herein.
3rd embodiment
Fig. 5 is please referred to, Fig. 5 is a kind of electronics that can be applied in the embodiment of the present application that third embodiment of the invention provides
The structure diagram of equipment.
Electronic equipment 200 may include voice noise restraining device 100, memory 201, storage control 202, processor
203, Peripheral Interface 204, input-output unit 205, audio unit 206, display unit 207.
The memory 201, storage control 202, processor 203, Peripheral Interface 204, input-output unit 205, sound
Frequency unit 206,207 each element of display unit are directly or indirectly electrically connected between each other, to realize the transmission or friendship of data
Mutually.It is electrically connected for example, these elements can be realized between each other by one or more communication bus or signal wire.The voice
Noise Suppression Device 100 can be stored in the memory 201 including at least one in the form of software or firmware (firmware)
In or the software function module that is solidificated in the operating system (operating system, OS) of voice noise restraining device 100.
The processor 203 is used to execute the executable module stored in memory 201, such as voice noise restraining device 100 includes
Software function module or computer program.
Wherein, memory 201 may be, but not limited to, random access memory (Random Access Memory,
RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only
Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM),
Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Wherein, memory 201 is for storing program, and the processor 203 executes described program after receiving and executing instruction, aforementioned
The method performed by server that the stream process that any embodiment of the embodiment of the present invention discloses defines can be applied to processor 203
In, or realized by processor 203.
Processor 203 can be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 203 can
To be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit
(Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), application-specific integrated circuit (ASIC),
Ready-made programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hard
Part component.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor
Can be microprocessor or the processor 203 can also be any conventional processor etc..
The Peripheral Interface 204 couples various input/output devices to processor 203 and memory 201.At some
In embodiment, Peripheral Interface 204, processor 203 and storage control 202 can be realized in one single chip.Other one
In a little examples, they can be realized by independent chip respectively.
Input-output unit 205 is for being supplied to user input data to realize user and the server (or local terminal)
Interaction.The input-output unit 205 may be, but not limited to, the equipment such as mouse and keyboard.
Audio unit 206 provides a user audio interface, may include that one or more microphones, one or more raises
Sound device and voicefrequency circuit.
Display unit 207 provides an interactive interface (such as user's operation circle between the electronic equipment 200 and user
Face) or for display image data give user reference.In the present embodiment, the display unit 207 can be liquid crystal display
Or touch control display.Can be the capacitance type touch control screen or resistance for supporting single-point and multi-point touch operation if touch control display
Formula touch screen etc..Single-point and multi-point touch operation is supported to refer to touch control display and can sense on the touch control display one
Or at multiple positions simultaneously generate touch control operation, and by the touch control operation that this is sensed transfer to processor 203 carry out calculate and
Processing.
It is appreciated that structure shown in fig. 5 is only to illustrate, the electronic equipment 200 may also include more than shown in Fig. 5
Either less component or with the configuration different from shown in Fig. 5.Hardware, software may be used in each component shown in Fig. 5
Or combinations thereof realize.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description
Specific work process, can refer to preceding method in corresponding process, no longer excessively repeat herein.
In conclusion an embodiment of the present invention provides a kind of voice noise suppressing method and device, the voice noise suppression
Method processed is handled based on frequency domain speech signal, and more conducively processing equipment is analyzed it and handled, and improves voice letter
Number processing speed and accuracy.Meanwhile the voice noise suppressing method is directed to different acoustics scenes to noise processed model
Be adjusted so that the voice noise suppressing method is more accurate on adapting to acoustics scene, can realize more added with for
The noise suppressed of property, improves the effect of noise suppressed.Further, the judgement of the acoustics scene is to estimate to tie by noise
Fruit and signal-to-noise ratio carry out, and increase the accuracy of acoustics scene judgement and improve the speed of acoustics scene judgement, to carry
High voice noise inhibition and efficiency.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through
Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, the flow chart in attached drawing and block diagram
Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product,
Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code
Part, a part for the module, section or code, which includes that one or more is for implementing the specified logical function, to be held
Row instruction.It should also be noted that at some as in the realization method replaced, the function of being marked in box can also be to be different from
The sequence marked in attached drawing occurs.For example, two continuous boxes can essentially be basically executed in parallel, they are sometimes
It can execute in the opposite order, this is depended on the functions involved.It is also noted that every in block diagram and or flow chart
The combination of box in a box and block diagram and or flow chart can use function or the dedicated base of action as defined in executing
It realizes, or can be realized using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion
Point, can also be modules individualism, can also two or more modules be integrated to form an independent part.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module
It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, any made by repair
Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should be noted that:Similar label and letter exist
Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing
It is further defined and is explained.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain
Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also include other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
Claims (10)
1. a kind of voice noise suppressing method, which is characterized in that the voice noise suppressing method includes:
Acoustic field corresponding with the frequency domain speech signal is determined according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
Scape;
Parameter adjustment is carried out to noise processed model according to the acoustics scene;
Speech enhan-cement is carried out to the frequency domain speech signal according to the noise processed model after adjustment.
2. voice noise suppressing method according to claim 1, which is characterized in that described according to frequency domain speech signal
Before noise estimated result and signal-to-noise ratio determine acoustics scene corresponding with the frequency domain speech signal, the voice noise inhibits
Method further includes:
It converts collected original time domain voice signal to frequency domain speech signal by simulating human ear filter.
3. voice noise suppressing method according to claim 2, which is characterized in that described by simulating human ear filter
Collected original time domain voice signal is converted after frequency domain speech signal and the making an uproar according to frequency domain speech signal to
Before sound estimated result and signal-to-noise ratio determine acoustics scene corresponding with the frequency domain speech signal, the voice noise inhibition side
Method further includes:
Obtain the voice activity detection result of the frequency domain speech signal;
The noise estimated result and signal-to-noise ratio for obtaining the frequency domain speech signal are calculated according to the voice activity detection result.
4. voice noise suppressing method according to claim 3, which is characterized in that the language for obtaining frequency domain speech signal
Sound activity detection is as a result, include:
Voice activity detection is carried out to frequency domain speech signal, to which the frequency domain speech signal is divided into sound section and noiseless
Section, using described sound section and the unvoiced segments as the voice activity detection of the frequency domain speech signal as a result, described sound section
To include the frequency range of voice signal and noise signal simultaneously, the unvoiced segments are only to include the frequency range of noise signal.
5. voice noise suppressing method according to claim 4, which is characterized in that described according to the voice activity detection
As a result the noise estimated result and signal-to-noise ratio for obtaining the frequency domain speech signal are calculated, including:
Comparing calculation is carried out to the energy feature of described sound section and the unvoiced segments, obtains noise estimated result and signal-to-noise ratio.
6. voice noise suppressing method according to any one of claims 1-5, which is characterized in that the noise processed mould
Type include noise suppressed submodel and human ear acoustics masking submodel, it is described according to the acoustics scene to noise processed model into
Row parameter adjustment carries out speech enhan-cement according to the noise processed model after adjustment to the frequency domain speech signal, including:
Determine that the human ear acoustics shelters the estimation masking threshold of submodel according to the acoustics scene, using the human ear acoustics
Masking submodel filters out auditory perceptual frequency domain speech signal in the frequency domain speech signal;
Noise suppressed based on spectrum-subtraction is carried out to the auditory perceptual frequency domain speech signal using the noise suppressed submodel
Processing obtains speech enhan-cement output signal.
7. voice noise suppressing method according to claim 6, which is characterized in that at the noise according to after adjustment
After model is managed to frequency domain speech signal progress speech enhan-cement, the voice noise suppressing method further includes:
The speech enhan-cement output signal is converted into time domain speech signal;
It is exported by loudspeaker after being amplified the time domain speech signal using power amplifier.
8. a kind of voice noise restraining device, which is characterized in that the voice noise restraining device includes:
Acoustics scene determining module, for being determined and the frequency domain according to the noise estimated result and signal-to-noise ratio of frequency domain speech signal
The corresponding acoustics scene of voice signal;
Parameter adjustment module, for carrying out parameter adjustment to noise processed model according to the acoustics scene;
Noise processed module, for carrying out speech enhan-cement to the frequency domain speech signal according to the noise processed model after adjustment.
9. voice noise restraining device according to claim 8, which is characterized in that the voice noise restraining device also wraps
It includes:
Voice Activity Detection module, the voice activity detection result for obtaining the frequency domain speech signal;
Noise analysis module, the noise for calculating the acquisition frequency domain speech signal according to the voice activity detection result are estimated
Count result and signal-to-noise ratio.
10. a kind of storage medium, which is characterized in that be stored with computer program instructions, the computer in the storage medium
When program instruction is read and run by a processor, perform claim requires the step in any one of 1-7 the methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810692665.1A CN108766454A (en) | 2018-06-28 | 2018-06-28 | A kind of voice noise suppressing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810692665.1A CN108766454A (en) | 2018-06-28 | 2018-06-28 | A kind of voice noise suppressing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108766454A true CN108766454A (en) | 2018-11-06 |
Family
ID=63974574
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810692665.1A Pending CN108766454A (en) | 2018-06-28 | 2018-06-28 | A kind of voice noise suppressing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108766454A (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109671446A (en) * | 2019-02-20 | 2019-04-23 | 西华大学 | A kind of deep learning sound enhancement method based on absolute hearing threshold |
CN110197670A (en) * | 2019-06-04 | 2019-09-03 | 大众问问(北京)信息科技有限公司 | Audio defeat method, apparatus and electronic equipment |
CN110544468A (en) * | 2019-08-23 | 2019-12-06 | Oppo广东移动通信有限公司 | Application awakening method and device, storage medium and electronic equipment |
WO2020097820A1 (en) * | 2018-11-14 | 2020-05-22 | 深圳市大疆创新科技有限公司 | Wind noise processing method, device, and system employing multiple microphones, and storage medium |
CN111261183A (en) * | 2018-12-03 | 2020-06-09 | 珠海格力电器股份有限公司 | Method and device for denoising voice |
CN111477241A (en) * | 2020-04-15 | 2020-07-31 | 南京邮电大学 | Layered self-adaptive denoising method and system for household noise environment |
CN111564161A (en) * | 2020-04-28 | 2020-08-21 | 长沙世邦通信技术有限公司 | Sound processing device and method for intelligently suppressing noise, terminal equipment and readable medium |
CN111796790A (en) * | 2019-04-09 | 2020-10-20 | 深圳市冠旭电子股份有限公司 | Sound effect adjusting method and device, readable storage medium and terminal equipment |
CN112165590A (en) * | 2020-09-30 | 2021-01-01 | 联想(北京)有限公司 | Video recording implementation method and device and electronic equipment |
CN112185410A (en) * | 2020-10-21 | 2021-01-05 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112309418A (en) * | 2020-10-30 | 2021-02-02 | 出门问问(苏州)信息科技有限公司 | Method and device for inhibiting wind noise |
CN112349291A (en) * | 2020-09-29 | 2021-02-09 | 成都千立网络科技有限公司 | Sound amplification system and method based on AI noise reduction model |
CN112420073A (en) * | 2020-10-12 | 2021-02-26 | 北京百度网讯科技有限公司 | Voice signal processing method, device, electronic equipment and storage medium |
CN112951259A (en) * | 2021-03-01 | 2021-06-11 | 杭州网易云音乐科技有限公司 | Audio noise reduction method and device, electronic equipment and computer readable storage medium |
CN112992153A (en) * | 2021-04-27 | 2021-06-18 | 太平金融科技服务(上海)有限公司 | Audio processing method, voiceprint recognition device and computer equipment |
CN113257272A (en) * | 2021-06-29 | 2021-08-13 | 深圳小米通讯技术有限公司 | Voice signal processing method and device, electronic equipment and storage medium |
WO2023138252A1 (en) * | 2022-01-24 | 2023-07-27 | Oppo广东移动通信有限公司 | Audio signal processing method and apparatus, earphone device, and storage medium |
WO2024041512A1 (en) * | 2022-08-25 | 2024-02-29 | 维沃移动通信有限公司 | Audio noise reduction method and apparatus, and electronic device and readable storage medium |
CN117746828A (en) * | 2024-02-20 | 2024-03-22 | 华侨大学 | Noise masking control method, device, equipment and medium for open office |
CN113949955B (en) * | 2020-07-16 | 2024-04-09 | Oppo广东移动通信有限公司 | Noise reduction processing method and device, electronic equipment, earphone and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777349A (en) * | 2009-12-08 | 2010-07-14 | 中国科学院自动化研究所 | Auditory perception property-based signal subspace microphone array voice enhancement method |
CN102014205A (en) * | 2010-11-19 | 2011-04-13 | 中兴通讯股份有限公司 | Method and device for treating voice call quality |
CN103077725A (en) * | 2012-12-31 | 2013-05-01 | 东莞宇龙通信科技有限公司 | Speech processing method and device |
CN103456301A (en) * | 2012-05-28 | 2013-12-18 | 中兴通讯股份有限公司 | Ambient sound based scene recognition method and device and mobile terminal |
CN103617797A (en) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
CN104575511A (en) * | 2013-10-22 | 2015-04-29 | 陈卓 | Voice enhancement method and device |
CN106128451A (en) * | 2016-07-01 | 2016-11-16 | 北京地平线机器人技术研发有限公司 | Method for voice recognition and device |
CN107910011A (en) * | 2017-12-28 | 2018-04-13 | 科大讯飞股份有限公司 | A kind of voice de-noising method, device, server and storage medium |
US10021507B2 (en) * | 2013-05-24 | 2018-07-10 | Barco Nv | Arrangement and method for reproducing audio data of an acoustic scene |
-
2018
- 2018-06-28 CN CN201810692665.1A patent/CN108766454A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777349A (en) * | 2009-12-08 | 2010-07-14 | 中国科学院自动化研究所 | Auditory perception property-based signal subspace microphone array voice enhancement method |
CN102014205A (en) * | 2010-11-19 | 2011-04-13 | 中兴通讯股份有限公司 | Method and device for treating voice call quality |
CN103456301A (en) * | 2012-05-28 | 2013-12-18 | 中兴通讯股份有限公司 | Ambient sound based scene recognition method and device and mobile terminal |
CN103077725A (en) * | 2012-12-31 | 2013-05-01 | 东莞宇龙通信科技有限公司 | Speech processing method and device |
US10021507B2 (en) * | 2013-05-24 | 2018-07-10 | Barco Nv | Arrangement and method for reproducing audio data of an acoustic scene |
CN104575511A (en) * | 2013-10-22 | 2015-04-29 | 陈卓 | Voice enhancement method and device |
CN103617797A (en) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
CN106128451A (en) * | 2016-07-01 | 2016-11-16 | 北京地平线机器人技术研发有限公司 | Method for voice recognition and device |
CN107910011A (en) * | 2017-12-28 | 2018-04-13 | 科大讯飞股份有限公司 | A kind of voice de-noising method, device, server and storage medium |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020097820A1 (en) * | 2018-11-14 | 2020-05-22 | 深圳市大疆创新科技有限公司 | Wind noise processing method, device, and system employing multiple microphones, and storage medium |
CN111261183A (en) * | 2018-12-03 | 2020-06-09 | 珠海格力电器股份有限公司 | Method and device for denoising voice |
CN111261183B (en) * | 2018-12-03 | 2022-11-22 | 珠海格力电器股份有限公司 | Method and device for denoising voice |
CN109671446A (en) * | 2019-02-20 | 2019-04-23 | 西华大学 | A kind of deep learning sound enhancement method based on absolute hearing threshold |
CN111796790B (en) * | 2019-04-09 | 2023-09-08 | 深圳市冠旭电子股份有限公司 | Sound effect adjusting method and device, readable storage medium and terminal equipment |
CN111796790A (en) * | 2019-04-09 | 2020-10-20 | 深圳市冠旭电子股份有限公司 | Sound effect adjusting method and device, readable storage medium and terminal equipment |
CN110197670A (en) * | 2019-06-04 | 2019-09-03 | 大众问问(北京)信息科技有限公司 | Audio defeat method, apparatus and electronic equipment |
CN110544468B (en) * | 2019-08-23 | 2022-07-12 | Oppo广东移动通信有限公司 | Application awakening method and device, storage medium and electronic equipment |
CN110544468A (en) * | 2019-08-23 | 2019-12-06 | Oppo广东移动通信有限公司 | Application awakening method and device, storage medium and electronic equipment |
CN111477241A (en) * | 2020-04-15 | 2020-07-31 | 南京邮电大学 | Layered self-adaptive denoising method and system for household noise environment |
CN111564161B (en) * | 2020-04-28 | 2023-07-07 | 世邦通信股份有限公司 | Sound processing device and method for intelligently suppressing noise, terminal equipment and readable medium |
CN111564161A (en) * | 2020-04-28 | 2020-08-21 | 长沙世邦通信技术有限公司 | Sound processing device and method for intelligently suppressing noise, terminal equipment and readable medium |
CN113949955B (en) * | 2020-07-16 | 2024-04-09 | Oppo广东移动通信有限公司 | Noise reduction processing method and device, electronic equipment, earphone and storage medium |
CN112349291A (en) * | 2020-09-29 | 2021-02-09 | 成都千立网络科技有限公司 | Sound amplification system and method based on AI noise reduction model |
CN112165590A (en) * | 2020-09-30 | 2021-01-01 | 联想(北京)有限公司 | Video recording implementation method and device and electronic equipment |
CN112165590B (en) * | 2020-09-30 | 2022-05-31 | 联想(北京)有限公司 | Video recording implementation method and device and electronic equipment |
CN112420073A (en) * | 2020-10-12 | 2021-02-26 | 北京百度网讯科技有限公司 | Voice signal processing method, device, electronic equipment and storage medium |
CN112420073B (en) * | 2020-10-12 | 2024-04-16 | 北京百度网讯科技有限公司 | Voice signal processing method, device, electronic equipment and storage medium |
CN112185410A (en) * | 2020-10-21 | 2021-01-05 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112185410B (en) * | 2020-10-21 | 2024-04-30 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112309418A (en) * | 2020-10-30 | 2021-02-02 | 出门问问(苏州)信息科技有限公司 | Method and device for inhibiting wind noise |
CN112309418B (en) * | 2020-10-30 | 2023-06-27 | 出门问问(苏州)信息科技有限公司 | Method and device for inhibiting wind noise |
CN112951259A (en) * | 2021-03-01 | 2021-06-11 | 杭州网易云音乐科技有限公司 | Audio noise reduction method and device, electronic equipment and computer readable storage medium |
CN112992153B (en) * | 2021-04-27 | 2021-08-17 | 太平金融科技服务(上海)有限公司 | Audio processing method, voiceprint recognition device and computer equipment |
CN112992153A (en) * | 2021-04-27 | 2021-06-18 | 太平金融科技服务(上海)有限公司 | Audio processing method, voiceprint recognition device and computer equipment |
CN113257272A (en) * | 2021-06-29 | 2021-08-13 | 深圳小米通讯技术有限公司 | Voice signal processing method and device, electronic equipment and storage medium |
WO2023138252A1 (en) * | 2022-01-24 | 2023-07-27 | Oppo广东移动通信有限公司 | Audio signal processing method and apparatus, earphone device, and storage medium |
WO2024041512A1 (en) * | 2022-08-25 | 2024-02-29 | 维沃移动通信有限公司 | Audio noise reduction method and apparatus, and electronic device and readable storage medium |
CN117746828A (en) * | 2024-02-20 | 2024-03-22 | 华侨大学 | Noise masking control method, device, equipment and medium for open office |
CN117746828B (en) * | 2024-02-20 | 2024-04-30 | 华侨大学 | Noise masking control method, device, equipment and medium for open office |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108766454A (en) | A kind of voice noise suppressing method and device | |
CN106486131B (en) | A kind of method and device of speech de-noising | |
US9601119B2 (en) | Systems and methods for segmenting and/or classifying an audio signal from transformed audio information | |
US9666183B2 (en) | Deep neural net based filter prediction for audio event classification and extraction | |
CN103827965B (en) | Adaptive voice intelligibility processor | |
EP1973104B1 (en) | Method and apparatus for estimating noise by using harmonics of a voice signal | |
CN109313909B (en) | Method, device, apparatus and system for evaluating consistency of microphone array | |
CN105261359B (en) | The noise-canceling system and noise-eliminating method of mobile microphone | |
CN103026407A (en) | A bandwidth extender | |
TR201810466T4 (en) | Apparatus and method for processing an audio signal to improve speech using feature extraction. | |
CN103903634B (en) | The detection of activation sound and the method and apparatus for activating sound detection | |
CN111128214A (en) | Audio noise reduction method and device, electronic equipment and medium | |
EP3316256A1 (en) | Voice activity modification frame acquiring method, and voice activity detection method and apparatus | |
KR20220062598A (en) | Systems and methods for generating audio signals | |
CN104916292B (en) | Method and apparatus for detecting audio signals | |
US20140321655A1 (en) | Sensitivity Calibration Method and Audio Device | |
CN110390947B (en) | Method, system, device and storage medium for determining sound source position | |
CN109979476A (en) | A kind of method and device of speech dereverbcration | |
Mack et al. | Single-Channel Dereverberation Using Direct MMSE Optimization and Bidirectional LSTM Networks. | |
Tian et al. | Spoofing detection under noisy conditions: a preliminary investigation and an initial database | |
Fraile et al. | Mfcc-based remote pathology detection on speech transmitted through the telephone channel-impact of linear distortions: Band limitation, frequency response and noise | |
Shankar et al. | Noise dependent super gaussian-coherence based dual microphone speech enhancement for hearing aid application using smartphone | |
CN113593604A (en) | Method, device and storage medium for detecting audio quality | |
Chen et al. | Neuromorphic pitch based noise reduction for monosyllable hearing aid system application | |
Dai et al. | An improved model of masking effects for robust speech recognition system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200617 Address after: 317028 Zhejiang province Taizhou City, city, Cang Town, Zhang Village No. Applicant after: Taizhou Zhige Electronic Technology Co., Ltd Address before: 318000 Zhangjia Du 2-110, gugang Town, Linghai City, Taizhou, Zhejiang Applicant before: ZHEJIANG FEIGE ELECTRONIC TECHNOLOGY Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181106 |