CN103824562A - Psychological acoustic model-based voice post-perception filter - Google Patents

Psychological acoustic model-based voice post-perception filter Download PDF

Info

Publication number
CN103824562A
CN103824562A CN201410046572.3A CN201410046572A CN103824562A CN 103824562 A CN103824562 A CN 103824562A CN 201410046572 A CN201410046572 A CN 201410046572A CN 103824562 A CN103824562 A CN 103824562A
Authority
CN
China
Prior art keywords
perception
voice
filter
noise
perceptual filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410046572.3A
Other languages
Chinese (zh)
Other versions
CN103824562B (en
Inventor
贾海蓉
李鸿燕
武奕峰
张雪英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201410046572.3A priority Critical patent/CN103824562B/en
Publication of CN103824562A publication Critical patent/CN103824562A/en
Application granted granted Critical
Publication of CN103824562B publication Critical patent/CN103824562B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a psychological acoustic model-based voice post-perception filter. The perception filters does not need to be fused in all algorithms, so that the complexity of the algorithms is not influenced; but the identical auditory perception enhancement effect can be obtained. Because the re-processing process of voice enhancement is focused, the auditory perception of the enhanced voice is further improved; and even under the circumstances that noise exists and the signal to noise ratio is not improved, the objective of auditory perception improvement can be achieved by using the post-perception filter. The filter is established under the circumstances that the voice signal distortion is in a minimum state and on the condition that the residual noises are not heard by human ears. Moreover, the gain of the filter is obtained by constructing a cost function containing a masking threshold on the condition; and further optimization is carried out by a perception normalization factor constructed by the masking threshold. Therefore, excessive signal weakening can be avoided and the minimum voice perception distortion after enhancement can be ensured.

Description

The rearmounted perceptual filter of voice based on psychoacoustic model
Technical field
The present invention relates to the rearmounted perceptual filter of voice based on psychoacoustic model.
Background technology
At present, the various algorithms that voice strengthen can be removed noise to some extent, but more or less also have residual noise and music noise, have affected the quality of voice, so need further to eliminate it; Add the auditory perception that the evaluation of voice is finally depended on to people, thereby the research that voice are strengthened should be in conjunction with the apperceive characteristic that uses human auditory system to voice, it is the masking effect of people's ear, unwanted noise is had to special inhibit feature, make the voice after strengthening reduce as much as possible auditory fatigue degree, improve auditory perception, play the usefulness that improves voice quality.So performance voice being strengthened in conjunction with the masking effect in human hearing characteristic has very important effect.
In recent years, there are many experts and scholars to strengthen and done research the voice based on people's ear masking effect, and obtained certain effect.But these algorithms are to be all based upon on the basis of merging with other algorithm, make algorithm originally because added the calculating of masking model more complicated, even can not real-time implementation.For this problem, this chapter has proposed a kind of rearmounted perceptual filter based on masking effect, and it is used in voice enhancing.
Summary of the invention
There is residual noise in the voice that the present invention is directed to after enhancing, causes the poor problem of Auditory Perception degree, proposes a kind of rearmounted perceptual filter based on psychoacoustic model, and it is used in voice enhancing.First, this perceptual filter does not need to merge in each algorithm, thereby can not affect the complexity of algorithm, but has but obtained the effect of same enhancing Auditory Perception degree.Secondly, it just, for the process of again processing that strengthens voice, further improves the Auditory Perception degree that strengthens voice, even if noise exists, in the situation that signal to noise ratio (S/N ratio) does not improve, utilize this rearmounted perceptual filter, also can reach the object that improves Auditory Perception degree; This postposition perceptual filter is to be based upon in the situation of voice signal distortion minimum, to make under condition that residual noise do not heard by people's ear as far as possible, and the gain of wave filter is to obtain by build the cost function that contains masking threshold under this condition, and the perception normalized factor being built by masking threshold is further optimized, object is the weakening signal of having avoided excessive, speech perception distortion minimum after having guaranteed to strengthen.
As shown in Figure 1, concrete scheme is:
1) noisy speech, after spectrum-subtraction (this method can change) strengthens, divides frame to calculate the masking threshold of every frame according to psychoacoustic model.
2) masking threshold that solves by the first step builds cost function, and object is to guarantee under the condition of voice signal distortion minimum, residual noise is not heard by people's ear as far as possible.
J=P(ε s)+μ(P(ε r)-E[T k])=|G-1| 2E[|S k|] 2+μ(|G| 2E[|N k|] 2-E[T k])
Wherein, ε s=S k(G-1) be voice distortion, ε r=N kg is residual noise.Because voice and noise are not
Relevant, so E (N ks k)=0, the power P (ε of voice distortion s), the power P (ε of residual noise r).
3) by make cost function under, solve the gain of perceptual filter.
4) for fear of excessive weakening signal, then by perception normalized factor, perceptual filter is revised, avoided excessive weakening signal, speech perception distortion minimum after having guaranteed to strengthen.
Perception normalized factor is:
Figure BDA0000464756540000021
wherein, T min(l) be the minimum value in the 1st frame, T max(l) be the maximal value in the 1st frame, obtain the gain G of final perceptual filter kfor:
G k = 1 / max ( θ * | N k | 2 T k , 1 ) = 1 / max ( θ * | N k | T k , 1 )
5) voice that are finally enhanced.
Accompanying drawing explanation
By describing in more detail exemplary embodiment of the present invention with reference to accompanying drawing, above and other aspect of the present invention and advantage will become more and be readily clear of, in the accompanying drawings:
Fig. 1 is the schematic diagram of the rearmounted perceptual filter of the voice based on psychoacoustic model of the present invention;
Fig. 2 is under the white noise background of the rearmounted perceptual filter of the voice based on psychoacoustic model of the present invention, and SS and WF add the relatively schematic diagram of result before and after perceptual filter;
Fig. 3 is under the train noise background of the rearmounted perceptual filter of the voice based on psychoacoustic model of the present invention, and SS and WF add the relatively schematic diagram of result before and after perceptual filter.
Embodiment
Hereinafter, now with reference to accompanying drawing, the present invention is described more fully, various embodiment shown in the drawings.But the present invention can implement in many different forms, and should not be interpreted as being confined to embodiment set forth herein.On the contrary, it will be thorough with completely providing these embodiment to make the disclosure, and scope of the present invention is conveyed to those skilled in the art fully.
Hereinafter, exemplary embodiment of the present invention is described with reference to the accompanying drawings in more detail.
With MATLAB to spectrum-subtraction (Spectral Subtraction, and Wiener Filter Method (WienerFiltering SS), WF) voice after strengthening add rearmounted perceptual filter and carry out experiment simulation, the English male voice in 863 sound banks taken from voice: " The birch canoe slid on the smooth planks. ", sampling rate is 8kHz, frame length K is 160, and frame is stacked as 50%; Noise is white Gaussian noise and the train noise of NOISEX.92 database.In clean speech, add white Gaussian noise and train noise as noisy speech.Wherein add the signal to noise ratio (S/N ratio) of the noisy speech of white Gaussian noise to be-10dB ,-5dB, 0dB, 5dB, 10dB; The signal to noise ratio (S/N ratio) that adds the noisy speech of train noise is 0dB, 5dB, 10dB, 15dB.The object of emulation is that spectrum-subtraction and Wiener filtering are added to SNR (Signal Noise Ratio) and PESQ (the Perceptual Evaluation Speech Quality) comparison before and after rearmounted perceptual filter, and experimental result as shown in Figure 2,3.
As can be seen from Figure 2, under white noise background, spectrum-subtraction and Wiener Filter Method add after perceptual filter, and signal to noise ratio (S/N ratio) has slightly and to improve or to reduce, but PESQ value entirety improved, such as when the 10dB.This has just in time confirmed the design philosophy of perceptual filter, and acceptable noise exists, and signal to noise ratio (S/N ratio) decreases, but Auditory Perception degree has improved.In addition, under train noise background in Fig. 3, situation is basic identical, what what illustrate no matter after noise background, voice enhancement algorithm, this design meets the designing requirement of perceptual filter, meet the requirement of human auditory system, also proved the validity of the perceptual filter of new proposition, can be used in voice enhancing simultaneously.
The foregoing is only embodiments of the invention, be not limited to the present invention.The present invention can have various suitable changes and variation.All any modifications of doing within the spirit and principles in the present invention, be equal to replacement, improvement etc., within protection scope of the present invention all should be included in.

Claims (1)

1. the rearmounted perceptual filter of the voice based on psychoacoustic model, is characterized in that:
In described wave filter,
1) noisy speech, after the spectrum-subtraction that can lead continuously strengthens, divides frame to calculate the masking threshold of every frame according to psychoacoustic model;
2) masking threshold solving by the first step builds cost function:
J=P(ε s)+μ(P(ε r)-E[T k])=|G-1| 2E[|S k| 2]+μ(|G| 2E[|N k| 2]-E[T k])
Wherein, ε s=S k(G-1) be voice distortion, ε r=N kg is residual noise; E (N ks k)=0, language
Power P (the ε of sound distortion s), the power P (ε of residual noise r);
3) by make cost function under, solve the gain of perceptual filter;
4) by perception normalized factor, perceptual filter is revised again,
Perception normalized factor is: wherein, T min(l) be the minimum value in the 1st frame, T max(l) be the maximal value in the 1st frame, obtain the gain G of final perceptual filter kfor:
G k = 1 / max ( θ * | N k | 2 T k , 1 ) = 1 / max ( θ * | N k | T k , 1 )
5) voice that are finally enhanced.
CN201410046572.3A 2014-02-10 2014-02-10 The rearmounted perceptual filter of voice based on psychoacoustic model Expired - Fee Related CN103824562B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410046572.3A CN103824562B (en) 2014-02-10 2014-02-10 The rearmounted perceptual filter of voice based on psychoacoustic model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410046572.3A CN103824562B (en) 2014-02-10 2014-02-10 The rearmounted perceptual filter of voice based on psychoacoustic model

Publications (2)

Publication Number Publication Date
CN103824562A true CN103824562A (en) 2014-05-28
CN103824562B CN103824562B (en) 2016-08-17

Family

ID=50759584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410046572.3A Expired - Fee Related CN103824562B (en) 2014-02-10 2014-02-10 The rearmounted perceptual filter of voice based on psychoacoustic model

Country Status (1)

Country Link
CN (1) CN103824562B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869649A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Perceptual filtering method and perceptual filter
CN109036466A (en) * 2018-08-01 2018-12-18 太原理工大学 The emotion dimension PAD prediction technique of Emotional Speech identification
CN109979478A (en) * 2019-04-08 2019-07-05 网易(杭州)网络有限公司 Voice de-noising method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6477489B1 (en) * 1997-09-18 2002-11-05 Matra Nortel Communications Method for suppressing noise in a digital speech signal
EP1619793A1 (en) * 2004-07-20 2006-01-25 Harman Becker Automotive Systems GmbH Audio enhancement system and method
CN101505447A (en) * 2008-02-07 2009-08-12 奥迪康有限公司 Method of estimating weighting function of audio signals in a hearing aid
CN101636648A (en) * 2007-03-19 2010-01-27 杜比实验室特许公司 Speech enhancement employing a perceptual model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6477489B1 (en) * 1997-09-18 2002-11-05 Matra Nortel Communications Method for suppressing noise in a digital speech signal
EP1619793A1 (en) * 2004-07-20 2006-01-25 Harman Becker Automotive Systems GmbH Audio enhancement system and method
CN101636648A (en) * 2007-03-19 2010-01-27 杜比实验室特许公司 Speech enhancement employing a perceptual model
CN101505447A (en) * 2008-02-07 2009-08-12 奥迪康有限公司 Method of estimating weighting function of audio signals in a hearing aid

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869649A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Perceptual filtering method and perceptual filter
CN105869649B (en) * 2015-01-21 2020-02-21 北京大学深圳研究院 Perceptual filtering method and perceptual filter
CN109036466A (en) * 2018-08-01 2018-12-18 太原理工大学 The emotion dimension PAD prediction technique of Emotional Speech identification
CN109036466B (en) * 2018-08-01 2022-11-29 太原理工大学 Emotion dimension PAD prediction method for emotion voice recognition
CN109979478A (en) * 2019-04-08 2019-07-05 网易(杭州)网络有限公司 Voice de-noising method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN103824562B (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN103778920B (en) Speech enhan-cement and compensating for frequency response phase fusion method in digital deaf-aid
CN110473567B (en) Audio processing method and device based on deep neural network and storage medium
JP7258182B2 (en) Speech processing method, device, electronic device and computer program
CN105611477B (en) The voice enhancement algorithm that depth and range neutral net are combined in digital deaf-aid
CN103236260B (en) Speech recognition system
CN103236263B (en) A kind of method, system and mobile terminal improving speech quality
CN112767963B (en) Voice enhancement method, device and system and computer readable storage medium
AU2010204470B2 (en) Automatic sound recognition based on binary time frequency units
CN105741849A (en) Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid
DE602007001338D1 (en) SPEECH RECOGNITION WITH SPEAKER ADAPTATION BASED ON BASIC FREQUENCY CLASSIFICATION
CN103761974B (en) Cochlear implant
CN108335702A (en) A kind of audio defeat method based on deep neural network
WO2020186742A1 (en) Voice recognition method applied to ground-air communication
CN104505100A (en) Non-supervision speech enhancement method based robust non-negative matrix decomposition and data fusion
CN106878851A (en) A kind of active noise reduction earphone based on channel compensation Yu speech recognition
Abdullah et al. Towards more efficient DNN-based speech enhancement using quantized correlation mask
CN103824562A (en) Psychological acoustic model-based voice post-perception filter
Min et al. Mask estimate through Itakura-Saito nonnegative RPCA for speech enhancement
CN102314883B (en) Music noise judgment method and voice noise elimination method
CN104778948A (en) Noise-resistant voice recognition method based on warped cepstrum feature
CN106658323A (en) Dual microphone noise reduction system and method for cochlear implants and hearing aids
Sun et al. An RNN-based speech enhancement method for a binaural hearing aid system
CN104703108B (en) A kind of digital deaf-aid dynamic range compression algorithm being under noise conditions
Lee et al. Citear: A two-stage end-to-end system for noisy-reverberant hearing-aid processing
Liu et al. Speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160817

Termination date: 20180210

CF01 Termination of patent right due to non-payment of annual fee