CN112037759A - Anti-noise perception sensitivity curve establishing and voice synthesizing method - Google Patents
Anti-noise perception sensitivity curve establishing and voice synthesizing method Download PDFInfo
- Publication number
- CN112037759A CN112037759A CN202010686375.3A CN202010686375A CN112037759A CN 112037759 A CN112037759 A CN 112037759A CN 202010686375 A CN202010686375 A CN 202010686375A CN 112037759 A CN112037759 A CN 112037759A
- Authority
- CN
- China
- Prior art keywords
- noise
- critical
- sensitivity curve
- voice
- perception
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/69—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals
Abstract
The invention provides an anti-noise perception sensitivity curve establishing and voice synthesizing method, which comprises the steps of using band-pass filtering to divide noise according to critical frequency bands perceived by human ears to obtain a plurality of critical frequency band noises; recording a corresponding anti-noise voice sequence according to different noise decibels aiming at each critical frequency band noise; determining a perception threshold value based on the SII objective test index, and performing noise decibel level perception test on each critical frequency band to obtain an updated critical decibel; generating an anti-noise perception sensitivity curve according to the updated critical decibels; and obtaining critical decibel values from an anti-noise perception sensitivity curve, selecting anti-noise voices with different critical decibel values, training an anti-noise voice feature mapping model, and performing voice synthesis by using the mapped anti-noise voice features. The method of the invention utilizes the hearing characteristic of people in the noise environment to provide an anti-noise perception sensitivity curve establishing and voice synthesizing method, which is more beneficial to the practical application scene of anti-noise voice conversion.
Description
Technical Field
The invention belongs to the technical field of acoustics, and particularly relates to an anti-noise perception sensitivity curve establishing and voice synthesizing method.
Background
An equal loudness curve refers to the plot of sound pressure level versus frequency for a pure tone of the same perceived loudness of a typical listener. And (3) an equal loudness curve of the binaural audiometry, wherein a dotted line with the lowest threshold value, namely a pure tone minimum audible sound field, is used as a hearing threshold curve of the binaural audiometry. The loudness is mainly determined by the sound intensity, and the loudness level is correspondingly increased by increasing the sound intensity. However, the loudness of sound is not determined purely by the sound intensity, but depends on the frequency, and pure tones of different frequencies have different loudness growth rates, wherein the loudness growth rate of low-frequency pure tones is faster than that of medium-frequency pure tones.
Thus, similar to the equal loudness curve, speakers perceive ambient noise at different frequencies, at different noise levels, and in different anti-noise sound production patterns triggered accordingly. Determining a distinguishing threshold curve of a speaker for the decibel level change of environmental noise, guiding to establish an anti-noise sound production model based on the Lombard effect, starting corresponding anti-noise voice conversion in due time, and ensuring the consistency of the converted anti-noise voice and various real noise scenes. However, the prior art focuses on the acoustic features of the Lombard effect changes, and the importance of the acoustic features to improve the intelligibility of anti-noise speech. Due to lack of guidance of anti-noise perception sensitivity, the converted anti-noise voice is not matched with a real scene, and experience of subsequent voice application is further influenced.
The invention provides an anti-noise perception sensitivity curve establishment and speech synthesis method, which aims to fully utilize the perception characteristics of people in different noise environments, study the anti-noise vocalization mechanism from the perspective of auditory perception, establish the perception sensitivity curve of a speaker to environmental noise and solve the problem that the anti-noise speech conversion is disconnected from a real scene due to the lack of auditory perception model guidance of the anti-noise speech vocalization at present.
Disclosure of Invention
The invention provides an anti-noise perception sensitivity curve establishing and voice synthesizing method, aiming at solving the problem that the existing anti-noise voice production is lack of auditory perception model guidance and reducing the detail difference in frequency.
The technical scheme adopted by the invention is that the method for establishing the anti-noise perception sensitivity curve comprises the following steps,
step 1, dividing noise according to critical frequency bands sensed by human ears by using band-pass filtering to obtain a plurality of critical frequency band noises;
step 2, recording corresponding anti-noise voice sequences according to different noise decibels aiming at each critical frequency band noise in the step 1;
step 3, determining a perception threshold value based on the SII objective test index, and performing noise decibel level perception test on each critical frequency band to obtain an updated critical decibel;
and 4, generating an anti-noise perception sensitivity curve according to the updated critical decibels obtained in the step 3.
In step 1, white noise is used as the noise.
In step 1, Bark band or Mel band is used as the critical band of human ear perception.
Moreover, the implementation manner of the step 2 is that firstly, aiming at each critical frequency band noise obtained in the step 1, data is collected through a manual head, each critical frequency band noise is correspondingly adjusted according to a preset signal-to-noise ratio, and the decibel level is calibrated; and then respectively recording voice sequences for different decibel levels according to the noise of each critical frequency band.
And recording according to the preset lower limit MIN and upper limit MAX of the signal-to-noise ratio range and the step length d respectively according to the signal-to-noise ratio of MIN, MIN + d, MIN +2d, … and MAX to obtain the corresponding voice sequence.
In step 3, the noise decibel level sensing test for each critical frequency band is realized by using the MUSHRA standard.
The invention also provides a speech synthesis method based on the anti-noise perception sensitivity curve, which comprises the following steps,
step 1, dividing noise according to critical frequency bands sensed by human ears by using band-pass filtering to obtain a plurality of critical frequency band noises;
step 2, recording corresponding anti-noise voice sequences according to different noise decibels aiming at each critical frequency band noise in the step 1;
step 3, determining a perception threshold value based on the SII objective test index, and performing noise decibel level perception test on each critical frequency band to obtain an updated critical decibel;
step 4, generating an anti-noise perception sensitivity curve according to the updated critical decibels obtained in the step 3;
and 5, obtaining critical decibel values from the anti-noise perception sensitivity curve obtained in the step 4, selecting anti-noise voices with different critical decibel values, training an anti-noise voice feature mapping model, and performing voice synthesis by using the mapped anti-noise voice features.
Furthermore, in step 5, the WORLD vocoder is used to extract acoustic features, including the fundamental frequency and the spectral envelope.
In step 5, the anti-noise speech feature mapping model is obtained by training a spectrum envelope by using an EM method and a gaussian mixture model.
And moreover, based on the spectrum envelope feature conversion result obtained by the anti-noise voice feature mapping model, voice synthesis is carried out by combining the fundamental frequency feature.
The method of the invention utilizes the hearing characteristic and the special sound production mechanism of people in the noise environment to provide an anti-noise perception sensitivity curve establishment and voice synthesis method, which is more beneficial to the practical application scene of anti-noise voice conversion, has high accuracy and wide application prospect, for example, a large amount of anti-noise voice data sets are needed in the practical application of voice separation and conference transcription.
Detailed Description
The present invention will be described in further detail with reference to examples for the purpose of facilitating understanding and practice of the invention by those of ordinary skill in the art, and it is to be understood that the present invention has been described in the illustrative embodiments and is not to be construed as limited thereto.
The method provided by the invention can realize the process by using computer software technology and other hardware equipment, and the process of the invention is specifically explained below.
Example one
The embodiment of the invention provides a voice synthesis method established based on an anti-noise perception sensitivity curve, which comprises the following specific implementation steps:
step 1: dividing the noise according to the critical frequency band sensed by the human ear by using band-pass filtering to obtain a plurality of critical frequency band noises;
the noise used in the embodiment is white noise, a Bark band is used as a critical frequency band of human ear perception, and the white noise is divided according to the Bark band by using band-pass filtering.
Step 2: recording a corresponding anti-noise voice sequence according to different noise decibels aiming at each critical frequency band noise obtained in the step 1;
for step 2, this embodiment may be implemented by the following steps:
step 2.1: and (3) aiming at each Bark band noise in the step (1), acquiring data through a manual head, correspondingly adjusting each Bark band noise according to a preset signal-to-noise ratio, and calibrating the decibel level.
Considering that the common scene noise is about 35dB, the hearing pain threshold of human ear is 85dB, and the preset signal-to-noise ratio range in the embodiment is 40-85dB, that is, MIN is 40, MAX is 85, and the step length d is 5 dB. And for each Bark band noise, recording according to signal-to-noise ratios of 40 dB, 45 dB, … 80 dB and 85dB respectively to obtain corresponding voice data.
The preferred recording materials and specific settings used in the examples are as follows:
embodiments use an artificial head device for recording, such as a g.r.a.s.kemar 45BA 1/2 inch low noise ear analog system, including a highly simulated extended ear canal. In order to avoid other noises such as wall reflection and the like, various environmental noises are played in the earphone by manually wearing the earphone, and the accurate signal-to-noise ratio can be obtained by manually recording the sound by the head.
The signal-to-noise ratio is calculated in the art as follows:
wherein s (n) is a speech signal, d (n) is a noise signal, psFor the power of the speech signal, pdIs the noise signal power, where N is the sampling point and N is the sampling point length.
Step 2.2: and respectively recording voice sequences for different decibel levels according to the noise of each Bark band.
In specific implementation, each speaker can wear an earphone, the earphone plays the noise calibrated in the step 2.1, and the voice sequence of each speaker is recorded for different decibel levels aiming at the noise of each Bark band. The corresponding experiment of the embodiment scheme is carried out in a anechoic room of Wuhan university, and a high-fidelity microphone is used for recording to obtain the voice data of corresponding decibel level.
Specifically, step 1 and step 2 may be performed in advance as input data.
And step 3: determining a perception threshold value based on a Speech Intelligibility Index (SII) objective Test index, and then performing noise decibel level perception Test on each critical frequency band by using a MUSHRA (Multi-Stimulus Test with high Reference and Anchor) standard to obtain an updated critical decibel level;
in specific implementation, other objective test indexes can be adopted, for example; other criteria may also be used for testing, such as the clarity Index (AI)
For step 3, this embodiment may be implemented by the following steps:
step 3.1: the improvement is carried out based on a definition index SII, the SII depends on the audible proportion of a listener in the spectrum information, the step uses a definition formula of the SII, and the critical decibel is calculated under the condition of a determined SII score, and the definition formula of the SII is as follows:
wherein, SII score is 0-1, and 0.35 is taken for determining decibel threshold value in the embodiment; n isf20 for the total number of frequency bands; wfA human ear perception weight representing the frequency band f; l isfA variable element representing a speech level distortion; efAnd DfDecibels representing speech and interference noise, respectively;representing the audible threshold for that band.
By the formula, while the speech intelligibility is ensured, the noise signal-to-noise ratio (critical decibel) corresponding to the anti-noise speech is obtained, namely Ef-Df。
Step 3.2: fine-tuning the critical decibel value in step 3.1: noise decibel level perception experiments were performed on each Bark band noise, where hearing perception tests were performed using the MUSHRA standard, and Word Error Rate (WER) was calculated. In order to keep the recognized word sequence consistent with the standard sequence, some words are replaced, deleted, or inserted, and the total number of words is divided by the total number of words in the standard sequence, multiplied by a percentage. The final word error rate calculation formula is as follows:
the obtained error rate is a score, and the statistical significance is required to be taken as a reference, and the average score of each voice sequence is calculated firstly
Wherein, scoreijkAnd (4) representing the score of the ith listener on the kth voice under the jth signal-to-noise ratio level, wherein N is the total number of listeners in the subjective experiment. Confidence intervals for each average score were then calculated:
the confidence coefficient is 95%, and non-repeated boundary values are found by comparing confidence intervals of different signal-to-noise ratios, and the critical decibel is updated.
And 4, step 4: and (4) generating an anti-noise perception sensitivity curve according to the test result in the step (3) (the updated critical decibel obtained in the step (3.2)).
In the present embodiment, Bark band is used, so the sensitivity curve here is plotted with Bark band on the horizontal axis and Bark band noise decibel level on the vertical axis, and other frequency bands, such as Mel band, may be used to generate corresponding cancellation in specific implementation.
And 5: and 4, obtaining critical decibel values from the anti-noise perception sensitivity curve in the step 4, selecting anti-noise voices with different critical decibel values, training an anti-noise voice feature mapping model, and performing voice synthesis by using the mapped anti-noise voice features.
For step 5, this embodiment may be implemented by the following steps:
step 5.1: and selecting anti-noise voices with different critical decibel values and corresponding common voices in the anti-noise perception sensitivity curve, and extracting acoustic features such as fundamental frequency (f0) and spectral envelope (spec).
In this embodiment, the method of extracting acoustic features by using the WORLD vocoder includes:
f0=DIO(x,fs)
spec=CheapTrick(x,fs,f0)
where x is the input speech signal, fs is the sampling rate, DIO and cheaptlock are prior art in the worrld vocoder, and the present invention is not described in detail.
Step 5.2: and (5) training an anti-noise voice feature mapping model by using the acoustic features extracted in the step (5.1), and performing feature conversion by using the feature mapping model.
The anti-noise speech feature mapping model used in this embodiment is a Gaussian Mixture Model (GMM), and a maximum-Expectation algorithm (EM) is used to train the GMM corresponding to the spec in step 5.1, where the spec feature is 24-dimensional, and the GMM is not described in detail for the prior art
In this embodiment, the GMM is used as the feature mapping model, and neural network models such as CycleGAN and StarGAN may also be used.
Step 5.3: and converting the spec characteristic into spe' by using the mapping model in the step 5.2, and combining other characteristics in the step 5.1 for voice synthesis.
This step adopts WORLD vocoder to carry out speech synthesis, includes:
source=Platinum(x,f0,spec)
y=SynthesisByWORLD(source,spec')
wherein y is the synthesized voice, and Platinum and synthesized ByWORLD are the prior art of WORLD vocoder, which is not repeated in the present invention.
In the embodiment, a WORLD vocoder is preferably used for analyzing and synthesizing the voice, wherein a STRAIGHT vocoder and the like can be used for analyzing the voice, and a neural network model such as WaveNet and WaveGAN can be used for synthesizing the voice.
Example two
The second embodiment of the invention fully utilizes the auditory characteristic of people in a noise environment, provides an anti-noise perception sensitivity curve establishing method, and can provide key guidance for anti-noise voice conversion in practical application. In specific implementation, the steps 1 to 4 in the first embodiment are implemented.
In specific implementation, the method provided by the technical scheme of the invention can be used for realizing an automatic operation process by a person skilled in the art by adopting a computer software technology to carry out operations such as generating an anti-noise perception sensitivity curve, synthesizing voice and the like. The system device for operating the method, such as a computer readable storage medium storing the corresponding computer program of the technical solution of the present invention and a computer apparatus including the corresponding computer program, should also be within the scope of the present invention.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. An anti-noise perception sensitivity curve establishing method is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, dividing noise according to critical frequency bands sensed by human ears by using band-pass filtering to obtain a plurality of critical frequency band noises;
step 2, recording corresponding anti-noise voice sequences according to different noise decibels aiming at each critical frequency band noise in the step 1;
step 3, determining a perception threshold value based on the objective test index, and performing noise decibel level perception test on each critical frequency band to obtain updated critical decibels;
and 4, generating an anti-noise perception sensitivity curve according to the updated critical decibels obtained in the step 3.
2. The anti-noise perceptual sensitivity curve creation method of claim 1, wherein: in step 1, the noise is white noise.
3. The anti-noise perceptual sensitivity curve creation method of claim 1, wherein: in step 1, Bark band or Mel band is used as the critical band of human ear perception.
4. The anti-noise perceptual sensitivity curve creation method of claim 1, wherein: the implementation mode of the step 2 is that firstly, aiming at each critical frequency band noise obtained in the step 1, data is collected through a manual head, each critical frequency band noise is correspondingly adjusted according to a preset signal-to-noise ratio, and the decibel level is calibrated; and then respectively recording voice sequences for different decibel levels according to the noise of each critical frequency band.
5. The anti-noise perceptual sensitivity curve creation method of claim 4, wherein: and recording according to the preset lower limit MIN, the preset upper limit MAX and the preset step length d of the signal-to-noise ratio range and the signal-to-noise ratio of MIN, MIN + d, MIN +2d, … and MAX respectively to obtain a corresponding voice sequence.
6. The anti-noise perceptual sensitivity curve creation method of claim 1, wherein: in step 3, a perception threshold value is determined based on the SII objective test index, and a noise decibel level perception test is carried out on each critical frequency band by adopting the MUSHRA standard.
7. A speech synthesis method based on anti-noise perception sensitivity curve establishment is characterized in that: comprises the following steps of (a) carrying out,
step 1, dividing noise according to critical frequency bands sensed by human ears by using band-pass filtering to obtain a plurality of critical frequency band noises;
step 2, recording corresponding anti-noise voice sequences according to different noise decibels aiming at each critical frequency band noise in the step 1;
step 3, determining a perception threshold value based on the objective test index, and performing noise decibel level perception test on each critical frequency band to obtain updated critical decibels;
step 4, generating an anti-noise perception sensitivity curve according to the updated critical decibels obtained in the step 3;
and 5, obtaining critical decibel values from the anti-noise perception sensitivity curve obtained in the step 4, selecting anti-noise voices with different critical decibel values, training an anti-noise voice feature mapping model, and performing voice synthesis by using the mapped anti-noise voice features.
8. The method of speech synthesis based on antinoise perceptual sensitivity curve creation as defined in claim 7, wherein: in step 5, a WORLD vocoder is used to extract acoustic features, including fundamental frequency and spectral envelope.
9. A speech synthesis method based on an antinoise perceptual sensitivity curve according to claim 8, characterized in that: in step 5, the anti-noise voice feature mapping model is obtained by adopting a Gaussian mixture model and using an EM (effective noise ratio) method to train the spectrum envelope.
10. A speech synthesis method based on an antinoise perceptual sensitivity curve according to claim 9, characterized in that: and performing voice synthesis by combining the fundamental frequency characteristics based on the spectrum envelope characteristic conversion result obtained by the anti-noise voice characteristic mapping model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010686375.3A CN112037759B (en) | 2020-07-16 | 2020-07-16 | Anti-noise perception sensitivity curve establishment and voice synthesis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010686375.3A CN112037759B (en) | 2020-07-16 | 2020-07-16 | Anti-noise perception sensitivity curve establishment and voice synthesis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112037759A true CN112037759A (en) | 2020-12-04 |
CN112037759B CN112037759B (en) | 2022-08-30 |
Family
ID=73579514
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010686375.3A Active CN112037759B (en) | 2020-07-16 | 2020-07-16 | Anti-noise perception sensitivity curve establishment and voice synthesis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112037759B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450780A (en) * | 2021-06-16 | 2021-09-28 | 武汉大学 | Lombard effect classification method for auditory perception loudness space |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1460992A (en) * | 2003-07-01 | 2003-12-10 | 北京阜国数字技术有限公司 | Low-time-delay adaptive multi-resolution filter group for perception voice coding/decoding |
US20040024591A1 (en) * | 2001-10-22 | 2004-02-05 | Boillot Marc A. | Method and apparatus for enhancing loudness of an audio signal |
US20110178799A1 (en) * | 2008-07-25 | 2011-07-21 | The Board Of Trustees Of The University Of Illinois | Methods and systems for identifying speech sounds using multi-dimensional analysis |
CN103165136A (en) * | 2011-12-15 | 2013-06-19 | 杜比实验室特许公司 | Audio processing method and audio processing device |
CN103390408A (en) * | 2012-05-09 | 2013-11-13 | 奥迪康有限公司 | Method and apparatus for processing audio signal |
CN105869652A (en) * | 2015-01-21 | 2016-08-17 | 北京大学深圳研究院 | Psychological acoustic model calculation method and device |
US20190156855A1 (en) * | 2016-05-11 | 2019-05-23 | Nuance Communications, Inc. | Enhanced De-Esser For In-Car Communication Systems |
CN110085245A (en) * | 2019-04-09 | 2019-08-02 | 武汉大学 | A kind of speech intelligibility Enhancement Method based on acoustic feature conversion |
-
2020
- 2020-07-16 CN CN202010686375.3A patent/CN112037759B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040024591A1 (en) * | 2001-10-22 | 2004-02-05 | Boillot Marc A. | Method and apparatus for enhancing loudness of an audio signal |
CN1460992A (en) * | 2003-07-01 | 2003-12-10 | 北京阜国数字技术有限公司 | Low-time-delay adaptive multi-resolution filter group for perception voice coding/decoding |
US20110178799A1 (en) * | 2008-07-25 | 2011-07-21 | The Board Of Trustees Of The University Of Illinois | Methods and systems for identifying speech sounds using multi-dimensional analysis |
CN103165136A (en) * | 2011-12-15 | 2013-06-19 | 杜比实验室特许公司 | Audio processing method and audio processing device |
CN103390408A (en) * | 2012-05-09 | 2013-11-13 | 奥迪康有限公司 | Method and apparatus for processing audio signal |
CN105869652A (en) * | 2015-01-21 | 2016-08-17 | 北京大学深圳研究院 | Psychological acoustic model calculation method and device |
US20190156855A1 (en) * | 2016-05-11 | 2019-05-23 | Nuance Communications, Inc. | Enhanced De-Esser For In-Car Communication Systems |
CN110085245A (en) * | 2019-04-09 | 2019-08-02 | 武汉大学 | A kind of speech intelligibility Enhancement Method based on acoustic feature conversion |
Non-Patent Citations (5)
Title |
---|
G. LI等: "Normal-To-Lombard Speech Conversion by LSTM Network and BGMM for Intelligibility Enhancement of Telephone Speech", 《2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)》 * |
S. SESHADRI等: "Vocal Effort Based Speaking Style Conversion Using Vocoder Features and Parallel Learning", 《IEEE ACCESS》 * |
田斌等: "一种用于强噪声环境下语音识别的含噪Lombard及Loud语音补偿方法", 《声学学报(中文版)》 * |
田斌等: "一种用于强噪声环境下语音识别的含噪Lombard及Loud语音补偿方法", 《声学学报》 * |
陈胜等: "基于人耳感知掩蔽效应的子空间语音增强算法研究", 《电子质量》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450780A (en) * | 2021-06-16 | 2021-09-28 | 武汉大学 | Lombard effect classification method for auditory perception loudness space |
CN113450780B (en) * | 2021-06-16 | 2023-02-24 | 武汉大学 | Lombard effect classification method for auditory perception loudness space |
Also Published As
Publication number | Publication date |
---|---|
CN112037759B (en) | 2022-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5737719A (en) | Method and apparatus for enhancement of telephonic speech signals | |
US9943253B2 (en) | System and method for improved audio perception | |
Humes et al. | Application of the Articulation Index and the Speech Transmission Index to the recognition of speech by normal-hearing and hearing-impaired listeners | |
US8369549B2 (en) | Hearing aid system adapted to selectively amplify audio signals | |
US8867764B1 (en) | Calibrated hearing aid tuning appliance | |
CN109246515B (en) | A kind of intelligent earphone and method promoting personalized sound quality function | |
CN107293286B (en) | Voice sample collection method based on network dubbing game | |
US20140309549A1 (en) | Methods for testing hearing | |
Boothroyd et al. | The hearing aid input: A phonemic approach to assessing the spectral distribution of speech | |
Marzinzik | Noise reduction schemes for digital hearing aids and their use for the hearing impaired | |
US6956955B1 (en) | Speech-based auditory distance display | |
Kates et al. | The hearing-aid audio quality index (HAAQI) | |
Monson et al. | The maximum audible low-pass cutoff frequency for speech | |
CN112037759B (en) | Anti-noise perception sensitivity curve establishment and voice synthesis method | |
WO2022240346A1 (en) | Voice optimization in noisy environments | |
KR100888049B1 (en) | A method for reinforcing speech using partial masking effect | |
DK2584795T3 (en) | Method for determining a compression characteristic | |
Herzke et al. | Effects of instantaneous multiband dynamic compression on speech intelligibility | |
CN113450780B (en) | Lombard effect classification method for auditory perception loudness space | |
Salehi et al. | Electroacoustic assessment of wireless remote microphone systems | |
CN114205724B (en) | Hearing aid earphone debugging method, device and equipment | |
Bouserhal et al. | On the potential for artificial bandwidth extension of bone and tissue conducted speech: A mutual information study | |
Patel et al. | Frequency-based multi-band adaptive compression for hearing aid application | |
JP7404664B2 (en) | Audio processing device and audio processing method | |
RU2589298C1 (en) | Method of increasing legible and informative audio signals in the noise situation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |