EP1168306A2 - Procédé et dispositif pour améliorer l'intelligibilité de signaux vocaux comprimés numériquement - Google Patents

Procédé et dispositif pour améliorer l'intelligibilité de signaux vocaux comprimés numériquement Download PDF

Info

Publication number
EP1168306A2
EP1168306A2 EP01304339A EP01304339A EP1168306A2 EP 1168306 A2 EP1168306 A2 EP 1168306A2 EP 01304339 A EP01304339 A EP 01304339A EP 01304339 A EP01304339 A EP 01304339A EP 1168306 A2 EP1168306 A2 EP 1168306A2
Authority
EP
European Patent Office
Prior art keywords
frame
frames
amplitude
sound
speech signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01304339A
Other languages
German (de)
English (en)
Other versions
EP1168306A3 (fr
Inventor
Paul Roller Michaelis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avaya Technology LLC
Original Assignee
Avaya Technology LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Avaya Technology LLC filed Critical Avaya Technology LLC
Publication of EP1168306A2 publication Critical patent/EP1168306A2/fr
Publication of EP1168306A3 publication Critical patent/EP1168306A3/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques

Definitions

  • the invention relates generally to speech processing and, more particularly, to techniques for enhancing the intelligibility of processed speech.
  • Human speech generally has a relatively large dynamic range.
  • the amplitudes of some consonant sounds e.g., the unvoiced consonants P, T, S, and F
  • the consonant sounds are often 30 dB lower than the amplitudes of vowel sounds in the same spoken sentence. Therefore, the consonant sounds will sometimes drop below a listener's speech detection threshold, thus compromising the intelligibility of the speech. This problem is exacerbated when the listener is hard of hearing, the listener is located in a noisy environment, or the listener is located in an area that receives a low signal strength.
  • amplitude compression on the signal.
  • the amplitude peaks of a speech signal were clipped and the resulting signal was amplified so that the difference between the peaks of the new signal and the low portions of the new signal would be reduced while maintaining the signal's original loudness.
  • Amplitude compression often leads to other forms of distortion within the resultant signal, such as the harmonic distortion resulting from flattening out the high amplitude components of the signal.
  • amplitude compression techniques tend to amplify some undesired low-level signal components (e.g., background noise) in an inappropriate manner, thus compromising the quality of the resultant signal.
  • the present invention relates to a system that is capable of significantly enhancing the intelligibility of processed speech.
  • the system first divides the speech signal into frames or segments as is commonly performed in certain low bit rate speech encoding algorithms, such as Linear Predictive Coding (LPC) and Code Excited Linear Prediction (CELP).
  • LPC Linear Predictive Coding
  • CELP Code Excited Linear Prediction
  • the system analyzes the spectral content of each frame to determine a sound type associated with that frame.
  • the analysis of each frame will typically be performed in the context of one or more other frames surrounding the frame of interest. The analysis may determine, for example, whether the sound associated with the frame is a vowel sound, a voiced fricative, or an unvoiced plosive.
  • the system will then modify the frame if it is believed that such modification will enhance intelligibility. For example, it is known that unvoiced plosive sounds commonly have lower amplitudes than other sounds within human speech. The amplitudes of frames identified as including unvoiced plosives are therefore boosted with respect to other frames.
  • the system may also modify frames surrounding that particular frame based on the sound type associated with the frame.
  • a frame of interest is identified as including an unvoiced plosive
  • the amplitude of the frame preceding this frame of interest can be reduced to ensure that the plosive isn't mistaken for a spectrally similar fricative.
  • the present invention relates to a system that is capable of significantly enhancing the intelligibility of processed speech.
  • the system determines a sound type associated with individual frames of a speech signal and modifies those frames based on the corresponding sound type.
  • the inventive principles are implemented as an enhancement to well-known speech encoding algorithms, such as the LPC and CELP algorithms, that perform frame-based speech digitization.
  • the system is capable of improving the intelligibility of speech signals without generating the distortions often associated with prior art amplitude clipping techniques.
  • the inventive principles can be used in a variety of speech applications including, for example, messaging systems, IVR applications, and wireless telephone systems.
  • the inventive principles can also be implemented in devices designed to aid the hard of hearing such as, for example, hearing aids and cochlear implants.
  • Fig. 1 is a block diagram illustrating a speech processing system 10 in accordance with one embodiment of the present invention.
  • the speech processing system 10 receives an analog speech signal at an input port 12 and converts this signal to a compressed digital speech signal which is output at an output port 14. In addition to performing signal compression and analog to digital conversion functions on the input signal, the system 10 also enhances the intelligibility of the input signal for later playback.
  • the speech processing system 10 includes: an analog to digital (A/D) converter 16, a frame separation unit 18, a frame analysis unit 20, a frame modification unit 22, and a compression unit 24.
  • A/D analog to digital
  • the blocks illustrated in Fig. 1 are functional in nature and do not necessarily correspond to discrete hardware elements. In one embodiment, for example, the speech processing system 10 is implemented within a single digital processing device. Hardware implementations, however, are also possible.
  • the analog speech signal received at port 12 is first sampled and digitized within the A/D converter 16 to generate a digital waveform for delivery to the frame separation unit 18.
  • the frame separation unit 18 is operative for dividing the digital waveform into individual time-based frames. In a preferred approach, these frames are each about 20 to 25 milliseconds in length.
  • the frame analysis unit 20 receives the frames from the frame separation unit 18 and performs a spectral analysis on each individual frame to determine a spectral content of the frame.
  • the frame analysis unit 20 then transfers each frame's spectral information to the frame modification unit 22.
  • the frame modification unit 22 uses the results of the spectral analysis to determine a sound type (or type of speech) associated with each individual frame.
  • the frame modification unit 22 modifies selected frames based on the identified sound types.
  • the frame modification unit 22 will normally analyze the spectral information corresponding to a frame of interest and also the spectral information corresponding to one or more frames surrounding the frame of interest to determine a sound type associated with the frame of interest.
  • the frame modification unit 22 includes a set of rules for modifying selected frames based on the sound type associated therewith.
  • the frame modification unit 22 also includes rules for modifying frames surrounding a frame of interest based on the sound type associated with the frame of interest.
  • the rules used by the frame modification unit 22 are designed to increase the intelligibility of the output signal generated by the system 10. Thus, the modifications are intended to emphasize the characteristics of particular sounds that allow those sounds to be distinguished from other similar sounds by the human ear. Many of the frames may remain unmodified by the frame modification unit 22 depending upon the specific rules programmed therein.
  • the modified and unmodified frame information is next transferred to the data assembly unit 24 which assembles the spectral information for all of the frames to generate the compressed output signal at output port 14.
  • the compressed output signal can then be transferred to a remote location via a communication medium or stored for later decoding and playback. It should be appreciated that the intelligibility enhancement functions of the frame modification unit 22 of Fig. 1 can alternatively (or additionally) be performed as part of the decoding process during signal playback.
  • the inventive principles are implemented as an enhancement to certain well-known speech encoding and/or decoding algorithms, such as the Linear Predictive Coding (LPC) algorithm and the Code-Excited Linear Prediction (CELP) algorithm.
  • LPC Linear Predictive Coding
  • CELP Code-Excited Linear Prediction
  • the inventive principles can be used in conjunction with virtually any encoding or decoding algorithm that is based upon frame-based speech digitization (i.e., breaking up speech into individual time-based frames and then capturing the spectral content of each frame to generate a digital representation of the speech).
  • these algorithms utilize a mathematical model of human vocal tract physiology to describe each frame's spectral content in terms of human speech mechanism analogs, such as overall amplitude, whether the frame's sound is voiced or unvoiced, and, if the sound is voiced, the pitch of the sound. This spectral information is then assembled into a compressed digital speech signal.
  • speech digitization algorithms that can be modified in accordance with the present invention can be found in the paper "Speech Digitization and Compression" by Paul Michaelis, International Encyclopedia of Ergonomics and Human Factors, edited by Waldamar Karwowski, published by Taylor & Francis, London, 2000.
  • the spectral information generated within such algorithms is used to determine a sound type associated with each frame. Knowledge about which sound types are important for intelligibility and are typically harder to hear is then used to develop rules for modifying the frame information in a manner that increases intelligibility. The rules are then used to modify the frame information of selected frames based on the determined sound type. The spectral information for each of the frames, whether modified or unmodified, is then used to develop the compressed speech signal in a conventional manner (e.g., the manner typically used by the LPC, CELP, or other similar algorithms).
  • Fig. 2 is a flowchart illustrating a method for processing an analog speech signal in accordance with one embodiment of the present invention.
  • the speech signal is digitized and separated into individual frames (step 30).
  • a spectral analysis is then performed on each individual frame to determine a spectral content of the frame (step 32).
  • spectral parameters such as amplitude, voicing, and pitch (if any) of sounds will be measured during the spectral analysis.
  • the spectral content of the frames is next analyzed to determine a sound type associated with each frame (step 34). To determine the sound type associated with a particular frame, the spectral content of other frames surrounding the particular frame will often be considered.
  • information corresponding to the frame may be modified to improve the intelligibility of the output signal (step 36).
  • Information corresponding to frames surrounding a frame of interest may also be modified based on the sound type of the frame of interest.
  • the modification of the frame information will include boosting or reducing the amplitude of the corresponding frame.
  • other modification techniques are also possible.
  • the reflection coefficients that govern spectral filtering can be modified in accordance with the present invention.
  • the spectral information corresponding to the frames, whether modified or unmodified, is then assembled into a compressed speech signal (step 38). This compressed speech signal can later be decoded to generate an audible speech signal having enhanced intelligibility.
  • Figs. 3 and 4 are portions of a flowchart illustrating a method for use in enhancing the intelligibility of speech signals in accordance with one embodiment of the present invention.
  • the method is operative for identifying unvoiced fricatives and voiced and unvoiced plosives within a speech signal and for adjusting the amplitudes of corresponding frames of the speech signal to enhance intelligibility.
  • Unvoiced fricatives and unvoiced plosives are sounds that are typically lower in volume in a speech signal than other sounds in the signal. In addition, these sounds are usually very important to the intelligibility of the underlying speech.
  • a voiced speech sound is one that is produced by tensing the vocal cords while exhaling, thus giving the sound a specific pitch caused by vocal cord vibration.
  • the spectrum of a voiced speech sound therefore includes a fundamental pitch and harmonics thereof.
  • An unvoiced speech sound is one that is produced by audible turbulence in the vocal tract and for which the vocal cords remain relaxed.
  • the spectrum of an unvoiced speech signal is typically similar to that of white noise.
  • an analog speech signal is first received (step 50) and then digitized (step 52).
  • the digital waveform is then separated into individual frames (step 54). In a preferred approach, these frames are each about 20 to 25 milliseconds in length.
  • a frame-by-frame analysis is then performed to extract and encode data from the frames, such as amplitude, voicing, pitch, and spectral filtering data (step 56).
  • the amplitude of that frame is increased in a manner that is designed to increase the likelihood that the loudness of the sound in a resulting speech signal exceeds a listener's detection threshold (step 58).
  • the amplitude of the frame can be increased, for example, by a predetermined gain value, to a predetermined amplitude value, or the amplitude can be increased by an amount that depends upon the amplitudes of the other frames within the same speech signal.
  • a fricative sound is produced by forcing air from the lungs through a constriction in the vocal tract that generates audible turbulence. Examples of unvoiced fricatives include the "f" in fat, the "s" in sat, and the "ch” in chat. Fricative sounds are characterized by a relatively constant amplitude over multiple sample periods. Thus, an unvoiced fricative can be identified by comparing the amplitudes of multiple successive frames after a decision has been made that the frames correspond to unvoiced sounds.
  • the amplitude of the frame preceding the voiced plosive is reduced (step 60).
  • a plosive is a sound that is produced by the complete stoppage and then sudden release of the breath. Plosive sounds are thus characterized by a sudden drop in amplitude followed by a sudden rise in amplitude within a speech signal.
  • An example of voiced plosives includes the "b" in bait, the "d” in date, and the "g” in gate. Plosives are identified within a speech signal by comparing the amplitudes of adjacent frames in the signal. By decreasing the amplitude of the frame preceding the voiced plosive, the amplitude "spike” that characterizes plosive sounds is accentuated, resulting in enhanced intelligibility.
  • the amplitude of the frame preceding the unvoiced plosive is decreased and the amplitude on the frame including the unvoiced plosive is increased (step 62).
  • the amplitude of the frame preceding the unvoiced plosive is decreased to emphasize the amplitude "spike" of the plosive as described above.
  • the amplitude of the frame including the initial component of the unvoiced plosive is increased to increase the likelihood that the loudness of the sound in a resulting speech signal exceeds a listener's detection threshold.
  • a frame-by-frame reconstruction of the digital waveform is next performed using, for example, the amplitude, voicing, pitch, and spectral filtering data (step 64).
  • the individual frames are then concatenated into a complete digital sequence (step 66).
  • a digital to analog conversion is then performed to generate an analog output signal (step 68).
  • the method illustrated in Figs. 3 and 4 can be performed all at one time as part of a real-time intelligibility enhancement procedure or it can be performed in multiple sub-procedures at different times. For example, if the method is implemented within a hearing aid, the entire method will be used to transform an input analog speech signal into an enhanced output analog speech signal for detection by a user of the hearing aid.
  • steps 50 through 62 may be performed as part of a speech signal encoding procedure while steps 64 through 68 are performed as part of a subsequent speech signal decoding procedure.
  • steps 50 through 56 are performed as part of a speech signal encoding procedure while steps 58 through 68 are performed as part of a subsequent speech decoding procedure.
  • the speech signal can be stored within a memory unit or be transferred between remote locations via a communication channel.
  • steps 50 through 56 are performed using well-known LPC or CELP encoding techniques.
  • steps 64 through 68 are preferably performed using well-known LPC or CELP decoding techniques.
  • the inventive principles can be used to enhance the intelligibility of other sound types.
  • a particular type of sound presents an intelligibility problem
  • the modification will include a simple boosting of the amplitude of the corresponding frame, although other types of frame modification are also possible in accordance with the present invention (e.g., modifications to the reflection coefficients that govern spectral filtering).
  • compressed speech signals generated using the inventive principles can usually be decoded using conventional decoders (e.g., LPC of CELP decoders) that have not been modified in accordance with the invention.
  • decoders that have been modified in accordance with the present invention can also be used to decode compressed speech signals that were generated without using the principles of the present invention.
  • systems using the inventive techniques can be upgraded piecemeal in an economical fashion without concern about widespread signal incompatibility within the system.
EP01304339A 2000-06-01 2001-05-16 Procédé et dispositif pour améliorer l'intelligibilité de signaux vocaux comprimés numériquement Withdrawn EP1168306A3 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/586,183 US6889186B1 (en) 2000-06-01 2000-06-01 Method and apparatus for improving the intelligibility of digitally compressed speech
US586183 2000-06-01

Publications (2)

Publication Number Publication Date
EP1168306A2 true EP1168306A2 (fr) 2002-01-02
EP1168306A3 EP1168306A3 (fr) 2002-10-02

Family

ID=24344649

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01304339A Withdrawn EP1168306A3 (fr) 2000-06-01 2001-05-16 Procédé et dispositif pour améliorer l'intelligibilité de signaux vocaux comprimés numériquement

Country Status (4)

Country Link
US (1) US6889186B1 (fr)
EP (1) EP1168306A3 (fr)
JP (1) JP3875513B2 (fr)
CA (1) CA2343661C (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1901286A2 (fr) * 2006-09-13 2008-03-19 Fujitsu Limited Appareil d'amélioration de la parole, appareil d'enregistrement de la parole, programme d'amélioration de la parole, programme d'enregistrement de la parole, procédé d'amélioration de la parole et procédé d'enregistrement de la parole
CN101023469B (zh) * 2004-07-28 2011-08-31 日本福年株式会社 数字滤波方法和装置
GB2514662A (en) * 2013-09-18 2014-12-03 Imagination Tech Ltd Voice data transmission with adaptive redundancy
EP3038106A1 (fr) * 2014-12-24 2016-06-29 Nxp B.V. Amélioration d'un signal audio

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7454331B2 (en) * 2002-08-30 2008-11-18 Dolby Laboratories Licensing Corporation Controlling loudness of speech in signals that contain speech and other types of audio material
JP4178319B2 (ja) * 2002-09-13 2008-11-12 インターナショナル・ビジネス・マシーンズ・コーポレーション 音声処理におけるフェーズ・アライメント
JP2004297273A (ja) * 2003-03-26 2004-10-21 Kenwood Corp 音声信号雑音除去装置、音声信号雑音除去方法及びプログラム
EP1629463B1 (fr) * 2003-05-28 2007-08-22 Dolby Laboratories Licensing Corporation PROCEDE, APPAREIL ET PROGRAMME INFORMATIQUE POUR LE CALCUL ET LE REGLAGE DE LA FORCE SONORE PERçUE D'UN SIGNAL SONORE
US7539614B2 (en) * 2003-11-14 2009-05-26 Nxp B.V. System and method for audio signal processing using different gain factors for voiced and unvoiced phonemes
US7660715B1 (en) 2004-01-12 2010-02-09 Avaya Inc. Transparent monitoring and intervention to improve automatic adaptation of speech models
CN101048935B (zh) 2004-10-26 2011-03-23 杜比实验室特许公司 控制音频信号的单位响度或部分单位响度的方法和设备
US8199933B2 (en) 2004-10-26 2012-06-12 Dolby Laboratories Licensing Corporation Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal
US7892648B2 (en) * 2005-01-21 2011-02-22 International Business Machines Corporation SiCOH dielectric material with improved toughness and improved Si-C bonding
JP4644876B2 (ja) * 2005-01-28 2011-03-09 株式会社国際電気通信基礎技術研究所 音声処理装置
AU2006237133B2 (en) * 2005-04-18 2012-01-19 Basf Se Preparation containing at least one conazole fungicide a further fungicide and a stabilising copolymer
US7529670B1 (en) 2005-05-16 2009-05-05 Avaya Inc. Automatic speech recognition system for people with speech-affecting disabilities
US7653543B1 (en) 2006-03-24 2010-01-26 Avaya Inc. Automatic signal adjustment based on intelligibility
EP2002426B1 (fr) * 2006-04-04 2009-09-02 Dolby Laboratories Licensing Corporation Mesure et modification de la sonie d'un signal audio dans le domaine mdct
TWI517562B (zh) 2006-04-04 2016-01-11 杜比實驗室特許公司 用於將多聲道音訊信號之全面感知響度縮放一期望量的方法、裝置及電腦程式
ATE493794T1 (de) 2006-04-27 2011-01-15 Dolby Lab Licensing Corp Tonverstärkungsregelung mit erfassung von publikumsereignissen auf der basis von spezifischer lautstärke
US8185383B2 (en) * 2006-07-24 2012-05-22 The Regents Of The University Of California Methods and apparatus for adapting speech coders to improve cochlear implant performance
US8725499B2 (en) * 2006-07-31 2014-05-13 Qualcomm Incorporated Systems, methods, and apparatus for signal change detection
US7962342B1 (en) 2006-08-22 2011-06-14 Avaya Inc. Dynamic user interface for the temporarily impaired based on automatic analysis for speech patterns
US7925508B1 (en) 2006-08-22 2011-04-12 Avaya Inc. Detection of extreme hypoglycemia or hyperglycemia based on automatic analysis of speech patterns
JP4940308B2 (ja) 2006-10-20 2012-05-30 ドルビー ラボラトリーズ ライセンシング コーポレイション リセットを用いるオーディオダイナミクス処理
US8521314B2 (en) * 2006-11-01 2013-08-27 Dolby Laboratories Licensing Corporation Hierarchical control path with constraints for audio dynamics processing
US7675411B1 (en) 2007-02-20 2010-03-09 Avaya Inc. Enhancing presence information through the addition of one or more of biotelemetry data and environmental data
US8041344B1 (en) 2007-06-26 2011-10-18 Avaya Inc. Cooling off period prior to sending dependent on user's state
WO2009011827A1 (fr) 2007-07-13 2009-01-22 Dolby Laboratories Licensing Corporation Traitement audio utilisant une analyse de scène auditive et une asymétrie spectrale
US20090282228A1 (en) 2008-05-06 2009-11-12 Avaya Inc. Automated Selection of Computer Options
JP5239594B2 (ja) * 2008-07-30 2013-07-17 富士通株式会社 クリップ検出装置及び方法
US8401856B2 (en) 2010-05-17 2013-03-19 Avaya Inc. Automatic normalization of spoken syllable duration
US9082414B2 (en) * 2011-09-27 2015-07-14 General Motors Llc Correcting unintelligible synthesized speech
US9161136B2 (en) 2012-08-08 2015-10-13 Avaya Inc. Telecommunications methods and systems providing user specific audio optimization
US9031836B2 (en) 2012-08-08 2015-05-12 Avaya Inc. Method and apparatus for automatic communications system intelligibility testing and optimization
US10176824B2 (en) 2014-03-04 2019-01-08 Indian Institute Of Technology Bombay Method and system for consonant-vowel ratio modification for improving speech perception
JP6481271B2 (ja) * 2014-07-07 2019-03-13 沖電気工業株式会社 音声復号化装置、音声復号化方法、音声復号化プログラム及び通信機器
JP6144719B2 (ja) * 2015-05-12 2017-06-07 株式会社日立製作所 超音波診断装置
KR20210072384A (ko) * 2019-12-09 2021-06-17 삼성전자주식회사 전자 장치 및 이의 제어 방법

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0076687A1 (fr) * 1981-10-05 1983-04-13 Signatron, Inc. Procédé et dispositif pour améliorer l'intelligibilité de la parole
US4468804A (en) * 1982-02-26 1984-08-28 Signatron, Inc. Speech enhancement techniques
EP0140249A1 (fr) * 1983-10-13 1985-05-08 Texas Instruments Incorporated Analyse et synthèse de la parole avec normalisation de l'énergie
EP0360265A2 (fr) * 1988-09-21 1990-03-28 Nec Corporation Système de transmission capable de modifier la qualité de la parole par classement des signaux de paroles

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US4852170A (en) * 1986-12-18 1989-07-25 R & D Associates Real time computer speech recognition system
JPH075898A (ja) * 1992-04-28 1995-01-10 Technol Res Assoc Of Medical & Welfare Apparatus 音声信号処理装置と破裂性抽出装置
JPH10124089A (ja) * 1996-10-24 1998-05-15 Sony Corp 音声信号処理装置及び方法、並びに、音声帯域幅拡張装置及び方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0076687A1 (fr) * 1981-10-05 1983-04-13 Signatron, Inc. Procédé et dispositif pour améliorer l'intelligibilité de la parole
US4468804A (en) * 1982-02-26 1984-08-28 Signatron, Inc. Speech enhancement techniques
EP0140249A1 (fr) * 1983-10-13 1985-05-08 Texas Instruments Incorporated Analyse et synthèse de la parole avec normalisation de l'énergie
EP0360265A2 (fr) * 1988-09-21 1990-03-28 Nec Corporation Système de transmission capable de modifier la qualité de la parole par classement des signaux de paroles

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101023469B (zh) * 2004-07-28 2011-08-31 日本福年株式会社 数字滤波方法和装置
EP1901286A2 (fr) * 2006-09-13 2008-03-19 Fujitsu Limited Appareil d'amélioration de la parole, appareil d'enregistrement de la parole, programme d'amélioration de la parole, programme d'enregistrement de la parole, procédé d'amélioration de la parole et procédé d'enregistrement de la parole
EP1901286A3 (fr) * 2006-09-13 2008-07-30 Fujitsu Limited Appareil d'amélioration de la parole, appareil d'enregistrement de la parole, programme d'amélioration de la parole, programme d'enregistrement de la parole, procédé d'amélioration de la parole et procédé d'enregistrement de la parole
CN101145346B (zh) * 2006-09-13 2010-10-13 富士通株式会社 语音增强设备和语音记录设备及方法
US8190432B2 (en) 2006-09-13 2012-05-29 Fujitsu Limited Speech enhancement apparatus, speech recording apparatus, speech enhancement program, speech recording program, speech enhancing method, and speech recording method
GB2514662A (en) * 2013-09-18 2014-12-03 Imagination Tech Ltd Voice data transmission with adaptive redundancy
GB2514662B (en) * 2013-09-18 2015-08-05 Imagination Tech Ltd Voice data transmission with adaptive redundancy
US11502973B2 (en) 2013-09-18 2022-11-15 Imagination Technologies Limited Voice data transmission with adaptive redundancy
EP3038106A1 (fr) * 2014-12-24 2016-06-29 Nxp B.V. Amélioration d'un signal audio
US20160189707A1 (en) * 2014-12-24 2016-06-30 Nxp B.V. Speech processing
US9779721B2 (en) * 2014-12-24 2017-10-03 Nxp B.V. Speech processing using identified phoneme clases and ambient noise

Also Published As

Publication number Publication date
JP3875513B2 (ja) 2007-01-31
CA2343661C (fr) 2009-01-06
US6889186B1 (en) 2005-05-03
CA2343661A1 (fr) 2001-12-01
JP2002014689A (ja) 2002-01-18
EP1168306A3 (fr) 2002-10-02

Similar Documents

Publication Publication Date Title
US6889186B1 (en) Method and apparatus for improving the intelligibility of digitally compressed speech
CN111179954B (zh) 用于降低时域解码器中的量化噪声的装置和方法
US8140326B2 (en) Systems and methods for reducing speech intelligibility while preserving environmental sounds
KR101046147B1 (ko) 디지털 오디오 신호의 고품질 신장 및 압축을 제공하기위한 시스템 및 방법
JP4222951B2 (ja) 紛失フレームを取扱うための音声通信システムおよび方法
US8401856B2 (en) Automatic normalization of spoken syllable duration
EP0993670B1 (fr) Procede et appareil d'amelioration de qualite de son vocal dans un systeme de communication par son vocal
KR100905585B1 (ko) 음성신호의 대역폭 확장 제어 방법 및 장치
DE69730779T2 (de) Verbesserungen bei oder in Bezug auf Sprachkodierung
US8326610B2 (en) Producing phonitos based on feature vectors
WO2002065457A2 (fr) Systeme de codage vocal comportant un classifieur musical
US6983242B1 (en) Method for robust classification in speech coding
US6240381B1 (en) Apparatus and methods for detecting onset of a signal
EP0140249B1 (fr) Analyse et synthèse de la parole avec normalisation de l'énergie
EP1609134A1 (fr) Systeme sonore ameliorant l'intelligibilite de la parole
JP3354252B2 (ja) 音声認識装置
US5897614A (en) Method and apparatus for sibilant classification in a speech recognition system
WO2009055718A1 (fr) Production de phonitos basée sur des vecteurs de particularité
GB2343822A (en) Using LSP to alter frequency characteristics of speech
Kaiser et al. Impact of the gsm amr codec on automatic vowel formant measurement in praat and voicesauce
Fulop et al. Signal Processing in Speech and Hearing Technology
Xie Removing redundancy in speech by modeling forward masking
Viswanathan et al. Medium and low bit rate speech transmission
JPH0619499A (ja) 有声/無声判定回路

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

AKX Designation fees paid

Designated state(s): DE FR GB

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20030403