JP4551215B2 - 音声の聴覚明瞭度分析を実施する方法 - Google Patents
音声の聴覚明瞭度分析を実施する方法 Download PDFInfo
- Publication number
- JP4551215B2 JP4551215B2 JP2004517988A JP2004517988A JP4551215B2 JP 4551215 B2 JP4551215 B2 JP 4551215B2 JP 2004517988 A JP2004517988 A JP 2004517988A JP 2004517988 A JP2004517988 A JP 2004517988A JP 4551215 B2 JP4551215 B2 JP 4551215B2
- Authority
- JP
- Japan
- Prior art keywords
- power
- speech
- pronunciation
- frequency
- clear
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 claims description 27
- 238000001228 spectrum Methods 0.000 claims description 8
- 230000005236 sound signal Effects 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims 2
- 238000001303 quality assessment method Methods 0.000 abstract description 18
- 238000013441 quality evaluation Methods 0.000 description 8
- 230000008447 perception Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/69—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for evaluating synthetic or decoded voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Telephone Function (AREA)
- Electrically Operated Instructional Devices (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Monitoring And Testing Of Transmission In General (AREA)
Description
本発明の特徴、態様、および利点は、以下の記述、添付の請求項、および付随する図面に関してより良く理解されるであろう。
Claims (15)
- 音声の聴覚明瞭度分析を実施する方法であって、
音声信号s(t)について明瞭発音電力(PA)と非明瞭発音電力(PNA)とを比較する工程を含み、前記明瞭発音電力および前記非明瞭発音電力が、前記音声信号の明瞭発音周波数に関連付けられた電力および前記音声信号の非明瞭発音周波数に関連付けられた電力であり、前記明瞭発音周波数および前記非明瞭発音周波数は各々、前記音声信号(s(t))をフィルタリングし処理することで得られる複数の臨界帯域信号から得られた複数の包絡線(a i (t))のそれぞれのフレーム(m)にフーリエ変換を実施して生成された変調スペクトル(A i (m,f))の周波数(f)に対応し、さらに、
前記明瞭発音電力と前記非明瞭発音電力との前記比較に基づいて、音声品質を評価する工程とを含み、
音声品質を評価する前記工程が、
前記明瞭発音電力と前記非明瞭発音電力との前記比較を使用して、前記音声信号における複数のフレームmの各々に対する局部音声品質(LSQ(m))を決定する工程を含み、
- 前記明瞭発音周波数が、約2〜12.5Hzである、請求項1に記載の方法。
- 前記明瞭発音周波数が、人間の明瞭発音速度に対応する、請求項1に記載の方法。
- 前記非明瞭発音周波数が、前記明瞭発音周波数より高い、請求項1に記載の方法。
- 前記明瞭発音電力と前記非明瞭発音電力との前記比較が、前記明瞭発音電力と前記非明瞭発音電力との比(ANR(m、i))である、請求項1に記載の方法。
- 前記明瞭発音電力と前記非明瞭発音電力との前記比較が、前記明瞭発音電力と前記非明瞭発音電力との差である、請求項1に記載の方法。
- 前記全体の音声品質が、対数電力Psを使用してさらに決定される、請求項9に記載の方法。
- 全体の音声品質が、対数電力Psを使用して決定される、請求項1に記載の方法。
- 前記比較する工程が、
複数の包絡線ai(t)の各々のフレームm上の複数の臨界帯域信号から得られる複数の包絡線のそれぞれに対してフーリエ変換を実施して、変調スペクトル(Ai(m,f))(fは周波数)を生成する工程を含む、請求項1に記載の方法。 - 前記比較する工程が、
それぞれのチャネルiに対する複数の臨界帯域信号(Si(t))を得るために、前記音声信号(s(t))をろ波する工程を含み、前記臨界帯域信号(Si(t))はs(t)×hi(t)に等しく、ここでhi(t)は蝸牛フィルタである、請求項1に記載の方法。 - 前記比較する工程が、
複数の変調スペクトル(Ai(m,f))(fは周波数)を得るために、前記複数の臨界帯域信号に対して包絡線分析を実施する工程を含む、請求項13に記載の方法。 - 前記比較する工程が、
前記複数の変調スペクトルのそれぞれに対してフーリエ変換を実施する工程を含む、請求項14に記載の方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/186,840 US7165025B2 (en) | 2002-07-01 | 2002-07-01 | Auditory-articulatory analysis for speech quality assessment |
PCT/US2003/020355 WO2004003889A1 (en) | 2002-07-01 | 2003-06-27 | Auditory-articulatory analysis for speech quality assessment |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2005531811A JP2005531811A (ja) | 2005-10-20 |
JP2005531811A5 JP2005531811A5 (ja) | 2006-05-25 |
JP4551215B2 true JP4551215B2 (ja) | 2010-09-22 |
Family
ID=29779948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2004517988A Expired - Fee Related JP4551215B2 (ja) | 2002-07-01 | 2003-06-27 | 音声の聴覚明瞭度分析を実施する方法 |
Country Status (7)
Country | Link |
---|---|
US (1) | US7165025B2 (ja) |
EP (1) | EP1518223A1 (ja) |
JP (1) | JP4551215B2 (ja) |
KR (1) | KR101048278B1 (ja) |
CN (1) | CN1550001A (ja) |
AU (1) | AU2003253743A1 (ja) |
WO (1) | WO2004003889A1 (ja) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7308403B2 (en) * | 2002-07-01 | 2007-12-11 | Lucent Technologies Inc. | Compensation for utterance dependent articulation for speech quality assessment |
US20040167774A1 (en) * | 2002-11-27 | 2004-08-26 | University Of Florida | Audio-based method, system, and apparatus for measurement of voice quality |
US7327985B2 (en) * | 2003-01-21 | 2008-02-05 | Telefonaktiebolaget Lm Ericsson (Publ) | Mapping objective voice quality metrics to a MOS domain for field measurements |
DE60305306T2 (de) * | 2003-06-25 | 2007-01-18 | Psytechnics Ltd. | Vorrichtung und Verfahren zur binauralen Qualitätsbeurteilung |
US7305341B2 (en) * | 2003-06-25 | 2007-12-04 | Lucent Technologies Inc. | Method of reflecting time/language distortion in objective speech quality assessment |
US20050228655A1 (en) * | 2004-04-05 | 2005-10-13 | Lucent Technologies, Inc. | Real-time objective voice analyzer |
US7742914B2 (en) * | 2005-03-07 | 2010-06-22 | Daniel A. Kosek | Audio spectral noise reduction method and apparatus |
US7515966B1 (en) | 2005-03-14 | 2009-04-07 | Advanced Bionics, Llc | Sound processing and stimulation systems and methods for use with cochlear implant devices |
US7426414B1 (en) * | 2005-03-14 | 2008-09-16 | Advanced Bionics, Llc | Sound processing and stimulation systems and methods for use with cochlear implant devices |
US7856355B2 (en) * | 2005-07-05 | 2010-12-21 | Alcatel-Lucent Usa Inc. | Speech quality assessment method and system |
WO2007043971A1 (en) * | 2005-10-10 | 2007-04-19 | Olympus Technologies Singapore Pte Ltd | Handheld electronic processing apparatus and an energy storage accessory fixable thereto |
US8296131B2 (en) * | 2008-12-30 | 2012-10-23 | Audiocodes Ltd. | Method and apparatus of providing a quality measure for an output voice signal generated to reproduce an input voice signal |
CN101996628A (zh) * | 2009-08-21 | 2011-03-30 | 索尼株式会社 | 提取语音信号的韵律特征的方法和装置 |
CN109496334B (zh) | 2016-08-09 | 2022-03-11 | 华为技术有限公司 | 用于评估语音质量的设备和方法 |
CN106782610B (zh) * | 2016-11-15 | 2019-09-20 | 福建星网智慧科技股份有限公司 | 一种音频会议的音质测试方法 |
CN106653004B (zh) * | 2016-12-26 | 2019-07-26 | 苏州大学 | 感知语谱规整耳蜗滤波系数的说话人识别特征提取方法 |
EP3961624A1 (de) * | 2020-08-28 | 2022-03-02 | Sivantos Pte. Ltd. | Verfahren zum betrieb einer hörvorrichtung in abhängigkeit eines sprachsignals |
DE102020210919A1 (de) * | 2020-08-28 | 2022-03-03 | Sivantos Pte. Ltd. | Verfahren zur Bewertung der Sprachqualität eines Sprachsignals mittels einer Hörvorrichtung |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3971034A (en) * | 1971-02-09 | 1976-07-20 | Dektor Counterintelligence And Security, Inc. | Physiological response analysis method and apparatus |
JPH078080B2 (ja) * | 1989-06-29 | 1995-01-30 | 松下電器産業株式会社 | 音質評価装置 |
WO1992015090A1 (en) * | 1991-02-22 | 1992-09-03 | Seaway Technologies, Inc. | Acoustic method and apparatus for identifying human sonic sources |
US5454375A (en) * | 1993-10-21 | 1995-10-03 | Glottal Enterprises | Pneumotachograph mask or mouthpiece coupling element for airflow measurement during speech or singing |
GB9604315D0 (en) * | 1996-02-29 | 1996-05-01 | British Telecomm | Training process |
MX9800434A (es) * | 1995-07-27 | 1998-04-30 | British Telecomm | Evaluacion de calidad de señal. |
US6052662A (en) * | 1997-01-30 | 2000-04-18 | Regents Of The University Of California | Speech processing using maximum likelihood continuity mapping |
US6246978B1 (en) * | 1999-05-18 | 2001-06-12 | Mci Worldcom, Inc. | Method and system for measurement of speech distortion from samples of telephonic voice signals |
JP4463905B2 (ja) * | 1999-09-28 | 2010-05-19 | 隆行 荒井 | 音声処理方法、装置及び拡声システム |
US7308403B2 (en) * | 2002-07-01 | 2007-12-11 | Lucent Technologies Inc. | Compensation for utterance dependent articulation for speech quality assessment |
US7305341B2 (en) * | 2003-06-25 | 2007-12-04 | Lucent Technologies Inc. | Method of reflecting time/language distortion in objective speech quality assessment |
-
2002
- 2002-07-01 US US10/186,840 patent/US7165025B2/en active Active
-
2003
- 2003-06-27 AU AU2003253743A patent/AU2003253743A1/en not_active Abandoned
- 2003-06-27 JP JP2004517988A patent/JP4551215B2/ja not_active Expired - Fee Related
- 2003-06-27 CN CNA038009382A patent/CN1550001A/zh active Pending
- 2003-06-27 WO PCT/US2003/020355 patent/WO2004003889A1/en active Application Filing
- 2003-06-27 EP EP03762155A patent/EP1518223A1/en not_active Ceased
- 2003-06-27 KR KR1020047003129A patent/KR101048278B1/ko not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
KR20050012711A (ko) | 2005-02-02 |
AU2003253743A1 (en) | 2004-01-19 |
KR101048278B1 (ko) | 2011-07-13 |
EP1518223A1 (en) | 2005-03-30 |
WO2004003889A1 (en) | 2004-01-08 |
US7165025B2 (en) | 2007-01-16 |
CN1550001A (zh) | 2004-11-24 |
US20040002852A1 (en) | 2004-01-01 |
JP2005531811A (ja) | 2005-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4551215B2 (ja) | 音声の聴覚明瞭度分析を実施する方法 | |
US7158933B2 (en) | Multi-channel speech enhancement system and method based on psychoacoustic masking effects | |
EP3598441B1 (en) | Systems and methods for modifying an audio signal using custom psychoacoustic models | |
EP3899936B1 (en) | Source separation using an estimation and control of sound quality | |
US8744846B2 (en) | Procedure for processing noisy speech signals, and apparatus and computer program therefor | |
Relaño-Iborra et al. | A speech-based computational auditory signal processing and perception model | |
JP4301514B2 (ja) | 音声品質を評価する方法 | |
Crochiere et al. | An interpretation of the log likelihood ratio as a measure of waveform coder performance | |
US10319394B2 (en) | Apparatus and method for improving speech intelligibility in background noise by amplification and compression | |
Huber et al. | Objective assessment of a speech enhancement scheme with an automatic speech recognition-based system | |
US20090161882A1 (en) | Method of Measuring an Audio Signal Perceived Quality Degraded by a Noise Presence | |
Steeneken et al. | Basics of the STI measuring method | |
Senoussaoui et al. | SRMR variants for improved blind room acoustics characterization | |
Cosentino et al. | Towards objective measures of speech intelligibility for cochlear implant users in reverberant environments | |
Chanda et al. | Speech intelligibility enhancement using tunable equalization filter | |
US20240071411A1 (en) | Determining dialog quality metrics of a mixed audio signal | |
Rosca et al. | Multichannel voice detection in adverse environments | |
EP3718476B1 (en) | Systems and methods for evaluating hearing health | |
CN116686047A (zh) | 确定混合音频信号的对话质量度量 | |
Pourmand et al. | Computational auditory models in predicting noise reduction performance for wideband telephony applications | |
Nikhil et al. | Impact of ERB and bark scales on perceptual distortion based near-end speech enhancement | |
Roßbach et al. | Prediction of speech intelligibility based on deep machine listening: Influence of training data and simulation of hearing impairment | |
Chetan et al. | Lower and higher critical band enhancement to attain intelligibility improvement in noisy environment | |
Kressner | Auditory models for evaluating algorithms | |
Speech Transmission and Music Acoustics | PREDICTED SPEECH INTELLIGIBILITY AND LOUDNESS IN MODEL-BASED PRELIMINARY HEARING-AID FITTING |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A521 | Written amendment |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20060330 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20060330 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20090507 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20090807 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20090814 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20090907 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20090914 |
|
A521 | Written amendment |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20091006 |
|
A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20091109 |
|
A521 | Written amendment |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20100309 |
|
A521 | Written amendment |
Free format text: JAPANESE INTERMEDIATE CODE: A821 Effective date: 20100310 |
|
A911 | Transfer to examiner for re-examination before appeal (zenchi) |
Free format text: JAPANESE INTERMEDIATE CODE: A911 Effective date: 20100525 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20100616 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20100709 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20130716 Year of fee payment: 3 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
LAPS | Cancellation because of no payment of annual fees |