TWI328798B - Method and apparatus for estimating degree of similarity between voices - Google Patents

Method and apparatus for estimating degree of similarity between voices Download PDF

Info

Publication number
TWI328798B
TWI328798B TW096109552A TW96109552A TWI328798B TW I328798 B TWI328798 B TW I328798B TW 096109552 A TW096109552 A TW 096109552A TW 96109552 A TW96109552 A TW 96109552A TW I328798 B TWI328798 B TW I328798B
Authority
TW
Taiwan
Prior art keywords
matrix
doc
voices
similarity
odd
Prior art date
Application number
TW096109552A
Other languages
English (en)
Other versions
TW200805252A (en
Inventor
Mikio Tohyama
Michiko Kazama
Satoru Goto
Takehiko Kawahara
Yasuo Yoshioka
Original Assignee
Yamaha Corp
Univ Waseda
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp, Univ Waseda filed Critical Yamaha Corp
Publication of TW200805252A publication Critical patent/TW200805252A/zh
Application granted granted Critical
Publication of TWI328798B publication Critical patent/TWI328798B/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Traffic Control Systems (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Complex Calculations (AREA)

Description

1328798 ’., 第096109552號專利申請案 中文圖式替換本(99年10月) 十一、圖式: 01 寸一 Η
&SI isi 51 ll csll 丨登記資訊1 1產生器1 麵者ID ID輸入 |單元1 00 οε ocsl
Mi - 9ε si s εε 蠢DC?蝙 a#§ IRgsf YSLcn CSJe ιε i 飄11
II ¾酵 就» Q#i 114818-fig-991008.doc 511328798
52 53 頻綱欄腿 {ajkl(j=1 〜N,k=1 〜N) 2
114818-fig-991008.doc 1328798
S16 S17 圖4 114818-flg-991008.doc 1328798
(affiNXN)魅联*6
(魅驭NXN) 魅^*6 $$ $ (1:驭 NCNX Νζ) 趣轵匯琚酲_骤·Ν_跑Υ蕕 114818-fig-99I008.doc 1328798 1?年t〇月/日修立/更止/補先/
9_ 114818-fig-991008.doc 1328798
圖7 114818-fig-991008.doc 1328798 圍 钃 ||Pf^ 醺 S »mrn 翻 議 sssa ffig m [奇數,奇數]
|孩一 | 孩 llllilllss 驪_画___圃_ _: —- | _ 2::- •s is ί之 I矩陣 巨陣) 輸入聲! 頻帶間相II (2Nx2N^5 從模板DB讀取之 頻帶間相關矩陣 (2Nx2N 矩陣) »11画丨丨丨丨歷1丨丨丨丨疆1疆 1111甬丨丨1丨费丨丨III运丨丨III赛 霪I嚣i_li噩 ^^55 1111运丨1丨丨盈丨丨丨丨蕤丨丨丨1运
[偶數,偶數] [奇數’偶數] [奇數,偶數] [偶數,奇數] 小矩陣(NxN矩陣) 陣(NxN矩陣)II比較 Ul| [奇數,奇數] (ΐίχΝ运陣) [偶數,偶數] 小矩陣(nxn矩陣)\rrnw 、小竭薄(N X N矩陣) 小矩陣(NxN矩陣)II比較
[偶數,奇數] 小矩陣(NxN矩陣) 小矩陣(NxN矩陣)
urn 決定 圖8 114818-fig-991008.doc 1328798
(%)自臓 114818-fig-991008.doc 1328798 ο
s Μ (%)傘臓 114818-fig-991008.doc
TW096109552A 2006-03-24 2007-03-20 Method and apparatus for estimating degree of similarity between voices TWI328798B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2006081853A JP4527679B2 (ja) 2006-03-24 2006-03-24 音声の類似度の評価を行う方法および装置

Publications (2)

Publication Number Publication Date
TW200805252A TW200805252A (en) 2008-01-16
TWI328798B true TWI328798B (en) 2010-08-11

Family

ID=38191379

Family Applications (1)

Application Number Title Priority Date Filing Date
TW096109552A TWI328798B (en) 2006-03-24 2007-03-20 Method and apparatus for estimating degree of similarity between voices

Country Status (6)

Country Link
US (1) US7996213B2 (zh)
EP (1) EP1837863B1 (zh)
JP (1) JP4527679B2 (zh)
KR (1) KR100919546B1 (zh)
CN (1) CN101042870B (zh)
TW (1) TWI328798B (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8655655B2 (en) 2010-12-03 2014-02-18 Industrial Technology Research Institute Sound event detecting module for a sound event recognition system and method thereof

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8140331B2 (en) * 2007-07-06 2012-03-20 Xia Lou Feature extraction for identification and classification of audio signals
CN101221760B (zh) * 2008-01-30 2010-12-22 中国科学院计算技术研究所 一种音频匹配方法及系统
CN102956238B (zh) * 2011-08-19 2016-02-10 杜比实验室特许公司 用于在音频帧序列中检测重复模式的方法及设备
US20140095161A1 (en) * 2012-09-28 2014-04-03 At&T Intellectual Property I, L.P. System and method for channel equalization using characteristics of an unknown signal
CN104580754B (zh) * 2014-12-03 2018-08-17 贵阳朗玛信息技术股份有限公司 Ivr系统及基于ivr的聊天速配方法
CN105590632B (zh) * 2015-12-16 2019-01-29 广东德诚科教有限公司 一种基于语音相似性识别的s-t教学过程分析方法
CN105679324B (zh) * 2015-12-29 2019-03-22 福建星网视易信息系统有限公司 一种声纹识别相似度评分的方法和装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4720863A (en) * 1982-11-03 1988-01-19 Itt Defense Communications Method and apparatus for text-independent speaker recognition
JPS60158498A (ja) * 1984-01-27 1985-08-19 株式会社リコー パターン照合装置
JPH01103759A (ja) * 1987-10-16 1989-04-20 Nec Corp パスワード検出装置
JPH03266898A (ja) * 1990-03-16 1991-11-27 Fujitsu Ltd 大語彙音声認識処理方式
US5583961A (en) * 1993-03-25 1996-12-10 British Telecommunications Public Limited Company Speaker recognition using spectral coefficients normalized with respect to unequal frequency bands
KR100484210B1 (ko) * 1996-05-03 2006-07-25 위니베르시떼 피에르 에 마리 퀴리 예측모델을사용한,특히억세스제어응용을위한발성자음성인식방법
JP2000330590A (ja) * 1999-05-21 2000-11-30 Ricoh Co Ltd 話者照合方法および話者照合システム
US7260226B1 (en) * 1999-08-26 2007-08-21 Sony Corporation Information retrieving method, information retrieving device, information storing method and information storage device
US7024359B2 (en) 2001-01-31 2006-04-04 Qualcomm Incorporated Distributed voice recognition system using acoustic feature vector modification
JP3699912B2 (ja) * 2001-07-26 2005-09-28 株式会社東芝 音声特徴量抽出方法と装置及びプログラム
JP3969079B2 (ja) 2001-12-12 2007-08-29 ソニー株式会社 音声認識装置および方法、記録媒体、並びにプログラム
JP4314016B2 (ja) * 2002-11-01 2009-08-12 株式会社東芝 人物認識装置および通行制御装置
JP4510539B2 (ja) * 2004-07-26 2010-07-28 日本放送協会 特定話者音声出力装置及び特定話者判定プログラム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8655655B2 (en) 2010-12-03 2014-02-18 Industrial Technology Research Institute Sound event detecting module for a sound event recognition system and method thereof

Also Published As

Publication number Publication date
JP2007256689A (ja) 2007-10-04
TW200805252A (en) 2008-01-16
US7996213B2 (en) 2011-08-09
CN101042870B (zh) 2010-12-29
EP1837863A3 (en) 2011-11-16
KR20070096913A (ko) 2007-10-02
JP4527679B2 (ja) 2010-08-18
US20070225979A1 (en) 2007-09-27
CN101042870A (zh) 2007-09-26
EP1837863B1 (en) 2013-02-27
KR100919546B1 (ko) 2009-10-01
EP1837863A2 (en) 2007-09-26

Similar Documents

Publication Publication Date Title
TWI328798B (en) Method and apparatus for estimating degree of similarity between voices
Herrera Viedma Global trends in coronavirus research at the time of Covid-19: A general bibliometric approach and content analysis using SciMAT
Suryawanshi et al. Limited cross-variant immunity from SARS-CoV-2 Omicron without vaccination
Llanes et al. Betacoronavirus genomes: how genomic information has been used to deal with past outbreaks and the COVID-19 pandemic
JP2011508310A5 (zh)
Song et al. Etiologic studies of epidemic hemorrhagic fever (hemorrhagic fever with renal syndrome)
WO2004107322A3 (en) Systems and methods utilizing natural language medical records
WO2007121292A3 (en) Portable media player enabled to obtain media previews
JPWO2019229543A5 (zh)
Tse et al. Effects of vocal fold epithelium removal on vibration in an excised human larynx model
Vaatainen Rumbasta rampaan: Vammaisen naistanssijan ruumiillisuus pyoratuolikilpatanssissa.
Hernández-García et al. Laryngotracheal complications after intubation for COVID-19: a multicenter study
Khan et al. Saliva for the diagnosis of COVID-19
Wu et al. Measurement of the sound transmission characteristics of normal neck tissue using a reflectionless uniform tube
Inan et al. A stethoscope for the knee: Investigating joint acoustical emissions as novel biomarkers for wearable joint health assessment
Kwong et al. Vaccine design reaches the atomic level
Paroni Human Beatboxing: pushing the boundaries of human voice production
DK1257170T3 (da) Metode til anonym registreering, opbevaring og anvendelse af legemsmateriale og/eller informationer afledt deraf
Park et al. Voice Onset Time in Patients with Bilateral Vocal Nodules
Loney et al. Innate and Adaptive Immune Genes Associated with MERS-CoV Infection in Dromedaries
Top Advancing the Science of Vaccine Safety: Introduction to the International Network of Special Immunisation Services
Onishchenko et al. EPIDEMICS OF SEVERE ACUTE RESPIRATORY SYNDROME (SARS) IN THE WORLD
Chandrasena et al. Effect of genotyping on the severity of rotavirus Gastroenteritis
Muhammad Reconciling Differences Pertaining to the Origin of SARS-CoV-2
Won et al. Virtual reality based assessment and education tool for auditory hallucination symptoms

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees