ES2152411T3 - Metodo y dispositivo para adaptar un equipo de reconocimiento del habla a las variantes dialectales de una lengua. - Google Patents
Metodo y dispositivo para adaptar un equipo de reconocimiento del habla a las variantes dialectales de una lengua.Info
- Publication number
- ES2152411T3 ES2152411T3 ES95925191T ES95925191T ES2152411T3 ES 2152411 T3 ES2152411 T3 ES 2152411T3 ES 95925191 T ES95925191 T ES 95925191T ES 95925191 T ES95925191 T ES 95925191T ES 2152411 T3 ES2152411 T3 ES 2152411T3
- Authority
- ES
- Spain
- Prior art keywords
- speech
- model
- fundamental tone
- language
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title abstract 3
- 238000001514 detection method Methods 0.000 abstract 1
- 238000013518 transcription Methods 0.000 abstract 1
- 230000035897 transcription Effects 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1807—Speech classification or search using natural language modelling using prosody or stress
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
LA PRESENTE INVENCION SE RELACIONA CON UN METODO Y UN DISPOSITIVO PARA EL RECONOCIMIENTO DE LAS VARIANTES DIALECTALES DE UNA LENGUA A PARTIR DE UN HABLA DADA. POR UN LADO, SE LLEVA A CABO UN PROCEDIMIENTO DE RECONOCIMIENTO DEL HABLA A PARTIR DE UNA DISCURSO DE ENTRADA, Y, POR OTRO LADO, SE EXTRAE LA CURVA DEL TONO FUNDAMENTAL. A PARTIR DEL RECONOCIMIENTO DEL HABLA SE CREA UNA CADENA DE ALOFONOS QUE JUNTO CON LA CURVA DEL TONO FUNDAMENTAL SE UTILIZA PARA LA DETECCION DE LOS VALORES MAXIMOS Y MINIMOS DE DEL TONO FUNDAMENTAL. EL HABLA RECONOCIDA ES COMPARADA CON UN DICCIONARIO CON ORTOGRAFIA Y TRANSCRIPCION PARA ENCONTRAR LOS CANDIDATOS DE LAS PALABRAS ADECUADAS. POSTERIORMENTE, LOS CANDIDATOS DE LAS PALABRAS SON ANALIZADOS SEGUN LA SINTAXIS. ESTA INFORMACION LEXICA Y SINTACTICA ENCONTRADA SEGUN LA FORMA MENCIONADA SE UTILIZA PARA CREAR UN MODELO DE HABLA. EL CONTORNO DEL TONO FUNDAMENTAL DEL MODELO Y EL TONO FUNDAMENTAL DEL DISCURSO SON COMPARADOS, EN DONDE LOS VALORES MINIMOS Y MAXIMOS DE LOS TONOS FUNDAMENTALES SON NOMBRADOS, Y SE OBTIENE UNA DIFERENCIA ENTRE EL MODELO Y EL HABLA. LA DIFERENCIA SE PRODUCE DESPUES DE QUE INFLUYA EN EL MODELO QUE SE CORRESPONDE CON EL DISCURSO DADO. DESPUES, EL MODELO PRODUCIDO SEGUN LA FORMA MENCIONADA SE UTILIZA PARA EL RECONOCIMIENTO DEL HABLA, EN DONDE SE LOGRA UNA POSIBILIDAD AUMENTADA PARA ENTENDER LOS DIFERENTES DIALECTOS DE UNA LENGUA DE UNA FORMA ARTIFICIAL.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE9402284A SE504177C2 (sv) | 1994-06-29 | 1994-06-29 | Metod och anordning att adaptera en taligenkänningsutrustning för dialektala variationer i ett språk |
Publications (1)
Publication Number | Publication Date |
---|---|
ES2152411T3 true ES2152411T3 (es) | 2001-02-01 |
Family
ID=20394556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
ES95925191T Expired - Lifetime ES2152411T3 (es) | 1994-06-29 | 1995-06-13 | Metodo y dispositivo para adaptar un equipo de reconocimiento del habla a las variantes dialectales de una lengua. |
Country Status (7)
Country | Link |
---|---|
US (1) | US5694520A (es) |
EP (1) | EP0767950B1 (es) |
JP (1) | JPH10504404A (es) |
DE (1) | DE69519229T2 (es) |
ES (1) | ES2152411T3 (es) |
SE (1) | SE504177C2 (es) |
WO (1) | WO1996000962A2 (es) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE516526C2 (sv) * | 1993-11-03 | 2002-01-22 | Telia Ab | Metod och anordning vid automatisk extrahering av prosodisk information |
SE514684C2 (sv) * | 1995-06-16 | 2001-04-02 | Telia Ab | Metod vid tal-till-textomvandling |
SE9601811L (sv) * | 1996-05-13 | 1997-11-03 | Telia Ab | Metod och system för tal-till-tal-omvandling med extrahering av prosodiinformation |
SE519273C2 (sv) * | 1996-05-13 | 2003-02-11 | Telia Ab | Förbättringar av , eller med avseende på, tal-till-tal- omvandling |
CN1120469C (zh) * | 1998-02-03 | 2003-09-03 | 西门子公司 | 传输语音数据的方法 |
US6343270B1 (en) * | 1998-12-09 | 2002-01-29 | International Business Machines Corporation | Method for increasing dialect precision and usability in speech recognition and text-to-speech systems |
DE60019229T2 (de) | 1999-10-29 | 2006-03-09 | Matsushita Electric Industrial Co., Ltd., Kadoma | Normalisierung der Grundfrequenz zur Spracherkennung |
CN1159702C (zh) | 2001-04-11 | 2004-07-28 | 国际商业机器公司 | 具有情感的语音-语音翻译系统和方法 |
US20040266337A1 (en) * | 2003-06-25 | 2004-12-30 | Microsoft Corporation | Method and apparatus for synchronizing lyrics |
US7940897B2 (en) | 2005-06-24 | 2011-05-10 | American Express Travel Related Services Company, Inc. | Word recognition system and method for customer and employee assessment |
JP4264841B2 (ja) * | 2006-12-01 | 2009-05-20 | ソニー株式会社 | 音声認識装置および音声認識方法、並びに、プログラム |
JP4882899B2 (ja) * | 2007-07-25 | 2012-02-22 | ソニー株式会社 | 音声解析装置、および音声解析方法、並びにコンピュータ・プログラム |
US8077836B2 (en) | 2008-07-30 | 2011-12-13 | At&T Intellectual Property, I, L.P. | Transparent voice registration and verification method and system |
JP2015087649A (ja) * | 2013-10-31 | 2015-05-07 | シャープ株式会社 | 発話制御装置、方法、発話システム、プログラム、及び発話装置 |
CN104464423A (zh) * | 2014-12-19 | 2015-03-25 | 科大讯飞股份有限公司 | 一种口语考试评测的校标优化方法及系统 |
CN107170454B (zh) * | 2017-05-31 | 2022-04-05 | Oppo广东移动通信有限公司 | 语音识别方法及相关产品 |
US11545132B2 (en) | 2019-08-28 | 2023-01-03 | International Business Machines Corporation | Speech characterization using a synthesized reference audio signal |
CN110716523A (zh) * | 2019-11-06 | 2020-01-21 | 中水三立数据技术股份有限公司 | 一种基于语音识别的泵站智能决策系统及方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE13680C1 (es) | 1902-02-01 | |||
SE12386C1 (es) | 1901-05-04 | |||
US5268990A (en) * | 1991-01-31 | 1993-12-07 | Sri International | Method for recognizing speech using linguistically-motivated hidden Markov models |
SE516526C2 (sv) * | 1993-11-03 | 2002-01-22 | Telia Ab | Metod och anordning vid automatisk extrahering av prosodisk information |
JP3450411B2 (ja) * | 1994-03-22 | 2003-09-22 | キヤノン株式会社 | 音声情報処理方法及び装置 |
-
1994
- 1994-06-29 SE SE9402284A patent/SE504177C2/sv unknown
-
1995
- 1995-06-13 DE DE69519229T patent/DE69519229T2/de not_active Expired - Fee Related
- 1995-06-13 US US08/532,823 patent/US5694520A/en not_active Expired - Lifetime
- 1995-06-13 WO PCT/SE1995/000710 patent/WO1996000962A2/en active IP Right Grant
- 1995-06-13 ES ES95925191T patent/ES2152411T3/es not_active Expired - Lifetime
- 1995-06-13 EP EP95925191A patent/EP0767950B1/en not_active Expired - Lifetime
- 1995-06-13 JP JP8503055A patent/JPH10504404A/ja not_active Ceased
Also Published As
Publication number | Publication date |
---|---|
EP0767950A2 (en) | 1997-04-16 |
WO1996000962A2 (en) | 1996-01-11 |
SE9402284L (sv) | 1995-12-30 |
DE69519229D1 (de) | 2000-11-30 |
EP0767950B1 (en) | 2000-10-25 |
US5694520A (en) | 1997-12-02 |
SE504177C2 (sv) | 1996-12-02 |
WO1996000962A3 (en) | 1996-02-22 |
SE9402284D0 (sv) | 1994-06-29 |
JPH10504404A (ja) | 1998-04-28 |
DE69519229T2 (de) | 2001-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ES2152411T3 (es) | Metodo y dispositivo para adaptar un equipo de reconocimiento del habla a las variantes dialectales de una lengua. | |
ES2153021T3 (es) | Procedimiento y disposicion para la conversion del habla a texto. | |
WO2018121757A1 (zh) | 文本语音播报方法及系统 | |
DE69827667D1 (de) | Vokoder basierter spracherkenner | |
ATE183010T1 (de) | Auf mikrosegmenten basierendes sprachsyntheseverfahren | |
ATE276568T1 (de) | Hierarchische sprachmodelle | |
CN103928023A (zh) | 一种语音评分方法及系统 | |
EP1629464A4 (en) | LANGUAGE RECOGNITION SYSTEM AND PHONETIC BASIC PROCEDURE | |
SE9502202D0 (sv) | Metod vid tal-till-textomvandling | |
Li et al. | Acoustical F0 analysis of continuous Cantonese speech | |
Petrushin et al. | Whispered speech prosody modeling for TTS synthesis | |
Wang et al. | Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS | |
Renovalles et al. | Text-to-speech systems for filipino using unit selection and deep learning | |
JPH0580791A (ja) | 音声規則合成装置および方法 | |
KR20030033628A (ko) | 판별 및 회귀 트리를 이용한 끊어읽기 강도 자동 레이블링방법 | |
Sarma et al. | A study on detection of intonation events of Assamese speech required for tilt model | |
Anil et al. | Expressive speech synthesis using prosodic modification for Marathi language | |
Hoffmann et al. | Employing Sentence Structure: Syntax Trees as Prosody Generators. | |
Maghbouleh | A logistic regression model for detecting prominences | |
Mertens | Transcription of tonal aspects in speech and a system for automatic tonal annotation | |
Apopei et al. | Towards prosodic phrasing of spontaneous and reading speech for Romanian corpora | |
Chellam et al. | Prosodic modification of speech to incorporate happy and sad emotions | |
JPS4949241B1 (es) | ||
Bartošek et al. | Foot detection in Czech using pitch information and HMM | |
Chan et al. | Prosodic features for a maximum entropy language model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FC2A | Grant refused |
Effective date: 19980707 |
|
FG2A | Definitive protection |
Ref document number: 767950 Country of ref document: ES |