NO316847B1 - Fremgangsmate og anordning ved omvandling av tale til tekst - Google Patents
Fremgangsmate og anordning ved omvandling av tale til tekst Download PDFInfo
- Publication number
- NO316847B1 NO316847B1 NO19962463A NO962463A NO316847B1 NO 316847 B1 NO316847 B1 NO 316847B1 NO 19962463 A NO19962463 A NO 19962463A NO 962463 A NO962463 A NO 962463A NO 316847 B1 NO316847 B1 NO 316847B1
- Authority
- NO
- Norway
- Prior art keywords
- speech
- model
- accent
- words
- information
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 22
- 238000013507 mapping Methods 0.000 claims description 4
- 238000007619 statistical method Methods 0.000 description 3
- 241001672694 Citrus reticulata Species 0.000 description 2
- 230000001944 accentuation Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1807—Speech classification or search using natural language modelling using prosody or stress
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE9502202A SE514684C2 (sv) | 1995-06-16 | 1995-06-16 | Metod vid tal-till-textomvandling |
Publications (3)
Publication Number | Publication Date |
---|---|
NO962463D0 NO962463D0 (no) | 1996-06-12 |
NO962463L NO962463L (no) | 1996-12-17 |
NO316847B1 true NO316847B1 (no) | 2004-06-01 |
Family
ID=20398649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
NO19962463A NO316847B1 (no) | 1995-06-16 | 1996-06-12 | Fremgangsmate og anordning ved omvandling av tale til tekst |
Country Status (7)
Country | Link |
---|---|
US (1) | US5806033A (ja) |
EP (1) | EP0749109B1 (ja) |
JP (1) | JPH0922297A (ja) |
DE (1) | DE69618503T2 (ja) |
DK (1) | DK0749109T3 (ja) |
NO (1) | NO316847B1 (ja) |
SE (1) | SE514684C2 (ja) |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1039895A (ja) * | 1996-07-25 | 1998-02-13 | Matsushita Electric Ind Co Ltd | 音声合成方法および装置 |
KR100238189B1 (ko) * | 1997-10-16 | 2000-01-15 | 윤종용 | 다중 언어 tts장치 및 다중 언어 tts 처리 방법 |
JP4267101B2 (ja) | 1997-11-17 | 2009-05-27 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 音声識別装置、発音矯正装置およびこれらの方法 |
US6377927B1 (en) | 1998-10-07 | 2002-04-23 | Masoud Loghmani | Voice-optimized database system and method of using same |
US7283973B1 (en) | 1998-10-07 | 2007-10-16 | Logic Tree Corporation | Multi-modal voice-enabled content access and delivery system |
US6941273B1 (en) | 1998-10-07 | 2005-09-06 | Masoud Loghmani | Telephony-data application interface apparatus and method for multi-modal access to data applications |
WO2001003112A1 (en) * | 1999-07-06 | 2001-01-11 | James Quest | Speech recognition system and method |
AU763362B2 (en) * | 1999-07-06 | 2003-07-17 | James Quest | Speech recognition system and method |
US6526382B1 (en) | 1999-12-07 | 2003-02-25 | Comverse, Inc. | Language-oriented user interfaces for voice activated services |
US20080147404A1 (en) * | 2000-05-15 | 2008-06-19 | Nusuara Technologies Sdn Bhd | System and methods for accent classification and adaptation |
US6948129B1 (en) | 2001-02-08 | 2005-09-20 | Masoud S Loghmani | Multi-modal, multi-path user interface for simultaneous access to internet data over multiple media |
US8000320B2 (en) * | 2001-02-08 | 2011-08-16 | Logic Tree Corporation | System for providing multi-phased, multi-modal access to content through voice and data devices |
US7200142B1 (en) | 2001-02-08 | 2007-04-03 | Logic Tree Corporation | System for providing multi-phased, multi-modal access to content through voice and data devices |
EP1298647B1 (en) * | 2001-09-28 | 2005-11-16 | Alcatel | A communication device and a method for transmitting and receiving of natural speech, comprising a speech recognition module coupled to an encoder |
GB2388738B (en) | 2001-11-03 | 2004-06-02 | Dremedia Ltd | Time ordered indexing of audio data |
GB2381688B (en) | 2001-11-03 | 2004-09-22 | Dremedia Ltd | Time ordered indexing of audio-visual data |
US20030115169A1 (en) * | 2001-12-17 | 2003-06-19 | Hongzhuan Ye | System and method for management of transcribed documents |
US6990445B2 (en) * | 2001-12-17 | 2006-01-24 | Xl8 Systems, Inc. | System and method for speech recognition and transcription |
US7280968B2 (en) * | 2003-03-25 | 2007-10-09 | International Business Machines Corporation | Synthetically generated speech responses including prosodic characteristics of speech inputs |
US20050055197A1 (en) * | 2003-08-14 | 2005-03-10 | Sviatoslav Karavansky | Linguographic method of compiling word dictionaries and lexicons for the memories of electronic speech-recognition devices |
JP4264841B2 (ja) | 2006-12-01 | 2009-05-20 | ソニー株式会社 | 音声認識装置および音声認識方法、並びに、プログラム |
US8315870B2 (en) * | 2007-08-22 | 2012-11-20 | Nec Corporation | Rescoring speech recognition hypothesis using prosodic likelihood |
US8401856B2 (en) * | 2010-05-17 | 2013-03-19 | Avaya Inc. | Automatic normalization of spoken syllable duration |
US9009049B2 (en) * | 2012-06-06 | 2015-04-14 | Spansion Llc | Recognition of speech with different accents |
US9966064B2 (en) | 2012-07-18 | 2018-05-08 | International Business Machines Corporation | Dialect-specific acoustic language modeling and speech recognition |
KR102084646B1 (ko) * | 2013-07-04 | 2020-04-14 | 삼성전자주식회사 | 음성 인식 장치 및 음성 인식 방법 |
US10468050B2 (en) | 2017-03-29 | 2019-11-05 | Microsoft Technology Licensing, Llc | Voice synthesized participatory rhyming chat bot |
US11809958B2 (en) * | 2020-06-10 | 2023-11-07 | Capital One Services, Llc | Systems and methods for automatic decision-making with user-configured criteria using multi-channel data inputs |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0029048B1 (en) * | 1979-05-28 | 1985-05-29 | The University Of Melbourne | Speech processor |
JPH05197389A (ja) * | 1991-08-13 | 1993-08-06 | Toshiba Corp | 音声認識装置 |
SE500277C2 (sv) * | 1993-05-10 | 1994-05-24 | Televerket | Anordning för att öka talförståelsen vid översätttning av tal från ett första språk till ett andra språk |
SE516526C2 (sv) * | 1993-11-03 | 2002-01-22 | Telia Ab | Metod och anordning vid automatisk extrahering av prosodisk information |
SE504177C2 (sv) * | 1994-06-29 | 1996-12-02 | Telia Ab | Metod och anordning att adaptera en taligenkänningsutrustning för dialektala variationer i ett språk |
-
1995
- 1995-06-16 SE SE9502202A patent/SE514684C2/sv unknown
-
1996
- 1996-06-04 DE DE69618503T patent/DE69618503T2/de not_active Expired - Fee Related
- 1996-06-04 EP EP96850108A patent/EP0749109B1/en not_active Expired - Lifetime
- 1996-06-04 DK DK96850108T patent/DK0749109T3/da active
- 1996-06-12 NO NO19962463A patent/NO316847B1/no unknown
- 1996-06-14 JP JP8175484A patent/JPH0922297A/ja active Pending
- 1996-06-17 US US08/665,728 patent/US5806033A/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
SE9502202L (sv) | 1996-12-17 |
SE9502202D0 (sv) | 1995-06-16 |
EP0749109A3 (en) | 1998-04-29 |
NO962463L (no) | 1996-12-17 |
DE69618503D1 (de) | 2002-02-21 |
DK0749109T3 (da) | 2002-03-25 |
DE69618503T2 (de) | 2002-08-29 |
NO962463D0 (no) | 1996-06-12 |
SE514684C2 (sv) | 2001-04-02 |
JPH0922297A (ja) | 1997-01-21 |
EP0749109B1 (en) | 2002-01-16 |
EP0749109A2 (en) | 1996-12-18 |
US5806033A (en) | 1998-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
NO316847B1 (no) | Fremgangsmate og anordning ved omvandling av tale til tekst | |
US5752227A (en) | Method and arrangement for speech to text conversion | |
US6085160A (en) | Language independent speech recognition | |
US7937262B2 (en) | Method, apparatus, and computer program product for machine translation | |
Moberg | Contributions to Multilingual Low-Footprint TTS System for Hand-Held Devices | |
US6792407B2 (en) | Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems | |
US5758023A (en) | Multi-language speech recognition system | |
US6910012B2 (en) | Method and system for speech recognition using phonetically similar word alternatives | |
KR20190085879A (ko) | 다중 언어 텍스트-음성 합성 방법 | |
CN106782603B (zh) | 智能语音评测方法及系统 | |
US20090138266A1 (en) | Apparatus, method, and computer program product for recognizing speech | |
CN115116428B (zh) | 韵律边界标注方法、装置、设备、介质及程序产品 | |
Batliner et al. | Prosodic models, automatic speech understanding, and speech synthesis: Towards the common ground? | |
KR20100068965A (ko) | 자동 통역 장치 및 그 방법 | |
US11817079B1 (en) | GAN-based speech synthesis model and training method | |
RU2386178C2 (ru) | Способ предварительной обработки текста | |
EP0919052B1 (en) | A method and a system for speech-to-speech conversion | |
Karaali et al. | A high quality text-to-speech system composed of multiple neural networks | |
NO318112B1 (no) | System og fremgangsmate for tale-til-taleomforming | |
Wu | English Pronunciation Error Detection Based on Multimedia Data | |
Apel et al. | Have a break! Modelling pauses in German speech | |
Ng | Survey of data-driven approaches to Speech Synthesis | |
Tepperman et al. | Robust recognition and assessment of nonnative speech variability | |
Dann'ells | Disfluency detection in a dialogue system | |
Karhila | Cross-lingual acoustic model adaptation for speakerindependent speech recognition |