KR102731583B1 - 음성 인식 트레이닝 및 스코어링을 위한 음역 - Google Patents
음성 인식 트레이닝 및 스코어링을 위한 음역 Download PDFInfo
- Publication number
- KR102731583B1 KR102731583B1 KR1020217017741A KR20217017741A KR102731583B1 KR 102731583 B1 KR102731583 B1 KR 102731583B1 KR 1020217017741 A KR1020217017741 A KR 1020217017741A KR 20217017741 A KR20217017741 A KR 20217017741A KR 102731583 B1 KR102731583 B1 KR 102731583B1
- Authority
- KR
- South Korea
- Prior art keywords
- script
- speech recognition
- computers
- words
- language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201862778431P | 2018-12-12 | 2018-12-12 | |
| US62/778,431 | 2018-12-12 | ||
| PCT/US2019/017258 WO2020122974A1 (en) | 2018-12-12 | 2019-02-08 | Transliteration for speech recognition training and scoring |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR20210076163A KR20210076163A (ko) | 2021-06-23 |
| KR102731583B1 true KR102731583B1 (ko) | 2024-11-15 |
Family
ID=65520451
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020217017741A Active KR102731583B1 (ko) | 2018-12-12 | 2019-02-08 | 음성 인식 트레이닝 및 스코어링을 위한 음역 |
Country Status (5)
| Country | Link |
|---|---|
| EP (1) | EP3877973B1 (https=) |
| JP (1) | JP7208399B2 (https=) |
| KR (1) | KR102731583B1 (https=) |
| CN (1) | CN113396455B (https=) |
| WO (1) | WO2020122974A1 (https=) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114420159B (zh) * | 2020-10-12 | 2025-04-18 | 苏州声通信息科技有限公司 | 音频评测方法及装置、非瞬时性存储介质 |
| US11568858B2 (en) * | 2020-10-17 | 2023-01-31 | International Business Machines Corporation | Transliteration based data augmentation for training multilingual ASR acoustic models in low resource settings |
| CN113626563A (zh) * | 2021-08-30 | 2021-11-09 | 京东方科技集团股份有限公司 | 训练自然语言处理模型和自然语言处理的方法、电子设备 |
| CN113889105B (zh) * | 2021-09-29 | 2025-07-04 | 北京搜狗科技发展有限公司 | 一种语音翻译方法、装置和用于语音翻译的装置 |
| CN114118108A (zh) * | 2021-11-11 | 2022-03-01 | 支付宝(杭州)信息技术有限公司 | 建立转译模型的方法、转译方法和对应装置 |
| CN114299930B (zh) * | 2021-12-21 | 2025-03-14 | 广州虎牙科技有限公司 | 端到端语音识别模型处理方法、语音识别方法及相关装置 |
| CN114520001B (zh) * | 2022-03-22 | 2025-08-01 | 科大讯飞股份有限公司 | 一种语音识别方法、装置、设备及存储介质 |
| KR102616598B1 (ko) * | 2023-05-30 | 2023-12-22 | 주식회사 엘솔루 | 번역 자막을 이용한 원문 자막 병렬 데이터 생성 방법 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060041427A1 (en) | 2004-08-20 | 2006-02-23 | Girija Yegnanarayanan | Document transcription system training |
| US20090248395A1 (en) | 2008-03-31 | 2009-10-01 | Neal Alewine | Systems and methods for building a native language phoneme lexicon having native pronunciations of non-natie words derived from non-native pronunciatons |
| WO2009129315A1 (en) | 2008-04-15 | 2009-10-22 | Mobile Technologies, Llc | System and methods for maintaining speech-to-speech translation in the field |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080221866A1 (en) * | 2007-03-06 | 2008-09-11 | Lalitesh Katragadda | Machine Learning For Transliteration |
| JP2009157888A (ja) | 2007-12-28 | 2009-07-16 | National Institute Of Information & Communication Technology | 音訳モデル作成装置、音訳装置、及びそれらのためのコンピュータプログラム |
| US9176936B2 (en) * | 2012-09-28 | 2015-11-03 | International Business Machines Corporation | Transliteration pair matching |
| US10540957B2 (en) * | 2014-12-15 | 2020-01-21 | Baidu Usa Llc | Systems and methods for speech transcription |
| JP2018028848A (ja) | 2016-08-19 | 2018-02-22 | 日本放送協会 | 変換処理装置、音訳処理装置、およびプログラム |
| US10255909B2 (en) * | 2017-06-29 | 2019-04-09 | Intel IP Corporation | Statistical-analysis-based reset of recurrent neural networks for automatic speech recognition |
-
2019
- 2019-02-08 KR KR1020217017741A patent/KR102731583B1/ko active Active
- 2019-02-08 EP EP19707226.7A patent/EP3877973B1/en active Active
- 2019-02-08 CN CN201980082043.XA patent/CN113396455B/zh active Active
- 2019-02-08 JP JP2021533448A patent/JP7208399B2/ja active Active
- 2019-02-08 WO PCT/US2019/017258 patent/WO2020122974A1/en not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20060041427A1 (en) | 2004-08-20 | 2006-02-23 | Girija Yegnanarayanan | Document transcription system training |
| US20090248395A1 (en) | 2008-03-31 | 2009-10-01 | Neal Alewine | Systems and methods for building a native language phoneme lexicon having native pronunciations of non-natie words derived from non-native pronunciatons |
| WO2009129315A1 (en) | 2008-04-15 | 2009-10-22 | Mobile Technologies, Llc | System and methods for maintaining speech-to-speech translation in the field |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2022515048A (ja) | 2022-02-17 |
| JP7208399B2 (ja) | 2023-01-18 |
| CN113396455B (zh) | 2025-04-15 |
| CN113396455A (zh) | 2021-09-14 |
| WO2020122974A1 (en) | 2020-06-18 |
| KR20210076163A (ko) | 2021-06-23 |
| EP3877973A1 (en) | 2021-09-15 |
| EP3877973B1 (en) | 2025-09-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102731583B1 (ko) | 음성 인식 트레이닝 및 스코어링을 위한 음역 | |
| US11417322B2 (en) | Transliteration for speech recognition training and scoring | |
| US11942076B2 (en) | Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models | |
| Stolcke et al. | Recent innovations in speech-to-text transcription at SRI-ICSI-UW | |
| Karpov et al. | Large vocabulary Russian speech recognition using syntactico-statistical language modeling | |
| Emond et al. | Transliteration based approaches to improve code-switched speech recognition performance | |
| TW201517016A (zh) | 語音辨識方法及電子裝置 | |
| KR102794379B1 (ko) | 앙상블 스코어를 이용한 학습 데이터 교정 방법 및 그 장치 | |
| Li et al. | Improving text normalization using character-blocks based models and system combination | |
| Raval et al. | Improving deep learning based automatic speech recognition for Gujarati | |
| Avram et al. | Towards a romanian end-to-end automatic speech recognition based on deepspeech2 | |
| Arısoy et al. | A unified language model for large vocabulary continuous speech recognition of Turkish | |
| Anoop et al. | Suitability of syllable-based modeling units for end-to-end speech recognition in Sanskrit and other Indian languages | |
| Alsharhan et al. | Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic | |
| Ablimit et al. | Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language | |
| Hanzlíček et al. | Using LSTM neural networks for cross‐lingual phonetic speech segmentation with an iterative correction procedure | |
| Antonova et al. | Spellmapper: A non-autoregressive neural spellchecker for asr customization with candidate retrieval based on n-gram mappings | |
| Zhang et al. | Knowledge prompt for whisper: An asr entity correction approach with knowledge base | |
| Pellegrini et al. | Automatic word decompounding for asr in a morphologically rich language: Application to amharic | |
| Núñez et al. | Phonetic normalization for machine translation of user generated content | |
| Pakoci et al. | Overcoming data sparsity in automatic transcription of dictated medical findings | |
| Adda-Decker et al. | A first LVCSR system for luxembourgish, a low-resourced european language | |
| Pushpakumara | Applicability of Transfer Learning on End-to-End Sinhala Speech Recognition | |
| Oba et al. | Efficient training of discriminative language models by sample selection | |
| Wong et al. | Goodness-of-pronunciation without phoneme time alignment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
Patent event date: 20210609 Patent event code: PA01051R01D Comment text: International Patent Application |
|
| PA0201 | Request for examination | ||
| PG1501 | Laying open of application | ||
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
Comment text: Notification of reason for refusal Patent event date: 20240531 Patent event code: PE09021S01D |
|
| GRNT | Written decision to grant | ||
| PR0701 | Registration of establishment |
Comment text: Registration of Establishment Patent event date: 20241113 Patent event code: PR07011E01D |
|
| PR1002 | Payment of registration fee |
Payment date: 20241113 End annual number: 3 Start annual number: 1 |
|
| PG1601 | Publication of registration |