KR102731583B1 - 음성 인식 트레이닝 및 스코어링을 위한 음역 - Google Patents

음성 인식 트레이닝 및 스코어링을 위한 음역 Download PDF

Info

Publication number
KR102731583B1
KR102731583B1 KR1020217017741A KR20217017741A KR102731583B1 KR 102731583 B1 KR102731583 B1 KR 102731583B1 KR 1020217017741 A KR1020217017741 A KR 1020217017741A KR 20217017741 A KR20217017741 A KR 20217017741A KR 102731583 B1 KR102731583 B1 KR 102731583B1
Authority
KR
South Korea
Prior art keywords
script
speech recognition
computers
words
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR1020217017741A
Other languages
English (en)
Korean (ko)
Other versions
KR20210076163A (ko
Inventor
부바나 라마바드란
민 마
페드로 제이. 모레노 멘기바
제시 에몬드
브라이언 이. 로악
Original Assignee
구글 엘엘씨
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 구글 엘엘씨 filed Critical 구글 엘엘씨
Publication of KR20210076163A publication Critical patent/KR20210076163A/ko
Application granted granted Critical
Publication of KR102731583B1 publication Critical patent/KR102731583B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)
KR1020217017741A 2018-12-12 2019-02-08 음성 인식 트레이닝 및 스코어링을 위한 음역 Active KR102731583B1 (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201862778431P 2018-12-12 2018-12-12
US62/778,431 2018-12-12
PCT/US2019/017258 WO2020122974A1 (en) 2018-12-12 2019-02-08 Transliteration for speech recognition training and scoring

Publications (2)

Publication Number Publication Date
KR20210076163A KR20210076163A (ko) 2021-06-23
KR102731583B1 true KR102731583B1 (ko) 2024-11-15

Family

ID=65520451

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020217017741A Active KR102731583B1 (ko) 2018-12-12 2019-02-08 음성 인식 트레이닝 및 스코어링을 위한 음역

Country Status (5)

Country Link
EP (1) EP3877973B1 (https=)
JP (1) JP7208399B2 (https=)
KR (1) KR102731583B1 (https=)
CN (1) CN113396455B (https=)
WO (1) WO2020122974A1 (https=)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114420159B (zh) * 2020-10-12 2025-04-18 苏州声通信息科技有限公司 音频评测方法及装置、非瞬时性存储介质
US11568858B2 (en) * 2020-10-17 2023-01-31 International Business Machines Corporation Transliteration based data augmentation for training multilingual ASR acoustic models in low resource settings
CN113626563A (zh) * 2021-08-30 2021-11-09 京东方科技集团股份有限公司 训练自然语言处理模型和自然语言处理的方法、电子设备
CN113889105B (zh) * 2021-09-29 2025-07-04 北京搜狗科技发展有限公司 一种语音翻译方法、装置和用于语音翻译的装置
CN114118108A (zh) * 2021-11-11 2022-03-01 支付宝(杭州)信息技术有限公司 建立转译模型的方法、转译方法和对应装置
CN114299930B (zh) * 2021-12-21 2025-03-14 广州虎牙科技有限公司 端到端语音识别模型处理方法、语音识别方法及相关装置
CN114520001B (zh) * 2022-03-22 2025-08-01 科大讯飞股份有限公司 一种语音识别方法、装置、设备及存储介质
KR102616598B1 (ko) * 2023-05-30 2023-12-22 주식회사 엘솔루 번역 자막을 이용한 원문 자막 병렬 데이터 생성 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060041427A1 (en) 2004-08-20 2006-02-23 Girija Yegnanarayanan Document transcription system training
US20090248395A1 (en) 2008-03-31 2009-10-01 Neal Alewine Systems and methods for building a native language phoneme lexicon having native pronunciations of non-natie words derived from non-native pronunciatons
WO2009129315A1 (en) 2008-04-15 2009-10-22 Mobile Technologies, Llc System and methods for maintaining speech-to-speech translation in the field

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080221866A1 (en) * 2007-03-06 2008-09-11 Lalitesh Katragadda Machine Learning For Transliteration
JP2009157888A (ja) 2007-12-28 2009-07-16 National Institute Of Information & Communication Technology 音訳モデル作成装置、音訳装置、及びそれらのためのコンピュータプログラム
US9176936B2 (en) * 2012-09-28 2015-11-03 International Business Machines Corporation Transliteration pair matching
US10540957B2 (en) * 2014-12-15 2020-01-21 Baidu Usa Llc Systems and methods for speech transcription
JP2018028848A (ja) 2016-08-19 2018-02-22 日本放送協会 変換処理装置、音訳処理装置、およびプログラム
US10255909B2 (en) * 2017-06-29 2019-04-09 Intel IP Corporation Statistical-analysis-based reset of recurrent neural networks for automatic speech recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060041427A1 (en) 2004-08-20 2006-02-23 Girija Yegnanarayanan Document transcription system training
US20090248395A1 (en) 2008-03-31 2009-10-01 Neal Alewine Systems and methods for building a native language phoneme lexicon having native pronunciations of non-natie words derived from non-native pronunciatons
WO2009129315A1 (en) 2008-04-15 2009-10-22 Mobile Technologies, Llc System and methods for maintaining speech-to-speech translation in the field

Also Published As

Publication number Publication date
JP2022515048A (ja) 2022-02-17
JP7208399B2 (ja) 2023-01-18
CN113396455B (zh) 2025-04-15
CN113396455A (zh) 2021-09-14
WO2020122974A1 (en) 2020-06-18
KR20210076163A (ko) 2021-06-23
EP3877973A1 (en) 2021-09-15
EP3877973B1 (en) 2025-09-10

Similar Documents

Publication Publication Date Title
KR102731583B1 (ko) 음성 인식 트레이닝 및 스코어링을 위한 음역
US11417322B2 (en) Transliteration for speech recognition training and scoring
US11942076B2 (en) Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models
Stolcke et al. Recent innovations in speech-to-text transcription at SRI-ICSI-UW
Karpov et al. Large vocabulary Russian speech recognition using syntactico-statistical language modeling
Emond et al. Transliteration based approaches to improve code-switched speech recognition performance
TW201517016A (zh) 語音辨識方法及電子裝置
KR102794379B1 (ko) 앙상블 스코어를 이용한 학습 데이터 교정 방법 및 그 장치
Li et al. Improving text normalization using character-blocks based models and system combination
Raval et al. Improving deep learning based automatic speech recognition for Gujarati
Avram et al. Towards a romanian end-to-end automatic speech recognition based on deepspeech2
Arısoy et al. A unified language model for large vocabulary continuous speech recognition of Turkish
Anoop et al. Suitability of syllable-based modeling units for end-to-end speech recognition in Sanskrit and other Indian languages
Alsharhan et al. Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic
Ablimit et al. Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language
Hanzlíček et al. Using LSTM neural networks for cross‐lingual phonetic speech segmentation with an iterative correction procedure
Antonova et al. Spellmapper: A non-autoregressive neural spellchecker for asr customization with candidate retrieval based on n-gram mappings
Zhang et al. Knowledge prompt for whisper: An asr entity correction approach with knowledge base
Pellegrini et al. Automatic word decompounding for asr in a morphologically rich language: Application to amharic
Núñez et al. Phonetic normalization for machine translation of user generated content
Pakoci et al. Overcoming data sparsity in automatic transcription of dictated medical findings
Adda-Decker et al. A first LVCSR system for luxembourgish, a low-resourced european language
Pushpakumara Applicability of Transfer Learning on End-to-End Sinhala Speech Recognition
Oba et al. Efficient training of discriminative language models by sample selection
Wong et al. Goodness-of-pronunciation without phoneme time alignment

Legal Events

Date Code Title Description
PA0105 International application

Patent event date: 20210609

Patent event code: PA01051R01D

Comment text: International Patent Application

PA0201 Request for examination
PG1501 Laying open of application
E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

Comment text: Notification of reason for refusal

Patent event date: 20240531

Patent event code: PE09021S01D

GRNT Written decision to grant
PR0701 Registration of establishment

Comment text: Registration of Establishment

Patent event date: 20241113

Patent event code: PR07011E01D

PR1002 Payment of registration fee

Payment date: 20241113

End annual number: 3

Start annual number: 1

PG1601 Publication of registration