JP7208399B2 - 音声認識の訓練および採点のための音訳 - Google Patents

音声認識の訓練および採点のための音訳 Download PDF

Info

Publication number
JP7208399B2
JP7208399B2 JP2021533448A JP2021533448A JP7208399B2 JP 7208399 B2 JP7208399 B2 JP 7208399B2 JP 2021533448 A JP2021533448 A JP 2021533448A JP 2021533448 A JP2021533448 A JP 2021533448A JP 7208399 B2 JP7208399 B2 JP 7208399B2
Authority
JP
Japan
Prior art keywords
script
words
speech recognition
language
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2021533448A
Other languages
English (en)
Japanese (ja)
Other versions
JP2022515048A (ja
JP2022515048A5 (https=
JPWO2020122974A5 (https=
Inventor
ラマバドラン、ブバナ
マー、ミン
ジェイ.モレノ メンヒバル、ペドロ
エモンド、ジェシー
イー. ロアーク、ブライアン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Publication of JP2022515048A publication Critical patent/JP2022515048A/ja
Publication of JP2022515048A5 publication Critical patent/JP2022515048A5/ja
Publication of JPWO2020122974A5 publication Critical patent/JPWO2020122974A5/ja
Application granted granted Critical
Publication of JP7208399B2 publication Critical patent/JP7208399B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)
JP2021533448A 2018-12-12 2019-02-08 音声認識の訓練および採点のための音訳 Active JP7208399B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201862778431P 2018-12-12 2018-12-12
US62/778,431 2018-12-12
PCT/US2019/017258 WO2020122974A1 (en) 2018-12-12 2019-02-08 Transliteration for speech recognition training and scoring

Publications (4)

Publication Number Publication Date
JP2022515048A JP2022515048A (ja) 2022-02-17
JP2022515048A5 JP2022515048A5 (https=) 2022-08-25
JPWO2020122974A5 JPWO2020122974A5 (https=) 2022-08-25
JP7208399B2 true JP7208399B2 (ja) 2023-01-18

Family

ID=65520451

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2021533448A Active JP7208399B2 (ja) 2018-12-12 2019-02-08 音声認識の訓練および採点のための音訳

Country Status (5)

Country Link
EP (1) EP3877973B1 (https=)
JP (1) JP7208399B2 (https=)
KR (1) KR102731583B1 (https=)
CN (1) CN113396455B (https=)
WO (1) WO2020122974A1 (https=)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114420159B (zh) * 2020-10-12 2025-04-18 苏州声通信息科技有限公司 音频评测方法及装置、非瞬时性存储介质
US11568858B2 (en) * 2020-10-17 2023-01-31 International Business Machines Corporation Transliteration based data augmentation for training multilingual ASR acoustic models in low resource settings
CN113626563A (zh) * 2021-08-30 2021-11-09 京东方科技集团股份有限公司 训练自然语言处理模型和自然语言处理的方法、电子设备
CN113889105B (zh) * 2021-09-29 2025-07-04 北京搜狗科技发展有限公司 一种语音翻译方法、装置和用于语音翻译的装置
CN114118108A (zh) * 2021-11-11 2022-03-01 支付宝(杭州)信息技术有限公司 建立转译模型的方法、转译方法和对应装置
CN114299930B (zh) * 2021-12-21 2025-03-14 广州虎牙科技有限公司 端到端语音识别模型处理方法、语音识别方法及相关装置
CN114520001B (zh) * 2022-03-22 2025-08-01 科大讯飞股份有限公司 一种语音识别方法、装置、设备及存储介质
KR102616598B1 (ko) * 2023-05-30 2023-12-22 주식회사 엘솔루 번역 자막을 이용한 원문 자막 병렬 데이터 생성 방법

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009157888A (ja) 2007-12-28 2009-07-16 National Institute Of Information & Communication Technology 音訳モデル作成装置、音訳装置、及びそれらのためのコンピュータプログラム
JP2018028848A (ja) 2016-08-19 2018-02-22 日本放送協会 変換処理装置、音訳処理装置、およびプログラム

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8335688B2 (en) * 2004-08-20 2012-12-18 Multimodal Technologies, Llc Document transcription system training
US20080221866A1 (en) * 2007-03-06 2008-09-11 Lalitesh Katragadda Machine Learning For Transliteration
US7472061B1 (en) * 2008-03-31 2008-12-30 International Business Machines Corporation Systems and methods for building a native language phoneme lexicon having native pronunciations of non-native words derived from non-native pronunciations
WO2009129315A1 (en) 2008-04-15 2009-10-22 Mobile Technologies, Llc System and methods for maintaining speech-to-speech translation in the field
US9176936B2 (en) * 2012-09-28 2015-11-03 International Business Machines Corporation Transliteration pair matching
US10540957B2 (en) * 2014-12-15 2020-01-21 Baidu Usa Llc Systems and methods for speech transcription
US10255909B2 (en) * 2017-06-29 2019-04-09 Intel IP Corporation Statistical-analysis-based reset of recurrent neural networks for automatic speech recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009157888A (ja) 2007-12-28 2009-07-16 National Institute Of Information & Communication Technology 音訳モデル作成装置、音訳装置、及びそれらのためのコンピュータプログラム
JP2018028848A (ja) 2016-08-19 2018-02-22 日本放送協会 変換処理装置、音訳処理装置、およびプログラム

Also Published As

Publication number Publication date
JP2022515048A (ja) 2022-02-17
CN113396455B (zh) 2025-04-15
KR102731583B1 (ko) 2024-11-15
CN113396455A (zh) 2021-09-14
WO2020122974A1 (en) 2020-06-18
KR20210076163A (ko) 2021-06-23
EP3877973A1 (en) 2021-09-15
EP3877973B1 (en) 2025-09-10

Similar Documents

Publication Publication Date Title
JP7208399B2 (ja) 音声認識の訓練および採点のための音訳
US11417322B2 (en) Transliteration for speech recognition training and scoring
US11942076B2 (en) Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models
Lee et al. Massively multilingual pronunciation modeling with WikiPron
Karpov et al. Large vocabulary Russian speech recognition using syntactico-statistical language modeling
Emond et al. Transliteration based approaches to improve code-switched speech recognition performance
Stolcke et al. Recent innovations in speech-to-text transcription at SRI-ICSI-UW
Abandah et al. Accurate and fast recurrent neural network solution for the automatic diacritization of Arabic text
Sullivan et al. Improving automatic speech recognition for non-native english with transfer learning and language model decoding
Sagae et al. Hallucinated n-best lists for discriminative language modeling
Kirov et al. Context-aware transliteration of romanized South Asian languages
Srivastava et al. Homophone Identification and Merging for Code-switched Speech Recognition.
Anoop et al. Suitability of syllable-based modeling units for end-to-end speech recognition in Sanskrit and other Indian languages
Ablimit et al. Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language
Juhár et al. Recent progress in development of language model for Slovak large vocabulary continuous speech recognition
US20220391588A1 (en) Systems and methods for generating locale-specific phonetic spelling variations
Zhang et al. Knowledge prompt for whisper: An asr entity correction approach with knowledge base
Pellegrini et al. Automatic word decompounding for asr in a morphologically rich language: Application to amharic
Zhang et al. Semantic-weighted word error rate based on BERT for evaluating automatic speech recognition models
Duan et al. Pinyin as a feature of neural machine translation for chinese speech recognition error correction
Pakoci et al. Overcoming data sparsity in automatic transcription of dictated medical findings
Saychum et al. Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling.
Ulasik et al. ZHAW-CAI: ensemble method for Swiss German speech to standard german text
Adda-Decker et al. A first LVCSR system for luxembourgish, a low-resourced european language
Xie et al. NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20210721

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20220720

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20220817

A871 Explanation of circumstances concerning accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A871

Effective date: 20220817

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20220830

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20221007

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20221213

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20230105

R150 Certificate of patent or registration of utility model

Ref document number: 7208399

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250