JP7691027B2 - 音声認識システム、音声認識方法、及び記録媒体 - Google Patents

音声認識システム、音声認識方法、及び記録媒体 Download PDF

Info

Publication number
JP7691027B2
JP7691027B2 JP2024504041A JP2024504041A JP7691027B2 JP 7691027 B2 JP7691027 B2 JP 7691027B2 JP 2024504041 A JP2024504041 A JP 2024504041A JP 2024504041 A JP2024504041 A JP 2024504041A JP 7691027 B2 JP7691027 B2 JP 7691027B2
Authority
JP
Japan
Prior art keywords
speech
data
voice
real
synthetic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2024504041A
Other languages
English (en)
Japanese (ja)
Other versions
JPWO2023166557A5 (https=
JPWO2023166557A1 (https=
Inventor
レイ カク
仁 山本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of JPWO2023166557A1 publication Critical patent/JPWO2023166557A1/ja
Publication of JPWO2023166557A5 publication Critical patent/JPWO2023166557A5/ja
Application granted granted Critical
Publication of JP7691027B2 publication Critical patent/JP7691027B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
JP2024504041A 2022-03-01 2022-03-01 音声認識システム、音声認識方法、及び記録媒体 Active JP7691027B2 (ja)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/008597 WO2023166557A1 (ja) 2022-03-01 2022-03-01 音声認識システム、音声認識方法、及び記録媒体

Publications (3)

Publication Number Publication Date
JPWO2023166557A1 JPWO2023166557A1 (https=) 2023-09-07
JPWO2023166557A5 JPWO2023166557A5 (https=) 2024-10-23
JP7691027B2 true JP7691027B2 (ja) 2025-06-11

Family

ID=87883147

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2024504041A Active JP7691027B2 (ja) 2022-03-01 2022-03-01 音声認識システム、音声認識方法、及び記録媒体

Country Status (3)

Country Link
US (1) US20250061884A1 (https=)
JP (1) JP7691027B2 (https=)
WO (1) WO2023166557A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230386446A1 (en) * 2022-05-25 2023-11-30 AuthenticVoice Inc. Modifying an audio signal to incorporate a natural-sounding intonation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003522978A (ja) 2000-02-10 2003-07-29 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ 手話を音声へ変換する方法及び装置
JP2019008120A (ja) 2017-06-23 2019-01-17 株式会社日立製作所 声質変換システム、声質変換方法、及び声質変換プログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003522978A (ja) 2000-02-10 2003-07-29 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ 手話を音声へ変換する方法及び装置
JP2019008120A (ja) 2017-06-23 2019-01-17 株式会社日立製作所 声質変換システム、声質変換方法、及び声質変換プログラム

Also Published As

Publication number Publication date
US20250061884A1 (en) 2025-02-20
WO2023166557A1 (ja) 2023-09-07
JPWO2023166557A1 (https=) 2023-09-07

Similar Documents

Publication Publication Date Title
KR102754124B1 (ko) 숫자 시퀀스에 대한 종단 간 자동 음성 인식
CN106469552B (zh) 语音识别设备和方法
JP2023509234A (ja) ストリーミングシーケンスモデルの一貫性予測
CN110599998B (zh) 一种语音数据生成方法及装置
CN112735371B (zh) 一种基于文本信息生成说话人视频的方法及装置
CN107705782B (zh) 用于确定音素发音时长的方法和装置
US12062363B2 (en) Tied and reduced RNN-T
CN103098124B (zh) 用于文本到语音转换的方法和系统
CN117355840A (zh) 正则化词分割
JP2020187340A (ja) 音声認識方法及び装置
CN102314778A (zh) 电子阅读器
WO2024182112A1 (en) Using text-injection to recognize speech without transcription
JP7691027B2 (ja) 音声認識システム、音声認識方法、及び記録媒体
CN112383721B (zh) 用于生成视频的方法、装置、设备和介质
CN112381926B (zh) 用于生成视频的方法和装置
CN114255737B (zh) 语音生成方法、装置、电子设备
US8781835B2 (en) Methods and apparatuses for facilitating speech synthesis
US20080243510A1 (en) Overlapping screen reading of non-sequential text
CN111862933A (zh) 用于生成合成语音的方法、装置、设备和介质
Mukherjee et al. A Bengali speech synthesizer on Android OS
CN114999450B (zh) 同形异义字的识别方法、装置、电子设备及存储介质
CN117711372A (zh) 语音合成方法、装置、计算机设备和存储介质
CN117894293A (zh) 语音合成方法、装置、计算机设备和存储介质
CN113096639B (zh) 语音贴图产生方法与装置
JP5881157B2 (ja) 情報処理装置、およびプログラム

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20240806

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20240806

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20250430

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20250513

R150 Certificate of patent or registration of utility model

Ref document number: 7691027

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150