JP7603948B2 - 音声合成装置、音声合成方法及び音声合成プログラム - Google Patents

音声合成装置、音声合成方法及び音声合成プログラム Download PDF

Info

Publication number
JP7603948B2
JP7603948B2 JP2023542446A JP2023542446A JP7603948B2 JP 7603948 B2 JP7603948 B2 JP 7603948B2 JP 2023542446 A JP2023542446 A JP 2023542446A JP 2023542446 A JP2023542446 A JP 2023542446A JP 7603948 B2 JP7603948 B2 JP 7603948B2
Authority
JP
Japan
Prior art keywords
speech
information
book
data
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2023542446A
Other languages
English (en)
Japanese (ja)
Other versions
JPWO2023022206A5 (https=
JPWO2023022206A1 (https=
Inventor
勇祐 井島
知樹 郡山
慎之介 高道
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Tokyo NUC
NTT Inc
NTT Inc USA
Original Assignee
Nippon Telegraph and Telephone Corp
University of Tokyo NUC
NTT Inc USA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp, University of Tokyo NUC, NTT Inc USA filed Critical Nippon Telegraph and Telephone Corp
Publication of JPWO2023022206A1 publication Critical patent/JPWO2023022206A1/ja
Publication of JPWO2023022206A5 publication Critical patent/JPWO2023022206A5/ja
Application granted granted Critical
Publication of JP7603948B2 publication Critical patent/JP7603948B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Processing Or Creating Images (AREA)
JP2023542446A 2021-08-18 2022-08-18 音声合成装置、音声合成方法及び音声合成プログラム Active JP7603948B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2021133713 2021-08-18
JP2021133713 2021-08-18
PCT/JP2022/031276 WO2023022206A1 (ja) 2021-08-18 2022-08-18 音声合成装置、音声合成方法及び音声合成プログラム

Publications (3)

Publication Number Publication Date
JPWO2023022206A1 JPWO2023022206A1 (https=) 2023-02-23
JPWO2023022206A5 JPWO2023022206A5 (https=) 2024-05-13
JP7603948B2 true JP7603948B2 (ja) 2024-12-23

Family

ID=85240853

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2023542446A Active JP7603948B2 (ja) 2021-08-18 2022-08-18 音声合成装置、音声合成方法及び音声合成プログラム

Country Status (3)

Country Link
US (1) US20240347039A1 (https=)
JP (1) JP7603948B2 (https=)
WO (1) WO2023022206A1 (https=)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240203418A1 (en) * 2022-12-20 2024-06-20 Jpmorgan Chase Bank, N.A. Method and system for automatically visualizing a transcript
US12548589B1 (en) 2025-09-24 2026-02-10 CNTXT FZCo Systems and methods for generating audio descriptions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005249880A (ja) 2004-03-01 2005-09-15 Xing Inc 携帯式通信端末によるディジタル絵本システム
JP2005321706A (ja) 2004-05-11 2005-11-17 Nippon Telegr & Teleph Corp <Ntt> 電子書籍の再生方法及びその装置
WO2020235696A1 (ko) 2019-05-17 2020-11-26 엘지전자 주식회사 스타일을 고려하여 텍스트와 음성을 상호 변환하는 인공 지능 장치 및 그 방법

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003044072A (ja) * 2001-07-30 2003-02-14 Seiko Epson Corp 音声読み上げ設定装置、音声読み上げ装置、音声読み上げ設定方法、音声読み上げ設定プログラム及び記録媒体
US20080070199A1 (en) * 2006-08-28 2008-03-20 Sommer Sandra R Coloring book composed of digital images converted to black and white outlines
WO2016103652A1 (ja) * 2014-12-24 2016-06-30 日本電気株式会社 音声処理装置、音声処理方法、および記録媒体
US20180133900A1 (en) * 2016-11-15 2018-05-17 JIBO, Inc. Embodied dialog and embodied speech authoring tools for use with an expressive social robot
CN108885614B (zh) * 2017-02-06 2020-12-15 华为技术有限公司 一种文本和语音信息的处理方法以及终端
US10607595B2 (en) * 2017-08-07 2020-03-31 Lenovo (Singapore) Pte. Ltd. Generating audio rendering from textual content based on character models
US10540445B2 (en) * 2017-11-03 2020-01-21 International Business Machines Corporation Intelligent integration of graphical elements into context for screen reader applications
US11226673B2 (en) * 2018-01-26 2022-01-18 Institute Of Software Chinese Academy Of Sciences Affective interaction systems, devices, and methods based on affective computing user interface
KR20210011844A (ko) * 2019-07-23 2021-02-02 삼성전자주식회사 전자 장치 및 그 제어 방법
US11270684B2 (en) * 2019-09-11 2022-03-08 Artificial Intelligence Foundation, Inc. Generation of speech with a prosodic characteristic
CN110717498A (zh) * 2019-09-16 2020-01-21 腾讯科技(深圳)有限公司 图像描述生成方法、装置及电子设备
JP7339151B2 (ja) * 2019-12-23 2023-09-05 株式会社 ディー・エヌ・エー 音声合成装置、音声合成プログラム及び音声合成方法
US20220269870A1 (en) * 2021-02-18 2022-08-25 Meta Platforms, Inc. Readout of Communication Content Comprising Non-Latin or Non-Parsable Content Items for Assistant Systems
JP2024516664A (ja) * 2021-04-27 2024-04-16 フラウンホッファー-ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ デコーダ

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005249880A (ja) 2004-03-01 2005-09-15 Xing Inc 携帯式通信端末によるディジタル絵本システム
JP2005321706A (ja) 2004-05-11 2005-11-17 Nippon Telegr & Teleph Corp <Ntt> 電子書籍の再生方法及びその装置
WO2020235696A1 (ko) 2019-05-17 2020-11-26 엘지전자 주식회사 스타일을 고려하여 텍스트와 음성을 상호 변환하는 인공 지능 장치 및 그 방법

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
百武恭汰他,絵本読み聞かせ風音声合成のためのコンテキストラベル設計に関する実験的検討,電子情報通信学会技術研究報告,Vol.115,No.523,pp.255-260

Also Published As

Publication number Publication date
WO2023022206A1 (ja) 2023-02-23
US20240347039A1 (en) 2024-10-17
JPWO2023022206A1 (https=) 2023-02-23

Similar Documents

Publication Publication Date Title
JP7745022B2 (ja) 非発話テキストおよび音声合成を使う音声認識
JP7280386B2 (ja) 多言語音声合成およびクロスランゲージボイスクローニング
JP7791934B2 (ja) 言語間音声合成を改良するための音声認識の使用
JP7228998B2 (ja) 音声合成装置及びプログラム
JP7753567B2 (ja) 音声認識モデルを訓練するための非並列音声変換の使用
JP2022107032A (ja) 機械学習を利用したテキスト音声合成方法、装置およびコンピュータ読み取り可能な記憶媒体
JP7257593B2 (ja) 区別可能な言語音を生成するための音声合成のトレーニング
CN111954903A (zh) 多说话者神经文本到语音合成
CN116783647A (zh) 生成多样且自然的文本到语音样本
CN114387946A (zh) 语音合成模型的训练方法和语音合成方法
JP7799037B2 (ja) 音声合成ベースのモデル適応での音声認識の向上
JP7603948B2 (ja) 音声合成装置、音声合成方法及び音声合成プログラム
CN112599113A (zh) 方言语音合成方法、装置、电子设备和可读存储介质
CN114255735B (zh) 语音合成方法及系统
CN120153418A (zh) 用于文本转语音的大规模多语言语音-文本联合半监督学习
De et al. Making social platforms accessible: Emotion-aware speech generation with integrated text analysis
KR102382191B1 (ko) 음성 감정 인식 및 합성의 반복 학습 방법 및 장치
KR102418465B1 (ko) 동화 낭독 서비스를 제공하는 서버, 방법 및 컴퓨터 프로그램
KR102426020B1 (ko) 한 화자의 적은 음성 데이터로 감정 운율을 담은 음성 합성 방법 및 장치
JP7357518B2 (ja) 音声合成装置及びプログラム
JP6475572B2 (ja) 発話リズム変換装置、方法及びプログラム
HK40071491B (zh) 音频合成方法、装置、计算机设备和存储介质
CN120954387A (zh) 语音转换方法以及装置
JP2023171025A (ja) 学習装置、学習方法、および、学習プログラム
HK40099450A (zh) 面部动画生成方法、装置、设备、介质及程序产品

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20231120

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20231120

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20240425

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20241105

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20241203

R150 Certificate of patent or registration of utility model

Ref document number: 7603948

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350