JP7603948B2 - 音声合成装置、音声合成方法及び音声合成プログラム - Google Patents
音声合成装置、音声合成方法及び音声合成プログラム Download PDFInfo
- Publication number
- JP7603948B2 JP7603948B2 JP2023542446A JP2023542446A JP7603948B2 JP 7603948 B2 JP7603948 B2 JP 7603948B2 JP 2023542446 A JP2023542446 A JP 2023542446A JP 2023542446 A JP2023542446 A JP 2023542446A JP 7603948 B2 JP7603948 B2 JP 7603948B2
- Authority
- JP
- Japan
- Prior art keywords
- speech
- information
- book
- data
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrically Operated Instructional Devices (AREA)
- Processing Or Creating Images (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021133713 | 2021-08-18 | ||
| JP2021133713 | 2021-08-18 | ||
| PCT/JP2022/031276 WO2023022206A1 (ja) | 2021-08-18 | 2022-08-18 | 音声合成装置、音声合成方法及び音声合成プログラム |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JPWO2023022206A1 JPWO2023022206A1 (https=) | 2023-02-23 |
| JPWO2023022206A5 JPWO2023022206A5 (https=) | 2024-05-13 |
| JP7603948B2 true JP7603948B2 (ja) | 2024-12-23 |
Family
ID=85240853
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023542446A Active JP7603948B2 (ja) | 2021-08-18 | 2022-08-18 | 音声合成装置、音声合成方法及び音声合成プログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20240347039A1 (https=) |
| JP (1) | JP7603948B2 (https=) |
| WO (1) | WO2023022206A1 (https=) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20240203418A1 (en) * | 2022-12-20 | 2024-06-20 | Jpmorgan Chase Bank, N.A. | Method and system for automatically visualizing a transcript |
| US12548589B1 (en) | 2025-09-24 | 2026-02-10 | CNTXT FZCo | Systems and methods for generating audio descriptions |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005249880A (ja) | 2004-03-01 | 2005-09-15 | Xing Inc | 携帯式通信端末によるディジタル絵本システム |
| JP2005321706A (ja) | 2004-05-11 | 2005-11-17 | Nippon Telegr & Teleph Corp <Ntt> | 電子書籍の再生方法及びその装置 |
| WO2020235696A1 (ko) | 2019-05-17 | 2020-11-26 | 엘지전자 주식회사 | 스타일을 고려하여 텍스트와 음성을 상호 변환하는 인공 지능 장치 및 그 방법 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003044072A (ja) * | 2001-07-30 | 2003-02-14 | Seiko Epson Corp | 音声読み上げ設定装置、音声読み上げ装置、音声読み上げ設定方法、音声読み上げ設定プログラム及び記録媒体 |
| US20080070199A1 (en) * | 2006-08-28 | 2008-03-20 | Sommer Sandra R | Coloring book composed of digital images converted to black and white outlines |
| WO2016103652A1 (ja) * | 2014-12-24 | 2016-06-30 | 日本電気株式会社 | 音声処理装置、音声処理方法、および記録媒体 |
| US20180133900A1 (en) * | 2016-11-15 | 2018-05-17 | JIBO, Inc. | Embodied dialog and embodied speech authoring tools for use with an expressive social robot |
| CN108885614B (zh) * | 2017-02-06 | 2020-12-15 | 华为技术有限公司 | 一种文本和语音信息的处理方法以及终端 |
| US10607595B2 (en) * | 2017-08-07 | 2020-03-31 | Lenovo (Singapore) Pte. Ltd. | Generating audio rendering from textual content based on character models |
| US10540445B2 (en) * | 2017-11-03 | 2020-01-21 | International Business Machines Corporation | Intelligent integration of graphical elements into context for screen reader applications |
| US11226673B2 (en) * | 2018-01-26 | 2022-01-18 | Institute Of Software Chinese Academy Of Sciences | Affective interaction systems, devices, and methods based on affective computing user interface |
| KR20210011844A (ko) * | 2019-07-23 | 2021-02-02 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
| US11270684B2 (en) * | 2019-09-11 | 2022-03-08 | Artificial Intelligence Foundation, Inc. | Generation of speech with a prosodic characteristic |
| CN110717498A (zh) * | 2019-09-16 | 2020-01-21 | 腾讯科技(深圳)有限公司 | 图像描述生成方法、装置及电子设备 |
| JP7339151B2 (ja) * | 2019-12-23 | 2023-09-05 | 株式会社 ディー・エヌ・エー | 音声合成装置、音声合成プログラム及び音声合成方法 |
| US20220269870A1 (en) * | 2021-02-18 | 2022-08-25 | Meta Platforms, Inc. | Readout of Communication Content Comprising Non-Latin or Non-Parsable Content Items for Assistant Systems |
| JP2024516664A (ja) * | 2021-04-27 | 2024-04-16 | フラウンホッファー-ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | デコーダ |
-
2022
- 2022-08-18 JP JP2023542446A patent/JP7603948B2/ja active Active
- 2022-08-18 WO PCT/JP2022/031276 patent/WO2023022206A1/ja not_active Ceased
- 2022-08-18 US US18/683,786 patent/US20240347039A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005249880A (ja) | 2004-03-01 | 2005-09-15 | Xing Inc | 携帯式通信端末によるディジタル絵本システム |
| JP2005321706A (ja) | 2004-05-11 | 2005-11-17 | Nippon Telegr & Teleph Corp <Ntt> | 電子書籍の再生方法及びその装置 |
| WO2020235696A1 (ko) | 2019-05-17 | 2020-11-26 | 엘지전자 주식회사 | 스타일을 고려하여 텍스트와 음성을 상호 변환하는 인공 지능 장치 및 그 방법 |
Non-Patent Citations (1)
| Title |
|---|
| 百武恭汰他,絵本読み聞かせ風音声合成のためのコンテキストラベル設計に関する実験的検討,電子情報通信学会技術研究報告,Vol.115,No.523,pp.255-260 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2023022206A1 (ja) | 2023-02-23 |
| US20240347039A1 (en) | 2024-10-17 |
| JPWO2023022206A1 (https=) | 2023-02-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7745022B2 (ja) | 非発話テキストおよび音声合成を使う音声認識 | |
| JP7280386B2 (ja) | 多言語音声合成およびクロスランゲージボイスクローニング | |
| JP7791934B2 (ja) | 言語間音声合成を改良するための音声認識の使用 | |
| JP7228998B2 (ja) | 音声合成装置及びプログラム | |
| JP7753567B2 (ja) | 音声認識モデルを訓練するための非並列音声変換の使用 | |
| JP2022107032A (ja) | 機械学習を利用したテキスト音声合成方法、装置およびコンピュータ読み取り可能な記憶媒体 | |
| JP7257593B2 (ja) | 区別可能な言語音を生成するための音声合成のトレーニング | |
| CN111954903A (zh) | 多说话者神经文本到语音合成 | |
| CN116783647A (zh) | 生成多样且自然的文本到语音样本 | |
| CN114387946A (zh) | 语音合成模型的训练方法和语音合成方法 | |
| JP7799037B2 (ja) | 音声合成ベースのモデル適応での音声認識の向上 | |
| JP7603948B2 (ja) | 音声合成装置、音声合成方法及び音声合成プログラム | |
| CN112599113A (zh) | 方言语音合成方法、装置、电子设备和可读存储介质 | |
| CN114255735B (zh) | 语音合成方法及系统 | |
| CN120153418A (zh) | 用于文本转语音的大规模多语言语音-文本联合半监督学习 | |
| De et al. | Making social platforms accessible: Emotion-aware speech generation with integrated text analysis | |
| KR102382191B1 (ko) | 음성 감정 인식 및 합성의 반복 학습 방법 및 장치 | |
| KR102418465B1 (ko) | 동화 낭독 서비스를 제공하는 서버, 방법 및 컴퓨터 프로그램 | |
| KR102426020B1 (ko) | 한 화자의 적은 음성 데이터로 감정 운율을 담은 음성 합성 방법 및 장치 | |
| JP7357518B2 (ja) | 音声合成装置及びプログラム | |
| JP6475572B2 (ja) | 発話リズム変換装置、方法及びプログラム | |
| HK40071491B (zh) | 音频合成方法、装置、计算机设备和存储介质 | |
| CN120954387A (zh) | 语音转换方法以及装置 | |
| JP2023171025A (ja) | 学習装置、学習方法、および、学習プログラム | |
| HK40099450A (zh) | 面部动画生成方法、装置、设备、介质及程序产品 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20231120 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A821 Effective date: 20231120 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20240425 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20241105 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20241203 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7603948 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
| S533 | Written request for registration of change of name |
Free format text: JAPANESE INTERMEDIATE CODE: R313533 |
|
| R350 | Written notification of registration of transfer |
Free format text: JAPANESE INTERMEDIATE CODE: R350 |