KR102057927B1 - 음성 합성 장치 및 그 방법 - Google Patents

음성 합성 장치 및 그 방법 Download PDF

Info

Publication number
KR102057927B1
KR102057927B1 KR1020190030905A KR20190030905A KR102057927B1 KR 102057927 B1 KR102057927 B1 KR 102057927B1 KR 1020190030905 A KR1020190030905 A KR 1020190030905A KR 20190030905 A KR20190030905 A KR 20190030905A KR 102057927 B1 KR102057927 B1 KR 102057927B1
Authority
KR
South Korea
Prior art keywords
emotion
speech synthesis
neural network
vector
embedding
Prior art date
Application number
KR1020190030905A
Other languages
English (en)
Korean (ko)
Inventor
이자룡
박중배
Original Assignee
휴멜로 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 휴멜로 주식회사 filed Critical 휴멜로 주식회사
Priority to KR1020190030905A priority Critical patent/KR102057927B1/ko
Application granted granted Critical
Publication of KR102057927B1 publication Critical patent/KR102057927B1/ko
Priority to PCT/KR2020/003768 priority patent/WO2020190054A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Child & Adolescent Psychology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Telephonic Communication Services (AREA)
KR1020190030905A 2019-03-19 2019-03-19 음성 합성 장치 및 그 방법 KR102057927B1 (ko)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020190030905A KR102057927B1 (ko) 2019-03-19 2019-03-19 음성 합성 장치 및 그 방법
PCT/KR2020/003768 WO2020190054A1 (fr) 2019-03-19 2020-03-19 Appareil de synthèse de la parole et procédé associé

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020190030905A KR102057927B1 (ko) 2019-03-19 2019-03-19 음성 합성 장치 및 그 방법

Related Child Applications (1)

Application Number Title Priority Date Filing Date
KR1020190167464A Division KR20200111609A (ko) 2019-12-16 2019-12-16 음성 합성 장치 및 그 방법

Publications (1)

Publication Number Publication Date
KR102057927B1 true KR102057927B1 (ko) 2019-12-20

Family

ID=69062875

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020190030905A KR102057927B1 (ko) 2019-03-19 2019-03-19 음성 합성 장치 및 그 방법

Country Status (2)

Country Link
KR (1) KR102057927B1 (fr)
WO (1) WO2020190054A1 (fr)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402923A (zh) * 2020-03-27 2020-07-10 中南大学 基于wavenet的情感语音转换方法
CN111627420A (zh) * 2020-04-21 2020-09-04 升智信息科技(南京)有限公司 极低资源下的特定发音人情感语音合成方法及装置
CN111667812A (zh) * 2020-05-29 2020-09-15 北京声智科技有限公司 一种语音合成方法、装置、设备及存储介质
WO2020190054A1 (fr) * 2019-03-19 2020-09-24 휴멜로 주식회사 Appareil de synthèse de la parole et procédé associé
CN111973178A (zh) * 2020-08-14 2020-11-24 中国科学院上海微系统与信息技术研究所 一种脑电信号识别系统及方法
KR102277205B1 (ko) * 2020-03-18 2021-07-15 휴멜로 주식회사 오디오 변환 장치 및 방법
KR20220004272A (ko) * 2020-07-03 2022-01-11 한국과학기술원 음성 감정 인식 및 합성의 반복 학습 방법 및 장치
KR20220041448A (ko) * 2020-09-25 2022-04-01 주식회사 딥브레인에이아이 텍스트 기반의 음성 합성 방법 및 장치
KR20220071525A (ko) * 2020-11-24 2022-05-31 주식회사 자이냅스 어텐션 얼라인먼트의 스코어를 이용하여 스펙트로그램의 품질을 평가하는 방법 및 음성 합성 시스템
KR20220134247A (ko) * 2021-03-26 2022-10-05 주식회사 엔씨소프트 음색 임베딩 모델 학습 장치 및 방법
US11769482B2 (en) 2020-11-11 2023-09-26 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11241574B2 (en) 2019-09-11 2022-02-08 Bose Corporation Systems and methods for providing and coordinating vagus nerve stimulation with audio therapy
CN112489621B (zh) * 2020-11-20 2022-07-12 北京有竹居网络技术有限公司 语音合成方法、装置、可读介质及电子设备
CN112633364B (zh) * 2020-12-21 2024-04-05 上海海事大学 一种基于Transformer-ESIM注意力机制的多模态情绪识别方法
CN112992177B (zh) * 2021-02-20 2023-10-17 平安科技(深圳)有限公司 语音风格迁移模型的训练方法、装置、设备及存储介质
CN113257218B (zh) * 2021-05-13 2024-01-30 北京有竹居网络技术有限公司 语音合成方法、装置、电子设备和存储介质
CN113421546B (zh) * 2021-06-30 2024-03-01 平安科技(深圳)有限公司 基于跨被试多模态的语音合成方法及相关设备
CN114299915A (zh) * 2021-11-09 2022-04-08 腾讯科技(深圳)有限公司 语音合成方法及相关设备
CN117423327B (zh) * 2023-10-12 2024-03-19 北京家瑞科技有限公司 基于gpt神经网络的语音合成方法和装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006084967A (ja) * 2004-09-17 2006-03-30 Advanced Telecommunication Research Institute International 予測モデルの作成方法およびコンピュータプログラム
KR101954447B1 (ko) * 2018-03-12 2019-03-05 박기수 이동 단말 및 고정 단말 간 연동 기반 텔레마케팅 서비스 제공 방법

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130091364A (ko) * 2011-12-26 2013-08-19 한국생산기술연구원 로봇의 학습이 가능한 감정생성장치 및 감정생성방법
KR102137523B1 (ko) * 2017-08-09 2020-07-24 한국과학기술원 텍스트-음성 변환 방법 및 시스템
KR102057927B1 (ko) * 2019-03-19 2019-12-20 휴멜로 주식회사 음성 합성 장치 및 그 방법

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006084967A (ja) * 2004-09-17 2006-03-30 Advanced Telecommunication Research Institute International 予測モデルの作成方法およびコンピュータプログラム
KR101954447B1 (ko) * 2018-03-12 2019-03-05 박기수 이동 단말 및 고정 단말 간 연동 기반 텔레마케팅 서비스 제공 방법

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020190054A1 (fr) * 2019-03-19 2020-09-24 휴멜로 주식회사 Appareil de synthèse de la parole et procédé associé
KR102277205B1 (ko) * 2020-03-18 2021-07-15 휴멜로 주식회사 오디오 변환 장치 및 방법
CN111402923A (zh) * 2020-03-27 2020-07-10 中南大学 基于wavenet的情感语音转换方法
CN111402923B (zh) * 2020-03-27 2023-11-03 中南大学 基于wavenet的情感语音转换方法
CN111627420A (zh) * 2020-04-21 2020-09-04 升智信息科技(南京)有限公司 极低资源下的特定发音人情感语音合成方法及装置
CN111627420B (zh) * 2020-04-21 2023-12-08 升智信息科技(南京)有限公司 极低资源下的特定发音人情感语音合成方法及装置
CN111667812B (zh) * 2020-05-29 2023-07-18 北京声智科技有限公司 一种语音合成方法、装置、设备及存储介质
CN111667812A (zh) * 2020-05-29 2020-09-15 北京声智科技有限公司 一种语音合成方法、装置、设备及存储介质
KR20220004272A (ko) * 2020-07-03 2022-01-11 한국과학기술원 음성 감정 인식 및 합성의 반복 학습 방법 및 장치
KR102382191B1 (ko) * 2020-07-03 2022-04-04 한국과학기술원 음성 감정 인식 및 합성의 반복 학습 방법 및 장치
CN111973178A (zh) * 2020-08-14 2020-11-24 中国科学院上海微系统与信息技术研究所 一种脑电信号识别系统及方法
KR102392904B1 (ko) * 2020-09-25 2022-05-02 주식회사 딥브레인에이아이 텍스트 기반의 음성 합성 방법 및 장치
KR20220041448A (ko) * 2020-09-25 2022-04-01 주식회사 딥브레인에이아이 텍스트 기반의 음성 합성 방법 및 장치
US11769482B2 (en) 2020-11-11 2023-09-26 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium
KR102503066B1 (ko) 2020-11-24 2023-03-02 주식회사 자이냅스 어텐션 얼라인먼트의 스코어를 이용하여 스펙트로그램의 품질을 평가하는 방법 및 음성 합성 시스템
KR20220071525A (ko) * 2020-11-24 2022-05-31 주식회사 자이냅스 어텐션 얼라인먼트의 스코어를 이용하여 스펙트로그램의 품질을 평가하는 방법 및 음성 합성 시스템
KR20220134247A (ko) * 2021-03-26 2022-10-05 주식회사 엔씨소프트 음색 임베딩 모델 학습 장치 및 방법
KR102576606B1 (ko) 2021-03-26 2023-09-08 주식회사 엔씨소프트 음색 임베딩 모델 학습 장치 및 방법

Also Published As

Publication number Publication date
WO2020190054A1 (fr) 2020-09-24

Similar Documents

Publication Publication Date Title
KR102057927B1 (ko) 음성 합성 장치 및 그 방법
KR102057926B1 (ko) 음성 합성 장치 및 그 방법
JP7204989B2 (ja) エンドツーエンド音声合成システムにおける表現度の制御
US11990118B2 (en) Text-to-speech (TTS) processing
EP3614376B1 (fr) Procédé de synthèse vocale, serveur et support de stockage
US20210209315A1 (en) Direct Speech-to-Speech Translation via Machine Learning
KR20200143659A (ko) 다중 언어 텍스트-음성 합성 방법
KR20200111609A (ko) 음성 합성 장치 및 그 방법
US11763797B2 (en) Text-to-speech (TTS) processing
US20200410981A1 (en) Text-to-speech (tts) processing
US11289068B2 (en) Method, device, and computer-readable storage medium for speech synthesis in parallel
JP7379756B2 (ja) 韻律的特徴からのパラメトリックボコーダパラメータの予測
US20230169953A1 (en) Phrase-based end-to-end text-to-speech (tts) synthesis
JP2024505076A (ja) 多様で自然なテキスト読み上げサンプルを生成する
KR20200111608A (ko) 음성 합성 장치 및 그 방법
KR102277205B1 (ko) 오디오 변환 장치 및 방법
JP7504188B2 (ja) エンドツーエンド音声合成システムにおける表現度の制御
KR20240035548A (ko) 합성 트레이닝 데이터를 사용하는 2-레벨 텍스트-스피치 변환 시스템
Oralbekova et al. Current advances and algorithmic solutions in speech generation
Saleh et al. Arabic Text-to-Speech Service with Syrian Dialect
Zhu et al. Control Emotion Intensity for LSTM-Based Expressive Speech Synthesis
CN115346510A (zh) 一种语音合成方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
A107 Divisional application of patent
GRNT Written decision to grant