SG11202100900QA - Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium - Google Patents

Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium

Info

Publication number
SG11202100900QA
SG11202100900QA SG11202100900QA SG11202100900QA SG11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA
Authority
SG
Singapore
Prior art keywords
computer
text
storage medium
readable storage
synthesis method
Prior art date
Application number
SG11202100900QA
Other languages
English (en)
Inventor
Minchuan Chen
Jun Ma
Shaojun Wang
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Publication of SG11202100900QA publication Critical patent/SG11202100900QA/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • G10L13/047Architecture of speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
SG11202100900QA 2019-01-17 2019-11-13 Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium SG11202100900QA (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910042827.1A CN109754778B (zh) 2019-01-17 2019-01-17 文本的语音合成方法、装置和计算机设备
PCT/CN2019/117775 WO2020147404A1 (zh) 2019-01-17 2019-11-13 文本的语音合成方法、装置、计算机设备及计算机非易失性可读存储介质

Publications (1)

Publication Number Publication Date
SG11202100900QA true SG11202100900QA (en) 2021-03-30

Family

ID=66405768

Family Applications (1)

Application Number Title Priority Date Filing Date
SG11202100900QA SG11202100900QA (en) 2019-01-17 2019-11-13 Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium

Country Status (4)

Country Link
US (1) US11620980B2 (zh)
CN (1) CN109754778B (zh)
SG (1) SG11202100900QA (zh)
WO (1) WO2020147404A1 (zh)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754778B (zh) * 2019-01-17 2023-05-30 平安科技(深圳)有限公司 文本的语音合成方法、装置和计算机设备
CN110310619A (zh) * 2019-05-16 2019-10-08 平安科技(深圳)有限公司 多音字预测方法、装置、设备及计算机可读存储介质
CN109979429A (zh) * 2019-05-29 2019-07-05 南京硅基智能科技有限公司 一种tts的方法及系统
CN110379409B (zh) * 2019-06-14 2024-04-16 平安科技(深圳)有限公司 语音合成方法、系统、终端设备和可读存储介质
CN110335587B (zh) * 2019-06-14 2023-11-10 平安科技(深圳)有限公司 语音合成方法、系统、终端设备和可读存储介质
CN112447165B (zh) * 2019-08-15 2024-08-02 阿里巴巴集团控股有限公司 信息处理、模型训练和构建方法、电子设备、智能音箱
CN111508466A (zh) * 2019-09-12 2020-08-07 马上消费金融股份有限公司 一种文本处理方法、装置、设备及计算机可读存储介质
CN112562637B (zh) * 2019-09-25 2024-02-06 北京中关村科金技术有限公司 拼接语音音频的方法、装置以及存储介质
CN110808027B (zh) * 2019-11-05 2020-12-08 腾讯科技(深圳)有限公司 语音合成方法、装置以及新闻播报方法、系统
CN112786000B (zh) * 2019-11-11 2022-06-03 亿度慧达教育科技(北京)有限公司 语音合成方法、系统、设备及存储介质
CN113066472B (zh) * 2019-12-13 2024-05-31 科大讯飞股份有限公司 合成语音处理方法及相关装置
WO2021127811A1 (zh) * 2019-12-23 2021-07-01 深圳市优必选科技股份有限公司 一种语音合成方法、装置、智能终端及可读介质
CN111316352B (zh) * 2019-12-24 2023-10-10 深圳市优必选科技股份有限公司 语音合成方法、装置、计算机设备和存储介质
CN111312210B (zh) * 2020-03-05 2023-03-21 云知声智能科技股份有限公司 一种融合图文的语音合成方法及装置
CN113450756A (zh) * 2020-03-13 2021-09-28 Tcl科技集团股份有限公司 一种语音合成模型的训练方法及一种语音合成方法
CN111369968B (zh) * 2020-03-19 2023-10-13 北京字节跳动网络技术有限公司 语音合成方法、装置、可读介质及电子设备
CN111524500B (zh) * 2020-04-17 2023-03-31 浙江同花顺智能科技有限公司 语音合成方法、装置、设备和存储介质
CN111653261A (zh) * 2020-06-29 2020-09-11 北京字节跳动网络技术有限公司 语音合成方法、装置、可读存储介质及电子设备
CN113971947A (zh) * 2020-07-24 2022-01-25 北京有限元科技有限公司 语音合成的方法、装置以及存储介质
CN112002305B (zh) * 2020-07-29 2024-06-18 北京大米科技有限公司 语音合成方法、装置、存储介质及电子设备
CN111986646B (zh) * 2020-08-17 2023-12-15 云知声智能科技股份有限公司 一种基于小语料库的方言合成方法及系统
CN112289299B (zh) * 2020-10-21 2024-05-14 北京大米科技有限公司 语音合成模型的训练方法、装置、存储介质以及电子设备
CN112712789B (zh) * 2020-12-21 2024-05-03 深圳市优必选科技股份有限公司 跨语言音频转换方法、装置、计算机设备和存储介质
CN112885328B (zh) * 2021-01-22 2024-06-28 华为技术有限公司 一种文本数据处理方法及装置
CN112908293B (zh) * 2021-03-11 2022-08-02 浙江工业大学 一种基于语义注意力机制的多音字发音纠错方法及装置
CN113380231B (zh) * 2021-06-15 2023-01-24 北京一起教育科技有限责任公司 一种语音转换的方法、装置及电子设备
CN113838448B (zh) * 2021-06-16 2024-03-15 腾讯科技(深圳)有限公司 一种语音合成方法、装置、设备及计算机可读存储介质
US20220405524A1 (en) * 2021-06-17 2022-12-22 International Business Machines Corporation Optical character recognition training with semantic constraints
CN113409761B (zh) * 2021-07-12 2022-11-01 上海喜马拉雅科技有限公司 语音合成方法、装置、电子设备以及计算机可读存储介质
CN113539239B (zh) * 2021-07-12 2024-05-28 网易(杭州)网络有限公司 语音转换方法、装置、存储介质及电子设备
CN114203151A (zh) * 2021-10-29 2022-03-18 广州虎牙科技有限公司 语音合成模型的训练的相关方法以及相关装置、设备
CN114783407B (zh) * 2022-06-21 2022-10-21 平安科技(深圳)有限公司 语音合成模型训练方法、装置、计算机设备及存储介质

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6978239B2 (en) * 2000-12-04 2005-12-20 Microsoft Corporation Method and apparatus for speech synthesis without prosody modification
US8005677B2 (en) * 2003-05-09 2011-08-23 Cisco Technology, Inc. Source-dependent text-to-speech system
US7590533B2 (en) * 2004-03-10 2009-09-15 Microsoft Corporation New-word pronunciation learning using a pronunciation graph
US9542927B2 (en) * 2014-11-13 2017-01-10 Google Inc. Method and system for building text-to-speech voice from diverse recordings
CN105654939B (zh) * 2016-01-04 2019-09-13 极限元(杭州)智能科技股份有限公司 一种基于音向量文本特征的语音合成方法
US9934775B2 (en) * 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
CA3179080A1 (en) * 2016-09-19 2018-03-22 Pindrop Security, Inc. Channel-compensated low-level features for speaker recognition
US10395654B2 (en) * 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10896669B2 (en) * 2017-05-19 2021-01-19 Baidu Usa Llc Systems and methods for multi-speaker neural text-to-speech
CN109716326A (zh) * 2017-06-21 2019-05-03 微软技术许可有限责任公司 在自动聊天中提供个性化歌曲
CN107564511B (zh) * 2017-09-25 2018-09-11 平安科技(深圳)有限公司 电子装置、语音合成方法和计算机可读存储介质
US11017761B2 (en) * 2017-10-19 2021-05-25 Baidu Usa Llc Parallel neural text-to-speech
KR102535411B1 (ko) * 2017-11-16 2023-05-23 삼성전자주식회사 메트릭 학습 기반의 데이터 분류와 관련된 장치 및 그 방법
KR102401512B1 (ko) * 2018-01-11 2022-05-25 네오사피엔스 주식회사 기계학습을 이용한 텍스트-음성 합성 방법, 장치 및 컴퓨터 판독가능한 저장매체
GB201804073D0 (en) * 2018-03-14 2018-04-25 Papercup Tech Limited A speech processing system and a method of processing a speech signal
CN108492818B (zh) * 2018-03-22 2020-10-30 百度在线网络技术(北京)有限公司 文本到语音的转换方法、装置和计算机设备
CN109036375B (zh) * 2018-07-25 2023-03-24 腾讯科技(深圳)有限公司 语音合成方法、模型训练方法、装置和计算机设备
US10971170B2 (en) * 2018-08-08 2021-04-06 Google Llc Synthesizing speech from text using neural networks
CN109754778B (zh) * 2019-01-17 2023-05-30 平安科技(深圳)有限公司 文本的语音合成方法、装置和计算机设备

Also Published As

Publication number Publication date
WO2020147404A1 (zh) 2020-07-23
CN109754778A (zh) 2019-05-14
US20210174781A1 (en) 2021-06-10
US11620980B2 (en) 2023-04-04
CN109754778B (zh) 2023-05-30

Similar Documents

Publication Publication Date Title
SG11202100900QA (en) Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium
SG11202001627XA (en) Speech recognition method, apparatus, and computer readable storage medium
EP3748629C0 (en) IDENTIFICATION METHOD FOR LANGUAGE KEYWORDS, COMPUTER READABLE STORAGE MEDIUM AND COMPUTER DEVICE
SG11202112456YA (en) Text classification method, apparatus and computer-readable storage medium
EP3806089A4 (en) METHOD AND DEVICE FOR MIXED SPEECH RECOGNITION AND COMPUTER-READABLE STORAGE MEDIUM
EP3770905C0 (en) SPEECH RECOGNITION METHOD, DEVICE AND APPARATUS AND STORAGE MEDIUM
EP3937165A4 (en) SPEECH SYNTHESIS METHOD AND APPARATUS, AND COMPUTER READABLE STORAGE MEDIUM
EP3588490A4 (en) LANGUAGE CONVERSION METHOD, COMPUTER DEVICE AND STORAGE MEDIUM
EP3648099A4 (en) VOICE RECOGNITION METHOD, DEVICE, DEVICE AND STORAGE MEDIUM
EP3758364A4 (en) METHOD FOR DYNAMIC EMOTICON GENERATION, COMPUTER-READABLE STORAGE MEDIUM AND COMPUTER DEVICE
EP3819821A4 (en) USER CHARACTERISTIC PRODUCTION PROCESS, DEVICE AND APPARATUS, AND COMPUTER READABLE STORAGE MEDIA
EP3477495A4 (en) APPARATUS AND METHOD FOR USER KEYWORD EXTRACTION AND COMPUTER-READABLE MEMORY MEDIUM
SG11202010916SA (en) Text recognition method and apparatus, electronic device and storage medium
EP3605537A4 (en) LANGUAGE MOTION DETECTION METHOD AND DEVICE, COMPUTER DEVICE AND STORAGE MEDIUM
EP3584786A4 (en) VOICE RECOGNITION METHOD, ELECTRONIC DEVICE, AND COMPUTER STORAGE MEDIUM
EP3591930A4 (en) INFORMATION STORAGE METHOD, COMPUTER-READABLE RECORDING MEDIUM AND DEVICE
EP3828885C0 (en) METHOD AND DEVICE FOR SPEAKING, COMPUTER DEVICE AND COMPUTER-READABLE STORAGE MEDIUM
SG11202105174XA (en) Text sequence recognition method and apparatus, electronic device, and storage medium
EP3848730A4 (en) POSITIONING PROCESS, APPARATUS AND DEVICE, AND COMPUTER READABLE STORAGE MEDIA
SG11202001873SA (en) Semantic recognition method, electronic device , and computer-readable storage medium
EP3619709A4 (en) MICROPHONE, VOICE TRAINING DEVICE WITH THE MICROPHONE AND VOICE ANALYZER, VOICE TRAINING PROCEDURE AND TRANSITIONAL, CONCRETE COMPUTER-READABLE STORAGE MEDIUM
EP3605407A4 (en) INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND COMPUTER READABLE STORAGE MEDIUM
EP3647725A4 (en) REAL SCENE NAVIGATION METHOD AND DEVICE, DEVICE AND COMPUTER READABLE STORAGE MEDIUM
EP3605400A4 (en) INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND COMPUTER READABLE STORAGE MEDIUM
EP3594940A4 (en) TRAINING PROCEDURE FOR VOICE DATA SET, COMPUTER DEVICE AND COMPUTER READABLE STORAGE MEDIUM