SG11202100900QA - Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium - Google Patents

Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium

Info

Publication number
SG11202100900QA
SG11202100900QA SG11202100900QA SG11202100900QA SG11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA SG 11202100900Q A SG11202100900Q A SG 11202100900QA
Authority
SG
Singapore
Prior art keywords
computer
text
storage medium
readable storage
synthesis method
Prior art date
Application number
SG11202100900QA
Inventor
Minchuan Chen
Jun Ma
Shaojun Wang
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Publication of SG11202100900QA publication Critical patent/SG11202100900QA/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • G10L13/047Architecture of speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
SG11202100900QA 2019-01-17 2019-11-13 Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium SG11202100900QA (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910042827.1A CN109754778B (en) 2019-01-17 2019-01-17 Text speech synthesis method and device and computer equipment
PCT/CN2019/117775 WO2020147404A1 (en) 2019-01-17 2019-11-13 Text-to-speech synthesis method, device, computer apparatus, and non-volatile computer readable storage medium

Publications (1)

Publication Number Publication Date
SG11202100900QA true SG11202100900QA (en) 2021-03-30

Family

ID=66405768

Family Applications (1)

Application Number Title Priority Date Filing Date
SG11202100900QA SG11202100900QA (en) 2019-01-17 2019-11-13 Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium

Country Status (4)

Country Link
US (1) US11620980B2 (en)
CN (1) CN109754778B (en)
SG (1) SG11202100900QA (en)
WO (1) WO2020147404A1 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754778B (en) * 2019-01-17 2023-05-30 平安科技(深圳)有限公司 Text speech synthesis method and device and computer equipment
CN110310619A (en) * 2019-05-16 2019-10-08 平安科技(深圳)有限公司 Polyphone prediction technique, device, equipment and computer readable storage medium
CN109979429A (en) * 2019-05-29 2019-07-05 南京硅基智能科技有限公司 A kind of method and system of TTS
CN110379409B (en) * 2019-06-14 2024-04-16 平安科技(深圳)有限公司 Speech synthesis method, system, terminal device and readable storage medium
CN110335587B (en) * 2019-06-14 2023-11-10 平安科技(深圳)有限公司 Speech synthesis method, system, terminal device and readable storage medium
CN111508466A (en) * 2019-09-12 2020-08-07 马上消费金融股份有限公司 Text processing method, device and equipment and computer readable storage medium
CN112562637B (en) * 2019-09-25 2024-02-06 北京中关村科金技术有限公司 Method, device and storage medium for splicing voice audios
CN110808027B (en) * 2019-11-05 2020-12-08 腾讯科技(深圳)有限公司 Voice synthesis method and device and news broadcasting method and system
CN112786000B (en) * 2019-11-11 2022-06-03 亿度慧达教育科技(北京)有限公司 Speech synthesis method, system, device and storage medium
CN113066472A (en) * 2019-12-13 2021-07-02 科大讯飞股份有限公司 Synthetic speech processing method and related device
CN111133507B (en) * 2019-12-23 2023-05-23 深圳市优必选科技股份有限公司 Speech synthesis method, device, intelligent terminal and readable medium
WO2021127978A1 (en) * 2019-12-24 2021-07-01 深圳市优必选科技股份有限公司 Speech synthesis method and apparatus, computer device and storage medium
CN111312210B (en) * 2020-03-05 2023-03-21 云知声智能科技股份有限公司 Text-text fused voice synthesis method and device
CN113450756A (en) * 2020-03-13 2021-09-28 Tcl科技集团股份有限公司 Training method of voice synthesis model and voice synthesis method
CN111369968B (en) * 2020-03-19 2023-10-13 北京字节跳动网络技术有限公司 Speech synthesis method and device, readable medium and electronic equipment
CN111524500B (en) * 2020-04-17 2023-03-31 浙江同花顺智能科技有限公司 Speech synthesis method, apparatus, device and storage medium
CN111653261A (en) * 2020-06-29 2020-09-11 北京字节跳动网络技术有限公司 Speech synthesis method, speech synthesis device, readable storage medium and electronic equipment
CN112002305A (en) * 2020-07-29 2020-11-27 北京大米科技有限公司 Speech synthesis method, speech synthesis device, storage medium and electronic equipment
CN111986646B (en) * 2020-08-17 2023-12-15 云知声智能科技股份有限公司 Dialect synthesis method and system based on small corpus
CN112712789B (en) * 2020-12-21 2024-05-03 深圳市优必选科技股份有限公司 Cross-language audio conversion method, device, computer equipment and storage medium
CN112885328A (en) * 2021-01-22 2021-06-01 华为技术有限公司 Text data processing method and device
CN112908293B (en) * 2021-03-11 2022-08-02 浙江工业大学 Method and device for correcting pronunciations of polyphones based on semantic attention mechanism
CN113380231B (en) * 2021-06-15 2023-01-24 北京一起教育科技有限责任公司 Voice conversion method and device and electronic equipment
CN113838448B (en) * 2021-06-16 2024-03-15 腾讯科技(深圳)有限公司 Speech synthesis method, device, equipment and computer readable storage medium
CN113539239A (en) * 2021-07-12 2021-10-22 网易(杭州)网络有限公司 Voice conversion method, device, storage medium and electronic equipment
CN113409761B (en) * 2021-07-12 2022-11-01 上海喜马拉雅科技有限公司 Speech synthesis method, speech synthesis device, electronic device, and computer-readable storage medium
CN114783407B (en) * 2022-06-21 2022-10-21 平安科技(深圳)有限公司 Speech synthesis model training method, device, computer equipment and storage medium

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6978239B2 (en) * 2000-12-04 2005-12-20 Microsoft Corporation Method and apparatus for speech synthesis without prosody modification
US8005677B2 (en) * 2003-05-09 2011-08-23 Cisco Technology, Inc. Source-dependent text-to-speech system
US7590533B2 (en) * 2004-03-10 2009-09-15 Microsoft Corporation New-word pronunciation learning using a pronunciation graph
US9542927B2 (en) * 2014-11-13 2017-01-10 Google Inc. Method and system for building text-to-speech voice from diverse recordings
CN105654939B (en) * 2016-01-04 2019-09-13 极限元(杭州)智能科技股份有限公司 A kind of phoneme synthesizing method based on sound vector text feature
US9934775B2 (en) * 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
WO2018053518A1 (en) * 2016-09-19 2018-03-22 Pindrop Security, Inc. Channel-compensated low-level features for speaker recognition
US10395654B2 (en) * 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10896669B2 (en) * 2017-05-19 2021-01-19 Baidu Usa Llc Systems and methods for multi-speaker neural text-to-speech
CN109716326A (en) * 2017-06-21 2019-05-03 微软技术许可有限责任公司 Personalized song is provided in automatic chatting
CN107564511B (en) * 2017-09-25 2018-09-11 平安科技(深圳)有限公司 Electronic device, phoneme synthesizing method and computer readable storage medium
US11017761B2 (en) * 2017-10-19 2021-05-25 Baidu Usa Llc Parallel neural text-to-speech
KR102535411B1 (en) * 2017-11-16 2023-05-23 삼성전자주식회사 Apparatus and method related to metric learning based data classification
EP3739572A4 (en) * 2018-01-11 2021-09-08 Neosapience, Inc. Text-to-speech synthesis method and apparatus using machine learning, and computer-readable storage medium
GB201804073D0 (en) * 2018-03-14 2018-04-25 Papercup Tech Limited A speech processing system and a method of processing a speech signal
CN108492818B (en) * 2018-03-22 2020-10-30 百度在线网络技术(北京)有限公司 Text-to-speech conversion method and device and computer equipment
CN109036375B (en) * 2018-07-25 2023-03-24 腾讯科技(深圳)有限公司 Speech synthesis method, model training device and computer equipment
US10971170B2 (en) * 2018-08-08 2021-04-06 Google Llc Synthesizing speech from text using neural networks
CN109754778B (en) * 2019-01-17 2023-05-30 平安科技(深圳)有限公司 Text speech synthesis method and device and computer equipment

Also Published As

Publication number Publication date
WO2020147404A1 (en) 2020-07-23
CN109754778B (en) 2023-05-30
CN109754778A (en) 2019-05-14
US20210174781A1 (en) 2021-06-10
US11620980B2 (en) 2023-04-04

Similar Documents

Publication Publication Date Title
SG11202100900QA (en) Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium
SG11202001627XA (en) Speech recognition method, apparatus, and computer readable storage medium
EP3748629C0 (en) Identification method for voice keywords, computer-readable storage medium, and computer device
EP3806089A4 (en) Mixed speech recognition method and apparatus, and computer readable storage medium
EP3937165A4 (en) Speech synthesis method and apparatus, and computer-readable storage medium
SG11202112456YA (en) Text classification method, apparatus and computer-readable storage medium
EP3588490A4 (en) Speech conversion method, computer device, and storage medium
EP3758364A4 (en) Dynamic emoticon-generating method, computer-readable storage medium and computer device
EP3648099A4 (en) Voice recognition method, device, apparatus, and storage medium
EP3819821A4 (en) User feature generating method, device, and apparatus, and computer-readable storage medium
EP3477495A4 (en) Apparatus and method for extracting user keyword, and computer-readable storage medium
EP3591930A4 (en) Information storage method, device, and computer-readable storage medium
EP3605537A4 (en) Speech emotion detection method and apparatus, computer device, and storage medium
SG11202010916SA (en) Text recognition method and apparatus, electronic device and storage medium
EP3584786A4 (en) Voice recognition method, electronic device, and computer storage medium
EP3828885C0 (en) Voice denoising method and apparatus, computing device and computer readable storage medium
SG11202101614VA (en) Association recommendation method and device, computer equipment and storage medium
SG11202105174XA (en) Text sequence recognition method and apparatus, electronic device, and storage medium
EP3848730A4 (en) Positioning method, apparatus and device, and computer-readable storage medium
SG11202001873SA (en) Semantic recognition method, electronic device , and computer-readable storage medium
EP3605407A4 (en) Information processing device, information processing method, and computer-readable storage medium
EP3619709A4 (en) Microphone, vocal training apparatus comprising microphone and vocal analyzer, vocal training method, and non-transitory tangible computer-readable storage medium
EP3640825A4 (en) Conversion method, apparatus, computer device, and storage medium
EP3647725A4 (en) Real-scene navigation method and apparatus, device, and computer-readable storage medium
EP3605400A4 (en) Information processing device, information processing method, and computer-readable storage medium