WO2017082717A3 - Method and system for text to speech synthesis - Google Patents

Method and system for text to speech synthesis Download PDF

Info

Publication number
WO2017082717A3
WO2017082717A3 PCT/MY2016/050076 MY2016050076W WO2017082717A3 WO 2017082717 A3 WO2017082717 A3 WO 2017082717A3 MY 2016050076 W MY2016050076 W MY 2016050076W WO 2017082717 A3 WO2017082717 A3 WO 2017082717A3
Authority
WO
WIPO (PCT)
Prior art keywords
malay
speech
text
model
language
Prior art date
Application number
PCT/MY2016/050076
Other languages
French (fr)
Other versions
WO2017082717A2 (en
Inventor
Mumtaz Begum PEER MUSTAFA
Mansoor Ali MOHAMED YUSOOF
Original Assignee
Universiti Malaya
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universiti Malaya filed Critical Universiti Malaya
Publication of WO2017082717A2 publication Critical patent/WO2017082717A2/en
Publication of WO2017082717A3 publication Critical patent/WO2017082717A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a HMM (Hidden Markov Model) based Malay text to Malay speech synthesis system (100) comprising of a plurality of software modules (105, 110) in the form of computer readable instructions residing in memory (22) of a computer (20) and executed by a central processing unit (CPU) (21) of said computer (20), the system (100) comprising of a training module (105) for the training of a plurality of HMM (Hidden Markov Model) statistical predictive models residing in a statistical parameter model database (113) to generate an acoustic speech model of Malay speech and a synthesizing module (110) for synthesizing Malay speech from a Malay text input string (111) utilizing context dependent labels generated from the Malay text input string (111) and the acoustic model for Malay speech; characterized in that, the speech synthesis module (110) includes a context dependent label generation unit (112) that automatically generates context dependent labels for Malay language from Malay language input text string (111) derived in part from syllabification rules that are exclusive to Malay language.
PCT/MY2016/050076 2015-11-09 2016-11-09 Method and system for text to speech synthesis WO2017082717A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2015704052 2015-11-09
MYPI2015704052 2015-11-09

Publications (2)

Publication Number Publication Date
WO2017082717A2 WO2017082717A2 (en) 2017-05-18
WO2017082717A3 true WO2017082717A3 (en) 2018-02-15

Family

ID=58695826

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2016/050076 WO2017082717A2 (en) 2015-11-09 2016-11-09 Method and system for text to speech synthesis

Country Status (1)

Country Link
WO (1) WO2017082717A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818089B (en) * 2021-02-23 2022-06-03 掌阅科技股份有限公司 Text phonetic notation method, electronic equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073679B (en) * 2017-11-10 2021-09-28 中国科学院信息工程研究所 Random pattern string set generation method and device in string matching scene and readable storage medium
KR102152902B1 (en) * 2020-02-11 2020-09-07 주식회사 엘솔루 Method for converting voice data into text data and speech-to-text device performing method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHEE-MING THING ET AL.: "Automatic Phonetic Segmentation of Malay Speech Database", 2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, 10 December 2007 (2007-12-10), XP031229382 *
HAFIZ MUSA ET AL.: "Syllabification Algorithm based on Syllable Rules Matching for Malay Language", 10TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED COMPUTER AND APPLIED COMPUTATIONAL SCIENCE, 8 March 2011 (2011-03-08), pages 279 - 286, XP055464035 *
MUMTAZ B. MUSTAFA ET AL.: "Context-Dependent Labels for an HMM-Based Speech Synthesis System for Malay", ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (0-COCOSDA/CASLRE, 25 November 2013 (2013-11-25), XP032545480 *
THANH-SON PHAN ET AL.: "Improvement of Naturalness for an HMM-based Vietnamese Speech Synthesis using the Prosodic information", 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES - RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF, 10 November 2013 (2013-11-10), pages 276 - 281, XP032555125 *
TIEN-PING TAN ET AL.: "Malay Grapheme to Phoneme Tool for Automatic Speech Recognition", WORKSHOP OF MALAYSIA AND INDONESIA LANGUAGE ENGINEERING (MALINDO, 2009, XP055464038 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818089B (en) * 2021-02-23 2022-06-03 掌阅科技股份有限公司 Text phonetic notation method, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2017082717A2 (en) 2017-05-18

Similar Documents

Publication Publication Date Title
JP6328260B2 (en) Intention estimation device and intention estimation method
CN109686361B (en) Speech synthesis method, device, computing equipment and computer storage medium
KR102423302B1 (en) Apparatus and method for calculating acoustic score in speech recognition, apparatus and method for learning acoustic model
WO2014197334A3 (en) System and method for user-specified pronunciation of words for speech synthesis and recognition
US10643032B2 (en) Output sentence generation apparatus, output sentence generation method, and output sentence generation program
WO2017176356A3 (en) Partitioned machine learning architecture
JP7051919B2 (en) Speech recognition and decoding methods based on streaming attention models, devices, equipment and computer readable storage media
MX2016013019A (en) Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method.
GB2507674A (en) Statistical enhancement of speech output from statistical text-to-speech synthesis system
GB201212783D0 (en) A speech processing system
RU2017108533A (en) A LOOK FOR UNDERSTANDING SPEAKING LANGUAGES IN MULTIMODAL DIALOGUE INTERACTIONS
ATE457510T1 (en) LANGUAGE RECOGNITION SYSTEM WITH HUGE VOCABULARY
WO2017072754A3 (en) A system and method for computer-assisted instruction of a music language
WO2006107586A3 (en) Method and system for interpreting verbal inputs in a multimodal dialog system
GB2484615A (en) A text to speech method and system
WO2017082717A3 (en) Method and system for text to speech synthesis
WO2018118492A3 (en) Linguistic modeling using sets of base phonetics
JP2014066779A5 (en)
WO2015147706A3 (en) Method for converting a structured data array
US10867525B1 (en) Systems and methods for generating recitation items
US10157608B2 (en) Device for predicting voice conversion model, method of predicting voice conversion model, and computer program product
JP2017049535A5 (en)
WO2016029045A3 (en) Lexical dialect analysis system
JP2012226651A5 (en)
TW201719633A (en) A multilingual automatic speed recognition device and method thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16864641

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16864641

Country of ref document: EP

Kind code of ref document: A2