CN103632663A - HMM-based method of Mongolian speech synthesis and front-end processing - Google Patents

HMM-based method of Mongolian speech synthesis and front-end processing Download PDF

Info

Publication number
CN103632663A
CN103632663A CN201310595871.8A CN201310595871A CN103632663A CN 103632663 A CN103632663 A CN 103632663A CN 201310595871 A CN201310595871 A CN 201310595871A CN 103632663 A CN103632663 A CN 103632663A
Authority
CN
China
Prior art keywords
hmm
speech
mongolian
training
mongol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310595871.8A
Other languages
Chinese (zh)
Other versions
CN103632663B (en
Inventor
飞龙
高光来
赵建东
张学良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201310595871.8A priority Critical patent/CN103632663B/en
Publication of CN103632663A publication Critical patent/CN103632663A/en
Application granted granted Critical
Publication of CN103632663B publication Critical patent/CN103632663B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an HMM-based method of Mongolian speech synthesis and front-end processing. The method comprises the following steps of analyzing speech data in a speech corpus; summarizing a context attribute set and a problem set for clustering decision-making trees; performing HMM training and acquiring HMM training results; acquiring corresponding clustering state models, and forming a clustering state model sequence; generating a target acoustic parameter sequence through the dynamic characteristics of parameters, and acquiring synthesis speech through a STRAIGHT synthesizer. The method of front-end processing comprises the following steps of performing special character processing on input Mongolian; transferring Mongolian to Latin; annotating the part of speech of Mongolian; transferring Latin to phonemes by adopting a G2P model; segmenting syllables by adopting rules; synthesizing data in the format required by rhythm forecasting; performing the rhythm forecasting; processing and acquiring a text format required by the synthesis. By the method, the duration and the pitch period parameter of the synthesis speech can be adjusted freely, the naturality and fluency of the synthesis speech are ensured, and the cost of the synthesis is reduced.

Description

A kind of Mongol phonetic synthesis based on HMM and the method for front-end processing
Technical field
The invention belongs to speech synthesis technique field, relate in particular to a kind of Mongol phonetic synthesis based on HMM and the method for front-end processing.
Background technology
Phonetic synthesis is to realize a gordian technique of man machine language's communication, it relates to a plurality of subject technologies such as acoustics, linguistics, digital signal processing, computer science, it is a cutting edge technology of field of information processing, the subject matter solving is exactly how Word message to be converted into the acoustic information that can listen, allows exactly machine lift up one's voice as people.
The phoneme synthesizing method of current main flow is the speech synthesis technique based on HMM, compare with the waveform concatenation technology based on Big-corpus, it has can be at short notice, substantially do not need automatically to construct a new system in the situation of manual intervention, and whole training process do not rely on the advantage of the factors such as speaker, pronunciation style and emotion substantially, at aspects such as Chinese, English, Japanese, there have been many adaptable speech synthesis systems.Aspect Mongolian phonetic synthesis, forefathers have done the research of the waveform concatenation method aspect based on Big-corpus.These research work are significant for the development of Mongol phonetic synthesis, but the Mongol phonetic synthesis research based on HMM is scarcely out of swaddling-clothes.The Mongolian speech synthesis technique of research based on HMM and build the Mongolian speech synthesis system based on HMM, has great importance for education, traffic, communication, the office automatic of minority area.
Prior art one, the Mongol phoneme synthesizing method based on stem affixe:
First by text analysis model, the text of input is formatd to processing, the word that record will pronounce and punctuation mark, the aphonic character of filtering; Then to each word, first in stem table, search, if find this word original shape, extract corresponding speech data; If can not find in stem table, need to carry out the cutting of stem affixe, to find the stem affixe that this word is corresponding, carry out the extraction of the corresponding speech data of stem affixe of this word simultaneously.Then record prosodic features, and sharp PSOLA algorithm carries out prosody modification.Finally according to splicing rule, the voice segment of choosing is spliced, produce final synthetic speech;
Technical scheme text analysis model need to be processed the problems such as cutting of text formatting and stem affixe.In system, when running into the English symbol representing or English word, only simply process, be exactly to read successively each English alphabet, further do not consider. for each word obtaining in text analyzing, first in stem table, search, if find this word, extract speech data, if can not find, carry out the cutting of stem affixe.Word in Mongol may only carry out a cutting just can find corresponding stem affixe, sometimes may need the stem after cutting to carry out secondary cutting, or carry out repeatedly cutting, so proposed the concept of multistage stem affixe sheer;
The prosodic features that represents phrase, sentence by record the prosodic features of each word at text analysis model.By recording the variation of duration variation, pitch variation and the amplitude of word, express the stress of word, in phrase, the variation of the duration of each word, stress change, the variation of the sentence tone (comprising stress, duration and fundamental frequency).After text analyzing, draw the rhythm variation characteristic of each word, in system, by TDPSOLA and FD-PSOLA, modify, finally obtain amended voice segment;
The splicing of phonetic synthesis unit is the most frequently used three kinds of algorithms: diphone splicing, fight recklessly connect, soft splicing.Fighting recklessly the method connecing is the method for a kind of simply two voice are put together (simple concatenation).Soft splicing, the position of splicing is positioned at the boundary of two segments equally.But, by introducing transient characteristic in natural language, carry out the transition situation of smoothing speech stitching portion.Between two speech primitives, may need certain overlapping.In voice joint is synthetic, if only use to fight recklessly, connect, the voice after synthetic sound sometimes fast and sometimes slow, shake very large, shortage continuity.So in the technical program, adopted the method connecing with soft splicing combination of fighting recklessly while splicing.In system, adopt the overlapping soft joining method of head and the tail to carry out transition.The overlapping soft joining method of so-called head and the tail is: the stem of the afterbody of last voice unit and a rear voice unit is carried out to the waveform stack of certain length.Choosing of certain last voice segment and a rear voice segment overlapping portion length is most important.By summing up the length of stack, to equal 1/8th left and right of two voice unit minimum lengths to be superimposed proper.So both can guarantee that the voice between synthetic rear word and word were coherent, smooth, and can make again pronunciation clear, improve the naturalness of phonetic synthesis.
The shortcoming of prior art: the waveform concatenation based on large database concept generally need to be used a very large sound storehouse, and this has just hindered its application in mobile device or embedded device; Secondly, the synthetic sound of the method for waveform concatenation is comparatively single, if need to change the features such as sex, age of synthetic video, needs to re-establish a sound storehouse, and required input is very large; And although there is the stitching algorithm of a lot of prosody adjustments, to the scope of pitch period and duration adjustment or limited, if adjustment is larger, the naturalness of synthetic speech can obviously decline.
Summary of the invention
The object of the embodiment of the present invention is to provide a kind of Mongol phonetic synthesis based on HMM and the method for front-end processing, is intended to solve that the synthetic cost that existing Mongol phoneme synthesizing method exists is large, the problem being limited in scope of pitch period and duration adjustment.
The embodiment of the present invention is achieved in that a kind of method of the Mongol phonetic synthesis based on HMM, and the method for being somebody's turn to do the Mongol phonetic synthesis based on HMM comprises the following steps:
Step 1, first will analyze the speech data in sound storehouse, extracts corresponding speech parameter;
Step 2, according to the speech parameter extracting, the observation vector of HMM can be divided into spectrum and two parts of fundamental frequency, and summary is to context property collection with for the problem set of decision tree cluster;
Step 3, according to model training and duration modeling training after model initialization, the female HMM training of sound, the training of extended context correlation model, cluster, carry out the training of HMM, finally obtain comprising that the cluster HMM of spectrum, fundamental frequency and duration parameters and decision tree separately carry out the training result of HMM;
Step 4, input text is converted to context-sensitive unit sequence after text analyzing, and the decision tree that then utilizes training to obtain carries out decision-making to each unit, obtains corresponding cluster state model, forms cluster state model sequence;
Step 5, according to parameter generation algorithm, utilizes the dynamic perfromance of parameter to generate target parameters,acoustic sequence, and obtains final synthetic speech by STRAIGHT compositor.
Further, the method for this Mongol phonetic synthesis based on HMM is divided into training stage and synthesis phase.
Further, in step 2, spectrum argument section adopts continuous probability distribution HMM to carry out modeling, and fundamental frequency partly adopts many spatial probability distribution HMM.
Another object of the embodiment of the present invention is to provide a kind of method of the Mongol phonetic synthesis front-end processing based on HMM, and the method for being somebody's turn to do the Mongol phonetic synthesis front-end processing based on HMM comprises the following steps:
The first step, carries out special character by the Mongolian of input and processes;
Second step, arrives latin transliteration by rule by Mongolian transcription;
The 3rd step, adopts the method based on historical models to carry out part of speech automatic marking to Mongolian part-of-speech tagging;
The 4th step, adopts G2P model that latin transliteration is arrived to phoneme;
The 5th step, adopts rule by syllable splitting, and adds up the syllable number comprising in each word;
The 6th step, becomes the part of speech obtaining the data of the required form of prosody prediction with syllable information combination;
The 7th step, uses the method based on condition random field to carry out prosody prediction;
The 8th step, processes and obtains synthetic required text formatting.
Further, in step 1, Mongolian is carried out special character and is comprised numeral, punctuation marks used to enclose the title character.
Mongol phonetic synthesis based on HMM provided by the invention and the method for front-end processing, the foundation of the corpus by the speech synthesis system based on HMM, the context property relevant to Mongol selected and for the determining of the problem set of cluster, and the processing of speech synthesis system front end.The present invention adopts Mongolian special symbol to process in front-end processing process, and Mongolian is to latin transliteration, and latin transliteration is to phoneme, the cutting of Mongol syllable, the prediction of the automatic marking of Mongol part of speech and the Mongol rhythm.The present invention can freely adjust parameters such as the duration of synthetic speech, pitch periods, keeps the natural and tripping of synthetic speech simultaneously; By MLLR, MAP scheduling algorithm, parameter is changed, thereby can be synthesized the sound of different tone colors, do not need to make in addition sound storehouse, reduced synthetic cost.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of the Mongol phonetic synthesis based on HMM that provides of the embodiment of the present invention;
Fig. 2 is the method flow diagram of the Mongol phonetic synthesis front-end processing based on HMM that provides of the embodiment of the present invention.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Below in conjunction with drawings and the specific embodiments, application principle of the present invention is further described.
As shown in Figure 1, the method for the Mongol phonetic synthesis based on HMM of the embodiment of the present invention comprises the following steps:
S101: first will analyze the speech data in sound storehouse, extract corresponding speech parameter;
S102: according to the speech parameter extracting, the observation vector of HMM can be divided into spectrum and two parts of fundamental frequency, summary is to context property collection with for the problem set of decision tree cluster;
S103: carry out the training of HMM according to model training and duration modeling training after model initialization, the female HMM training of sound, the training of extended context correlation model, cluster, finally obtain comprising that the cluster HMM of spectrum, fundamental frequency and duration parameters and decision tree separately carry out the training result of HMM;
S104: input text is converted to context-sensitive unit sequence after text analyzing, the decision tree that then utilizes training to obtain carries out decision-making to each unit, obtains corresponding cluster state model, forms cluster state model sequence;
S105: according to parameter generation algorithm, utilize the dynamic perfromance of parameter to generate target parameters,acoustic sequence, and obtain final synthetic speech by STRAIGHT compositor.
The method concrete steps of the Mongol phonetic synthesis based on HMM of the embodiment of the present invention are:
Be divided into two stages: training stage and synthesis phase;
Training stage mainly comprises pre-service and HMM training, at pretreatment stage, first to analyze the speech data in sound storehouse, extract corresponding speech parameter, according to the speech parameter extracting, the observation vector of HMM can be divided into spectrum and two parts of fundamental frequency, wherein composing argument section adopts continuous probability distribution HMM to carry out modeling, and fundamental frequency partly adopts many spatial probability distribution HMM, in addition, also having an important job before model training is exactly to design to context property collection with for the problem set of decision tree cluster, according to priori select some relevant to Mongol and to parameters,acoustic (spectrum, fundamental frequency and duration) there is the context property of certain influence and design corresponding problem set for context dependent model cluster,
It after pre-service completes, is exactly whole HMM training, its training step is followed successively by model training and the duration modeling training after model initialization, the female HMM training of sound, the training of extended context correlation model, cluster, and the training result finally obtaining comprises the cluster HMM of spectrum, fundamental frequency and duration parameters and decision tree separately;
At synthesis phase, first input text is converted to context-sensitive unit sequence after text analyzing, then the decision tree that utilizes training to obtain carries out decision-making to each unit, obtain corresponding cluster state model, and form cluster state model sequence, last, according to parameter generation algorithm, utilize the dynamic perfromance of parameter to generate target parameters,acoustic sequence, and obtain final synthetic speech by STRAIGHT compositor.
As shown in Figure 2, the method for the Mongol phonetic synthesis front-end processing based on HMM of the embodiment of the present invention comprises the following steps:
S201: special character is carried out in the Mongolian of input and process;
S202: Mongolian transcription is arrived to latin transliteration by rule;
S203: adopt the method based on historical models to carry out part of speech automatic marking to Mongolian part-of-speech tagging;
S204: adopt G2P model that latin transliteration is arrived to phoneme;
S205: adopt rule by syllable splitting, and add up the syllable number comprising in each word;
S206: the part of speech obtaining is become to the data of the required form of prosody prediction with syllable information combination;
S207: use the method based on condition random field to carry out prosody prediction;
S208: process and obtain synthetic required text formatting.
The method concrete steps of the Mongol phonetic synthesis front-end processing based on HMM of the present invention are:
One, the Mongolian of input being carried out to special character (comprising the characters such as numeral, punctuation marks used to enclose the title) processes;
Two: by rule, Mongolian transcription is arrived to latin transliteration;
Three: to Mongolian part-of-speech tagging, adopt the method based on historical models to carry out part of speech automatic marking;
Four: adopt G2P model that latin transliteration is arrived to phoneme;
Five: adopt rule by syllable splitting, and add up the syllable number comprising in each word;
Six: three, five parts of speech that obtain are become to the data of the required form of prosody prediction with syllable information combination;
Seven: use the method based on condition random field to carry out prosody prediction;
Eight: process and obtain synthetic required text formatting.
The Mongol phoneme synthesizing method that the present invention is based on HMM can freely be adjusted parameters such as the duration of synthetic speech, pitch periods, keeps the natural and tripping of synthetic speech simultaneously; Can to parameter, change by MLLR, MAP scheduling algorithm, thereby can synthesize the sound of different tone colors, different from waveform concatenation, this conversion does not need to make in addition sound storehouse, only needs an a small amount of training data to train.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (5)

1. a method for the Mongol phonetic synthesis based on HMM, is characterized in that, the method for being somebody's turn to do the Mongol phonetic synthesis based on HMM comprises the following steps:
Step 1, first will analyze the speech data in sound storehouse, extracts corresponding speech parameter;
Step 2, according to the speech parameter extracting, the observation vector of HMM can be divided into spectrum and two parts of fundamental frequency, and summary is to context property collection with for the problem set of decision tree cluster;
Step 3, according to model training and duration modeling training after model initialization, the female HMM training of sound, the training of extended context correlation model, cluster, carry out the training of HMM, finally obtain comprising that the cluster HMM of spectrum, fundamental frequency and duration parameters and decision tree separately carry out the training result of HMM;
Step 4, input text is converted to context-sensitive unit sequence after text analyzing, and the decision tree that then utilizes training to obtain carries out decision-making to each unit, obtains corresponding cluster state model, forms cluster state model sequence;
Step 5, according to parameter generation algorithm, utilizes the dynamic perfromance of parameter to generate target parameters,acoustic sequence, and obtains final synthetic speech by STRAIGHT compositor.
2. the method for the Mongol phonetic synthesis based on HMM as claimed in claim 1, is characterized in that, the method for being somebody's turn to do the Mongol phonetic synthesis based on HMM is divided into training stage and synthesis phase.
3. the method for the Mongol phonetic synthesis based on HMM as claimed in claim 1, is characterized in that, in step 2, spectrum argument section adopts continuous probability distribution HMM to carry out modeling, and fundamental frequency partly adopts many spatial probability distribution HMM.
4. a method for the Mongol phonetic synthesis front-end processing based on HMM, is characterized in that, the method for being somebody's turn to do the Mongol phonetic synthesis front-end processing based on HMM comprises the following steps:
The first step, carries out special character by the Mongolian of input and processes;
Second step, arrives latin transliteration by rule by Mongolian transcription;
The 3rd step, adopts the method based on historical models to carry out part of speech automatic marking to Mongolian part-of-speech tagging;
The 4th step, adopts G2P model that latin transliteration is arrived to phoneme;
The 5th step, adopts rule by syllable splitting, and adds up the syllable number comprising in each word;
The 6th step, becomes the part of speech obtaining the data of the required form of prosody prediction with syllable information combination;
The 7th step, uses the method based on condition random field to carry out prosody prediction;
The 8th step, processes and obtains synthetic required text formatting.
5. the method for the Mongol phonetic synthesis front-end processing based on HMM as claimed in claim 4, is characterized in that, in step 1, Mongolian is carried out special character and comprised numeral, punctuation marks used to enclose the title character.
CN201310595871.8A 2013-11-25 2013-11-25 A kind of method of Mongol phonetic synthesis front-end processing based on HMM Expired - Fee Related CN103632663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310595871.8A CN103632663B (en) 2013-11-25 2013-11-25 A kind of method of Mongol phonetic synthesis front-end processing based on HMM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310595871.8A CN103632663B (en) 2013-11-25 2013-11-25 A kind of method of Mongol phonetic synthesis front-end processing based on HMM

Publications (2)

Publication Number Publication Date
CN103632663A true CN103632663A (en) 2014-03-12
CN103632663B CN103632663B (en) 2016-08-17

Family

ID=50213641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310595871.8A Expired - Fee Related CN103632663B (en) 2013-11-25 2013-11-25 A kind of method of Mongol phonetic synthesis front-end processing based on HMM

Country Status (1)

Country Link
CN (1) CN103632663B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427855A (en) * 2015-11-09 2016-03-23 上海语知义信息技术有限公司 Voice broadcast system and voice broadcast method of intelligent software
CN105679308A (en) * 2016-03-03 2016-06-15 百度在线网络技术(北京)有限公司 Method and device for generating g2p model based on artificial intelligence and method and device for synthesizing English speech based on artificial intelligence
CN105957518A (en) * 2016-06-16 2016-09-21 内蒙古大学 Mongolian large vocabulary continuous speech recognition method
CN108334502A (en) * 2017-12-29 2018-07-27 内蒙古蒙科立蒙古文化股份有限公司 A kind of method for mutually conversing of tradition Mongolian and Cyrillic Mongolian
CN108766413A (en) * 2018-05-25 2018-11-06 北京云知声信息技术有限公司 Phoneme synthesizing method and system
CN110349564A (en) * 2019-07-22 2019-10-18 苏州思必驰信息科技有限公司 Across the language voice recognition methods of one kind and device
CN112562637A (en) * 2019-09-25 2021-03-26 北京中关村科金技术有限公司 Method, device and storage medium for splicing voice and audio
CN114936555A (en) * 2022-05-24 2022-08-23 内蒙古自治区公安厅 Method and system for carrying out AI intelligent labeling on Mongolian

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN102063897A (en) * 2010-12-09 2011-05-18 北京宇音天下科技有限公司 Sound library compression for embedded type voice synthesis system and use method thereof
CN103226946A (en) * 2013-03-26 2013-07-31 中国科学技术大学 Voice synthesis method based on limited Boltzmann machine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN102063897A (en) * 2010-12-09 2011-05-18 北京宇音天下科技有限公司 Sound library compression for embedded type voice synthesis system and use method thereof
CN103226946A (en) * 2013-03-26 2013-07-31 中国科学技术大学 Voice synthesis method based on limited Boltzmann machine

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
吴义坚等: "基于HMM的可训练中文语音合成", 《中文信息学报》, vol. 20, no. 4, 30 July 2006 (2006-07-30) *
董远等: "条件随机场模型在韵律结构预测中的应用", 《北京邮电大学学报》, vol. 32, no. 5, 31 October 2009 (2009-10-31), pages 36 - 40 *
赵建东等: "基于历史模型的蒙古文自动词性标注研究", 《中文信息学报》, vol. 27, no. 5, 30 September 2013 (2013-09-30) *
飞龙等: "蒙古文字母到音素转换方法的研究", 《计算机应用研究》, vol. 30, no. 6, 30 June 2013 (2013-06-30), pages 1696 - 1700 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427855A (en) * 2015-11-09 2016-03-23 上海语知义信息技术有限公司 Voice broadcast system and voice broadcast method of intelligent software
CN105679308A (en) * 2016-03-03 2016-06-15 百度在线网络技术(北京)有限公司 Method and device for generating g2p model based on artificial intelligence and method and device for synthesizing English speech based on artificial intelligence
CN105957518A (en) * 2016-06-16 2016-09-21 内蒙古大学 Mongolian large vocabulary continuous speech recognition method
CN105957518B (en) * 2016-06-16 2019-05-31 内蒙古大学 A kind of method of Mongol large vocabulary continuous speech recognition
CN108334502A (en) * 2017-12-29 2018-07-27 内蒙古蒙科立蒙古文化股份有限公司 A kind of method for mutually conversing of tradition Mongolian and Cyrillic Mongolian
CN108766413A (en) * 2018-05-25 2018-11-06 北京云知声信息技术有限公司 Phoneme synthesizing method and system
CN110349564A (en) * 2019-07-22 2019-10-18 苏州思必驰信息科技有限公司 Across the language voice recognition methods of one kind and device
CN110349564B (en) * 2019-07-22 2021-09-24 思必驰科技股份有限公司 Cross-language voice recognition method and device
CN112562637A (en) * 2019-09-25 2021-03-26 北京中关村科金技术有限公司 Method, device and storage medium for splicing voice and audio
CN112562637B (en) * 2019-09-25 2024-02-06 北京中关村科金技术有限公司 Method, device and storage medium for splicing voice audios
CN114936555A (en) * 2022-05-24 2022-08-23 内蒙古自治区公安厅 Method and system for carrying out AI intelligent labeling on Mongolian

Also Published As

Publication number Publication date
CN103632663B (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN103632663B (en) A kind of method of Mongol phonetic synthesis front-end processing based on HMM
CN105845125B (en) Phoneme synthesizing method and speech synthetic device
US11443733B2 (en) Contextual text-to-speech processing
CN101178896B (en) Unit selection voice synthetic method based on acoustics statistical model
WO2018153213A1 (en) Multi-language hybrid speech recognition method
KR102139387B1 (en) Method and apparatus for speech synthesis based on large corpus
CN109389968A (en) Based on double-tone section mashed up waveform concatenation method, apparatus, equipment and storage medium
CN104217713A (en) Tibetan-Chinese speech synthesis method and device
CN107103900A (en) A kind of across language emotional speech synthesizing method and system
CN110767213A (en) Rhythm prediction method and device
CN105390133A (en) Tibetan TTVS system realization method
CN106297764A (en) A kind of multilingual mixed Chinese language treatment method and system
JP5929909B2 (en) Prosody generation device, speech synthesizer, prosody generation method, and prosody generation program
CN101515456A (en) Speech recognition interface unit and speed recognition method thereof
CN1811912B (en) Minor sound base phonetic synthesis method
Al-Anzi et al. The impact of phonological rules on Arabic speech recognition
CN114678001A (en) Speech synthesis method and speech synthesis device
Kayte et al. A Marathi Hidden-Markov Model Based Speech Synthesis System
TWI605350B (en) Text-to-speech method and multiplingual speech synthesizer using the method
CN105895076B (en) A kind of phoneme synthesizing method and system
TW201937479A (en) Multilingual mixed speech recognition method
Toman et al. Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis
Balyan et al. Automatic phonetic segmentation of Hindi speech using hidden Markov model
Barbosa Prominence-and boundary-related acoustic correlations in Brazilian Portuguese read and spontaneous speech
CN114822490A (en) Voice splicing method and voice splicing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160615

Address after: 010021 Hohhot West Road, the Inner Mongolia Autonomous Region, No. 235

Applicant after: INNER MONGOLIA University

Address before: 010021 Hohhot West Road, the Inner Mongolia Autonomous Region, No. 235

Applicant before: Flying dragon

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160817