CN102568472A - Voice synthesis system with speaker selection and realization method thereof - Google Patents

Voice synthesis system with speaker selection and realization method thereof Download PDF

Info

Publication number
CN102568472A
CN102568472A CN2010105891201A CN201010589120A CN102568472A CN 102568472 A CN102568472 A CN 102568472A CN 2010105891201 A CN2010105891201 A CN 2010105891201A CN 201010589120 A CN201010589120 A CN 201010589120A CN 102568472 A CN102568472 A CN 102568472A
Authority
CN
China
Prior art keywords
target speaker
text
speech
speaker
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010105891201A
Other languages
Chinese (zh)
Inventor
吴悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengle Information Technolpogy Shanghai Co Ltd
Original Assignee
Shengle Information Technolpogy Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengle Information Technolpogy Shanghai Co Ltd filed Critical Shengle Information Technolpogy Shanghai Co Ltd
Priority to CN2010105891201A priority Critical patent/CN102568472A/en
Publication of CN102568472A publication Critical patent/CN102568472A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a voice synthesis system with speaker selection and a realization method of the system. The voice synthesis system comprises a target speaker data extracting device, a model self-adaption device and a target speaker voice synthesis device. The realization method comprises the following steps: (A) collecting voice data of a target speaker by using the target speaker data extracting device; (B) according to the voice data of the target speaker, generating a target speaker model and storing in a target speaker model library by using the model self-adaption device; and (C) after users activate the voice synthesis system, realizing a voice synthesis function by using the target speaker voice synthesis device. The voice synthesis system in a mobile phone-embedded version can select appointed target persons to read short messages and mobile phone texts according to the favors of the users so that the functions of mobile phones are expanded, and the process that the users use the mobile phones for acquiring messages is full of enjoyment and interactivity. In addition, the voice synthesis system can also be applied to platforms besides the mobile phones.

Description

The speech synthesis system that the speaker is optional and its implementation
Technical field
The present invention relates to a kind of speech synthesis system, the optional speech synthesis system of especially a kind of speaker.The invention still further relates to the implementation method of this speech synthesis system.
Background technology
Current cell phone platform is generally expressed the content information of note or text with literal, form is single, does not have entertaining, and is interactive not strong.And speech synthesis technique can address this problem to a certain extent, is about to Word message and converts audio frequency into, reads aloud to the user with the literal in the voice handle machine and listens.But existing speech synthesis system is simplification mostly, and a synthesis system generally includes only one to two speaker, still can not satisfy the diversified affection need of user.If the user dislikes speaker's sound that system carries, even also can produce resentment to using system.
Existing technology can address the above problem to a certain extent; Like Chinese patent number is 200480010899.X; The patent that name is called " text-to-speech system that depends on the source " has been described and has a kind ofly been generated the method for voice from text message, and this method comprises the speech feature vector of the sound of confirming to be associated with the source of text message, and relatively this speech feature vector and a plurality of speaker model; It is given and fixing by system that but its shortcoming is the speaker model, not strong for the adaptability of customer requirements.
Chinese patent number is 01116305.4, and the patent that name is called " by the method for text generation personalized speech " has been introduced a kind of method of concrete generation adaptive model, but sets forth the concrete grammar that obtains target speaker speech data.
In addition, except above-mentioned cell phone platform, at present also not for other platforms, user experience effect speech synthesis system preferably.
Summary of the invention
The technical matters that the present invention will solve provides the optional speech synthesis system of a kind of speaker; It is rich in interest and expressive force; Not only can promote the enjoyment of linking up between the user (as utilizing the enjoyment of messaging communication between the cellphone subscriber), also can promote the user for the experience of reading.
For solving the problems of the technologies described above, the speech synthesis system that speaker of the present invention is optional comprises:
Target speaker's data extract device is used to extract target speaker's speech data, and these data comprise voice data and corresponding text data; This device comprises: recording module is used to record target speaker voice; The text library of band phoneme characteristic is used to offer the target speaker and reads aloud; Sound identification module is used for target speaker's voice (voice data) of being recorded are converted into corresponding text data; Wherein, in this recording module, the source of sound of recording target speaker voice comprises: environment sound, telephone relation voice;
Model self-adaption device is used for generating and choosing the intended target speaker model, and this device comprises: speaker's modular converter is used for generating the target speaker model according to target speaker's speech data; Target speaker model storehouse is used to store the target speaker model;
Target speaker's speech synthetic device is used to generate the synthetic speech that the target speaker reads aloud text, and this device comprises: text analysis model is used for reading aloud text analysis; The phonetic synthesis module is used to generate intended target speaker's the synthetic speech of reading aloud fixed text.
The optional speech synthesis system of speaker of the present invention can be applied to comprise the speech synthesis system of cell phone platform, email platforms, voice broadcast platform.
Another technical matters that the present invention will solve provides the implementation method of above-mentioned speech synthesis system.
For solving the problems of the technologies described above, the implementation method of the speech synthesis system that speaker of the present invention is optional comprises step:
(A) target speaker data extract device is gathered target speaker's speech data;
(B) model self-adaption device generates the target speaker model according to target speaker's speech data, and is stored to target speaker model storehouse;
(C) behind this speech synthesis system of user activation, target speaker's speech synthetic device is realized speech-sound synthesizing function according to the following step:
(1) user's specify text and name;
Wherein, in the speech synthesis system that is applied to cell phone platform, the user can be through following mode specify text and name:
1. the name of target speaker model in the speech synthesis system and cell phone address book is bound, with fixing name be sender's note as specify text, relevant people is called the appointment name;
2. to be stored in text in the mobile phone as specify text, the user manually specifies name;
(2) text analysis model is analyzed text;
(3) the phonetic synthesis module extracts corresponding model according to name from target speaker model storehouse, and according to the analysis result of text analysis model, generates the synthetic speech that the target people reads aloud text;
(4) play the voice that synthesized.
In the said step (A), target speaker's data extract device can be decided in its sole discretion with following any mode by the user target speaker is carried out the speech data extraction:
(1) by the target speaker read aloud the appointment of target speaker data extract device band phoneme characteristic text and with recording module recording, with the text of the band phoneme characteristic of appointment as text data, with the voice recorded as voice data;
Wherein, the Chinese character in the text of the band phoneme characteristic of appointment should cover all syllables;
(2) read aloud any free text and, convert institute's recorded speech into text by sound identification module again by the target speaker with recording module recording, with the text as text data, with the voice recorded as voice data;
(3) utilizing recording module to record target speaker's call voice, is text by sound identification module with the speech conversion of being recorded again, with the text as text data, with the voice recorded as voice data.
Record length in mode (2) and (3) must satisfy the fixed time of target speaker data extract device; If the duration of single recording does not meet the demands; Then need repeatedly recording to make the total duration of audio frequency satisfy the appointment requirement of target speaker data extract device, and with the voice data of the audio frequency summation that meets the demands as the target speaker.
In the speech synthesis system of the present invention; In order to improve synthetic target speaker's voice quality; Promptly obtain the high target speaker model of parameter matching degree, this speech synthesis system has comprised the text that contains complete phoneme characteristic and has offered the target speaker and read aloud and record; If the user dislikes this data acquisition modes, also can let the target speaker read aloud the text of random length and record or record the calling record with the target speaker, with voice identification mode identification content of text, recording must be satisfied the appointment duration again.
The speech synthesis system that is applied to cell phone platform of the present invention can combine to read note and read two kinds of functions of mobile phone text.In addition, the user can bind the name in system and the cell phone address book, utilizes target speaker's massage voice reading SMS, also can specify text fragment in any mobile phone to utilize target speaker's massage voice reading; When the target speaker model storehouse of system and the name in the cell phone address book are bound, when receiving target speaker's note, the user can use this people's sound to read note.For the stored text of other mobile phones, this system also can let user's intended target speaker that it is read aloud.
Therefore, speech synthesis system of the present invention is rich in interest and expressive force, can promote the enjoyment of linking up between the user, and various reading experience can be provided.In addition, speech synthesis system of the present invention also can be applicable to the platform except that mobile phone, like email platforms, voice broadcast platform etc.
Description of drawings
Below in conjunction with accompanying drawing and embodiment the present invention is done further detailed explanation:
Fig. 1 is the module diagram of speech synthesis system of the present invention;
Fig. 2 is a system of the present invention operational scheme synoptic diagram;
Fig. 3 is the schematic flow sheet that the present invention gathers target speaker data.
Embodiment
Understand for technology contents of the present invention, characteristics and effect being had more specifically, be example and combine illustrated embodiment that with the optional speech synthesis system of the speaker of cell phone platform details are as follows at present:
The speech synthesis system that the speaker of cell phone platform of the present invention is optional is based on the embedded development version of mobile phone operating system, can be used for the voice of synthetic target speaker note and reads aloud or utilize the mobile phone text of target speaker's massage voice reading appointment.This speech synthesis system comprises: target speaker's data extract device, model self-adaption device and target speaker speech synthetic device.Wherein, the module diagram of this speech synthesis system, as shown in Figure 1.
Target speaker's data extract device is used to extract target speaker's speech data, and these data comprise voice data and corresponding text data.Wherein, this target speaker data extract device comprises:
Recording module is used to record target speaker voice; This recording module can be to recording from the source of sound of environment sound or telephone relation voice;
The text library of band phoneme characteristic is used to offer the target speaker and reads aloud;
Sound identification module is used for target speaker's voice of being recorded are converted into corresponding text data.
For many-side satisfies user's hobby, target speaker's data extract device can be selected in following 3 kinds of modes any one voluntarily by the user, the target speaker is carried out speech data extract (shown in Figure 3):
(1) reads aloud the text of the band phoneme characteristic that target speaker data extract device extracts by the target speaker from the text library of band phoneme characteristic and it is recorded with recording module; With the text of the band phoneme characteristic of appointment as text data, with the voice recorded as voice data;
Wherein, the Chinese character in the text of the band phoneme characteristic of appointment covers all syllables;
(2) very dull if the user feels to read aloud specify text; Can also read aloud any free text and it is recorded by the target speaker with recording module; Convert institute's recorded speech into text by sound identification module again, with the text as text data, with the voice recorded as voice data;
(3) user can utilize recording module to record to the mobile phone communication voice with the target speaker, converts institute's recorded speech into text by sound identification module again, with the text as text data, with the voice recorded as voice data.Can not receive distance limit like this, but audio quality can reduce further simultaneously.
Wherein, The quality of data that (1) kind mode obtains is the highest; Back dual mode [(2), (3) mode] must notice that long recording time will satisfy the appointment duration of target speaker data extract device; If the duration of single recording does not meet the demands, then needing repeatedly records makes the total duration of audio frequency satisfy specified requirement, and with the voice data of the audio frequency summation that meets the demands as the target speaker.
Model self-adaption device is used for generating and choosing the intended target speaker model.This model self-adaption device comprises:
Speaker's modular converter is used for generating the target speaker model according to target speaker's speech data;
Target speaker model storehouse is used to store the target speaker model.
After speaker's modular converter of model self-adaption device obtains target speaker data; Can utilize existing adaptive technique the source speaker model to be carried out the model parameter mapping with target speaker data; Obtain the target speaker model, store the gained model into target speaker model storehouse afterwards.According to user's requirement, the model in the model bank can select with cell phone address book in name bind, be used to read the note of specifying the speaker to send.
Target speaker's speech synthetic device is used for the synthetic speech according to the content generation text target speaker of user's specify text.This target speaker speech synthetic device comprises:
The text analysis model of front end is used for reading aloud text analysis; For example, how each literal of analyzing in the text is read, and how to make pauses in reading unpunctuated ancient writings etc.;
The phonetic synthesis module of rear end is used to generate intended target speaker's the synthetic speech of reading aloud fixed text.
In target speaker speech synthetic device; The user can be through obtaining target speaker note and the content of selecting the mode specify text of existing text in the mobile phone, and this dual mode is respectively the target speaker and manual intended target speaker that automatic selection cell phone address book is bound for target speaker's specific mode.The analysis result that text analysis model through front end obtains is passed to the phonetic synthesis module of rear end, generates this by the phonetic synthesis module and reads aloud the synthetic speech of text.
Do further detailed explanation in the face of the implementation method of speech synthesis system of the present invention down.This implementation method, as shown in Figure 2, its concrete steps comprise:
(A) target speaker data extract device is gathered target speaker's speech data:
The user invites the desired destination speaker to participate in the data acquisition link.Target speaker can from before the mode selecting described three kinds of modes (shown in Figure 3) oneself to be inclined to carry out data acquisition.It should be noted that the data acquisition modes of non-bright read apparatus specify text, long recording time must satisfy the system requirements duration.
Such as, user Zhang San expects Li Si's speech data, for model adaptation is prepared.This moment is if Li Si is on the scene; So Zhang San invites Li Si to read the text of the band phoneme characteristic that target speaker data extract device provides; And with recording module recording, system preserves voice data behind the End of Tape, and with the text of the band phoneme characteristic formulated as text data.If Li Si dislikes reading text; Then Zhang San asks his words that make some casual remarks; Be arbitrary text, Zhang San records to it with recording module simultaneously, when the appointment duration of the discontented foot-eye speaker data extract device of long recording time; Device provides corresponding prompting, and this moment, Zhang San can record to Li Si once more or repeatedly.Sound identification module goes out corresponding text as text data to audio identification behind the End of Tape.If Li Si is absent from the scene, Zhang San just makes a phone call to Li Si, in communication process, with recording module Li Si's dialog context is recorded.When the appointment duration of the discontented foot-eye speaker data extract device of long recording time, device provides corresponding prompting, and Zhang San can be once more or repeatedly made a phone call with Li Si and record at this moment.Sound identification module goes out corresponding text as text data to audio identification behind the End of Tape.
(B) model self-adaption device generates the target speaker model according to target speaker's speech data, and is stored to target speaker model storehouse:
After target speaker data satisfy the speech synthesis system requirement; Whether the system prompt user need carry out model adaptation immediately; The user selects, and then model self-adaption device starts speaker's modular converter and begins to carry out model adaptation, and user's choosing is not; Carry out model adaptation after then can selecting again, perhaps collection is carried out model adaptation after more speaking more and talking about personal data again.
The target speaker model that obtains can be stored in target speaker model storehouse, and after obtaining a model, the user can also cover with new data training new model and to the model in the storehouse.Whether the system prompt user will be tied to cell phone address book to the target speaker model after a target speaker model is preserved; User's choosing is; Then system can open cell phone address book and offers the user and carry out name and select, and the user selects to accomplish behind the corresponding name and binds.
Zhang San has obtained Li Si's speech data such as hypothesis, so he selects Li Si's data and utilizes speaker's modular converter self-adaptation to obtain Li Si's speech model in system, and is kept in the target speaker model storehouse.This moment, whether system prompt will be bound with the name in the cell phone address book, and Zhang San selects is, and selects Li Si to accomplish binding at address list.Zhang San's new speech data of one section Li Si of having got back after a period of time; So Zhang San has obtained Li Si's speech model with new data self-adaptation again; And covered the Li Si's among the target speaker originally speech model, and the name " Li Si " in adversary's machine address list is bound again.
(C) behind this speech synthesis system of user activation, target speaker's speech synthetic device is realized speech-sound synthesizing function:
When this speech synthesis system of user activation and after certain name having been carried out the model binding; When after this receiving this person's note again; Whether system can point out will read note, is that then target speaker speech synthetic device can synthesize reading aloud voice and playing of this note if select.
Li Si's speech model is bound " Li Si " in the address list such as Zhang San; This moment, Zhang San received Li Si's note; Whether system can point out will read note, and Zhang San selects is, so the voice of the synthetic bright reading short message of Li Si of target speaker speech synthetic device and broadcast.
In addition, no matter whether certain target speaker model is bound to cell phone address book, and this model all can be used to read other mobile phone texts.Method is: the user opens system; And select to open the text document of certain specified path in system; Model in the manual more afterwards select target speaker model storehouse, the target speaker of the synthetic current page document of definite back target speaker speech synthetic device reads aloud voice and plays.
Such as, Zhang San wants to read aloud certain text document with king five sound.King five speech model is not bound with cell phone address book, but Zhang San still can manually select king five speech model in system, confirms the voice of back with the synthetic king's five reading text of target speaker speech synthetic device.
The speech synthesis system of the above-mentioned embedded version of mobile phone can select the intended target people to read note and mobile phone text according to user preferences, has expanded the function of mobile phone, and the process that makes the user utilize mobile phone to obtain information more is full of interesting and interactive.
In addition, though only introduced the example that is applied to cell phone platform, according to as stated, the present invention can be applicable to other platforms fully, like email platforms, voice broadcast platform.
The speech synthesis system that speaker of the present invention is optional and its implementation; Generate target speaker's speech model through target speaker's speech data self-adaptation; Model bank is dynamic and higher with speaker's matching degree of user expectation, and the present invention adopted concrete speech data acquisition method, can be adapted to different scenes; Put forth effort on making speech data comprise more complete phoneme characteristic simultaneously, obtaining the higher target speaker model of parameter matching becomes possibility.Therefore, the present invention's process that can make the user obtain information more is full of interesting and interactive.

Claims (11)

1. speech synthesis system that the speaker is optional, it is characterized in that: this speech synthesis system comprises:
Target speaker's data extract device is used to extract target speaker's speech data;
Model self-adaption device is used for generating and choosing the intended target speaker model;
Target speaker's speech synthetic device is used to generate the synthetic speech that the target speaker reads aloud text.
2. the optional speech synthesis system of speaker as claimed in claim 1 is characterized in that: said target speaker's data extract device comprises: recording module is used to record target speaker voice; The text library of band phoneme characteristic is used to offer the target speaker and reads aloud; Sound identification module is used for target speaker's voice of being recorded are converted into corresponding text data;
Model self-adaption device comprises: speaker's modular converter is used for generating the target speaker model according to target speaker's speech data; Target speaker model storehouse is used to store the target speaker model;
Target speaker's speech synthetic device comprises: text analysis model is used for reading aloud text analysis; The phonetic synthesis module is used to generate intended target speaker's the synthetic speech of reading aloud fixed text.
3. the optional speech synthesis system of speaker as claimed in claim 1 is characterized in that: said speech synthesis system is a kind of speech synthesis system that is applied to comprise cell phone platform, email platforms, voice broadcast platform.
4. the optional speech synthesis system of speaker as claimed in claim 1 is characterized in that: in said target speaker's data extract device, target speaker's speech data comprises voice data and corresponding text data.
5. the optional speech synthesis system of speaker as claimed in claim 2 is characterized in that: in the said recording module, the source of sound of recording target speaker voice comprises: environment sound, telephone relation voice.
6. like the implementation method of the optional speech synthesis system of each described speaker of claim 1-5, comprise step:
(A) target speaker data extract device is gathered target speaker's speech data;
(B) model self-adaption device generates the target speaker model according to target speaker's speech data, and is stored to target speaker model storehouse;
(C) behind this speech synthesis system of user activation, target speaker's speech synthetic device is realized speech-sound synthesizing function.
7. the implementation method of the speech synthesis system that speaker as claimed in claim 6 is optional is characterized in that: in the said step (A), target speaker's data extract device is with following any one mode the target speaker to be carried out speech data to extract:
(1) by the target speaker read aloud the appointment of target speaker data extract device band phoneme characteristic text and with recording module recording, with the text of the band phoneme characteristic of appointment as text data, with the voice recorded as voice data;
(2) read aloud any free text and, convert institute's recorded speech into text by sound identification module again by the target speaker with recording module recording, with the text as text data, with the voice recorded as voice data;
(3) utilizing recording module to record target speaker's call voice, is text by sound identification module with the speech conversion of being recorded again, with the text as text data, with the voice recorded as voice data.
8. the implementation method of the speech synthesis system that speaker as claimed in claim 6 is optional is characterized in that: in said (C), target speaker's speech synthetic device is realized speech-sound synthesizing function according to the following step:
(1) user's specify text and name;
(2) text analysis model is analyzed text;
(3) the phonetic synthesis module extracts corresponding model according to name from target speaker model storehouse, and according to the analysis result of text analysis model, generates the synthetic speech that the target people reads aloud text;
(4) play the voice that synthesized.
9. the implementation method of the speech synthesis system that speaker as claimed in claim 8 is optional is characterized in that: in said (1), in the speech synthesis system that is applied to cell phone platform, the user is through following mode specify text and name:
1. the name of target speaker model in the speech synthesis system and cell phone address book is bound, with fixing name be sender's note as specify text, relevant people is called the appointment name;
2. to be stored in text in the mobile phone as specify text, the user manually specifies name.
10. the implementation method of the speech synthesis system that speaker as claimed in claim 7 is optional is characterized in that: the Chinese character in the text of the band phoneme characteristic of the appointment in said (1) covers all syllables.
11. the implementation method of the speech synthesis system that speaker as claimed in claim 7 is optional; It is characterized in that: the record length in said (2) and (3) must satisfy the fixed time of target speaker data extract device; If the duration of single recording does not meet the demands; Then need repeatedly recording to make the total duration of audio frequency satisfy the appointment requirement of target speaker data extract device, and with the voice data of the audio frequency summation that meets the demands as the target speaker.
CN2010105891201A 2010-12-15 2010-12-15 Voice synthesis system with speaker selection and realization method thereof Pending CN102568472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105891201A CN102568472A (en) 2010-12-15 2010-12-15 Voice synthesis system with speaker selection and realization method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105891201A CN102568472A (en) 2010-12-15 2010-12-15 Voice synthesis system with speaker selection and realization method thereof

Publications (1)

Publication Number Publication Date
CN102568472A true CN102568472A (en) 2012-07-11

Family

ID=46413729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105891201A Pending CN102568472A (en) 2010-12-15 2010-12-15 Voice synthesis system with speaker selection and realization method thereof

Country Status (1)

Country Link
CN (1) CN102568472A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730117A (en) * 2012-10-12 2014-04-16 中兴通讯股份有限公司 Self-adaptation intelligent voice device and method
CN104123932A (en) * 2014-07-29 2014-10-29 科大讯飞股份有限公司 Voice conversion system and method
CN104123857A (en) * 2014-07-16 2014-10-29 北京网梯科技发展有限公司 Device and method for achieving individualized touch reading
CN104464716A (en) * 2014-11-20 2015-03-25 北京云知声信息技术有限公司 Voice broadcasting system and method
CN104485100A (en) * 2014-12-18 2015-04-01 天津讯飞信息科技有限公司 Text-to-speech pronunciation person self-adaptive method and system
WO2015085542A1 (en) * 2013-12-12 2015-06-18 Intel Corporation Voice personalization for machine reading
CN105208194A (en) * 2015-08-17 2015-12-30 努比亚技术有限公司 Voice broadcast device and method
CN105609096A (en) * 2015-12-30 2016-05-25 小米科技有限责任公司 Text data output method and device
CN105654941A (en) * 2016-01-20 2016-06-08 华南理工大学 Voice change method and device based on specific target person voice change ratio parameter
CN105702246A (en) * 2016-03-17 2016-06-22 广东小天才科技有限公司 Method and device for assisting user for dictation
CN107154263A (en) * 2017-05-25 2017-09-12 宇龙计算机通信科技(深圳)有限公司 Sound processing method, device and electronic equipment
CN107293284A (en) * 2017-07-27 2017-10-24 上海传英信息技术有限公司 A kind of phoneme synthesizing method and speech synthesis system based on intelligent terminal
CN107331388A (en) * 2017-06-15 2017-11-07 重庆柚瓣科技有限公司 A kind of dialect collection system based on endowment robot
CN107634898A (en) * 2017-08-18 2018-01-26 上海云从企业发展有限公司 True man's voice information communication is realized by the chat tool on electronic communication equipment
CN108364638A (en) * 2018-01-12 2018-08-03 咪咕音乐有限公司 A kind of voice data processing method, device, electronic equipment and storage medium
CN109935225A (en) * 2017-12-15 2019-06-25 富泰华工业(深圳)有限公司 Character information processor and method, computer storage medium and mobile terminal
CN110415678A (en) * 2019-06-13 2019-11-05 百度时代网络技术(北京)有限公司 Customized voice broadcast client, server, system and method
CN110856023A (en) * 2019-11-15 2020-02-28 四川长虹电器股份有限公司 System and method for realizing customized broadcast of smart television based on TTS
CN111429878A (en) * 2020-03-11 2020-07-17 云知声智能科技股份有限公司 Self-adaptive speech synthesis method and device
CN111681638A (en) * 2020-04-20 2020-09-18 深圳奥尼电子股份有限公司 Vehicle-mounted intelligent voice control method and system
CN112802448A (en) * 2021-01-05 2021-05-14 杭州一知智能科技有限公司 Speech synthesis method and system for generating new tone
CN112885371A (en) * 2021-01-13 2021-06-01 北京爱数智慧科技有限公司 Method, apparatus, electronic device and readable storage medium for audio desensitization
CN113362805A (en) * 2021-06-18 2021-09-07 四川启睿克科技有限公司 Chinese and English speech synthesis method and device with controllable tone and accent
CN113823293A (en) * 2021-09-28 2021-12-21 武汉理工大学 Speaker recognition method and system based on voice enhancement

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730117A (en) * 2012-10-12 2014-04-16 中兴通讯股份有限公司 Self-adaptation intelligent voice device and method
WO2015085542A1 (en) * 2013-12-12 2015-06-18 Intel Corporation Voice personalization for machine reading
US10176796B2 (en) 2013-12-12 2019-01-08 Intel Corporation Voice personalization for machine reading
CN104123857A (en) * 2014-07-16 2014-10-29 北京网梯科技发展有限公司 Device and method for achieving individualized touch reading
CN104123857B (en) * 2014-07-16 2016-08-17 北京网梯科技发展有限公司 A kind of Apparatus and method for realizing personalized some reading
CN104123932A (en) * 2014-07-29 2014-10-29 科大讯飞股份有限公司 Voice conversion system and method
CN104464716B (en) * 2014-11-20 2018-01-12 北京云知声信息技术有限公司 A kind of voice broadcasting system and method
CN104464716A (en) * 2014-11-20 2015-03-25 北京云知声信息技术有限公司 Voice broadcasting system and method
CN104485100A (en) * 2014-12-18 2015-04-01 天津讯飞信息科技有限公司 Text-to-speech pronunciation person self-adaptive method and system
CN104485100B (en) * 2014-12-18 2018-06-15 天津讯飞信息科技有限公司 Phonetic synthesis speaker adaptive approach and system
CN105208194A (en) * 2015-08-17 2015-12-30 努比亚技术有限公司 Voice broadcast device and method
CN105609096A (en) * 2015-12-30 2016-05-25 小米科技有限责任公司 Text data output method and device
CN105654941A (en) * 2016-01-20 2016-06-08 华南理工大学 Voice change method and device based on specific target person voice change ratio parameter
CN105702246A (en) * 2016-03-17 2016-06-22 广东小天才科技有限公司 Method and device for assisting user for dictation
CN107154263A (en) * 2017-05-25 2017-09-12 宇龙计算机通信科技(深圳)有限公司 Sound processing method, device and electronic equipment
CN107331388A (en) * 2017-06-15 2017-11-07 重庆柚瓣科技有限公司 A kind of dialect collection system based on endowment robot
CN107293284A (en) * 2017-07-27 2017-10-24 上海传英信息技术有限公司 A kind of phoneme synthesizing method and speech synthesis system based on intelligent terminal
CN107634898A (en) * 2017-08-18 2018-01-26 上海云从企业发展有限公司 True man's voice information communication is realized by the chat tool on electronic communication equipment
CN109935225A (en) * 2017-12-15 2019-06-25 富泰华工业(深圳)有限公司 Character information processor and method, computer storage medium and mobile terminal
CN108364638A (en) * 2018-01-12 2018-08-03 咪咕音乐有限公司 A kind of voice data processing method, device, electronic equipment and storage medium
CN110415678A (en) * 2019-06-13 2019-11-05 百度时代网络技术(北京)有限公司 Customized voice broadcast client, server, system and method
CN110856023A (en) * 2019-11-15 2020-02-28 四川长虹电器股份有限公司 System and method for realizing customized broadcast of smart television based on TTS
CN111429878A (en) * 2020-03-11 2020-07-17 云知声智能科技股份有限公司 Self-adaptive speech synthesis method and device
CN111681638A (en) * 2020-04-20 2020-09-18 深圳奥尼电子股份有限公司 Vehicle-mounted intelligent voice control method and system
CN112802448A (en) * 2021-01-05 2021-05-14 杭州一知智能科技有限公司 Speech synthesis method and system for generating new tone
CN112802448B (en) * 2021-01-05 2022-10-11 杭州一知智能科技有限公司 Speech synthesis method and system for generating new tone
CN112885371A (en) * 2021-01-13 2021-06-01 北京爱数智慧科技有限公司 Method, apparatus, electronic device and readable storage medium for audio desensitization
CN113362805A (en) * 2021-06-18 2021-09-07 四川启睿克科技有限公司 Chinese and English speech synthesis method and device with controllable tone and accent
CN113362805B (en) * 2021-06-18 2022-06-21 四川启睿克科技有限公司 Chinese and English speech synthesis method and device with controllable tone and accent
CN113823293A (en) * 2021-09-28 2021-12-21 武汉理工大学 Speaker recognition method and system based on voice enhancement
CN113823293B (en) * 2021-09-28 2024-04-26 武汉理工大学 Speaker recognition method and system based on voice enhancement

Similar Documents

Publication Publication Date Title
CN102568472A (en) Voice synthesis system with speaker selection and realization method thereof
CN104464716B (en) A kind of voice broadcasting system and method
CN101295504A (en) Entertainment audio only for text application
CN101095287B (en) Voice service over short message service
US9812120B2 (en) Speech synthesis apparatus, speech synthesis method, speech synthesis program, portable information terminal, and speech synthesis system
US20090198497A1 (en) Method and apparatus for speech synthesis of text message
CN105264872B (en) The control method of voice emoticon in portable terminal
JP2003521750A (en) Speech system
CN109951743A (en) Barrage information processing method, system and computer equipment
CN1692403A (en) Speech synthesis apparatus with personalized speech segments
CN102324231A (en) Game dialogue voice synthesizing method and system
CN101699879A (en) Method for transmitting voice message by mobile terminal
CN109346057A (en) A kind of speech processing system of intelligence toy for children
US20080161057A1 (en) Voice conversion in ring tones and other features for a communication device
CN101621594A (en) Method and device for playing background sound of voice message
CN108364638A (en) A kind of voice data processing method, device, electronic equipment and storage medium
CN102056093A (en) Method for converting text message into voice message
CN1972478A (en) A novel method for mobile phone reading short message
KR100819740B1 (en) System and method for synthesizing music and voice, and service system and method thereof
US8768406B2 (en) Background sound removal for privacy and personalization use
CN102902506B (en) Pronunciation inputting method in a kind of mobile terminal and device
KR20080037402A (en) Method for making of conference record file in mobile terminal
CN101378424A (en) Voicemail system for a handheld device
CN201075286Y (en) Apparatus for speech voice identification
JP6170604B1 (en) Speech generator

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120711