CN105280179A - Text-to-speech processing method and system - Google Patents
Text-to-speech processing method and system Download PDFInfo
- Publication number
- CN105280179A CN105280179A CN201510741753.2A CN201510741753A CN105280179A CN 105280179 A CN105280179 A CN 105280179A CN 201510741753 A CN201510741753 A CN 201510741753A CN 105280179 A CN105280179 A CN 105280179A
- Authority
- CN
- China
- Prior art keywords
- sound
- tone color
- feature
- user
- word message
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- User Interface Of Digital Computer (AREA)
Abstract
The invention discloses a text-to-speech processing method and a text-to-speech processing system. The text-to-speech processing method comprises the steps of: acquiring text information input by a user; converting the text information into speech; acquiring emotional characteristics of the text information, and reading prestored characteristic values corresponding to the emotional characteristics; and utilizing the characteristic values to adjust the speech and obtain output speech. The step of acquiring the emotional characteristics of the text information includes identifying key words in the text information, acquiring the emotional characteristics corresponding to the key words, or acquiring the emotional characteristics corresponding to the text information input by the user. The text-to-speech processing method converts the input text information into speech information with corresponding emotional characteristics by utilizing the characteristic values corresponding to the emotional characteristics, enriches characteristics of the output speech, restores the enriches characteristics that the user wants to express, and enhances user experience.
Description
Technical field
The present invention relates to word processing technical field, particularly relate to a kind of disposal route and system of text-to-speech.
Background technology
Usually occur that sender is inconvenient to speak in daily life and can only send information by word, but take over party again can only the situation of receiving speech information, at this time user just can convert the Word message that oneself edits to voice messaging by the technology of text-to-speech and sends out, but the voice messaging that current text-to-speech disposal route is converted to is only simple phonetic synthesis and voice to be pieced together, can not by the expression of feeling in speaker's voice out, it is very stiff that the voice that translation is come show, and user can not be wanted that the emotion feature expressed shows.The present invention utilizes emotion feature characteristic of correspondence value the Word message of input to be converted to acoustic information with corresponding emotion feature, has enriched the feature exporting voice, and also original subscriber wants the emotion feature expressed, improves Consumer's Experience.
Summary of the invention
The present invention proposes a kind of disposal route and system of text-to-speech, and the method utilizes emotion feature characteristic of correspondence value the Word message of input to be converted to acoustic information with corresponding emotion feature, and also original subscriber wants the emotion feature expressed.
For reaching this object, the present invention by the following technical solutions:
First aspect, the present invention proposes a kind of disposal route of text-to-speech, comprising:
Obtain the Word message of user's input;
Described Word message is converted to sound;
Obtain the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored;
Utilize described eigenwert to regulate described sound, obtain exporting voice.
Wherein, obtain the emotion feature of described Word message, comprise;
Identify the key word in described Word message, obtain emotion feature corresponding to described key word, or
Obtain the emotion feature of the described Word message of correspondence of user's input.
Wherein, after described Word message is converted to sound, utilizes before described eigenwert regulates described sound, also comprise, tone color process is carried out to described sound.
Wherein, tone color process is carried out to described sound, comprising: the information obtaining described user, read the speech data of the described user prestored, obtain tone color feature from speech data, use described tone color feature to carry out tone color process to described sound.
Wherein, use before described tone color feature carries out tone color process to described sound, also comprise: the speech data storing described user.
Before described reading described emotion feature characteristic of correspondence value, also comprise: store emotion feature characteristic of correspondence value.
Wherein, emotion feature comprises: sad, indignation, be full of loves, happiness; Eigenwert comprises: sound frequency, tone, word speed, soft and stress tone.
Second aspect, the present invention proposes a kind of disposal system of text-to-speech, comprising:
First acquiring unit: for obtaining the Word message of user's input;
Converting unit: for described Word message is converted to sound;
Second acquisition unit: for obtaining the emotion feature of described Word message, reads the described emotion feature characteristic of correspondence value prestored;
Emotion processing unit: for utilizing described eigenwert to regulate described sound, obtains exporting voice.
Wherein, second acquisition unit comprises:
Identify acquiring unit: for identifying key word in described Word message, obtaining emotion feature corresponding to described key word;
Direct acquiring unit: for obtaining the emotion feature of the described Word message of correspondence of user's input.
Wherein, also comprise, tone color processing unit: after for described converting unit described Word message being converted to sound, tone color process is carried out to described sound.
Wherein, tone color processing unit comprises:
3rd acquiring unit: for obtaining the information of described user, reads the speech data of the described user prestored, obtains tone color feature from speech data;
Processing unit: tone color process is carried out to described sound for using described tone color feature.
Wherein, also comprise, storage unit: for storing the speech data of described user and described emotion feature characteristic of correspondence value.
Beneficial effect of the present invention: the present invention, by obtaining the emotion feature of Word message, utilizes emotion feature characteristic of correspondence value the Word message of input to be converted to acoustic information with corresponding emotion feature, enriched the feature exporting voice; In addition the present invention also carries out tone color process to the voice messaging after conversion, the tone color feature of user is extracted according to the speech data of user, this tone color feature is used to carry out tone color process to voice messaging, personalized speech conversion can be carried out for user, also original subscriber wants the emotion feature of expression greatly, and Consumer's Experience is better.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of the disposal route embodiment one of text-to-speech provided by the invention.
Fig. 2 is the method flow diagram of the disposal route embodiment two of text-to-speech provided by the invention.
Fig. 3 is the functional block diagram of the disposal system of text-to-speech provided by the invention.
Fig. 4 is the functional block diagram of the disposal system of another kind of text-to-speech provided by the invention.
Embodiment
Below in conjunction with accompanying drawing, further illustrate technical scheme of the present invention by specific embodiment.
Embodiment one
With reference to figure 1, a kind of disposal route of text-to-speech, comprising:
The Word message of S101, acquisition user input; The Word message that what automatic acquisition user edited want sends.
S102, described Word message is converted to sound.
Mainly utilize existing text-to-speech (tts, texttospeech) technology, Word message is converted to sound, but this step is only simply word is converted into sound, just simple voice are pieced together.
S103, obtain the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored.
Carry out keyword recognition to Word message, go out the emotion feature in Word message according to keyword recognition, emotion feature comprises sadness, indignation, is full of love, happiness etc.; From database, read the emotion feature characteristic of correspondence value prestored again according to emotion feature, this eigenwert is the frequency, word speed, tone, soft and stress tone etc. of sound under corresponding emotion feature.
Such as, keyword recognition is carried out to Word message, obtain the key word relevant to happiness, then judge that user wants the emotion expressed to be characterized as happiness, from database, read the characteristic of correspondence value such as sound frequency, word speed, tone, soft and stress tone that the emotion prestored is characterized as glad correspondence.
In order to improve the accuracy of emotion feature, except the emotion feature gone out according to keyword recognition in Word message, user manually can also input the emotion feature wanting to express.
S104, utilize described eigenwert to regulate described sound, obtain exporting voice.
The eigenwert such as sound frequency, word speed, tone, soft and stress tone under different emotion feature can be obtained from database, these eigenwerts are utilized to carry out emotion process to the sound obtained after simple conversion, final output voice, just with corresponding emotion, realize the object emotion feature of user being conveyed to recipient.
Such as learn that the emotion feature that user will express is glad by previous step, the frequency, word speed, tone, soft and stress tone etc. of sound when the eigenwert then extracted from database is exactly happiness, utilize these eigenwerts to be optimized the sound obtained after simple conversion, the sound of conversion just can show glad emotion state.
This method is by obtaining the emotion feature of Word message, emotion feature characteristic of correspondence value is utilized the Word message of input to be converted to acoustic information with corresponding emotion feature, enriched the feature exporting voice, also original subscriber wants the emotion feature of expression, improves Consumer's Experience.
Embodiment two
With reference to figure 2, present embodiments provide the disposal route of another kind of text-to-speech, comprising:
S201, obtain the Word message of user's input, the Word message of transmission that what automatic acquisition user edit want.
S202, described Word message is converted to sound.
Mainly utilize existing text-to-speech (tts, texttospeech) technology, Word message is converted to sound, but this step is only simply word is converted into sound, just simple voice are pieced together.
S203, obtain the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored.This step is identical with S103, repeats no more herein.
S204, obtain the information of described user, read the speech data of the described user prestored, obtain tone color feature from speech data, use described tone color feature to carry out tone color process to described sound.
Speech data has comprised the tone color feature of user.Tone color feature refers to the characteristic voice of user, such as somebody has a low and deep voice, somebody's sound is limpid, when somebody speaks, the tone is soft, when somebody speaks, tone irritability etc. is all the tone color feature of user, different popularity tone color features is all not identical, and the tone color feature that same user shows when exchanging with different people is also different, therefore need to gather the tone color feature of user and store, not only need to preserve the tone color feature of user itself and also will preserve tone color feature when user and different people engage in the dialogue, utilizing these tone color features to carry out tone color process to the sound after simple conversion just can make change of voice process have more specific aim and personalization, style according to different users carries out tone color process to the sound after simple conversion.
Such as, when mother sends information to child, can learn that Word message is that mother sends to child, tone color feature time then now just to exchange with child before the mother preserved in database is reference, extract corresponding tone color feature, utilize corresponding tone color feature to carry out tone color process to the sound after simple conversion, obtain the sound of the tone color feature with mother.
S205, utilize described eigenwert to regulate described sound, obtain exporting voice.
By previous step, tone color process is carried out to the sound after conversion, made the sound changed have the tone color feature of user, obtain the sound with tone color feature; Again according to the emotion feature characteristic of correspondence value obtained in S203, this eigenwert is the frequency, word speed, tone, soft and stress tone etc. of sound under corresponding emotion feature.
These eigenwerts are utilized again to carry out emotion process to the sound having carried out tone color process, obtain final output voice, not only final output voice are with the emotion feature that the tone color feature but also want with user of user is expressed, by the acoustic information being reduced into user of the Word message genuineness of the transmission of user, and convey to recipient with corresponding emotion feature.
Such as, when mother sends information to child, the emotion feature that will express manually is inputted according to Word message clearance keyword recognition or mother, judge that obtaining mother wants the emotion expressed to be characterized as happiness, then extract eigenwert when emotion is characterized as happiness in a database; Secondly, can learn that Word message is that mother sends to child, tone color feature when exchanging with child before the mother then now just preserved from database is reference, extract corresponding tone color feature, utilize corresponding tone color feature to carry out change of voice process to the sound after simple conversion, obtain the sound of the tone color feature with mother; The sound of eigenwert to the tone color feature with mother when finally utilizing emotion to be characterized as happiness is optimized again, the Word message genuineness inputted the most at last be reduced into mother glad time sound send to child as output voice.
The present invention, by obtaining the emotion feature of Word message, utilizes emotion feature characteristic of correspondence value the Word message of input to be converted to acoustic information with corresponding emotion feature, has enriched the feature exporting voice; In addition the present invention also carries out tone color process to the voice messaging after conversion, the tone color feature of user is extracted according to the speech data of user, this tone color feature is used to carry out tone color process to voice messaging, personalized speech conversion can be carried out for user, also original subscriber wants the emotion feature of expression greatly, and Consumer's Experience is better.
Embodiment three
With reference to figure 3, present embodiments provide a kind of disposal system of text-to-speech, comprising:
101 first acquiring units: for obtaining the Word message of user's input; By the Word message wanting to send that the first acquiring unit automatic acquisition user edits.
102 converting units: for described Word message is converted to sound; Utilize existing text-to-speech (tts, texttospeech) technology, Word message is converted to sound, but this step is only simply word is converted into sound, just simple voice are pieced together.
103 second acquisition units: for obtaining the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored.103 second acquisition units, comprising:
1031 identify acquiring units: for identifying key word in described Word message, obtaining affective characteristics corresponding to described key word;
1032 direct acquiring units: for obtaining the emotion feature of the described Word message of correspondence of user's input.
By identifying that acquiring unit carries out keyword recognition to Word message, the emotion feature in Word message is gone out according to keyword recognition, in order to improve the accuracy of emotion feature, direct acquiring unit can also be passed through, the emotion feature wanting to express that direct acquisition user manually inputs, this emotion feature comprises sadness, indignation, is full of love, happiness etc.; From database, read the emotion feature characteristic of correspondence value prestored again according to emotion feature, this eigenwert is the frequency, word speed, tone, soft and stress tone etc. of sound under corresponding emotion feature.
Such as, by identifying that acquiring unit carries out keyword recognition to Word message, obtain the key word relevant to happiness, then judge that user wants the emotion expressed to be characterized as happiness, then from database, read the characteristic of correspondence value such as frequency, word speed, tone, soft and stress tone of sound during the emotion feature happiness prestored.
104 emotion processing units: for utilizing described eigenwert to regulate described sound, obtain exporting voice.
The eigenwert such as sound frequency, word speed, tone, soft and stress tone under different emotion feature can be obtained from database, utilize these eigenwerts to the sound emotion process obtained after simple conversion, final output voice, just with corresponding emotion, realize the object emotion feature of user being conveyed to recipient.
Such as, when what second acquisition unit extracted from database the is happiness eigenwert such as sound frequency, word speed, tone, soft and stress tone, utilize these eigenwerts to carry out emotion process to the sound obtained after simple conversion, the sound of conversion just can show glad emotion state.
Native system is by obtaining the emotion feature of Word message, emotion feature characteristic of correspondence value is utilized the Word message of input to be converted to acoustic information with corresponding emotion feature, enriched the feature exporting voice, also original subscriber wants the emotion feature of expression, improves Consumer's Experience.
Embodiment four
With reference to figure 4, present embodiments provide the disposal system of another kind of text-to-speech, comprising:
201 first acquiring units: for obtaining the Word message of user's input; By the Word message wanting to send that the first acquiring unit automatic acquisition user edits.
202 converting units: for described Word message is converted to sound; Mainly utilize existing text-to-speech (tts, texttospeech) technology, Word message is converted to sound, but this step is only simply word is converted into sound, just simple voice are pieced together.
203 second acquisition units: for obtaining the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored.103 second acquisition units, comprising:
2031 identify acquiring units: for identifying key word in described Word message, obtaining affective characteristics corresponding to described key word;
2032 direct acquiring units: for obtaining the emotion feature of the described Word message of correspondence of user's input.
By identifying that acquiring unit carries out keyword recognition to Word message, go out the emotion feature in Word message according to keyword recognition, this emotion feature comprises sadness, indignation, is full of love, happiness etc.; From database, read the emotion feature characteristic of correspondence value prestored again according to emotion feature, this eigenwert is the frequency, word speed, tone, soft and stress tone etc. of sound under corresponding emotion feature.
Such as, by identifying that acquiring unit carries out keyword recognition to Word message, obtain the key word relevant to happiness, then judge that user wants the emotion expressed to be characterized as happiness, then from database, read the characteristic of correspondence value such as frequency, word speed, tone, soft and stress tone of sound during the emotion feature happiness prestored.
In order to improve the accuracy of emotion feature, can also direct acquiring unit be passed through, directly obtaining the emotion feature wanting to express that user manually inputs.
204 tone color processing units: for carrying out tone color process to described sound.204 tone color processing units, comprising:
2041 the 3rd acquiring units: for obtaining the information of described user, read the speech data of the described user prestored, obtain tone color feature from speech data.
Speech data has comprised the tone color feature of user.Tone color feature refers to the characteristic voice of user, such as somebody has a low and deep voice, somebody's sound is limpid, when somebody speaks, the tone is soft, when somebody speaks, tone irritability etc. is all the tone color feature of user, different popularity tone color features is all not identical, and the tone color feature of same user when exchanging with different people is also different, therefore need to gather the tone color feature of user and store, not only preserve the tone color feature of user itself and also will preserve tone color feature when user and different people engage in the dialogue.
2042 processing units: tone color process is carried out to described sound for using described tone color feature.
Utilizing tone color feature to be optimized the sound after simple conversion just can make change of voice process have more specific aim and personalization, and the style according to different users is optimized the sound after simple conversion.
Such as, when mother sends information to child, can learn that Word message is that mother sends to child, tone color feature when exchanging with child before the mother then now just preserved from database is reference, extract corresponding tone color feature, utilize corresponding tone color feature to carry out change of voice process to the sound after simple conversion, obtain the sound of the tone color feature with mother.
205 emotion processing units: for utilizing described eigenwert to regulate described sound, obtain exporting voice.
Obtain the sound through tone color process after tone color processing unit, make the sound changed have the tone color feature of user, obtain the sound with tone color feature; Again according to the emotion feature characteristic of correspondence value obtained in second acquisition unit, this eigenwert is the frequency, word speed, tone, soft and stress tone etc. of sound under corresponding emotion feature.
Utilize these eigenwerts to the sound emotion process again having carried out tone color process, obtain final output voice, not only final output voice are with the emotion feature that the tone color feature but also want with user of user is expressed, by the acoustic information being reduced into user of the Word message genuineness of the transmission of user, and convey to recipient with corresponding emotion feature.
Such as, when mother sends information to child, the emotion feature that will express manually is inputted according to Word message clearance keyword recognition or mother, judge that obtaining mother wants the emotion expressed to be characterized as happiness, then extract eigenwert when emotion is characterized as happiness in a database; Secondly, can learn that Word message is that mother sends to child, tone color feature when exchanging with child before the mother then now just preserved from database is reference, extract corresponding tone color feature, utilize corresponding tone color feature to carry out change of voice process to the sound after simple conversion, obtain the sound of the tone color feature with mother; The sound of eigenwert to the tone color feature with mother when finally utilizing emotion to be characterized as happiness is optimized again, the Word message genuineness inputted the most at last be reduced into mother glad time sound send to child as output voice.
206 storage unit: for storing the speech data of described user and described emotion feature characteristic of correspondence value.
The present invention utilizes emotion feature characteristic of correspondence value the Word message of input to be converted to acoustic information with corresponding emotion feature, has enriched the feature exporting voice; In addition the present invention also carries out tone color process to the voice messaging after conversion, the tone color feature of user is extracted according to the speech data of user, this tone color feature is used to process further voice messaging, personalized process can be carried out for user, also original subscriber wants the emotion feature of expression greatly, and Consumer's Experience is better.
Below the know-why of the embodiment of the present invention is described in conjunction with specific embodiments; these describe the principle just in order to explain the embodiment of the present invention; and the restriction that can not be interpreted as by any way embodiment of the present invention protection domain; those skilled in the art does not need to pay other embodiment that performing creative labour can associate the embodiment of the present invention, these modes all by fall into the embodiment of the present invention protection domain within.
Claims (10)
1. a disposal route for text-to-speech, is characterized in that, comprising:
Obtain the Word message of user's input;
Described Word message is converted to sound;
Obtain the emotion feature of described Word message, read the described emotion feature characteristic of correspondence value prestored;
Utilize described eigenwert to regulate described sound, obtain exporting voice;
Wherein, obtain the emotion feature of described Word message, comprise;
Identify the key word in described Word message, obtain emotion feature corresponding to described key word, or
Obtain the emotion feature of the described Word message of correspondence of user's input.
2. disposal route according to claim 1, is characterized in that, described described Word message is converted to sound after, utilize before described eigenwert regulates described sound, also comprise, tone color process is carried out to described sound.
3. disposal route according to claim 2, it is characterized in that, described tone color process is carried out to described sound, comprise: the information obtaining described user, read the speech data of the described user prestored, obtain tone color feature from speech data, use described tone color feature to carry out tone color process to described sound.
4. disposal route according to claim 3, is characterized in that, described use described tone color feature also comprises: the speech data storing described user before carrying out tone color process to described sound;
Before described reading described emotion feature characteristic of correspondence value, also comprise: store emotion feature characteristic of correspondence value.
5. disposal route according to claim 4, is characterized in that, described emotion feature comprises: sad, indignation, be full of loves, happiness;
Described eigenwert comprises: sound frequency, tone, word speed, soft and stress tone.
6. a disposal system for text-to-speech, is characterized in that, comprising:
First acquiring unit: for obtaining the Word message of user's input;
Converting unit: for described Word message is converted to sound;
Second acquisition unit: for obtaining the emotion feature of described Word message, reads the described emotion feature characteristic of correspondence value prestored;
Emotion processing unit: for utilizing described eigenwert to regulate described sound, obtains exporting voice.
7. disposal system according to claim 6, is characterized in that, described second acquisition unit comprises:
Identify acquiring unit: for identifying key word in described Word message, obtaining emotion feature corresponding to described key word; With
Direct acquiring unit: for obtaining the emotion feature of the described Word message of correspondence of user's input.
8. disposal system according to claim 6, is characterized in that, also comprises, tone color processing unit: after for described converting unit described Word message being converted to sound, carry out tone color process to described sound.
9. disposal system according to claim 8, is characterized in that, described tone color processing unit comprises:
3rd acquiring unit: for obtaining the information of described user, reads the speech data of the described user prestored, obtains tone color feature from speech data;
Processing unit: tone color process is carried out to described sound for using described tone color feature.
10. disposal system according to claim 9, is characterized in that, also comprises:
Storage unit: for storing the speech data of described user and described emotion feature characteristic of correspondence value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510741753.2A CN105280179A (en) | 2015-11-02 | 2015-11-02 | Text-to-speech processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510741753.2A CN105280179A (en) | 2015-11-02 | 2015-11-02 | Text-to-speech processing method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105280179A true CN105280179A (en) | 2016-01-27 |
Family
ID=55149072
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510741753.2A Pending CN105280179A (en) | 2015-11-02 | 2015-11-02 | Text-to-speech processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105280179A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106571136A (en) * | 2016-10-28 | 2017-04-19 | 努比亚技术有限公司 | Voice output device and method |
CN106791913A (en) * | 2016-12-30 | 2017-05-31 | 深圳市九洲电器有限公司 | Digital television program simultaneous interpretation output intent and system |
CN107705783A (en) * | 2017-11-27 | 2018-02-16 | 北京搜狗科技发展有限公司 | A kind of phoneme synthesizing method and device |
CN108364658A (en) * | 2018-03-21 | 2018-08-03 | 冯键能 | Cyberchat method and server-side |
CN109417504A (en) * | 2017-04-07 | 2019-03-01 | 微软技术许可有限责任公司 | Voice forwarding in automatic chatting |
CN109634552A (en) * | 2018-12-17 | 2019-04-16 | 广东小天才科技有限公司 | It is a kind of to enter for control method and terminal device applied to dictation |
CN109658917A (en) * | 2019-01-17 | 2019-04-19 | 深圳壹账通智能科技有限公司 | E-book chants method, apparatus, computer equipment and storage medium |
CN109712604A (en) * | 2018-12-26 | 2019-05-03 | 广州灵聚信息科技有限公司 | A kind of emotional speech synthesis control method and device |
CN109754779A (en) * | 2019-01-14 | 2019-05-14 | 出门问问信息科技有限公司 | Controllable emotional speech synthesizing method, device, electronic equipment and readable storage medium storing program for executing |
WO2019218773A1 (en) * | 2018-05-15 | 2019-11-21 | 中兴通讯股份有限公司 | Voice synthesis method and device, storage medium, and electronic device |
CN110867177A (en) * | 2018-08-16 | 2020-03-06 | 林其禹 | Voice playing system with selectable timbre, playing method thereof and readable recording medium |
CN111883098A (en) * | 2020-07-15 | 2020-11-03 | 青岛海尔科技有限公司 | Voice processing method and device, computer readable storage medium and electronic device |
CN111966257A (en) * | 2020-08-25 | 2020-11-20 | 维沃移动通信有限公司 | Information processing method and device and electronic equipment |
CN114420086A (en) * | 2022-03-30 | 2022-04-29 | 北京沃丰时代数据科技有限公司 | Speech synthesis method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH04199098A (en) * | 1990-11-29 | 1992-07-20 | Meidensha Corp | Regular voice synthesizing device |
JP2007011308A (en) * | 2005-05-30 | 2007-01-18 | Kyocera Corp | Document display device and document reading method |
CN101064104A (en) * | 2006-04-24 | 2007-10-31 | 中国科学院自动化研究所 | Emotion voice creating method based on voice conversion |
CN102385858A (en) * | 2010-08-31 | 2012-03-21 | 国际商业机器公司 | Emotional voice synthesis method and system |
CN103543979A (en) * | 2012-07-17 | 2014-01-29 | 联想(北京)有限公司 | Voice outputting method, voice interaction method and electronic device |
US20140067397A1 (en) * | 2012-08-29 | 2014-03-06 | Nuance Communications, Inc. | Using emoticons for contextual text-to-speech expressivity |
CN103761963A (en) * | 2014-02-18 | 2014-04-30 | 大陆汽车投资(上海)有限公司 | Method for processing text containing emotion information |
CN104900226A (en) * | 2014-03-03 | 2015-09-09 | 联想(北京)有限公司 | Information processing method and device |
-
2015
- 2015-11-02 CN CN201510741753.2A patent/CN105280179A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH04199098A (en) * | 1990-11-29 | 1992-07-20 | Meidensha Corp | Regular voice synthesizing device |
JP2007011308A (en) * | 2005-05-30 | 2007-01-18 | Kyocera Corp | Document display device and document reading method |
CN101064104A (en) * | 2006-04-24 | 2007-10-31 | 中国科学院自动化研究所 | Emotion voice creating method based on voice conversion |
CN102385858A (en) * | 2010-08-31 | 2012-03-21 | 国际商业机器公司 | Emotional voice synthesis method and system |
CN103543979A (en) * | 2012-07-17 | 2014-01-29 | 联想(北京)有限公司 | Voice outputting method, voice interaction method and electronic device |
US20140067397A1 (en) * | 2012-08-29 | 2014-03-06 | Nuance Communications, Inc. | Using emoticons for contextual text-to-speech expressivity |
CN103761963A (en) * | 2014-02-18 | 2014-04-30 | 大陆汽车投资(上海)有限公司 | Method for processing text containing emotion information |
CN104900226A (en) * | 2014-03-03 | 2015-09-09 | 联想(北京)有限公司 | Information processing method and device |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106571136A (en) * | 2016-10-28 | 2017-04-19 | 努比亚技术有限公司 | Voice output device and method |
CN106791913A (en) * | 2016-12-30 | 2017-05-31 | 深圳市九洲电器有限公司 | Digital television program simultaneous interpretation output intent and system |
CN109417504A (en) * | 2017-04-07 | 2019-03-01 | 微软技术许可有限责任公司 | Voice forwarding in automatic chatting |
US11233756B2 (en) | 2017-04-07 | 2022-01-25 | Microsoft Technology Licensing, Llc | Voice forwarding in automated chatting |
CN107705783A (en) * | 2017-11-27 | 2018-02-16 | 北京搜狗科技发展有限公司 | A kind of phoneme synthesizing method and device |
CN108364658A (en) * | 2018-03-21 | 2018-08-03 | 冯键能 | Cyberchat method and server-side |
WO2019218773A1 (en) * | 2018-05-15 | 2019-11-21 | 中兴通讯股份有限公司 | Voice synthesis method and device, storage medium, and electronic device |
CN110867177A (en) * | 2018-08-16 | 2020-03-06 | 林其禹 | Voice playing system with selectable timbre, playing method thereof and readable recording medium |
CN109634552A (en) * | 2018-12-17 | 2019-04-16 | 广东小天才科技有限公司 | It is a kind of to enter for control method and terminal device applied to dictation |
CN109712604A (en) * | 2018-12-26 | 2019-05-03 | 广州灵聚信息科技有限公司 | A kind of emotional speech synthesis control method and device |
CN109754779A (en) * | 2019-01-14 | 2019-05-14 | 出门问问信息科技有限公司 | Controllable emotional speech synthesizing method, device, electronic equipment and readable storage medium storing program for executing |
CN109658917A (en) * | 2019-01-17 | 2019-04-19 | 深圳壹账通智能科技有限公司 | E-book chants method, apparatus, computer equipment and storage medium |
CN111883098A (en) * | 2020-07-15 | 2020-11-03 | 青岛海尔科技有限公司 | Voice processing method and device, computer readable storage medium and electronic device |
CN111883098B (en) * | 2020-07-15 | 2023-10-24 | 青岛海尔科技有限公司 | Speech processing method and device, computer readable storage medium and electronic device |
CN111966257A (en) * | 2020-08-25 | 2020-11-20 | 维沃移动通信有限公司 | Information processing method and device and electronic equipment |
CN114420086A (en) * | 2022-03-30 | 2022-04-29 | 北京沃丰时代数据科技有限公司 | Speech synthesis method and device |
CN114420086B (en) * | 2022-03-30 | 2022-06-17 | 北京沃丰时代数据科技有限公司 | Speech synthesis method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105280179A (en) | Text-to-speech processing method and system | |
KR101513888B1 (en) | Apparatus and method for generating multimedia email | |
CN111477216B (en) | Training method and system for voice and meaning understanding model of conversation robot | |
CN101694772B (en) | Method for converting text into rap music and device thereof | |
CN108364632B (en) | Emotional Chinese text voice synthesis method | |
CN110751943A (en) | Voice emotion recognition method and device and related equipment | |
CN102903361A (en) | Instant call translation system and instant call translation method | |
KR20090085376A (en) | Service method and apparatus for using speech synthesis of text message | |
KR101628050B1 (en) | Animation system for reproducing text base data by animation | |
CN104202455A (en) | Intelligent voice dialing method and intelligent voice dialing device | |
CN108536655A (en) | Audio production method and system are read aloud in a kind of displaying based on hand-held intelligent terminal | |
CN103761963A (en) | Method for processing text containing emotion information | |
CN104142936A (en) | Audio and video match method and audio and video match device | |
KR20150017662A (en) | Method, apparatus and storing medium for text to speech conversion | |
US20020169610A1 (en) | Method and system for automatically converting text messages into voice messages | |
CN109346057A (en) | A kind of speech processing system of intelligence toy for children | |
CN109492126B (en) | Intelligent interaction method and device | |
CN110767233A (en) | Voice conversion system and method | |
CN112349266A (en) | Voice editing method and related equipment | |
EP3113175A1 (en) | Method for converting text to individual speech, and apparatus for converting text to individual speech | |
US20170221481A1 (en) | Data structure, interactive voice response device, and electronic device | |
CN106708789A (en) | Text processing method and device | |
KR20080037402A (en) | Method for making of conference record file in mobile terminal | |
CN103516582A (en) | Method and system for conducting information prompt in instant messaging | |
CN113724690B (en) | PPG feature output method, target audio output method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160127 |
|
RJ01 | Rejection of invention patent application after publication |