CN103383844B - Phoneme synthesizing method and system - Google Patents

Phoneme synthesizing method and system Download PDF

Info

Publication number
CN103383844B
CN103383844B CN201210138028.2A CN201210138028A CN103383844B CN 103383844 B CN103383844 B CN 103383844B CN 201210138028 A CN201210138028 A CN 201210138028A CN 103383844 B CN103383844 B CN 103383844B
Authority
CN
China
Prior art keywords
speech synthesis
task
casting
processed
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210138028.2A
Other languages
Chinese (zh)
Other versions
CN103383844A (en
Inventor
王玉平
翟鲁峰
戴林
高羽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI GEAK ELECTRONICS Co.,Ltd.
Original Assignee
SHANGHAI GUOKE ELECTRONIC CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI GUOKE ELECTRONIC CO Ltd filed Critical SHANGHAI GUOKE ELECTRONIC CO Ltd
Priority to CN201210138028.2A priority Critical patent/CN103383844B/en
Publication of CN103383844A publication Critical patent/CN103383844A/en
Application granted granted Critical
Publication of CN103383844B publication Critical patent/CN103383844B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of phoneme synthesizing method and systems, which comprises default speech synthesis task, external speech synthesis application call and submit speech synthesis task;Various speech synthesis tasks are deployed and generate the speech synthesis task list to be processed by sequence;It is selected from speech synthesis task list to be processed and comes most preceding speech synthesis task to be processed progress current speech synthesis and casting;It repeats the above steps up to no speech synthesis task is submitted and speech synthesis task list to be processed is sky, various speech synthesis demands in the application of different external speech syntheses can be transferred to an independent speech synthesis application processing by the present invention, speech synthesis is normalized, external speech synthesis applies the function of using speech synthesis if necessary, the speech-sound synthesizing function that speech synthesis can be called to apply by way of far call, resource redundancy and waste are reduced, also simplifies the exploitation complexity of external speech synthesis application to a certain extent.

Description

Phoneme synthesizing method and system
Technical field
The present invention relates to a kind of phoneme synthesizing method and systems.
Background technique
Along with the rapid development of computer technology, speech synthesis system has also obtained huge progress, present synthesis Effect can meet the needs of daily in intelligibility, naturalness, have already appeared many as reading news, reading novel, weather are broadcast The products relevant to speech synthesis such as report, short message are broadcasted, e-book is read aloud, for each application, as long as being related to voice conjunction At, necessarily there is a module to carry out special disposal speech synthesis, this will lead to the severely redundant of resource and waste, if there is 10 The application of speech synthesis, then requiring 10 duplicate voice synthetic modules.
Present most of speech synthesis related applications (also referred to as external speech synthesis application) are nothing more than two kinds of application scenarios: Under the first application scenarios, each voice related application has the non-serviceable voice of other speech synthesis related application to close At module, more parts of voice synthetic modules are then needed for a variety of different speech synthesis related applications, will cause storage money in this way Source and exploitative serious waste;Under second of application scenarios, each speech synthesis related application can be connect by network The unified interface that mouth calls others to issue, but this application then requires to necessarily require on energy when using speech-sound synthesizing function Net, and need downloading synthesis voice that can generate many data traffics.
Summary of the invention
The purpose of the present invention is to provide a kind of phoneme synthesizing method and systems, can answer different external speech syntheses Various speech synthesis demands in transfer to an independent speech synthesis application processing, i.e., normalize speech synthesis, external Speech synthesis applies the function of using speech synthesis if necessary, speech synthesis can be called to answer by way of far call Speech-sound synthesizing function can greatly reduce resource redundancy and waste in this way, also simplify external language to a certain extent The exploitation complexity of sound synthesis application.
To solve the above problems, the present invention provides a kind of phoneme synthesizing method, comprising:
Step 1: default speech synthesis task;
Step 2: external speech synthesis application calls and submits the speech synthesis task;
The speech synthesis task to be processed that step 3: deploying various speech synthesis tasks and generates by sequence arranges Table;And
Step 4: it is selected from the speech synthesis task list to be processed and comes most preceding speech synthesis task to be processed Carry out current speech synthesis and casting.
It further, in the above-mentioned methods, further include repeating the above steps one to four up to no language after the step 4 Sound synthesizes task submission and the speech synthesis task list to be processed is sky.
Further, in the above-mentioned methods, selected from the speech synthesis task list to be processed come it is most preceding to Further include monitoring telephone situation while handling speech synthesis task progress speech synthesis, when listening to phone, stops current Speech synthesis and casting restart current speech synthesis and casting when listening to telephone finished.
Further, in the above-mentioned methods, selected from the speech synthesis task list to be processed come it is most preceding to It further include monitoring telephone situation while handling speech synthesis task progress speech synthesis, when listening to phone, pause is current Speech synthesis and casting, when listening to telephone finished, current speech synthesis and casting since the place of pause.
Further, in the above-mentioned methods, stop current speech synthesis and casting or the synthesis of pause current speech and casting The step of after, further include transmission task processing status (start to broadcast, terminate casting) to the external speech synthesis apply with Just its logic state for modifying oneself.
Further, in the above-mentioned methods, restart current speech synthesis and casting or work as since the place of pause It further include that transmission task processing status is applied to the external speech synthesis so as to it after the step of preceding speech synthesis and casting Modify the logic state of oneself.
Further, in the above-mentioned methods, selected from the speech synthesis task list to be processed come it is most preceding to Handle speech synthesis task carry out speech synthesis while, send task processing status to the external speech synthesis apply so as to Its logic state for modifying oneself.
Further, in the above-mentioned methods, the speech synthesis task includes the permission activation of external speech synthesis application Task submits casting task, stops casting task, pause casting task and delete its all casting task.
Further, in the above-mentioned methods, the current speech is carried out using the parameterised speech synthetic method based on HMM Synthesis and casting.
Further, in the above-mentioned methods, using the phoneme synthesizing method based on formant or based on the wave of Big-corpus Shape concatenative speech synthesis method carries out the current speech synthesis.
Further, in the above-mentioned methods, using first in first out, the queuing mechanism of last-in, last-out is to the voice to be processed Synthesis task list is ranked up.
Further, in the above-mentioned methods, using important preceding, secondary posterior queuing mechanism is to the voice to be processed Synthesis task list is ranked up.
Further, in the above-mentioned methods, the current speech is synthesized and adjusted in broadcasting including intonation, word speed adjusts, Made tone color adjustment with the method for the change of voice, reverberation effect is added with the method for echo or improved sound effect with the method for balanced device.
Another side according to the present invention provides a kind of mobile terminal, carries out speech synthesis using above-mentioned phoneme synthesizing method.
Further, in above-mentioned mobile terminal, the speech synthesis application run on including symbian, android, On the operating system system of ios, linux or mtk.
Another side according to the present invention provides a kind of speech synthesis system, comprising:
The voice is called and submitted to interface module for presetting speech synthesis task, and for the application of external speech synthesis Synthesis task;
Task processing module, for being deployed to various speech synthesis tasks and generating the voice to be processed by sequence Synthesize task list;And
Compositing Engine comes most preceding voice to be processed conjunction for selecting from the speech synthesis task list to be processed Current speech synthesis and casting are carried out at task and broadcast situation to the external speech synthesis application report.
Further, in above system, the speech synthesis task includes the permission activation of external speech synthesis application Task submits casting task, stops casting task, pause casting task and delete its all casting task.
Further, in above system, further include monitoring module, be used for monitoring telephone situation, when listening to phone, It notifies the Compositing Engine to stop current speech synthesis and casting, when listening to telephone finished, notifies the Compositing Engine weight It is new to start current speech synthesis and casting.
Further, in above system, the monitoring module is also used to monitoring telephone situation, when listening to phone, Notify Compositing Engine pause current speech synthesis and casting, when listening to telephone finished, notify the Compositing Engine from The place of pause starts current speech synthesis and casting.
Further, it in above system, monitors module and is also used to send task processing status to the external voice conjunction At application so that it modifies the logic state of oneself.
Further, in above system, the Compositing Engine is also used to stop current speech synthesis and casting, opens again Beginning current speech synthesis and casting, pause current speech synthesis and casting or since pause place start current speech synthesize and broadcast Report.
Another side according to the present invention provides a kind of mobile terminal, including above-mentioned speech synthesis system carries out speech synthesis.
Compared with prior art, the present invention is called and is submitted by presetting speech synthesis task, external speech synthesis application Then the speech synthesis task deploys various speech synthesis tasks and generates the speech synthesis to be processed by sequence Task list, and selected from the speech synthesis task list to be processed and come most preceding speech synthesis task to be processed progress Current speech synthesis and casting, and repeat the above steps up to no speech synthesis task and the speech synthesis task to be processed List is empty, an independent voice can be transferred to close the various speech synthesis demands in the application of different external speech syntheses It handles, i.e., normalizes speech synthesis at application, external speech synthesis is applied uses the function of speech synthesis if necessary, can be with The speech-sound synthesizing function for calling speech synthesis to apply by way of far call, can greatly reduce resource redundancy in this way And waste, the exploitation complexity of external speech synthesis application is also simplified to a certain extent.
In addition, user does not need to surf the Internet by speech synthesis system of the invention in running of mobile terminal, it will not generate and appoint What data traffic is conducive to push advancing for speech synthesis technique.
In addition, by presetting speech synthesis task, so that the application of external speech synthesis is called and submits the speech synthesis Speech synthesis application integrating can be applied at for the unified of different mobile terminal equipment, be different from traditional voice and close by task At application, the speech synthesis application that mobile terminal device only needs the present invention to be mounted unified, if external speech synthesis is applied It needs using speech-sound synthesizing function, then can have by calling unified speech synthesis task that speech-sound synthesizing function can be realized Effect reduces storage and exploits natural resources.
Detailed description of the invention
Fig. 1 is the flow chart of the phoneme synthesizing method of the embodiment of the present invention one;
Fig. 2 is the flow chart of the phoneme synthesizing method of the embodiment of the present invention two;
Fig. 3 is the external phoneme synthesizing method schematic diagram of the embodiment of the present invention two;
Fig. 4 is the functional block diagram of the speech synthesis system of the embodiment of the present invention four;
Fig. 5 is the process flow diagram of the speech synthesis system of the embodiment of the present invention four;
Fig. 6 is the functional block diagram of the speech synthesis system of the embodiment of the present invention five;
Fig. 7 is the process flow diagram of the speech synthesis system of the embodiment of the present invention five.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
Embodiment one
As shown in Figure 1, the present invention provides a kind of phoneme synthesizing method, comprising:
Step S11 presets speech synthesis task;Wherein, the speech synthesis task includes what external speech synthesis was applied Permission activates task, submits casting task, stops casting task, pause casting task and delete all casting tasks;
Step S12, the speech synthesis task is called and is submitted in external speech synthesis application, specifically, by presetting language Sound synthesizes task, can be by speech synthesis using whole so that the speech synthesis task is called and submitted in the application of external speech synthesis Synthesis is different from traditional speech synthesis application, terminal device only needs to be mounted for the unified application of different terminal equipment The unified speech synthesis application of the present invention, external speech synthesis are applied if necessary to use speech-sound synthesizing function, then can pass through tune Speech-sound synthesizing function can be realized with preset speech synthesis task, effectively reduce storage and exploit natural resources;
Step S13 deploys various speech synthesis tasks and generates the speech synthesis task to be processed by sequence List, to guarantee that speech synthesis on the terminal device orderly carries out;The queuing mechanism pair of first in first out, last-in, last-out can be used The speech synthesis task list to be processed is ranked up, alternatively, using it is important in preceding, secondary posterior queuing mechanism to described Speech synthesis task list to be processed is ranked up;
Step S14 is selected from the speech synthesis task list to be processed and is come most preceding speech synthesis to be processed times Business carries out current speech synthesis and casting;The parameterised speech synthetic method based on HMM can be used and carry out the current speech conjunction At and casting, the amount of storage of this method and the requirement of operand be not high, and synthesize voice intelligibility and naturalness all It is very high, the speech synthesis application on various mobile terminals is complied fully with, due to the parameterised speech synthesis application based on HMM Through widespread, so specific introduction is not done here, in addition, the phoneme synthesizing method based on formant can also be used or be based on big The waveform concatenation phoneme synthesizing method progress current speech synthesis of corpus, but the Speech Synthesis Algorithm based on formant Effect can be lacking in intelligibility and naturalness, be only applicable to speech synthesis situation of less demanding, be based on Big-corpus Waveform concatenation Speech Synthesis Algorithm calculation amount and amount of storage require can it is relatively high, can be used in the higher terminal of processing capacity set It is standby upper;It include intonation adjustment, word speed tune in the current speech synthesis and casting to keep speech synthesis effect more abundant It is whole, make tone color adjustment with the method for the change of voice, reverberation effect added with the method for echo or improves sound with the method for balanced device and is imitated Fruit;While executing this step, also transmittable task processing status such as starts to broadcast, and terminates casting etc. and closes to the external voice At application so that it modifies the logic state of oneself;
Step S15, judge whether no speech synthesis task submit and the speech synthesis task list to be processed whether be Sky, if so, S16 is thened follow the steps, if it is not, then repeating step S11 to step S15;
Step S16 terminates to exit.
Various speech synthesis demands in the application of different external speech syntheses can be transferred to an independence by the present embodiment Speech synthesis application processing, i.e., speech synthesis is normalized, external speech synthesis using using speech synthesis if necessary Function, the speech-sound synthesizing function that speech synthesis can be called to apply by way of far call, can greatly subtract in this way Few resource redundancy and waste also simplify the exploitation complexity of external speech synthesis application to a certain extent.
Embodiment two
The difference between this embodiment and the first embodiment lies in being selected from the speech synthesis task list to be processed executing It comes the step of most preceding speech synthesis task to be processed carries out current speech synthesis and casting simultaneously, increases monitoring telephone feelings Condition and the step of make corresponding specially treated, so that phoneme synthesizing method of the invention is suitable for the shifting that the needs such as mobile phone receive calls Dynamic terminal device, telephony feature reaches normal use during guaranteeing speech synthesis.
As shown in Fig. 2, the present invention provides another phoneme synthesizing method, comprising:
Step S21 presets speech synthesis task, specifically, the speech synthesis task includes external speech synthesis application Permission activation task, submit casting task, stop casting task, pause casting task and delete all casting tasks;;
Step S22, external speech synthesis application call and submit the speech synthesis task;
Step S23 deploys various speech synthesis tasks and generates the speech synthesis task to be processed by sequence List, to guarantee that speech synthesis on the terminal device orderly carries out, optionally, using first in first out, the queuing of last-in, last-out Mechanism is ranked up the speech synthesis task list to be processed, or using important in preceding, secondary posterior queuing mechanism pair The speech synthesis task list to be processed is ranked up;
Step S24 is selected from the speech synthesis task list to be processed and is come most preceding speech synthesis to be processed times Business carries out current speech synthesis and casting, and monitoring telephone situation, when listening to phone, stops current speech synthesis and broadcasts Report restarts current speech synthesis and casting when listening to telephone finished;In addition, when listening to phone, it can also be temporary Ready preceding speech synthesis and casting, when listening to telephone finished, current speech synthesis and casting since the place of pause;
Step S25 sends task processing status and applies to the external speech synthesis so that it modifies the logic shape of oneself State, specifically, the external speech synthesis apply according to the various task processing statuses during speech synthesis such as start casting, Stop casting, pause is broadcasted, restart current speech casting, the states such as current speech casting are realized since the place of pause The logic of oneself, external speech synthesis applies the casting state of oneself as described in modification;
Step S26, judge whether no speech synthesis task submit or the speech synthesis task list to be processed whether be Sky, if it is not, then repeating step S22 to step S26, is not required to if so, thening follow the steps S27 if presetting speech synthesis task here It resets, then can execute directly since step S22 to step S26, omit and re-execute the steps S21;
Step S27 terminates to exit.
As shown in figure 3, using the present invention produce a normalized speech synthesis platform, the platform can be realized listen it is short The speech synthesis demand of various external speech syntheses applications such as believe, answer a call, listen novel, listen novel, listen news, listen weather.
Various speech synthesis demands in the application of different external speech syntheses can be transferred to an independence by the present embodiment Speech synthesis application processing, i.e., speech synthesis is normalized, external speech synthesis using using speech synthesis if necessary Function, the speech-sound synthesizing function that speech synthesis can be called to apply by way of far call, can greatly subtract in this way Few resource redundancy and waste also simplify the exploitation complexity of external speech synthesis application, in addition, this implementation to a certain extent Telephony feature reaches normal use during example can also guarantee speech synthesis.
Embodiment three
The present invention also provides a kind of mobile terminals, use the phoneme synthesizing method as described in embodiment one or embodiment two Speech synthesis is carried out, optionally, the speech synthesis application is run on including symbian, android, ios, linux or mtk Operating system system on.
The present embodiment can be set by speech synthesis application integrating at an individual application product for different mobile terminals Standby, mobile terminal system includes but is not limited to symbian, android, ios, and linux, mtk etc. pacify on these terminal devices After having filled unified speech synthesis application, the external speech synthesis in these equipment, which is applied, itself then no longer needs respective independence Speech synthesis application, calling directly the unified speech synthesis application of the present invention can be realized speech-sound synthesizing function, finally, be formed One is directed to the unified speech synthesis platform of terminal device.
Example IV
As shown in figure 4, the present invention also provides speech synthesis system 1, in normalized speech synthesis system most importantly Interface module 11, task processing module 12 and Compositing Engine 13.
Institute's predicate is called and submitted to interface module 11 using 14 for presetting speech synthesis task, and for external speech synthesis Sound synthesizes task, and optionally, the speech synthesis task includes the permission activation task of external speech synthesis application, submits casting Task stops casting task, pause casting task and deletes all casting tasks.Specifically, interface module 11 is mainly responsible for outside Connect speech synthesis application control input, for external speech synthesis application call, such as call submit casting task with Start casting, call stopping casting task to terminate casting etc..It can be whole by speech synthesis system 1 by setting interface module 11 Synthesis is directed to the unified application system of different terminal equipment, is different from traditional speech synthesis system, terminal device only needs to pacify The unified speech synthesis system of the present invention of dress, external speech synthesis are applied if necessary to use speech-sound synthesizing function, then can be led to It crosses and calls unified interface module 11 that speech-sound synthesizing function can be realized, effectively reduce storage and exploit natural resources.
Wherein, permission activation task (activate), to activate the speech synthesis related application to obtain Compositing Engine Processing authority;It submits casting task (speak), to submit speech synthesis task;Stop casting task (stop), to stop The casting task for the external speech synthesis application currently broadcasted, the external speech synthesis application cannot stop other casting and appoint Business;All casting tasks (stopAll) are deleted, all tasks submitted to delete current external speech synthesis application.It is external The calling of speech synthesis task can be realized using the 14 above interface modules 11 of calling for speech synthesis.
Task processing module 12 is for deploying various speech synthesis tasks and generating the language to be processed by sequence Sound synthesizes task list, and task processing module 12 is mainly to cope with the treatment mechanism of various speech synthesis tasks, guarantees in terminal Speech synthesis in equipment is orderly, because normalized speech synthesis system needs to handle different external connection, speech synthesis application was sent out The speech synthesis task come, so task processing module 12 will guarantee that the order of synthesis task, task processing module 12 can adopt Come front with arriving first for task, arriving afterwards for task comes subsequent queuing mechanism, for Compositing Engine 13 handle in sequence to Handle speech synthesis task.
Compositing Engine 13 comes most preceding voice to be processed for selecting from the speech synthesis task list to be processed Synthesis task carries out current speech synthesis and casting and broadcasts situation to the external speech synthesis application report.Compositing Engine 13 It is mainly responsible for and speech synthesis is carried out to the text of input and is broadcasted, when synthesis task starts to broadcast and terminate casting, synthesis is drawn Holding up 13 can all notify external speech synthesis to apply so that respective handling is done by task call side.The Compositing Engine 13 is also used to send Task processing status such as starts casting, terminates casting to the external speech synthesis using so that it modifies the logic shape of oneself State.
As shown in fig. 5, it is assumed that current, there are three speech synthesis tasks to be processed to upload to normalization speech synthesis platform, puts down Platform is given according to task uplink time works as first three speech synthesis task ranking to be processed, assumes external speech synthesis application in Fig. 5 Two speech synthesis task to be processed ranks the first, to be processed speech synthesis task of the Compositing Engine 13 from task processing module 12 Current task of first speech synthesis task to be processed as Compositing Engine 13 is taken out in list, when starting to process, is closed Message that one starts casting can be sent out to external speech synthesis using two at engine 12, external speech synthesis can be with using two at this time The logic of oneself, such as modification casting state etc. are realized with this message, then starts the content of casting task, when casting is tied After beam, the message ended processing can equally issue the corresponding external speech synthesis application of current task, then Compositing Engine 13 It goes to obtain second speech synthesis task to be processed in speech synthesis task list to be processed again, continues same processing.
This implementation can by different external speech syntheses apply in various speech synthesis demands transfer to one it is independent Speech synthesis system processing, i.e., normalize speech synthesis, and external speech synthesis applies the function for using speech synthesis if necessary Can, the speech-sound synthesizing function of speech synthesis system can be called by way of far call, can greatly be reduced in this way Resource redundancy and waste also simplify the exploitation complexity of external speech synthesis system to a certain extent.
Embodiment five
As shown in fig. 6, the present invention also provides another speech synthesis system 2, the difference of the present embodiment and example IV exists Module 24 is monitored in increasing, so that speech synthesis system of the invention is suitable for the mobile terminal that the needs such as mobile phone receive calls Equipment, telephony feature reaches normal use during guaranteeing speech synthesis.
Institute's predicate is called and submitted to interface module 21 using 25 for presetting speech synthesis task, and for external speech synthesis Sound synthesizes task, and optionally, the speech synthesis task includes the permission activation task of external speech synthesis application, submits casting Task stops casting task, pause casting task and deletes all casting tasks.
Task processing module 22 is for deploying various speech synthesis tasks and generating the language to be processed by sequence Sound synthesizes task list.Task processing module 22 is mainly to cope with the treatment mechanism of various synthesis tasks, is guaranteed in terminal device On speech synthesis it is orderly because normalized speech synthesis system needs to handle what different external connection speech synthesis application was sent Synthesis task, so this module will guarantee that the task row arrived first can be used in the order of synthesis task, task processing module 22 In front, arriving afterwards for task comes subsequent queuing mechanism, handles speech synthesis to be processed in sequence for Compositing Engine 23 and appoints Business, and specially treated is also needed for mobile phone, phone period needs to suspend other all speech synthesis tasks to guarantee phone It can normal use.
Compositing Engine 23 comes most preceding voice to be processed for selecting from the speech synthesis task list to be processed Synthesis task carries out current speech synthesis and casting and broadcasts situation to the external speech synthesis application report.In order to it is described It monitors module 24 to cooperate, the Compositing Engine 23 is also used to stop current speech synthesis and casting, restarts current speech conjunction At with casting, pause current speech synthesis and casting or since pause place start current speech synthesis and broadcast when.The conjunction It is also used to send task processing status at engine 23 to apply to the external speech synthesis so that it modifies the logic state of oneself.
It monitors module 24 and is used for monitoring telephone situation, when listening to phone, the Compositing Engine is notified to stop current language Sound synthesis and casting notify the Compositing Engine to restart current speech synthesis and casting when listening to telephone finished.Separately Outside, the monitoring module 24 can also notify the Compositing Engine pause current speech synthesis and casting when listening to phone, when When listening to telephone finished, the Compositing Engine current speech synthesis and casting since the place of pause are notified.In mobile phone etc. On mobile terminal, then normalized speech synthesis system needs whether a telephone monitoring module 24 has phone at this time to monitor, If there is phone, then interrupting current synthesis task, and a signal being interrupted by phone call is sent to the outer of current task Speech synthesis is connect using 25, takes next speech synthesis to be processed to appoint after telephone finished, then in synthesis semantic task list Business is handled.
As shown in fig. 7, increasing a telephone monitoring device in figure as module 24 is monitored, when telephone monitoring device has listened to electricity It, can be by Compositing Engine 23 to interrupt speech synthesis and casting when words.That is, here when listening to phone or remove electricity When, then all speech synthesis tasks of Compositing Engine 23 will block, casting can be also stopped, after telephone finished, task It will be again started up, 23 engine of Compositing Engine handles remaining speech synthesis task to be processed in sequence, and normalized voice closes It needs easy-to-use stopping to broadcast mode at platform to allow users to easily stop casting.
In order to keep speech synthesis platform effect more abundant, one can be done on the basis of speech synthesis engine 23 on basis A little extensions, such as the adjustment about intonation and word speed, the adjustment of tone color is made of the method for the change of voice, is added with the method for echo mixed Effect is rung, improves sound effect with the method for balanced device, extension includes but is not limited to content listed above.
Various speech synthesis demands in the application of different external speech syntheses not only can be transferred to one by the present embodiment Independent speech synthesis system processing, i.e., normalize speech synthesis, and external speech synthesis is closed using voice is used if necessary At function, the speech-sound synthesizing function of speech synthesis system can be called by way of far call, in this way can be significantly Reduction resource redundancy and waste, also simplify the exploitation complexity of other external speech synthesis systems to a certain extent, also Telephony feature reaches normal use during can guaranteeing speech synthesis.
Embodiment six
The present invention also provides a kind of mobile terminal, including the speech synthesis system as described in example IV or embodiment five into Row speech synthesis.
Speech synthesis system can be integrated into an individual application product by the present embodiment, be set for different mobile terminals It is standby, under identical speech synthesis engine, it is packaged into the speech synthesis application that may operate under this system, mobile terminal system Including but not limited to symbian, android, ios, linux, mtk etc. are mounted with unified voice on these terminal devices After synthesis system, the external speech synthesis application in these equipment does not need speech synthesis processing module independent then, The interface i.e. module for calling directly unified speech synthesis system can realize speech-sound synthesizing function, ultimately form one for terminal The unified speech synthesis platform of equipment can not only effectively reduce storage and exploit natural resources, speech synthesis engine can also be made to be In local runtime, user's online is not needed, any data traffic will not be generated.
The present invention calls by presetting speech synthesis task, external speech synthesis application and the speech synthesis is submitted to appoint Then the speech synthesis task list to be processed by sequence is deployed and generated to various speech synthesis tasks by business, and from It is selected in the speech synthesis task list to be processed and comes most preceding speech synthesis task to be processed progress current speech synthesis And casting, and repeat the above steps up to no speech synthesis task and the speech synthesis task list to be processed is sky, it can The various speech synthesis demands in different external speech synthesis applications are transferred to an independent speech synthesis application processing, Speech synthesis is normalized, external speech synthesis applies the function of using speech synthesis if necessary, then can pass through The mode of far call can greatly reduce resource redundancy and wave come the speech-sound synthesizing function for calling speech synthesis to apply in this way Take, also simplifies the exploitation complexity of other external speech synthesis applications to a certain extent.
In addition, will not be produced by the way that by speech synthesis system of the invention, in running of mobile terminal, user does not need to surf the Internet Raw any data traffic is conducive to push advancing for speech synthesis technique.
In addition, by presetting speech synthesis task, so that the application of external speech synthesis is called and submits the speech synthesis Speech synthesis system can be integrated into the unified application for different mobile terminal equipment, be different from traditional voice and close by task At application, the speech synthesis application that mobile terminal device only needs the present invention to be mounted unified, if external speech synthesis is applied It needs to use speech-sound synthesizing function, then can have by calling the unified speech synthesis task that connects that speech-sound synthesizing function can be realized Effect reduces storage and exploits natural resources.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment For, due to corresponding to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration ?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
Obviously, those skilled in the art can carry out various modification and variations without departing from spirit of the invention to invention And range.If in this way, these modifications and changes of the present invention belong to the claims in the present invention and its equivalent technologies range it Interior, then the invention is also intended to include including these modification and variations.

Claims (18)

1. a kind of phoneme synthesizing method characterized by comprising
Step 1: default speech synthesis task;
Step 2: external speech synthesis application calls and submits the speech synthesis task;
Step 3: deploying various speech synthesis tasks, using first in first out, last-in, last-out queuing mechanism to described each Kind speech synthesis task is ranked up, and generates the speech synthesis task list to be processed by sequence;And
Step 4: it is selected from the speech synthesis task list to be processed and comes most preceding speech synthesis task to be processed, adopted With the phoneme synthesizing method of parameterised speech synthetic method or formant based on HMM or based on the waveform of Big-corpus Concatenative speech synthesis method carries out current speech synthesis and casting;
Step 5: one to four is repeated the above steps up to no speech synthesis task is submitted and the speech synthesis task column to be processed Table is sky.
2. phoneme synthesizing method as described in claim 1, which is characterized in that from the speech synthesis task list to be processed It selects while come most preceding speech synthesis task to be processed and carry out speech synthesis, further includes monitoring telephone situation, work as monitoring When to phone, stop current speech synthesis and casting, when listening to telephone finished, restarts current speech synthesis and broadcast Report.
3. phoneme synthesizing method as described in claim 1, which is characterized in that from the speech synthesis task list to be processed It selects while come most preceding speech synthesis task to be processed and carry out speech synthesis, further includes monitoring telephone situation, work as monitoring When to phone, the synthesis of pause current speech and casting start current speech conjunction since the place of pause when listening to telephone finished At and casting.
4. phoneme synthesizing method as claimed in claim 2 or claim 3, which is characterized in that stop current speech synthesis and casting or temporary After the step of ready preceding speech synthesis and casting, further include transmission task processing status to the external speech synthesis apply with Just its logic state for modifying oneself.
5. phoneme synthesizing method as claimed in claim 2 or claim 3, which is characterized in that restart current speech synthesis and casting Or after the step of place of pause starts current speech synthesis and casting, further include transmission task processing status to it is described outside Speech synthesis is connect to apply so that it modifies the logic state of oneself.
6. phoneme synthesizing method as described in claim 1, which is characterized in that from the speech synthesis task list to be processed It selects while come most preceding speech synthesis task to be processed and carry out speech synthesis, sends task processing status to described external Speech synthesis is applied so that it modifies the logic state of oneself.
7. phoneme synthesizing method as described in claim 1, which is characterized in that the speech synthesis task includes that external voice closes At the permission activation task of application, casting task is submitted, casting task, pause casting task are stopped and deletes its all casting and is appointed Business.
8. phoneme synthesizing method as described in claim 1, which is characterized in that using important in preceding, secondary posterior queue machine System is ranked up the speech synthesis task list to be processed.
9. phoneme synthesizing method as described in claim 1, which is characterized in that include language in the current speech synthesis and casting Tune is whole, word speed adjustment, with the method for the change of voice make tone color adjustment, reverberation effect is added with the method for echo or with the side of balanced device Method improves sound effect.
10. a kind of mobile terminal, which is characterized in that carried out using phoneme synthesizing method as described in any one of claim 1 to 9 Speech synthesis.
11. mobile terminal as claimed in claim 10, which is characterized in that the speech synthesis application run on including On the operating system system of symbian, android, ios, linux or mtk.
12. a kind of speech synthesis system characterized by comprising
The speech synthesis is called and submitted to interface module for presetting speech synthesis task, and for the application of external speech synthesis Task;
Task processing module, for being deployed to various speech synthesis tasks, using the queue machine of first in first out, last-in, last-out System is ranked up the various speech synthesis tasks, and generates the speech synthesis task list to be processed by sequence;And
Compositing Engine comes most preceding speech synthesis to be processed times for selecting from the speech synthesis task list to be processed Business carries out current speech synthesis and casting and broadcasts situation to the external speech synthesis application report.
13. speech synthesis system as claimed in claim 12, which is characterized in that the speech synthesis task includes external voice The permission activation task of synthesis application submits casting task, stops casting task, pause casting task and delete its all casting Task.
14. speech synthesis system as claimed in claim 12, which is characterized in that further include monitoring module, be used for monitoring telephone Situation notifies the Compositing Engine to stop current speech synthesis and casting when listening to phone, when listening to telephone finished When, notify the Compositing Engine to restart current speech synthesis and casting.
15. speech synthesis system as claimed in claim 14, which is characterized in that the monitoring module is also used to monitoring telephone Situation notifies Compositing Engine pause current speech synthesis and casting when listening to phone, when listening to telephone finished When, notify the Compositing Engine current speech synthesis and casting since the place of pause.
16. the speech synthesis system as described in claims 14 or 15, which is characterized in that the Compositing Engine, which is also used to send, appoints Business processing status is applied to the external speech synthesis so that it modifies the logic state of oneself.
17. the speech synthesis system as described in claims 14 or 15, which is characterized in that the Compositing Engine is also used to stop to work as Preceding speech synthesis and casting restart current speech synthesis and casting, the synthesis of pause current speech and casting or from pause Place starts current speech synthesis and casting.
18. a kind of mobile terminal, which is characterized in that including the described in any item speech synthesis systems of such as claim 12 to 17 into Row speech synthesis.
CN201210138028.2A 2012-05-04 2012-05-04 Phoneme synthesizing method and system Active CN103383844B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210138028.2A CN103383844B (en) 2012-05-04 2012-05-04 Phoneme synthesizing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210138028.2A CN103383844B (en) 2012-05-04 2012-05-04 Phoneme synthesizing method and system

Publications (2)

Publication Number Publication Date
CN103383844A CN103383844A (en) 2013-11-06
CN103383844B true CN103383844B (en) 2019-01-01

Family

ID=49491618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210138028.2A Active CN103383844B (en) 2012-05-04 2012-05-04 Phoneme synthesizing method and system

Country Status (1)

Country Link
CN (1) CN103383844B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104575487A (en) * 2014-12-11 2015-04-29 百度在线网络技术(北京)有限公司 Voice signal processing method and device
CN107342084A (en) * 2017-07-10 2017-11-10 绵阳美菱软件技术有限公司 A kind of intelligent refrigerator and communication means and system based on intelligent refrigerator

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1438626A (en) * 2002-02-15 2003-08-27 佳能株式会社 Information processing apparatus with speech-sound synthesizing function and method thereof
CN1455386A (en) * 2002-11-01 2003-11-12 中国科学院声学研究所 Imbedded voice synthesis method and system
US6661889B1 (en) * 2000-01-18 2003-12-09 Avaya Technology Corp. Methods and apparatus for multi-variable work assignment in a call center
CN1719513A (en) * 2005-08-08 2006-01-11 北京中星微电子有限公司 Audio frequency sequence device and sound document treatment method
CN101046956A (en) * 2006-03-28 2007-10-03 国际商业机器公司 Interactive audio effect generating method and system
CN101192203A (en) * 2006-11-30 2008-06-04 中兴通讯股份有限公司 Mobile phones audio frequency playing method
CN101266554A (en) * 2008-04-22 2008-09-17 中兴通讯股份有限公司 Embedded terminal multimedia application processing method and embedded terminal
CN101299332A (en) * 2008-06-13 2008-11-05 嘉兴闻泰通讯科技有限公司 Method for implementing speech synthesis function by GSM mobile phone
CN101355766A (en) * 2008-09-11 2009-01-28 青岛海信移动通信技术股份有限公司 Mobile terminal and control method for playing multimedia thereof
CN101778158A (en) * 2009-12-29 2010-07-14 闻泰集团有限公司 Method for processing audio conflict of mobile phone
CN102117221A (en) * 2009-12-31 2011-07-06 上海博泰悦臻电子设备制造有限公司 Audio frequency application conflict management method and manager
CN102262879A (en) * 2010-05-24 2011-11-30 乐金电子(中国)研究开发中心有限公司 Voice command competition processing method and device as well as voice remote controller and digital television
CN102360543A (en) * 2007-08-20 2012-02-22 微软公司 HMM-based bilingual (mandarin-english) TTS techniques
CN102402457A (en) * 2010-09-17 2012-04-04 希姆通信息技术(上海)有限公司 Method for processing mobile phone application program alterative events

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6661889B1 (en) * 2000-01-18 2003-12-09 Avaya Technology Corp. Methods and apparatus for multi-variable work assignment in a call center
CN1438626A (en) * 2002-02-15 2003-08-27 佳能株式会社 Information processing apparatus with speech-sound synthesizing function and method thereof
CN1455386A (en) * 2002-11-01 2003-11-12 中国科学院声学研究所 Imbedded voice synthesis method and system
CN1719513A (en) * 2005-08-08 2006-01-11 北京中星微电子有限公司 Audio frequency sequence device and sound document treatment method
CN101046956A (en) * 2006-03-28 2007-10-03 国际商业机器公司 Interactive audio effect generating method and system
CN101192203A (en) * 2006-11-30 2008-06-04 中兴通讯股份有限公司 Mobile phones audio frequency playing method
CN102360543A (en) * 2007-08-20 2012-02-22 微软公司 HMM-based bilingual (mandarin-english) TTS techniques
CN101266554A (en) * 2008-04-22 2008-09-17 中兴通讯股份有限公司 Embedded terminal multimedia application processing method and embedded terminal
CN101299332A (en) * 2008-06-13 2008-11-05 嘉兴闻泰通讯科技有限公司 Method for implementing speech synthesis function by GSM mobile phone
CN101355766A (en) * 2008-09-11 2009-01-28 青岛海信移动通信技术股份有限公司 Mobile terminal and control method for playing multimedia thereof
CN101778158A (en) * 2009-12-29 2010-07-14 闻泰集团有限公司 Method for processing audio conflict of mobile phone
CN102117221A (en) * 2009-12-31 2011-07-06 上海博泰悦臻电子设备制造有限公司 Audio frequency application conflict management method and manager
CN102262879A (en) * 2010-05-24 2011-11-30 乐金电子(中国)研究开发中心有限公司 Voice command competition processing method and device as well as voice remote controller and digital television
CN102402457A (en) * 2010-09-17 2012-04-04 希姆通信息技术(上海)有限公司 Method for processing mobile phone application program alterative events

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
《基于HMM的可训练中文语音合成》;吴义坚等;《中文信息学报》;20060731;第20卷(第4期);第75-81页
《基于共振峰过渡的协同发音语音合成算法》;康广玉等;《天津大学学报》;20100930;第43卷(第9期);第810-814页
《基于波形拼接的语音合成技术研究》;苏珊珊;《福建电脑》;20081031(第10期);第104-105页
《基于语音生成和发音模型的语音合成新方法的探讨》;俞振利;《声学学报》;20000930;第25卷(第5期);第455-462页
《改进的HMM系统在英语语音合成中的研究》;张雪英等;《太原理工大学学报》;20120131;第43卷(第1期);第16-19页
《语音合成技术及其研究进展》;阿日木扎等;《内蒙古科技与经济》;20100930(第18期);第31-33页

Also Published As

Publication number Publication date
CN103383844A (en) 2013-11-06

Similar Documents

Publication Publication Date Title
US9911415B2 (en) Executing a voice command during voice input
WO2016205338A1 (en) Managing interactions between users and applications
JP4917884B2 (en) System and method for text speech processing in a portable device
CN108364645A (en) A kind of method and device for realizing page interaction based on phonetic order
CN106504742B (en) Synthesize transmission method, cloud server and the terminal device of voice
WO2015094907A1 (en) Attribute-based audio channel arbitration
WO2005076789B1 (en) A system for computer-based, calendar-controlled message creation and delivery
CN109977218A (en) A kind of automatic answering system and method applied to session operational scenarios
CN105721492B (en) A kind of method, apparatus and terminal of speech processes
KR102639526B1 (en) Method for providing speech video
CN106373566A (en) Data transmission control method and device
CN113748460A (en) Bandwidth extension of incoming data using neural networks
CN109005190A (en) A method of full-duplex voice dialogue and page control are realized based on webpage
CN103383844B (en) Phoneme synthesizing method and system
CN110428811B (en) Data processing method and device and electronic equipment
CN108986810A (en) A kind of method and device for realizing interactive voice by earphone
CN101299332B (en) Method for implementing speech synthesis function by GSM mobile phone
CN109326288A (en) A kind of AI speech dialogue system
CN110335610A (en) The control method and display of multimedia translation
CN108012030A (en) A kind of mobile terminal downloading management method
CN102438086A (en) Intelligent voice system in converged communication and realization method
CN104023114A (en) Method for information processing and electronic equipment
CN105653229A (en) Method and device for implementing voice control
CN106170032A (en) The way of recording of a kind of speech data and device
CN111754974A (en) Information processing method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: SHANGHAI GUOKE ELECTRONIC CO., LTD.

Free format text: FORMER OWNER: SHENGYUE INFORMATION TECHNOLOGY (SHANGHAI) CO., LTD.

Effective date: 20140919

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20140919

Address after: 201203, room 1, building 380, 108 Yin Yin Road, Shanghai, Pudong New Area

Applicant after: Shanghai Guoke Electronic Co., Ltd.

Address before: 201203 Shanghai Guo Shou Jing Road, Zhangjiang High Tech Park of Pudong New Area No. 356 building 3 Room 102

Applicant before: Shengle Information Technology (Shanghai) Co., Ltd.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 127, building 3, 356 GuoShouJing Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 200120

Patentee after: SHANGHAI GEAK ELECTRONICS Co.,Ltd.

Address before: Room 108, building 1, 380 Yinbei Road, Pudong New Area, Shanghai 201203

Patentee before: Shanghai Nutshell Electronics Co.,Ltd.