CN106056207B

CN106056207B - A kind of robot depth interaction and inference method and device based on natural language

Info

Publication number: CN106056207B
Application number: CN201610302605.5A
Authority: CN
Inventors: 闵华松; 李潇; 齐诗萌; 林云汉; 周昊天
Original assignee: Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan University of Science and Engineering WUSE
Priority date: 2016-05-09
Filing date: 2016-05-09
Publication date: 2018-10-23
Anticipated expiration: 2036-05-09
Also published as: CN106056207A

Abstract

The robot depth based on natural language that the invention discloses a kind of interacts and inference method and device, and this approach includes the following steps：1）Speech recognition：User speech input is received, input signal is handled, text message is obtained；2）Obtain case attribute：By step 1）The text of middle acquisition carries out word segmentation processing, and the case in the text and case library after participle is then carried out the attribute that similarity mode extracts case；3）Depth dialogue is interacted with three-dimensional scene：If according to step 2）The user view for extracting the attribute acquisition of case is imperfect, then the real-time map file for combining Kinect sensor to obtain repeatedly guides user, is completely intended to until obtaining, and is then directed to the job task that user is completely intended to and generates solution；Phonetic synthesis：Obtained solution is showed in a text form, synthesis voice feeds back to user by stereo set.Robot uses natural language per family with use in interactive process of the present invention.

Description

A kind of robot depth interaction and inference method and device based on natural language

Technical field

The present invention relates to artificial intelligence technology more particularly to a kind of robot depth interactions and reasoning based on natural language Method and apparatus.

Background technology

In recent years, with the fast development of intelligent robot, it is intended that allowing robot in complexity by way of dialogue Various job tasks are completed in environment.It is communicated with natural language with machine, this is that people are pursued for a long time：People It can make operation robot with the language that oneself is most accustomed to, go to learn various complexity with energy without taking much time again Computer language.

In this process, it is necessary to which intelligent robot system understands natural language, understands user and it is expected, and has one Kind inference mechanism makes inferences real time problem, solves and learns.In current achievement in research, representative inference mechanism There are Process Based (Rule-Based Reasoning, RBR), procedural inference (Procedural Reasoning System, PRS) and case-based reasoningf (case-based reasoning, CBR).Wherein, Process Based is core Inference mechanism be difficult to obtain inference rule in certain fields without being widely used；Kernel-based methods inference mechanism shortens Inference time, but there is also some shortcomings that newly-generated planning can not be learnt and be stored such as the restriction of planning library； The mechanism of case-based reasoningf has certain by accessing the source example in case searching to obtain the solution of current case Learning ability, it may have higher practicability.

But the inference mechanism of case-based reasoningf does not have analysis ability, can not analyze the indefinite purposes of user and anti- Feedback guiding, does not have independence.In this context, this method introduces BDI (belief-desire-intention) model, BDI It is a kind of behavior cognition framework, essence is how target in order to solve how to determine intelligent body and intelligent body realize target, It by case-based reasoning techniques and BDI models couplings, can both increase the independence of inference system, also solve BDI models The shortcomings that without learning ability.Meanwhile depth dialogue and three-dimensional scene reasoning process are also introduced, by reasoning and actual scene knot Altogether, the intelligent of robot is improved.

Invention content

The technical problem to be solved in the present invention is for the defects in the prior art, to provide a kind of based on natural language The interaction of robot depth and inference method and device realize that user interacts and reasoning with the depth of robot by natural language, Improve the intelligent and independence of robot.

The technical solution adopted by the present invention to solve the technical problems is：A kind of robot depth friendship based on natural language Mutually and inference method, include the following steps：

1) speech recognition：User speech input is received, input signal is handled, text message is obtained；

2) case attribute is obtained：The text obtained in step 1) is subjected to word segmentation processing, then by after participle text with Case in case library carries out the attribute of matching extraction present case；

For storing the case being pre-designed according to actual scene, each case has including following basic the case library Attribute value, including：The attribute set of case and the solution of case；

3) depth dialogue is interacted with three-dimensional scene：If the user that the attribute for extracting present case according to step 2) obtains anticipates Scheme it is imperfect, then combine Kinect sensor obtain real-time map information user is repeatedly guided, until obtain completely It is intended to, is then directed to the job task that user is completely intended to and generates solution；

Phonetic synthesis：Inference machine shows obtained solution in a text form, and machine is in a manner of voice It is sent to user, user is fed back to by stereo set using TTS technologies synthesis voice.

By said program, the step 1) speech recognition process specifically comprises the following steps：

1.1) it pre-processes：User speech information is acquired by microphone array, at the primary speech signal of input Reason, filter out unessential information therein and ambient noise, and carry out the end-point detection of voice signal, voice framing and Preemphasis processing；

1.2) feature extraction：The key characterization parameter for extracting reflection phonic signal character forms feature vector sequence；

1.3) Hidden Markov Model (HMM) is used to carry out acoustic model modeling, it will be to be identified during identification Voice is matched with acoustic model, to obtain recognition result；

1.4) grammer, semantic analysis are carried out to training text database, trains to obtain N-Gram by being based on statistical model Language model reduces search range to improve discrimination.

1.5) it is directed to the voice signal of input, is established according to oneself trained good HMM acoustic models, language model and dictionary One identification network, a best paths is found according to searching algorithm, this path is to maximum in the network The word string of the probability output voice signal, so that it is determined that the word that this speech samples is included.

By said program, the attribute of extraction present case is in the text and case library after segmenting in the step 2) Case carries out the attribute of the matching extraction present case of the text similarity based on vector space model.

By said program, the foundation of case library is using following steps in the step 2)：

Conversation subject is designed according to demand, and according to conversation subject come design motif tree, subject tree is divided into theme node, necessary Attribute node and leaf node, the relationship between them is leaf node subordinate and indispensable attributes node, and indispensable attributes node is subordinated to Theme node, there are one the effective statuses of two-value to accord with for each node, be between leaf node or relationship, indispensable attributes Between node for relationship；

Dialogue generating function is write according to the node of subject tree, the set of these dialogue generating functions constitutes guiding library； Under different system modes, call the function that can obtain different response output, each generating function of talking with only is responsible for its institute The response of corresponding node, is independent of each other in design and modification.

Case attribute process is obtained by said program, in the step 2) to specifically comprise the following steps：

2.1) word segmentation processing is carried out to the text obtained in step 1), i.e., by text segmentation at single phrase；

2.2) text after participle is matched with the case in case library, since each case includes the attribute of task Set, when retrieving most like case, will extract the corresponding task attribute of case；

By said program, step 3) the depth dialogue specifically comprises the following steps with three-dimensional scene interactive process：

3.1) when inference machine receives voice messaging input, cartographic information that robot is obtained according to Kinect sensor Judge that user inputs voice, if uncorrelated to current map information, robot can carry out user's guiding；If user inputs and works as Preceding cartographic information is related, then user can be inputted and be matched with the case in case library by robot, if there are similar cases, User's input information is matched with the cartographic information that Kinect sensor obtains, judges whether to disclosure satisfy that user it is expected simultaneously Feed back to user；

3.2) after by Case Retrieval and map match, inference machine has just obtained corresponding task attribute and matching degree, connects Get off to be analyzed to obtain user to these information and it is expected, if be calculated user be contemplated to be it is complete if need not be into Row further guiding, is transferred to step 3.4), if it is desired to and it is imperfect, then it needs to carry out further user's guiding, is transferred to step 3.3)；

3.3) a guiding case library is built with XML file, the guiding library contains when user it is expected imperfect and is directed to Lack the boot scheme that attribute makes user；By the desired each attribute of user with guiding library case attribute one by one compared with, It is all mutually 1, is not all 0, obtained value is added, and maximum value is best case, and the guiding case is taken to be guided as boot scheme User；It is expected until obtaining complete user；

3.4) weight after calling this in case library completely it is expected corresponding solution and matched with real-time three-dimensional environmental information With a succession of executable action sequence (Intention) of generation, to realize specified job task.

A kind of robot depth interaction and reasoning device based on natural language, including：

Point cloud acquisition module is used for the collected map depth informations of Kinect and colouring information after fusion treatment Generate three dimensional point cloud (PCD), by pretreatment, key point extraction, description son extraction, then by object features database into Row characteristic matching obtains three-dimensional scenic semanteme map and describes file；

Sound identification module, the voice signal input by user for being acquired to microphone array carry out noise reduction process, and Feature extraction is carried out using MFCC algorithms to search for by tone decoding then in conjunction with HMM acoustic models and N-gram language models Algorithm converts voice signal to text.

Depth is talked with and three-dimensional scene interactive module, for retrieving the case in the text and case library that receive Most like case is found, the map file that binding object recognition node obtains carries out map match, it is expected analysis and guiding, from And the expectation for improving user generates solution, while voice is sent in a text form to the answer and guidance information of user Synthesize node；

Voice synthetic module, the text obtained when using TTS technologies by human-computer interaction by text analyzing, prosody modeling and Three steps of phonetic synthesis generate corresponding voice signal and feed back to user；

Case library, the knowledge base for storing experience in reality built with XML file use for reference the experience memory mould of the mankind Formula, according to actual scene cases of design, each case includes following essential attribute value：The solution of attribute set and case.

The beneficial effect comprise that：

1. robot uses natural language, robot that can be obtained from main boot user per family with use in interactive process of the present invention It obtains user completely it is expected, and solution is obtained to execute task with case storehouse matching.

2. the present invention is using a kind of depth interaction towards Chinese speech and inference mechanism, on the basis of traditional CBR-BDI Upper increase depth dialogue and three-dimensional scene interactive module, and realize " user, which expresses, to be intended to mismatch with actual scene " and " use Family expression be intended to it is imperfect " when interaction and reasoning.Since this method is to supplement the unknown category in being intended to by human-computer interaction Property, compared to the reasoning based on common sense, this method can be more accurate, flexible and practical；Inference mechanism is with CBR-BDI simultaneously Basis can solve existing issue using past experience, and can carry out feedback to problem, can independently go to realize mesh Mark has preferable market application prospect and development potentiality.

Description of the drawings

Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing：

Fig. 1 is the hardware system framework figure of robot depth interaction and reasoning device in the embodiment of the present invention；

Fig. 2 is the interaction of robot depth and inference method program flow diagram in the embodiment of the present invention；

Fig. 3 is depth interaction and inference mechanism flow chart in the embodiment of the present invention；

Fig. 4 is depth dialogue and three-dimensional scene interactive module reasoning flow chart in the embodiment of the present invention.

Specific implementation mode

In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, is not used to limit The fixed present invention.

As shown in FIG. 1, FIG. 1 is a kind of robot depth interactions and inference method based on natural language proposed by the present invention Hardware system framework when for robot sorting system.Voice is inputted by microphone array, text is obtained by sound identification module This information, text message inputs man-machine depth alternate reasoning module, while the map file that the identification of Kinect cameras obtains also is sent out Depth interaction and reasoning module are given, obtaining complete user by improved CBR-BDI inference mechanisms it is expected, in map file The middle coordinate position for obtaining target, generates solution.The system platform used in the present invention is that Ubuntu (version 12.04) is embedding Enter formula platform.

Fig. 2 is a kind of the robot depth interaction and inference method program circuit based on natural language that the present invention is implemented Figure, it is main as follows：User information is converted to text by speech recognition process, case is carried out in case library after text participle Retrieval, analyzes the case attribute that Case Retrieval obtains, if case number of attributes is more than 0, carries out map match；If case Original state number of attributes is 0 in example, then represents user's input in vain, and the extraction guiding case from guiding library is needed to guide Until status attribute quantity is more than 0, map match is then carried out.If user's desired object quantity and the physical quantities in map Match, then carries out expectation analysis；If user's desired object quantity is mismatched with physical quantities in map, need to guide, directly To user's desired object quantity with until physical quantities match in map.Finally the value of the attribute of case and map match is added Expectation analysis is done to user's expectation, whether analysis expectation is complete, if all indispensable attributes are not in present case acquisition attribute value For sky, then it is expected completely, otherwise it is expected imperfect；(it is expected that whether the attribute completely obtained by present case includes Whole attributes with case judge), if expectation is imperfect, need further to guide, until it is expected it is complete until；If it is expected that Completely, then it extracts required information and generates solution, that is, user view.

Robot depth interaction and inference method flow chart of the Fig. 3 based on natural language, include mainly speech recognition, case Storage, acquisition case attribute, depth dialogue is interacted with three-dimensional scene and five parts of phonetic synthesis.

The specific implementation of the present invention is as follows：

S1：Speech recognition

S11：User handles the primary speech signal of input, is filtered out by microphone array input voice information Fall unessential information therein and ambient noise, and carries out at the end-point detection of voice signal, voice framing and preemphasis Reason.

S12：Speech recognition is carried out using Mel frequency cepstral coefficients (MFCC) algorithm.Using MFCC features, use Frame removes segmentation speech waveform, and per the general 10ms of frame, then per frame, extraction can represent 39 numbers of the frame voice, this 39 numbers The MFCC features of the word namely frame voice, are indicated with feature vector.

S13：Acoustic model modeling is carried out using hidden Markov model (HMM).To the time series structure of voice signal Statistical model is established, is changed come analog voice signal statistical property with the Markov chain with finite state number.

S14：Language model modeling is carried out using N-Gram models, carrys out the relationship between descriptor and word.The technical program N-gram language models are obtained using the training tool CMUCLMTK of CMU offers.

S15：Using the Viterbi algorithm based on Dynamic Programming at every point of time on each state, calculate decoding shape The posterior probability of state sequence pair observation sequence retains the path of maximum probability, and the corresponding state letter under each nodes records Breath is so as to last reversed acquisition word decoding sequence.

S2：Case stores

Case library is stored using XML file form in the technical program, and with good grounds actual scene designs in case library 1~ N case, there are two essential attribute values for each case, including：The attribute set of case, the solution (robot of case A series of actions sequence), by with the final attribute set that is generated after environmental interaction and reasoning.

For each newly-generated case, an initial attribute set can be obtained by similitude matching, it is initial to belong to Xing Ji credit unions constantly change with alternate reasoning process, and after final generation is complete to be intended to, end-state is stored in finally In attribute set.

Case is divided into inquiry theme and sorting theme in present design, and the attribute set of case includes：Physical quantities, object The title for the destination that body title, the position of object, object color, article size, object are placed.Such as case：" crawl one Red Big Apple is placed in the basket of the left side ", attribute is assigned as：Physical quantities："one"；Object names：" apple "；Object position It is set to sky；Object color：" red "；Article size：" big "；The destination title that object is placed：" left side basket ".

S3：Obtain case attribute

S31：The text obtained in S1 is segmented using segmenter.Example 1：User inputs the text that voice converts： " one apple of crawl ", after segmenter segments, result is："/mono-/apple/of crawl ".

S32：Each word after participle is matched with case library, if not retrieving similar cases, establishes new case Example；If retrieving similar cases, similar cases are returned:, and calculate initial case number of attributes.Example 2：Case：" crawl one A apple ", case initial attribute have：Physical quantities and object names, then initial case attribute number magnitude is 2.When initial case Number of attributes is more than 0, then carries out map match；When initial case number of attributes is equal to 0, then input is invalid, robot actively into Row guiding.

S4：Depth dialogue is interacted with three-dimensional scene, and detailed process is as shown in Figure 4：

S41：Map match；

S411：System needs to obtain the operating environment Semantic map information of high quality by 3D vision environment sensing.This Design scheme extracts 3D point cloud image by Kinect and establishes CSHOT object models for the characteristic matching in scene.It calls Point cloud library (PCL) realizes real-time object using the method based on local surfaces feature descriptor to common daily rigid objects Body identifies and understands.The detection that object is realized by region growing segmentation algorithm, extracts the ISS characteristic points of scene；In key point Place calculates CSHOT feature description vectors；Candidate family is generated by the 3D characteristic matchings based on distance threshold；Pass through stochastical sampling Consistency algorithm generates transformation it is assumed that being verified by iteration closest approach algorithm pair hypothesis, generates one and keeps complete with scene The solution of office's consistency is simultaneously converted the coordinate information of object to robot coordinate system by coordinate conversion.Object will be obtained Mark and geological information be written XML semanteme map files.

The attribute of XML map file objects includes：The number of object in scene map；The title of object, such as apple, tangerine Son etc.；The color of object；The shape of object, such as cylindrical type, cube；The size of object, that is, long * wide the * high, (bottom surfaces half π * Diameter) 2* high etc..

S412：Whether have user desired object, and count its quantity if finding in XML maps, calculates the expectation object of user The case where matching of body quantity and map.Here it will appear four kinds of match conditions：(1) there is no satisfactory object in scene； (2) physical quantities are less than the desired quantity of user in scene；(3) quantity is just equal both in scene；(4) object in scene Body quantity is more than the desired quantity of user.

S42：It is expected that analyzing

Illustrate that being contemplated to be for user is effective and determination when there are situation in S412 (3), can carry out at this time in next step Expectation analysis, be to need to carry out user's guiding using the method in S43 for there is situation in S412 (1) (2) (4).

When it is expected analysis, the value of the attribute of case and map match is added to user and it is expected to do expectation analysis, phase Prestige can completely carry out case reuse and generate intention, i.e. robot motion sequence, otherwise it is expected imperfect to call guiding case Example library guides.

Example 3：There are one red Big Apples, user to be said to robot in map file：" one apple of crawl ", carries out map Matching is that corresponding situation is：Physical quantities and user's requested number are equal in scene, but carry out it is expected analysis when, object The destination name attribute value of placement is sky, then it is expected imperfect, needs to carry out user's guiding using the method in S43.

S43：User guides

When user it is expected that the expectation that analysis obtains is imperfect, user's guiding is carried out.Each attribute in case storage All there are one dialogue generating functions to be corresponding to it for node, and the set of these dialogue generating functions constitutes guiding library.In example 3, it is expected that Imperfect, the library of retrieval guiding at this time, robot can inquire user according to default attribute：" which basket may I ask will be put into ", then According to the information of user feedback, completion case attribute.

S44：Case reuses and complete intention generates

It is intended to through one or many guiding when incomplete, generating complete expectation, (indispensable attributes are not sky, not It is that all properties have required value), call this in case library completely it is expected corresponding solution and believe with real-time three-dimensional environment It is reused after breath matching, generates a succession of executable action sequence, to realize specified job task.

S5：Phonetic synthesis

S51：Text analyzing

By the text normalization of input, and the possible splicing mistake of user is handled, by the lack of standardization of appearance or can not pronounced Character filtering fall.The boundary for analyzing the word or phrase in text, determines the pronunciation of word, while analyzing the number occurred in text The pronunciation mode of word, surname, spcial character and various polyphones.Determine the weight side of tone transcriber not unisonance when pronunciation Formula.Finally, the inner parameter that the text conversion of input can be handled at computer, is further processed and is given birth to convenient for subsequent module At corresponding information.

S52：Prosody modeling

Go out segment5al feature for synthesis voice planning, prosodic parameter includes fundamental frequency, the duration of a sound, loudness of a sound, enables synthesis voice just The really expression meaning of one's words sounds more natural.

S53：Phonetic synthesis

According to prosody modeling as a result, converting text to voice output using Pitch synchronous overlap add method PSOLA.Processing The speech primitive of individual character or phrase corresponding to good text is extracted from phonetic synthesis library, utilizes specific speech synthesis technique The adjustment and modification that prosody characteristics are carried out to speech primitive, finally synthesize satisfactory voice, are fed back by stereo set To user.

It should be understood that for those of ordinary skills, it can be modified or changed according to the above description, And all these modifications and variations should all belong to the protection domain of appended claims of the present invention.

Claims

1. a kind of robot depth interaction and inference method based on natural language, which is characterized in that include the following steps：

2) case attribute is obtained：The text obtained in step 1) is subjected to word segmentation processing, then by the text and case after participle Case in library carries out the attribute of the matching extraction case of the text similarity based on vector space model；

The case library is for storing the case being pre-designed according to actual scene, essential attribute value that there are three each cases, Including：The initial attribute set of case, the solution of case, by with the final property set that is generated after environmental interaction and reasoning It closes；

3) depth dialogue is interacted with three-dimensional scene：If the user view that the attribute for extracting case according to step 2) obtains is endless Whole, then the real-time map file for combining Kinect sensor to obtain repeatedly guides user, is completely intended to until obtaining, so It is directed to the job task that user is completely intended to afterwards and generates solution；

Step 3) the depth dialogue specifically comprises the following steps with three-dimensional scene interactive process：

3.1) when inference machine receives voice messaging input, robot judges according to the cartographic information that Kinect sensor obtains User inputs voice, if uncorrelated to current map information, robot can carry out user's guiding；If user inputs and current position Figure information is related, then user can be inputted and be matched with the case in case library by robot, if there are similar cases, will be used Family input information is matched with the cartographic information that Kinect sensor obtains, and judges whether to disclosure satisfy that user it is expected and feeds back To user；

3.2) after by Case Retrieval and map match, inference machine has just obtained corresponding task attribute and matching degree, next These information are analyzed to obtain user's expectation, it need not be into traveling if user is calculated and is contemplated to be completely One step guides, and is transferred to step 3.4), if it is desired to and it is imperfect, then it needs to carry out further user's guiding, is transferred to step 3.3)；

3.3) a guiding case library is built with XML file, the guiding library, which contains to be directed to when user it is expected imperfect, to be lacked The boot scheme that attribute makes user；By the desired each attribute of user with guiding case library case attribute one by one compared with, It is all mutually 1, is not all 0, obtained value is added, and maximum value is best case, and the guiding case is taken to be guided as boot scheme User；It is expected until obtaining complete user；

3.4) this in case library is called completely it is expected corresponding solution and reused after being matched with real-time three-dimensional environmental information, it is raw At a succession of executable action sequence, to realize specified job task；

Phonetic synthesis：Inference machine shows obtained solution in a text form, and use is sent in a manner of voice Family.

2. robot depth interaction and inference method according to claim 1 based on natural language, which is characterized in that institute Step 1) speech recognition process is stated to specifically comprise the following steps：

1.1) it pre-processes：User speech information is acquired by microphone array, the primary speech signal of input is handled, is filtered Unessential information therein and ambient noise are removed, and carries out the end-point detection of voice signal, voice framing and pre-add It handles again；

1.3) Hidden Markov Model is used to carry out acoustic model modeling, by voice and acoustics to be identified during identification Model is matched, to obtain recognition result；

1.4) grammer, semantic analysis are carried out to training text database, trains to obtain N-Gram language by being based on statistical model Model reduces search range to improve discrimination；

1.5) it is directed to the voice signal of input, one is established according to oneself trained good HMM acoustic models, language model and dictionary It identifies network, finds a best paths in the network according to searching algorithm, this path is to maximum probability The word string of the voice signal is exported, so that it is determined that the word that this speech samples is included.

3. robot depth interaction and inference method according to claim 1 based on natural language, which is characterized in that institute The foundation for stating case library in step 2) uses following steps：

Conversation subject is designed according to demand, and according to conversation subject come design motif tree, subject tree is divided into theme node, indispensable attributes Node and leaf node, there are one the effective statuses of two-value to accord with for each node；

Dialogue generating function is write according to the node of subject tree, the set of these dialogue generating functions constitutes guiding library；In difference System mode under, call the function that can obtain different response outputs, each generating function of talking with only is responsible for corresponding to it The response of node is independent of each other in design and modification.

4. robot depth interaction and inference method according to claim 1 based on natural language, which is characterized in that institute Acquisition case attribute process in step 2) is stated to specifically comprise the following steps：

2.2) text after participle is matched with the case in case library, since each case includes the feature and phase of problem Task attribute is answered, when retrieving most like case, the corresponding task attribute of case will be extracted.

5. a kind of robot depth interaction and reasoning device based on natural language, which is characterized in that including：

Point cloud acquisition module, for generating the collected map depth informations of Kinect and colouring information after fusion treatment Three dimensional point cloud carries out feature by pretreatment, key point extraction, description son extraction, then by object features database File is described with three-dimensional scenic semanteme map is obtained；

Sound identification module, the voice signal input by user for being acquired to microphone array carries out noise reduction process, and uses MFCC algorithms carry out feature extraction and pass through tone decoding searching algorithm then in conjunction with HMM acoustic models and N-gram language models Convert voice signal to text；

Depth is talked with and three-dimensional scene interactive module, for the case in the text and case library that receive to be carried out retrieval searching Most like case, the map file that binding object recognition node obtains carries out map match, it is expected analysis and guiding, to complete The expectation of kind user generates solution, while being sent to phonetic synthesis in a text form to the answer and guidance information of user Node；

Voice synthetic module, the text obtained when using TTS technologies by human-computer interaction pass through text analyzing, prosody modeling and voice Three steps of synthesis generate corresponding voice signal and feed back to user；

Case library, for storing the case being pre-designed according to actual scene, the case includes following essential attribute value：Attribute The solution of set and case.