CN106227721B

CN106227721B - Chinese Prosodic Hierarchy forecasting system

Info

Publication number: CN106227721B
Application number: CN201610642956.0A
Authority: CN
Inventors: 陶建华; 郑艺斌; 李雅; 温正棋
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2016-08-08
Filing date: 2016-08-08
Publication date: 2019-02-01
Anticipated expiration: 2036-08-08
Also published as: CN106227721A

Abstract

The invention discloses a kind of Chinese Prosodic Hierarchy forecasting systems.Wherein, which includes: the text data that text analysis model output analysis is completed；The text feature of text feature parameterized module output parameter；Words vector joint training module receives the text data that the analysis that the text analysis model generates is completed, and the term vector enhanced with word vector for exporting text indicates model；Term vector generation module indicates model using the term vector enhanced with word vector, and the term vector of the word vector enhancing of text data is completed in output analysis；First single classifier training module exports the first mapping model；Second single classifier training module exports the second mapping model；The output of feature importance ranking module has the text parameter feature of predtermined category performance；Model Fusion module exports the result of the prosody hierarchy structure prediction.The accuracy of rhythm structure level prediction is improved through the embodiment of the present invention.

Description

Chinese Prosodic Hierarchy forecasting system

Technical field

The present embodiments relate to the total speech synthesis technique fields of human-computer interaction, more particularly, to a kind of Chinese prosody hierarchy Structure prediction system.

Background technique

Accurate prosody hierarchy description and from predicting that prosody hierarchy structure is always in speech synthesis to pass in text information An important step is the important component for improving synthesis speech naturalness and expressive force, constructing harmonious human-computer interaction technology.Rhythm Rule structural model can depict modulation in tone and the order of importance and emergency in voice, and then improve the expressive force and nature of synthesis voice Degree.Rhythm structure modeling is of great significance to the development of speech synthesis, human-computer interaction etc. with prediction.

Although having there is many research work in this field, there are also many problems for prosody hierarchy prediction so far There is no very good solution.

It is mainly manifested in the following:

1, the description of text feature is inaccurate.Wherein most is only from some more traditional spies such as part of speech, word length Text is described in sign；And the fewer description for considering word vector characteristics and text feature being carried out, do not account for a word yet The word inside different meaning and word be might have under different context on term vector influence that may be present.

2, an ideal state, therefore the naturalness of speech synthesis is generally not achieved in the accuracy of single model prediction It is significantly damaged, and then influences the sense of hearing of people.

In view of this, the present invention is specifically proposed.

Summary of the invention

In order to solve existing technical problem, the embodiment of the present invention provides a kind of Chinese Prosodic Hierarchy forecasting system, To improve the accuracy of rhythm structure level prediction.

To achieve the goals above, according to an aspect of the invention, there is provided following technical scheme:

A kind of Chinese Prosodic Hierarchy forecasting system, the forecasting system include:

Text analysis model, for receiving text data to be analyzed, the text data that output analysis is completed；

Text feature parameterized module is connected with the text analysis model, the text completed for receiving the analysis Data, the text feature of output parameter；

Words vector joint training module, is connected, for receiving the text analysis model with the text analysis model The text data that the analysis generated is completed, and language model of the joint training based on word vector sum term vector export text The term vector enhanced with word vector indicate model；

Term vector generation module, the text data that the analysis for being exported based on the text analysis model is completed, Model is indicated using the term vector enhanced with word vector, exports the word vector enhancing that text data is completed in the analysis Term vector；

First single classifier training module is connected with the text feature parameterized module, for training from the text First mapping model of the text feature of the parametrization of feature parameterization module output to prosody hierarchy structure；

Second single classifier training module is connected with the term vector generation module, raw from the term vector for training At module output the word vector enhancing term vector to the prosody hierarchy structure the second mapping model；

Feature importance ranking module is connected with the first single classifier training module, has predetermined point for exporting The text parameter feature of class performance；

Model Fusion module, with the first single classifier training module, the second single classifier training module and institute It states feature importance screening module to be connected, for receiving the first single classifier training module and second single classifier instruction Practice first mapping model and second mapping model and defeated by the feature importance ranking module of module output The text parameter feature with predtermined category performance out, and first single classifier is instructed using integrated learning approach Practice module and the second single classifier training module is merged in decision-making level, to export the prosody hierarchy structure prediction Result.

Further, the text analysis model is specifically used for carrying out at regularization the text data to be analyzed Reason corrects polyphone pronunciation mistake, and carries out Regularization to number, exports the text data of the analysis completion.

Further, the text data that the analysis is completed includes symbolic feature and numeric type feature；The text feature Parameterized module is specifically used at the symbolic feature exported using one-hot representation method to the text analysis model Reason, and retain the numeric type feature of the text analysis model output, to export the text feature of the parametrization.

Further, the text data that the analysis is completed further includes participle feature；

The words vector joint training module specifically includes:

Word location extraction module, the participle feature for being exported according to the text analysis model, and appeared according to word Position in word clusters each word, extracts word location information；

Word context cluster module, the word location information for being extracted based on the word location extraction module, according to The different contexts of the word are clustered, and indicate the same word with multiple vectors；

Non- decomposition vocabulary establishes module, the textual data that the analysis for being exported based on the text analysis model is completed According to, establish it is non-decompose word word lists；

Specific words vector joint training module, for establishing the described overstepping one's bounds of module output according to the non-vocabulary that decomposes The word lists of word and the output of the word context cluster module are solved as a result, exporting being enhanced with word vector for the text Term vector indicate.

The term vector generation module is specifically used for the participle feature exported based on the text analysis model, and The term vector of the text of the words vector joint training module output enhanced with word vector indicates model, in conjunction with The semantic information that word information in word and word is included constructs the united term vector model of words, and is combined by the words Term vector model mapped, export it is described analysis complete text data word vector enhancing term vector.

Further, the first single classifier training module is specifically used for establishing mapping using the method for condition random field First mapping model of the text feature of the parametrization and the prosody hierarchy relationship between structure.

Further, the second single classifier training module is specifically used for recycling nerve net using two-way long short-term memory Network establishes the second mapping model of the term vector and the prosody hierarchy relationship between structure that map the word vector enhancing.

Further, the feature importance ranking module specifically includes:

Text feature set extraction module, the text feature for changing based on the parameter extract text spy by enumerative technique Collection is closed；

F-Score value promotes computing module, each spy extracted for calculating separately the text feature set extraction module The promotion of the upper brought F-Score value of verifying collection when levying the input respectively as the first single classifier training module Value；

Feature importance ranking output module promotes each F-Score that computing module obtains to the F-Score value The lifting values of value are ranked up, the output text parameter feature with predtermined category performance.

Further, the Model Fusion module can specifically include:

First single classifier output module is connected with the first single classifier training module, for according to described first Mapping model determines the first probability that prosody hierarchy prediction pauses and do not stop；

Second single classifier output module is connected with the second single classifier training module, for according to described second Mapping model determines the second probability that prosody hierarchy prediction pauses and do not stop；

Important feature generation module is connected with the feature importance ranking module, for described with pre- by calculating The text parameter feature for determining classification performance exports the important feature of prosody hierarchy to the contribution of the F-Score value；

Fusion forecasting module, first probability for being exported to the first single classifier output module and described Second probability of two single classifier output modules output and the important spy of important feature generation module output Sign is iterated decision tree fusion, with the prediction result on the rhythm boundary of the determination prosody hierarchy structure.

It can be seen from the above technical proposal that the embodiment of the present invention receives text to be analyzed by text analysis model Data, the text data that output analysis is completed；Then, the term vector and pass through that vector training module obtains are being combined by words Text is described in two different aspects of traditional text feature that text feature parameterized module obtains, can be more smart Text feature carefully is described；First single classifier training module is connected with text feature parameterized module, for training from text First mapping model of the text feature of the parametrization of eigen parameterized module output to prosody hierarchy structure；It is again single by second Classifier training module is connected with term vector generation module, for training the word vector exported from term vector generation module enhancing Second mapping model of the term vector to the prosody hierarchy structure；For different prosody hierarchies, using different text features In conjunction with and characteristic window length, thereby increase the accuracy of model prediction；The embodiment of the present invention also passes through words vector and joins Close the term vector that training module generates more accurately word enhancing；It is important by the feature being connected with the first single classifier training module Property sorting module output have predtermined category performance text parameter feature, to improve the performance of prosody prediction.Finally by mould Type Fusion Module introduces the iteration decision Tree algorithms in integrated study, and the output and feature generate to two single classifiers is arranged Sequence module generates importance characteristic and is merged, and substantially increases the performance of prosody prediction, so that synthesis speech naturalness and table Existing power is more preferable.

Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.Objectives and other advantages of the present invention can be by written explanation Specifically noted method is achieved and obtained in book, claims and attached drawing.

Detailed description of the invention

By the detailed description below in conjunction with attached drawing, above and other aspects, features and advantages of the invention will become more Add apparent, in which:

Fig. 1 is the structural schematic diagram according to the Chinese Prosodic Hierarchy forecasting system shown in an exemplary embodiment；

Fig. 2 is the structural schematic diagram according to the words vector joint training module shown in another exemplary embodiment；

Fig. 3 is the structural schematic diagram according to the feature importance ranking module shown in an exemplary embodiment；

Fig. 4 is the structural schematic diagram according to the Model Fusion module shown in an exemplary embodiment.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference Attached drawing, the present invention is described in more detail.It should be noted that in the absence of clear limitations or conflicts, this hair Each embodiment and technical characteristic therein in bright can be combined with each other and form technical solution.It is described in attached drawing or specification In, similar or identical part all uses identical figure number.And in the accompanying drawings, to simplify or facilitate mark.Furthermore in attached drawing The implementation for not being painted or describing is form known to a person of ordinary skill in the art in technical field.In addition, though herein can The example of parameter comprising particular value is provided, it is to be understood that parameter is equal to corresponding value without definite, but in acceptable mistake It is similar to be worth accordingly in poor tolerance or design constraint.

The total thought of the embodiment of the present invention be joint word vector be trained term vector model, based on word enhancing word to The modeling of memory models Recognition with Recurrent Neural Network model in short-term of the two-way length of amount simultaneously utilizes the method for integrated study to different classifications device Result merged.Particularly by text analyzing and parameterized module, the traditional characteristic for obtaining text is indicated, for not Rhythm structure with level is predicted, is combined using different text features and text is described in characteristic window length, in turn Using text feature as the input of condition random field, to construct the first single classifier；Pass through words joint training model, building text The term vectorization of the word enhancing of eigen indicates.To which text feature is obtained one group of vectorization parameter by statistics and is retouched It states, as the input of two-way length memory models Recognition with Recurrent Neural Network in short-term, to construct the second single classifier；Finally, pass through feature Importance ranking module generates the important feature for being conducive to classification, and the output of these features and the first and second single classifiers is together As the input of Model Fusion, merged during Model Fusion using iteration decision Tree algorithms.

Fig. 1 is the structural schematic diagram of Chinese Prosodic Hierarchy of embodiment of the present invention forecasting system.As shown in Figure 1, this is pre- Examining system may include: text analysis model 1, text feature parameterized module 2, words vector joint training module 3, term vector Generation module 4, the first single classifier training module 5, the second single classifier training module 6, feature importance ranking module 7 and mould Type Fusion Module 8.Wherein, text analysis model 1 is for receiving text data to be analyzed, the textual data that output analysis is completed According to.Text feature parameterized module 2 is connected with text analysis model 1, for finishing receiving the text data of analysis, output parameter The text feature of change.Words vector joint training module 3 is connected with text analysis model 1, raw for receiving text analysis model 1 At the text data completed of analysis, and language model of the joint training based on word vector sum term vector, export text uses word The term vector that vector is enhanced indicates model.The analysis that term vector generation module 4 is used to export based on text analysis model is complete At text data, using words vector joint training module 3 export the term vector enhanced with word vector indicate model, The term vector of the word vector enhancing of text data is completed in output analysis.First single classifier training module 5 and text feature parameter Change module 2 to be connected, for training the text feature of the parametrization exported from text feature parameterized module 2 to prosody hierarchy structure The first mapping model.Second single classifier training module 6 is connected with term vector generation module 4, raw from term vector for training At module 4 export word vector enhancing term vector to prosody hierarchy structure the second mapping model.Feature importance ranking mould Block 7 is connected with the first single classifier training module 5, for exporting the text parameter feature with predtermined category performance.Model melts Mold block 8 and 7 phase of the first single classifier training module 5, the second single classifier training module 6 and feature importance screening module Even, for receiving the first mapping model that the first single classifier training module 5 and the second single classifier training module 6 export and the Two mapping models and the text parameter feature with predtermined category performance exported by feature importance ranking module 7, and adopt The first single classifier training module 5 and the second single classifier training module 6 are merged in decision-making level with integrated learning approach, To export the result of prosody hierarchy structure prediction.

In the above-described embodiments, text analysis model 1 specifically can be used for carrying out regularization to text data to be analyzed Processing corrects polyphone pronunciation mistake, and carries out Regularization, the text data of output analysis completion to number.Wherein, divide The text data that analysis is completed includes symbolic feature and numeric type feature.Wherein, symbolic feature includes participle feature.

Wherein, it when text analysis model 1 carries out regularization processing to text data to be analyzed, is gone using the method for rule Fall symbol extra in text data to be analyzed.What text analysis model 1 exported, which analyzes the text data completed, may include But it is not limited to the corresponding participle of text, part of speech, the syllable number of word length and sentence.

In the above-described embodiments, text feature parameterized module 2 specifically can be used for using one-hot representation method to text The symbolic feature that this analysis module 1 exports is handled, and retains the numeric type feature of the output of text analysis model 1, thus defeated The text feature parameterized out.

In the above-described embodiments, it when the term vector of the training of words vector joint training module 3 word enhancing, i.e., is instructed in term vector The influence of the contained word internal word vector inside word is considered when white silk.

Fig. 2 schematically illustrates the structure of words vector joint training module.As shown in Fig. 2, words vector joint instruction Practicing module 3 can specifically include: word location extraction module 31, word context cluster module 32, non-decomposition vocabulary establish module 33 And specific words vector joint training module 34.

Word location extraction module 31 is used for the participle feature exported according to text analysis model 1, and appears in word according to word In position, each word is clustered, extract word location information.

As an example, word location extraction module 31 can appear in the starting position, middle position and end of word according to word Each word is clustered into three different classes by position.By word location extraction module 31, consider the text of word location uses word Term vector that vector is enhanced indicates can be with are as follows:

Wherein,Indicate word x_jMiddle first character；Indicate word x_jExcept middle removing first character and the last character K-th of word；Indicate word x_jMiddle the last character；N_jIndicate word x_jIn number of words；The serial number of k expression word；J takes positive integer.

The embodiment of the present invention passes through setting word location extraction module 31, it is contemplated that word letter the location of in word Breath, to eliminate word location ambiguity.

Word context cluster module 32 is used for the word location information extracted based on word location extraction module 31, not according to word It is clustered with context, and indicates the same word with multiple vectors.

For example, for word x_j={ c₁,...,c_N, then consider being increased with word vector for the text of context cluster Strong term vector indicates can be with are as follows:

Wherein, X_jIndicate that the term vector of text enhanced with word vector indicates；S () indicates that cosine similarity calculates Function；K indicates the number of the cliction up and down considered, i.e. window is long, it is preferable that K=5；It indicates in the training process most frequently By word x_jThe word vector of selection；c_uIt indicates in the training process by word x_jThe word vector of selection；w_jIndicate word x_jTerm vector；w_tTable Show simple word；The serial number of k expression word；Indicate the optimum cluster of each word in word；x_tIndicate the word of word enhancing.

The non-vocabulary that decomposes establishes text data of the module 33 for completing based on the analysis that text analysis model 1 exports, and builds Found the non-word lists for decomposing word.

In practical applications, generally existing non-decomposition word in Mandarin Chinese, such as " sofa ", " chocolate ", " hovering " Deng.In these words, the word inside word is to the semanteme of entire word substantially without contribution.It therefore is elimination Mandarin Chinese In non-influence of the word to term vector decomposed in word, need to ignore in non-decomposition word structure in the non-decomposition word term vector of training Influence of the word to term vector.When the embodiment of the present invention establishes module 33 using non-decomposition vocabulary, in the training process may not be used Consider influence of the word to term vector is generated inside word, for indissoluble word, establishes the non-word lists for decomposing word.

In order to keep decomposing word and the non-consistency for decomposing word term vector dimension, so, X_jFormula is needed multiplied by 1/2.

Specific words vector joint training module 34 is for establishing the non-decomposition word of the output of module 33 according to non-decomposition vocabulary Word lists and word context cluster module 32 output as a result, output text the term vector table enhanced with word vector Representation model.

Wherein, specific words vector joint training module 34 considers contained inside word when term vector training The influence of word vector, then word x_jThe term vector enhanced with word vector indicate model X_jIt can indicate are as follows:

Wherein, w_jIndicate word x_jTerm vector；N_jIndicate word x_jIn number of words；c_kIndicate word x_jIn k-th of word word vector； The serial number of k expression word.

The embodiment of the present invention can produce the term vector of more accurately word enhancing by words vector joint training module.Word Term vector joint training module considers influence of the word inside word to term vector.Moreover, for word vector, it is contemplated that word is in word One word is indicated with different vectors, and this is transported by the factors such as different contexts locating for middle different position and word It uses in words joint training model.In addition, for the words that cannot be split some in Chinese, in the training process, for these The word that cannot be split, influence of the word to term vector inside word will not be considered.

The embodiment of the present invention obtains being enhanced with word vector for trained text in words joint vector training module Term vector indicate the text feature of traditional parametrization that model and text feature parameterized module obtain to text to be analyzed Notebook data is described in terms of two different, can more subtly describe text feature.

In the above-described embodiments, term vector generation module 4 specifically can be used for the participle exported based on text analysis model 1 The term vector of the text that feature and words vector joint training module 3 export enhanced with word vector indicates model, ties The semantic information that the word information in word and word is included is closed, constructs the united term vector model of words, and combine by the words Term vector model mapped, output analysis complete text data word vector enhancing term vector.

Term vector generation module 4 has comprehensively considered the semantic letter that the word information in word and word is included during training Breath.After obtaining the united term vector model of trained words, the word of input text can be obtained by by the mapping of the model Vector description data.

In the above-described embodiments, the first single classifier training module 5 specifically can be used for the method using condition random field Establish the text feature of mapping parameters and the first mapping model of prosody hierarchy relationship between structure.

Wherein, the first mapping model reflects the probability that each word pauses or do not stop on the prosody hierarchy at place.

Herein, prosody hierarchy structure may include rhythm word, prosodic phrase and intonation phrase.

In the rhythm structure level prediction based on the first single classifier training module 5, for different prosody hierarchies, adopt With different text feature combinations and characteristic window length, the accuracy of model prediction is increased in this way.

In the above-described embodiments, the second single classifier training module 6 specifically can be used for following using two-way long short-term memory Ring neural network maps the term vector of word vector enhancing and the second mapping model of prosody hierarchy relationship between structure.

Wherein, the second mapping model reflects the probability that each word pauses or do not stop on the prosody hierarchy at place.

Fig. 3 schematically illustrates the structure of feature importance ranking module.As shown in figure 3, feature importance ranking mould Block 7 can specifically include: text feature set extraction module 71, F-Score value promote computing module 72 and feature importance row Sequence output module 73.Wherein, text feature set extraction module 71 is mentioned for the text feature based on parametrization by enumerative technique Take text feature set.Text feature set extraction module 71 extracts various possible feature combinations as feature by enumerative technique The input of importance ranking module 7.F-Score value promotes computing module 72 for calculating separately text feature set extraction module The 71 each features extracted respectively as the first single classifier training module 5 input when verifying collection it is upper brought by F-Score The lifting values of value.Feature importance ranking output module 73 promotes each F-Score that computing module 72 obtains to F-Score value The lifting values of value are ranked up, and export the text parameter feature with predtermined category performance.

Wherein, the text parameter feature with predtermined category performance can be by will select from maximum value in ranking results Take the feature of predetermined number as obtained from importance characteristic.

The performance of prosody prediction can be improved by setting feature importance ranking module 7 in the embodiment of the present invention.

Fig. 4 schematically illustrates the structure of Model Fusion module.As shown in figure 4, Model Fusion module 8 specifically can wrap It includes: the first single classifier output module 81, the second single classifier output module 82, important feature generation module 83 and fusion forecasting Module 84.Wherein, the first single classifier output module 81 is connected with the first single classifier training module 5, for reflecting according to first Model is penetrated, determines the first probability that prosody hierarchy prediction pauses and do not stop.Second single classifier output module 82 and second is single Classifier training module 6 is connected, for determining that prosody hierarchy prediction pauses and do not stop second is general according to the second mapping model Rate.Important feature generation module 83 is connected with feature importance ranking module 7, for having predtermined category performance by calculating Text parameter feature exports the important feature of prosody hierarchy to the contribution of F-Score value.Fusion forecasting module 84 is used for the One single classifier output module 81 output the first probability and the second single classifier output module 82 output the second probability and The important feature that important feature generation module 83 exports is iterated decision tree fusion, to determine the rhythm side of prosody hierarchy structure The prediction result on boundary.

Above-mentioned fusion forecasting module 84 comprehensively considers the first single classifier training module 5, the second single classifier training module 6 Influence with feature importance screening module 7 to final result, to generate the prediction on the rhythm boundary (pausing) of the level As a result, the result will be as the prediction of next level input feature vector.

The embodiment of the present invention introduces the iteration decision Tree algorithms in integrated study by setting Model Fusion module, right The output and feature ordering module that two single classifiers generate generate importance characteristic and are merged, and substantially increase prosody prediction Performance so that obtained synthesis speech naturalness and expressive force are more preferable.

Therefore, the embodiment of the present invention passes through text analysis model 1, text feature parameterized module 2, words vector joint instruction It is important to practice module 3, term vector generation module 4, the first single classifier training module 5, the second single classifier training module 6, feature Property sorting module 7 and Model Fusion module 8 can by any text carry out three rhythm word, prosodic phrase and intonation phrase differences The prediction of prosody hierarchy structure for instructing the rear end of speech synthesis to carry out speech synthesis, and then improves oneself for synthesizing voice So degree and expressive force.

It should be noted that Chinese Prosodic Hierarchy forecasting system provided by the above embodiment is carrying out Chinese fascicule Level structure predict when, only the example of the division of the above functional modules, in practical applications, can according to need and Above-mentioned function distribution is completed by different functional modules, i.e., the module in the embodiment of the present invention can also be decomposed again or Combination, such as the module of above-described embodiment can be merged into a module, multiple submodule can also be further split into, with complete At all or part of function described above.For the title of module involved in the embodiment of the present invention, it is only for area Divide modules, is not intended as inappropriate limitation of the present invention.

As used herein, term " module " may refer to the software object executed on a computing system or routine (it can be used the language such as C language and is achieved).Disparate modules described herein can be embodied as calculating The object or process (for example, as independent thread) executed in system.While it is preferred that being retouched herein with software to realize The system and method stated, but it is also possible with the combined realization of hardware or software and hardware and be that can be conceived to 's.

The embodiment of the present invention can be based on the operation of the platforms such as windows, linux.

It will be understood by those skilled in the art that above-mentioned Chinese Prosodic Hierarchy forecasting system can also include some other Known features, such as processor, controller, memory etc., wherein memory include but is not limited to random access memory, flash memory, only Read memory, programmable read only memory, volatile memory, nonvolatile memory, serial storage, parallel storage or Register etc., processor includes but is not limited to CPLD/FPGA, DSP, arm processor, MIPS processor etc., in order to unnecessarily Fuzzy embodiment of the disclosure, these well known structures are not shown in FIG. 1.

It should be understood that the quantity of the modules in Fig. 1 is only schematical.According to actual needs, each module can be with With arbitrary quantity.

Technical solution is provided for the embodiments of the invention above to be described in detail.Although applying herein specific A example the principle of the present invention and embodiment are expounded, still, the explanation of above-described embodiment be only applicable to help manage Solve the principle of the embodiment of the present invention；Meanwhile to those skilled in the art, according to an embodiment of the present invention, it is being embodied It can be made a change within mode and application range.

It, can also be into it should be noted that the block diagram being referred to herein is not limited solely to form shown in this article Other divisions of row and/or combination.

It should be noted that: label and text in attached drawing are intended merely to be illustrated more clearly that the present invention, are not intended as pair The improper restriction of the scope of the present invention.

Again it should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, rather than be used to describe or indicate specific sequence or precedence.It should be understood that this The data that sample uses can be interchanged in appropriate circumstances, so that the embodiment of the present invention described herein can be in addition at this In illustrate or description those of other than sequence implement.

Term " includes " or any other like term are intended to cover non-exclusive inclusion, so that including a system Process, method, article or equipment/device of column element not only includes those elements, but also including being not explicitly listed Other elements, or further include the intrinsic element of these process, method, article or equipment/devices.

Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects It describes in detail bright, it should be understood that the above is only a specific embodiment of the present invention, is not intended to restrict the invention, it is all Within the spirit and principles in the present invention, any modification, equivalent substitution, improvement and etc. done should be included in guarantor of the invention Within the scope of shield.

Claims

1. a kind of Chinese Prosodic Hierarchy forecasting system, which is characterized in that the forecasting system includes:

Text feature parameterized module is connected with the text analysis model, the text data completed for receiving the analysis, The text feature of output parameter；

Words vector joint training module, is connected with the text analysis model, generates for receiving the text analysis model The text data completed of the analysis, and language model of the joint training based on word vector sum term vector export the use of text The term vector that word vector is enhanced indicates model；

Term vector generation module, the text data that the analysis for being exported based on the text analysis model is completed, utilizes The term vector enhanced with word vector indicates model, exports the word vector enhancing for the text data that the analysis is completed Term vector；

First single classifier training module is connected with the text feature parameterized module, for training from the text feature First mapping model of the text feature of the parametrization of parameterized module output to prosody hierarchy structure；

Second single classifier training module is connected with the term vector generation module, generates mould from the term vector for training Second mapping model of the term vector of the word vector enhancing of block output to the prosody hierarchy structure；

Feature importance ranking module is connected with the first single classifier training module, has predtermined category for exporting The text parameter feature of energy；

Model Fusion module, with the first single classifier training module, the second single classifier training module and the spy It levies importance ranking module to be connected, for receiving the first single classifier training module and second single classifier training mould Block output first mapping model and second mapping model and by the feature importance ranking module output The text parameter feature with predtermined category performance, and using integrated learning approach to first single classifier training mould Block and the second single classifier training module are merged in decision-making level, to export the knot of the prosody hierarchy structure prediction Fruit.

2. forecasting system according to claim 1, which is characterized in that the text analysis model be specifically used for it is described to The text data of analysis carries out regularization processing, corrects polyphone pronunciation mistake, and carry out Regularization to number, exports institute State the text data that analysis is completed.

3. forecasting system according to claim 1, which is characterized in that the text data that the analysis is completed includes symbol spy It seeks peace numeric type feature；

The text feature parameterized module is specifically for utilizing one-hot representation method to export the text analysis model Symbolic feature is handled, and retains the numeric type feature of the text analysis model output, to export the parametrization Text feature.

4. forecasting system according to claim 1, which is characterized in that the text data that the analysis is completed further includes participle Feature；

The words vector joint training module specifically includes:

Word location extraction module, the participle feature for being exported according to the text analysis model, and appeared in word according to word Position, each word is clustered, extract word location information；

Word context cluster module, the word location information for being extracted based on the word location extraction module, according to described The different contexts of word are clustered, and indicate the same word with multiple vectors；

Non- decomposition vocabulary establishes module, the text data that the analysis for being exported based on the text analysis model is completed, Establish the non-word lists for decomposing word；

Specific words vector joint training module, for according to the non-non- decomposition word for decomposing vocabulary and establishing module output Word lists and the word context cluster module output as a result, exporting the word of the text enhanced with word vector Vector indicates.

5. forecasting system according to claim 1, which is characterized in that the text data that the analysis is completed further includes participle Feature；

The term vector generation module is specifically used for the participle feature exported based on the text analysis model and described The term vector of the text of words vector joint training module output enhanced with word vector indicates model, bluebeard compound and The semantic information that word information in word is included constructs the united term vector model of words, and passes through the united word of the words Vector model is mapped, and the term vector of the word vector enhancing for the text data that the analysis is completed is exported.

6. forecasting system according to claim 1, which is characterized in that the first single classifier training module is specifically used for The text feature and the prosody hierarchy relationship between structure for mapping the parametrization are established using the method for condition random field First mapping model.

7. forecasting system according to claim 1, which is characterized in that the second single classifier training module is specifically used for The term vector and the prosody hierarchy knot for mapping the word vector enhancing are established using two-way long short-term memory Recognition with Recurrent Neural Network Second mapping model of relationship between structure.

8. forecasting system according to claim 1, which is characterized in that the feature importance ranking module specifically includes:

Text feature set extraction module, the text feature for changing based on the parameter extract text feature collection by enumerative technique It closes；

F-Score value promotes computing module, for calculating separately each feature point that the text feature set extraction module extracts When input not as the first single classifier training module verifying collection it is upper brought by the F-Score value lifting values；

Feature importance ranking output module promotes mentioning for the F-Score value that computing module obtains to the F-Score value Appreciation is ranked up, the output text parameter feature with predtermined category performance.

9. forecasting system according to claim 8, which is characterized in that the Model Fusion module specifically includes:

First single classifier output module is connected with the first single classifier training module, for according to first mapping Model determines the first probability that prosody hierarchy prediction pauses and do not stop；

Second single classifier output module is connected with the second single classifier training module, for according to second mapping Model determines the second probability that prosody hierarchy prediction pauses and do not stop；

Important feature generation module is connected with the feature importance ranking module, for described with predetermined point by calculating The text parameter feature of class performance exports the important feature of prosody hierarchy to the contribution of the F-Score value；

Fusion forecasting module, first probability and second list for being exported to the first single classifier output module Classifier output module output second probability and the important feature generation module output the important feature into Row iteration decision tree fusion, with the prediction result on the rhythm boundary of the determination prosody hierarchy structure.