CN106227721A - Chinese Prosodic Hierarchy prognoses system - Google Patents

Chinese Prosodic Hierarchy prognoses system Download PDF

Info

Publication number
CN106227721A
CN106227721A CN201610642956.0A CN201610642956A CN106227721A CN 106227721 A CN106227721 A CN 106227721A CN 201610642956 A CN201610642956 A CN 201610642956A CN 106227721 A CN106227721 A CN 106227721A
Authority
CN
China
Prior art keywords
module
text
word
feature
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610642956.0A
Other languages
Chinese (zh)
Other versions
CN106227721B (en
Inventor
陶建华
郑艺斌
李雅
温正棋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201610642956.0A priority Critical patent/CN106227721B/en
Publication of CN106227721A publication Critical patent/CN106227721A/en
Application granted granted Critical
Publication of CN106227721B publication Critical patent/CN106227721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Abstract

The invention discloses a kind of Chinese Prosodic Hierarchy prognoses system.Wherein, this system includes: text analysis model exports the text data analyzed;The text feature of text feature parameterized module output parameter;Words vector joint training module receives the text data that completes of described analysis that described text analysis model generates, and the term vector carrying out strengthening with word vector of output text represents model;The term vector that term vector generation module utilization word vector carries out strengthening represents that the term vector that the word vector of text data strengthens analyzed by model, output;First single classifier training module exports the first mapping model;Second single classifier training module exports the second mapping model;The output of feature importance ranking module has the text parameter feature of predtermined category performance;Model Fusion module exports the result of described prosody hierarchy structure prediction.The accuracy of rhythm structure level prediction is improve by the embodiment of the present invention.

Description

Chinese Prosodic Hierarchy prognoses system
Technical field
The present embodiments relate to man-machine interaction total speech synthesis technique field, especially relate to a kind of Chinese prosody hierarchy Structure prediction system.
Background technology
Prosody hierarchy describes and predicts that from text message prosody hierarchy structure is always in phonetic synthesis to pass accurately An important step, is to improve synthesis speech naturalness and representability, the important component part of structure harmony human-computer interaction technology.Rhythm Rule structural model can depict the modulation in tone in voice and the order of importance and emergency, and then improves representability and the nature of synthesis voice Degree.Rhythm structure modeling is significant to the development of phonetic synthesis, man-machine interaction etc. with prediction.
Although, had a lot of research work in this field, but prosody hierarchy prediction also had a lot of problems so far The most well solve.
Be mainly manifested in following some:
1, the description of text feature is the most accurate.Wherein most is from some more traditional spies such as part of speech, word length Levy and text is described;And the description that text feature is carried out by fewer consideration word vector characteristics, do not account for a word yet The impact that may be present on term vector of the word within different implications and word is there may be under different context.
2, the accuracy of single model prediction does not generally reach a preferable state, and therefore the naturalness of phonetic synthesis is also It is significantly damaged, and then affects the sense of hearing of people.
In view of this, the special proposition present invention.
Summary of the invention
In order to solve existing technical problem, the embodiment of the present invention provides a kind of Chinese Prosodic Hierarchy prognoses system, To improve the accuracy of rhythm structure level prediction.
To achieve these goals, according to an aspect of the invention, it is provided techniques below scheme:
A kind of Chinese Prosodic Hierarchy prognoses system, described prognoses system includes:
Text analysis model, for receiving text data to be analyzed, the text data that output has been analyzed;
Text feature parameterized module, is connected with described text analysis model, for receiving the text that described analysis completes Data, the text feature of output parameter;
Words vector joint training module, is connected with described text analysis model, is used for receiving described text analysis model The text data that the described analysis generated completes, and joint training language model based on word vector sum term vector, export text The term vector carrying out strengthening with word vector represent model;
Term vector generation module, the text data that the described analysis for exporting based on described text analysis model completes, The term vector utilizing described word vector to carry out strengthening represents model, exports described analysis and completes the word vector enhancing of text data Term vector;
First single classifier training module, is connected with described text feature parameterized module, for training from described text The described parameterized text feature of feature parameterization module output is to the first mapping model of prosody hierarchy structure;
Second single classifier training module, is connected with described term vector generation module, raw from described term vector for training The term vector that the described word vector becoming module to export strengthens is to the second mapping model of described prosody hierarchy structure;
Feature importance ranking module, is connected with described first single classifier training module, has predetermined point for output The text parameter feature of class performance;
Model Fusion module, with described first single classifier training module, described second single classifier training module and institute State feature importance screening module to be connected, be used for receiving described first single classifier training module and described second single classifier instruction Practice described first mapping model and described second mapping model and defeated by described feature importance ranking module of module output There is described in going out the text parameter feature of predtermined category performance, and use integrated learning approach to described first single classifier instruction Practice module and described second single classifier training module merges in decision-making level, thus export described prosody hierarchy structure prediction Result.
Further, described text analysis model is specifically for carrying out at regularization described text data to be analyzed Reason, corrects polyphone pronunciation mistake, and numeral is carried out Regularization, export the text data that described analysis completes.
Further, the text data that described analysis completes includes symbolic feature and numeric type feature;Described text feature At the symbolic feature that described text analysis model is exported by parameterized module specifically for utilizing one-hot method for expressing Reason, and retain the numeric type feature of described text analysis model output, thus export described parameterized text feature.
Further, the text data that described analysis completes also includes participle feature;
Described words vector joint training module specifically includes:
Word location extraction module, for the participle feature exported according to described text analysis model, and occurs according to word Position in word, clusters each word, extracts word location information;
Word context cluster module, is used for the described word location information extracted based on described word location extraction module, according to The different contexts of described word cluster, and with the same word of multiple vector representations;
Non-decomposition vocabulary sets up module, the textual data that the described analysis for exporting based on described text analysis model completes According to, set up the word lists of non-decomposition word;
Concrete words vector joint training module, for according to described non-decomposition vocabulary set up module output described presumptuously Solve word lists and the output result of described word context cluster module of word, export strengthening with word vector of described text Term vector represent.
Further, the text data that described analysis completes also includes participle feature;
Described term vector generation module specifically for the described participle feature exported based on described text analysis model, and The term vector carrying out strengthening with word vector of the described text of described words vector joint training module output represents model, in conjunction with The semantic information that word information in word and word is comprised, is built words united term vector model, and is combined by described words Term vector model map, export described analysis complete text data word vector strengthen term vector.
Further, described first single classifier training module maps specifically for utilizing the method for condition random field to set up Described parameterized text feature and the first mapping model of described prosody hierarchy relationship between structure.
Further, described second single classifier training module is specifically for utilizing two-way long short term memory to circulate nerve net Network is set up and is mapped term vector and the second mapping model of described prosody hierarchy relationship between structure that described word vector strengthens.
Further, described feature importance ranking module specifically includes:
Text feature set extraction module, special for extracting text based on described parameterized text feature by enumerative technique Collection is closed;
F-Score value promotes computing module, for calculating each spy that described text feature set extraction module extracts respectively The lifting of the described F-Score value brought on checking collection when levying the input respectively as described first single classifier training module Value;
Feature importance ranking output module, promotes, to described F-Score value, described each F-Score that computing module obtains The lifting values of value is ranked up, and has the text parameter feature of predtermined category performance described in output.
Further, described Model Fusion module specifically may include that
First single classifier output module, is connected with described first single classifier training module, for according to described first Mapping model, determines the first probability that prosody hierarchy prediction pauses and do not stops;
Second single classifier output module, is connected with described second single classifier training module, for according to described second Mapping model, determines the second probability that prosody hierarchy prediction pauses and do not stops;
Key character generation module, is connected with described feature importance ranking module, for pre-by having described in calculating Determine the contribution to described F-Score value of the text parameter feature of classification performance to export the key character of prosody hierarchy;
Fusion forecasting module, for described first probability and described the to described first single classifier output module output Described second probability of two single classifier output module outputs and the described important spy of described key character generation module output Levy and be iterated decision tree fusion, to determine predicting the outcome of the rhythm border of described prosody hierarchy structure.
From technique scheme it can be seen that the embodiment of the present invention is by text analysis model, receive text to be analyzed Data, the text data that output has been analyzed;Then, by the words associating vector term vector that obtains of training module with pass through Text is described by traditional two different aspects of text feature that text feature parameterized module obtains, it is possible to more smart Text feature carefully is described;First single classifier training module is connected with text feature parameterized module, for training from literary composition The parameterized text feature of eigen parameterized module output is to the first mapping model of prosody hierarchy structure;Single by second again Classifier training module is connected with term vector generation module, strengthens from the word vector of term vector generation module output for training Term vector is to the second mapping model of described prosody hierarchy structure;For different prosody hierarchies, use different text features In conjunction with, and characteristic window length, thereby increase the accuracy of model prediction;The embodiment of the present invention joins also by words vector Close training module and produce the term vector that word strengthens more accurately;By important with the feature that the first single classifier training module is connected Property order module output there is the text parameter feature of predtermined category performance, to improve the performance of prosody prediction.Finally by mould Type Fusion Module, introduces the iteration decision Tree algorithms in integrated study, the output producing two single classifiers and feature row Sequence module produces importance characteristic and merges, and substantially increases the performance of prosody prediction so that synthesis speech naturalness and table Existing power is more preferable.
Other features and advantages of the present invention will illustrate in the following description, and, partly become from description Obtain it is clear that or understand by implementing the present invention.Objectives and other advantages of the present invention can be by the explanation write Method specifically noted in book, claims and accompanying drawing realizes and obtains.
Accompanying drawing explanation
By the detailed description below in conjunction with accompanying drawing, above and other aspects, features and advantages of the present invention will become more Add it is clear that wherein:
Fig. 1 is the structural representation according to the Chinese Prosodic Hierarchy prognoses system shown in an exemplary embodiment;
Fig. 2 is the structural representation according to the words vector joint training module shown in another exemplary embodiment;
Fig. 3 is the structural representation according to the feature importance ranking module shown in an exemplary embodiment;
Fig. 4 is the structural representation according to the Model Fusion module shown in an exemplary embodiment.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in more detail.It should be noted that in the case of the most clearly limiting or do not conflict, this Each embodiment and technical characteristic therein in bright can be mutually combined and form technical scheme.Describe at accompanying drawing or description In, similar or identical part all uses identical figure number.And in the accompanying drawings, to simplify or convenient sign.Furthermore, in accompanying drawing The implementation not illustrating or describing, for form known to a person of ordinary skill in the art in art.Although it addition, herein can There is provided the example of the parameter comprising particular value, it is to be understood that parameter is worth equal to corresponding without definite, but in acceptable mistake It is similar to be worth accordingly in difference tolerance limit or design constraint.
The total thought of the embodiment of the present invention be associating word vector be trained term vector model, the word that strengthens based on word to The modeling of the two-way length memory models Recognition with Recurrent Neural Network model in short-term of amount also utilizes the method for integrated study to different graders Result merge.Particularly by text analyzing and parameterized module, the traditional characteristic obtaining text represents, for not Predict with the rhythm structure of level, use different text features to combine and text is described by characteristic window length, and then Using text feature as the input of condition random field, to build the first single classifier;By words joint training model, build literary composition The term vector that the word of eigen strengthens represents.Thus, text feature is obtained one group of vectorization parameter by statistics and retouches State, as the input of two-way length memory models Recognition with Recurrent Neural Network in short-term, to build the second single classifier;Finally, feature is passed through Importance ranking module produces the key character being conducive to classification, and these features are together with the output of the first and second single classifiers As the input of Model Fusion, iteration decision Tree algorithms is used to merge during Model Fusion.
Fig. 1 is the structural representation of embodiment of the present invention Chinese Prosodic Hierarchy prognoses system.As it is shown in figure 1, this is pre- Examining system may include that text analysis model 1, text feature parameterized module 2, words vector joint training module 3, term vector Generation module the 4, first single classifier training module the 5, second single classifier training module 6, feature importance ranking module 7 and mould Type Fusion Module 8.Wherein, text analysis model 1, for receiving text data to be analyzed, exports the textual data analyzed According to.Text feature parameterized module 2 is connected with text analysis model 1, for finishing receiving the text data of analysis, output parameter The text feature changed.Words vector joint training module 3 is connected with text analysis model 1, is used for receiving text analysis model 1 raw The text data that the analysis become completes, and joint training language model based on word vector sum term vector, export text uses word The term vector that vector carries out strengthening represents model.Term vector generation module 4 is complete for analysis based on text analysis model output The text data become, the term vector carrying out strengthening with word vector utilizing that words vector joint training module 3 exports represents model, The term vector that the word vector of text data strengthens has been analyzed in output.First single classifier training module 5 and text feature parameter Change module 2 to be connected, for training from the parameterized text feature of text feature parameterized module 2 output to prosody hierarchy structure The first mapping model.Second single classifier training module 6 is connected with term vector generation module 4, raw from term vector for training The term vector that the word vector becoming module 4 to export strengthens is to the second mapping model of prosody hierarchy structure.Feature importance ranking mould Block 7 is connected with the first single classifier training module 5, has the text parameter feature of predtermined category performance for output.Model melts Compound module 8 screens module 7 phase with first single classifier training module the 5, second single classifier training module 6 and feature importance Even, for receiving the first single classifier training module 5 and the first mapping model of the second single classifier training module 6 output and the Two mapping models and the text parameter feature with predtermined category performance exported by feature importance ranking module 7, and adopt With integrated learning approach, the first single classifier training module 5 and the second single classifier training module 6 are merged in decision-making level, Thus export the result of prosody hierarchy structure prediction.
In the above-described embodiments, text analysis model 1 specifically may be used for text data to be analyzed is carried out regularization Process, correct polyphone pronunciation mistake, and numeral is carried out Regularization, the text data that output has been analyzed.Wherein, divide The text data analysed includes symbolic feature and numeric type feature.Wherein, symbolic feature includes participle feature.
Wherein, when text analysis model 1 carries out regularization process to text data to be analyzed, the method for rule is utilized to go Fall symbol unnecessary in text data to be analyzed.The text data that the analysis of text analysis model 1 output completes can include But it is not limited to participle corresponding to text, part of speech, word length and the syllable number of sentence.
In the above-described embodiments, text feature parameterized module 2 specifically may be used for utilizing one-hot method for expressing to literary composition The symbolic feature of this analysis module 1 output processes, and retains the numeric type feature of text analysis model 1 output, thus defeated Go out parameterized text feature.
In the above-described embodiments, when words vector joint training module 3 trains the term vector that word strengthens, i.e. instruct at term vector The impact that the contained word internal word inside word is vectorial is considered the when of white silk.
Fig. 2 schematically illustrates the structure of words vector joint training module.As in figure 2 it is shown, words vector combines instruction Practice module 3 and specifically may include that word location extraction module 31, word context cluster module 32, non-decomposition vocabulary set up module 33 And concrete words vector joint training module 34.
Word location extraction module 31 is for the participle feature according to text analysis model 1 output, and occurs in word according to word In position, each word is clustered, extract word location information.
As example, word location extraction module 31 can occur in the starting position of word, centre position and end according to word Position, is clustered into three different classes by each word.By word location extraction module 31, it is considered to the text of word location use word The term vector that vector carries out strengthening represents:
X j = 1 2 ( w j + 1 N j ( c 1 B + Σ k = 2 N j - 1 c k M + c N j E ) )
Wherein,Represent word xjMiddle first character;Represent word xjOutside middle removing first character and the last character Kth word;Represent word xjMiddle the last character;NjRepresent word xjIn number of words;K represents the sequence number of word;J takes positive integer.
The embodiment of the present invention is by arranging word location extraction module 31, it is contemplated that word location letter in word Breath, thus eliminate word location ambiguity.
Word context cluster module 32 is for the word location information extracted based on word location extraction module 31, according to word not Cluster with context, and with the same word of multiple vector representations.
For example, for word xj={ c1,...,cN, then increasing with word vector of the text that consideration context clusters Strong term vector represents:
r k max = arg max r k S ( c k r k , V c o n t e x t )
V c o n t e x t = Σ t = j - K j + K x t = Σ t = j - K j + K 1 2 ( w t + 1 N t Σ c u ∈ x t c u m o s t )
X j = 1 2 ( w j + 1 N j Σ k = 1 N j c k r k max )
Wherein, XjRepresent that the term vector carrying out strengthening with word vector of text represents;S () represents that cosine similarity calculates Function;K represents the number of the cliction up and down of consideration, i.e. window is long, it is preferable that K=5;Represent the most most frequently By word xjThe word vector selected;cuRepresent in the training process by word xjThe word vector selected;wjRepresent word xjTerm vector;wtTable Show simple word;K represents the sequence number of word;The optimum cluster of each word in expression word;xtRepresent the word that word strengthens.
Non-decomposition vocabulary sets up module 33 for the text data that the analysis exported based on text analysis model 1 completes, and builds The word lists of vertical non-decomposition word.
In actual applications, Mandarin Chinese generally exists non-decomposition word, such as " sofa ", " chocolate ", " hovering " etc. Deng.In the middle of these words, the word within word to the semanteme of whole word substantially without contribution.Therefore for eliminating Mandarin Chinese In the impact on term vector of word in non-decomposition word, need, when training non-decomposition word term vector, to ignore in non-decomposition word structure The word impact on term vector.When the embodiment of the present invention uses non-decomposition vocabulary to set up module 33, in the training process may not be used Consider the impact on generating term vector of the word within word, for indissoluble word, set up the word lists of non-decomposition word.
In order to keep decomposing word and the concordance of non-decomposition word term vector dimension, so, XjFormula needs to be multiplied by 1/2.
Concrete words vector joint training module 34 for setting up the non-decomposition word of module 33 output according to non-decomposition vocabulary Word lists and the output result of word context cluster module 32, output text carry out, with word vector, the term vector table that strengthens Representation model.
Wherein, contained by concrete words vector joint training module 34 considered inside word term vector is trained when The impact of word vector, then word xjThe term vector carrying out strengthening with word vector represent model XjCan be expressed as:
X j = w j + 1 N j Σ k = 1 N j c k
Wherein, wjRepresent word xjTerm vector;NjRepresent word xjIn number of words;ckRepresent word xjThe word vector of middle kth word; K represents the sequence number of word.
The embodiment of the present invention can produce, by words vector joint training module, the term vector that word strengthens more accurately.Word Term vector joint training module considers the impact on term vector of the word within word.And, for word vector, it is contemplated that word is at word The factors such as the different contexts residing for middle different position and word, are indicated a word with different vectors, and this are transported Use in words joint training model.Additionally, for some words that can not split in Chinese, in the training process, for these The word that can not split, the impact of term vector will not be considered by the word within word.
Strengthening with word vector of the text that the embodiment of the present invention obtains training at words associating vector training module Term vector represent that traditional parameterized text feature that model and text feature parameterized module obtain is to literary composition to be analyzed It is described in terms of notebook data is different from two, it is possible to describe text feature more subtly.
In the above-described embodiments, term vector generation module 4 specifically may be used for participle based on text analysis model 1 output Feature, and the term vector carrying out strengthening with word vector of the text of words vector joint training module 3 output represents model, knot Close the semantic information that the word information in word and word is comprised, build words united term vector model, and combined by this words Term vector model map, output has analyzed text data word vector enhancing term vector.
Term vector generation module 4 has considered the semantic letter that the word information in word and word is comprised during training Breath.After obtaining the words united term vector model trained, can be obtained by inputting the word of text by the mapping of this model Vector description data.
In the above-described embodiments, the method that the first single classifier training module 5 specifically may be used for utilizing condition random field Set up the text feature of mapping parameters and the first mapping model of prosody hierarchy relationship between structure.
Wherein, the first mapping model reflects the probability that each word pauses on the prosody hierarchy at place or do not stops.
Herein, prosody hierarchy structure can include rhythm word, prosodic phrase and intonation phrase.
In rhythm structure level based on the first single classifier training module 5 is predicted, for different prosody hierarchies, adopt Combine with different text features, and characteristic window length, which increase the accuracy of model prediction.
In the above-described embodiments, the second single classifier training module 6 specifically may be used for utilizing two-way long short term memory to follow Ring neural network maps term vector and the second mapping model of prosody hierarchy relationship between structure that word vector strengthens.
Wherein, the second mapping model reflects the probability that each word pauses on the prosody hierarchy at place or do not stops.
Fig. 3 schematically illustrates the structure of feature importance ranking module.As it is shown on figure 3, feature importance ranking mould Block 7 specifically may include that text feature set extraction module 71, F-Score value promote computing module 72 and feature importance row Sequence output module 73.Wherein, text feature set extraction module 71 is for carrying by enumerative technique based on parameterized text feature Take text feature set.Text feature set extraction module 71 extracts various possible feature by enumerative technique and combines as feature The input of importance ranking module 7.F-Score value promotes computing module 72 for calculating text feature set extraction module respectively 71 each features extracted are respectively as the F-Score brought on checking collection during the input of the first single classifier training module 5 The lifting values of value.Feature importance ranking output module 73 promotes, to F-Score value, each F-Score that computing module 72 obtains The lifting values of value is ranked up, and output has the text parameter feature of predtermined category performance.
Wherein, the text parameter feature with predtermined category performance can be by by choosing from maximum in ranking results The feature taking predetermined number obtains as importance characteristic.
The embodiment of the present invention, by arranging feature importance ranking module 7, can improve the performance of prosody prediction.
Fig. 4 schematically illustrates the structure of Model Fusion module.As shown in Figure 4, Model Fusion module 8 specifically can be wrapped Include: first single classifier output module the 81, second single classifier output module 82, key character generation module 83 and fusion forecasting Module 84.Wherein, the first single classifier output module 81 is connected with the first single classifier training module 5, for reflecting according to first Penetrate model, determine the first probability that prosody hierarchy prediction pauses and do not stops.Second single classifier output module 82 and second is single Classifier training module 6 is connected, and for according to the second mapping model, determines that prosody hierarchy predicts that second pausing and not stopping is general Rate.Key character generation module 83 is connected with feature importance ranking module 7, for having predtermined category performance by calculating Text parameter feature exports the key character of prosody hierarchy to the contribution of F-Score value.Fusion forecasting module 84 is for the One single classifier output module 81 output the first probability and the second single classifier output module 82 output the second probability and The key character of key character generation module 83 output is iterated decision tree and merges, to determine the rhythm limit of prosody hierarchy structure Boundary predicts the outcome.
Above-mentioned fusion forecasting module 84 considers first single classifier training module the 5, second single classifier training module 6 Screen the module 7 impact on final result with feature importance, thus produce the prediction on the rhythm border (i.e. pausing) of this level As a result, this result is using an input feature vector of the prediction as next level.
The embodiment of the present invention, by arranging Model Fusion module, introduces the iteration decision Tree algorithms in integrated study, right The output of two single classifier generations and feature ordering module produce importance characteristic and merge, and substantially increase prosody prediction Performance so that the synthesis speech naturalness obtained and representability are more preferable.
Therefore, the embodiment of the present invention combines instruction by text analysis model 1, text feature parameterized module 2, words vector White silk module 3, term vector generation module the 4, first single classifier training module the 5, second single classifier training module 6, feature are important It is different with intonation phrase three that any text can be carried out rhythm word, prosodic phrase by property order module 7 and Model Fusion module 8 The prediction of prosody hierarchy structure, for instructing the rear end of phonetic synthesis to carry out phonetic synthesis, and then improve synthesis voice from So degree and representability.
It should be noted that the Chinese Prosodic Hierarchy prognoses system that above-described embodiment provides is carrying out Chinese fascicule During level structure prediction, only it is illustrated with the division of above-mentioned each functional module, in actual applications, can be as desired Above-mentioned functions distribution is completed by different functional modules, will the module in the embodiment of the present invention can also decompose again or Combination, the module of such as above-described embodiment can merge into a module, it is also possible to is further split into multiple submodule, with complete Become all or part of function described above.The title of the module for relating in the embodiment of the present invention, it is only for district Divide modules, be not intended as inappropriate limitation of the present invention.
As used herein, term " module " may refer to software object or the routine performed on a computing system (it can use the language such as such as C language to be achieved).Can be embodied as disparate modules described herein calculating The object performed in system or process (such as, as independent thread).While it is preferred that realize being retouched herein with software The system and method stated, but realizing also possible and be to be conceived to the combination of hardware or software and hardware 's.
The embodiment of the present invention can be run based on platforms such as such as windows, linux.
It will be understood by those skilled in the art that above-mentioned Chinese Prosodic Hierarchy prognoses system can also include some other Known features, such as processor, controller, memorizer etc., wherein, memorizer includes but not limited to random access memory, flash memory, only Read memorizer, programmable read only memory, volatile memory, nonvolatile memory, serial storage, parallel storage or Depositors etc., processor includes but not limited to CPLD/FPGA, DSP, arm processor, MIPS processor etc., in order to unnecessarily Obscuring and embodiment of the disclosure, structure known to these is not shown in FIG. 1.
It should be understood that the quantity of the modules in Fig. 1 is only schematically.According to actual needs, each module is permissible There is arbitrary quantity.
The technical scheme provided the embodiment of the present invention above is described in detail.Although applying concrete herein Individual example principle and the embodiment of the present invention are set forth, but, the explanation of above-described embodiment be only applicable to help reason Solve the principle of the embodiment of the present invention;For those skilled in the art, according to the embodiment of the present invention, it is being embodied as All can make a change within mode and range of application.
It should be noted that referred to herein to block diagram be not limited solely to form shown in this article, it can also enter Other divisions of row and/or combination.
It can further be stated that: labelling and word in accompanying drawing are intended merely to be illustrated more clearly that the present invention, and it is right to be not intended as The improper restriction of scope.
Again it should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " it is etc. for distinguishing similar object rather than for describing or representing specific order or precedence.Should be appreciated that this The data that sample uses can be exchanged in appropriate circumstances, in order to embodiments of the invention described herein can be with except at this In illustrate or describe those beyond order implement.
Term " includes " or any other like term is intended to comprising of nonexcludability, so that include that one is The process of row key element, method, article or equipment/device not only include those key elements, but also include being not expressly set out Other key element, or also include the key element that these processes, method, article or equipment/device are intrinsic.
Particular embodiments described above, has been carried out the purpose of the present invention, technical scheme and beneficial effect the most in detail Describe in detail bright, be it should be understood that the specific embodiment that the foregoing is only the present invention, be not limited to the present invention, all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. done, should be included in the guarantor of the present invention Within the scope of protecting.

Claims (9)

1. a Chinese Prosodic Hierarchy prognoses system, it is characterised in that described prognoses system includes:
Text analysis model, for receiving text data to be analyzed, the text data that output has been analyzed;
Text feature parameterized module, is connected with described text analysis model, for receiving the text data that described analysis completes, The text feature of output parameter;
Words vector joint training module, is connected with described text analysis model, is used for receiving described text analysis model and generates The text data that completes of described analysis, and joint training language model based on word vector sum term vector, the use of output text The term vector that word vector carries out strengthening represents model;
Term vector generation module, the text data that the described analysis for exporting based on described text analysis model completes, utilize The term vector that described word vector carries out strengthening represents model, exports described analysis and completes the word that the word vector of text data strengthens Vector;
First single classifier training module, is connected with described text feature parameterized module, for training from described text feature The described parameterized text feature of parameterized module output is to the first mapping model of prosody hierarchy structure;
Second single classifier training module, is connected with described term vector generation module, generates mould for training from described term vector The term vector that the described word vector of block output strengthens is to the second mapping model of described prosody hierarchy structure;
Feature importance ranking module, is connected with described first single classifier training module, has predtermined category for output The text parameter feature of energy;
Model Fusion module, with described first single classifier training module, described second single classifier training module and described spy Levy importance screening module to be connected, be used for receiving described first single classifier training module and described second single classifier training mould Block output described first mapping model and described second mapping model and by described feature importance ranking module output The described text parameter feature with predtermined category performance, and use integrated learning approach to described first single classifier training mould Block and described second single classifier training module merge in decision-making level, thus export the knot of described prosody hierarchy structure prediction Really.
Prognoses system the most according to claim 1, it is characterised in that described text analysis model is specifically for treating described The text data analyzed carries out regularization process, corrects polyphone pronunciation mistake, and numeral carries out Regularization, export institute State the text data analyzed.
Prognoses system the most according to claim 1, it is characterised in that the text data that described analysis completes includes that symbol is special Seek peace numeric type feature;
Described text analysis model is exported by described text feature parameterized module specifically for utilizing one-hot method for expressing Symbolic feature processes, and retains the numeric type feature of described text analysis model output, thus exports described parameterized Text feature.
Prognoses system the most according to claim 1, it is characterised in that the text data that described analysis completes also includes participle Feature;
Described words vector joint training module specifically includes:
Word location extraction module, for the participle feature exported according to described text analysis model, and occurs in word according to word Position, each word is clustered, extract word location information;
Word context cluster module, for the described word location information extracted based on described word location extraction module, according to described The different contexts of word cluster, and with the same word of multiple vector representations;
Non-decomposition vocabulary sets up module, the text data that the described analysis for exporting based on described text analysis model completes, Set up the word lists of non-decomposition word;
Concrete words vector joint training module, for setting up the described non-decomposition word of module output according to described non-decomposition vocabulary Word lists and the output result of described word context cluster module, export described text carries out, with word vector, the word that strengthens Vector representation.
Prognoses system the most according to claim 1, it is characterised in that the text data that described analysis completes also includes participle Feature;
Described term vector generation module is specifically for the described participle feature exported based on described text analysis model and described The term vector carrying out strengthening with word vector of the described text of words vector joint training module output represents model, bluebeard compound and The semantic information that word information in word is comprised, builds words united term vector model, and by the united word of described words Vector model maps, and exports described analysis and completes the term vector that the word vector of text data strengthens.
Prognoses system the most according to claim 1, it is characterised in that described first single classifier training module specifically for The method utilizing condition random field is set up and is mapped described parameterized text feature and described prosody hierarchy relationship between structure First mapping model.
Prognoses system the most according to claim 1, it is characterised in that described second single classifier training module specifically for Utilize two-way long short term memory Recognition with Recurrent Neural Network to set up and map term vector and the described prosody hierarchy knot that described word vector strengthens Second mapping model of relation between structure.
Prognoses system the most according to claim 1, it is characterised in that described feature importance ranking module specifically includes:
Text feature set extraction module, for extracting text feature collection based on described parameterized text feature by enumerative technique Close;
F-Score value promotes computing module, and each feature extracted for calculating described text feature set extraction module respectively is divided Not as the lifting values of the described F-Score value brought on checking collection during the input of described first single classifier training module;
Feature importance ranking output module, to described each F-Score value that described F-Score value lifting computing module obtains Lifting values is ranked up, and has the text parameter feature of predtermined category performance described in output.
Prognoses system the most according to claim 8, it is characterised in that described Model Fusion module specifically includes:
First single classifier output module, is connected with described first single classifier training module, for mapping according to described first Model, determines the first probability that prosody hierarchy prediction pauses and do not stops;
Second single classifier output module, is connected with described second single classifier training module, for mapping according to described second Model, determines the second probability that prosody hierarchy prediction pauses and do not stops;
Key character generation module, is connected with described feature importance ranking module, for having predetermined point described in calculating The text parameter feature of class performance exports the key character of prosody hierarchy to the contribution of described F-Score value;
Fusion forecasting module, single for described first probability that described first single classifier output module is exported and described second Described second probability of grader output module output and the described key character of described key character generation module output enter Row iteration decision tree is merged, to determine predicting the outcome of the rhythm border of described prosody hierarchy structure.
CN201610642956.0A 2016-08-08 2016-08-08 Chinese Prosodic Hierarchy forecasting system Active CN106227721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610642956.0A CN106227721B (en) 2016-08-08 2016-08-08 Chinese Prosodic Hierarchy forecasting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610642956.0A CN106227721B (en) 2016-08-08 2016-08-08 Chinese Prosodic Hierarchy forecasting system

Publications (2)

Publication Number Publication Date
CN106227721A true CN106227721A (en) 2016-12-14
CN106227721B CN106227721B (en) 2019-02-01

Family

ID=57547688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610642956.0A Active CN106227721B (en) 2016-08-08 2016-08-08 Chinese Prosodic Hierarchy forecasting system

Country Status (1)

Country Link
CN (1) CN106227721B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106652995A (en) * 2016-12-31 2017-05-10 深圳市优必选科技有限公司 Voice broadcasting method and system for text
CN106682217A (en) * 2016-12-31 2017-05-17 成都数联铭品科技有限公司 Method for enterprise second-grade industry classification based on automatic screening and learning of information
CN107423284A (en) * 2017-06-14 2017-12-01 中国科学院自动化研究所 Merge the construction method and system of the sentence expression of Chinese language words internal structural information
CN107451115A (en) * 2017-07-11 2017-12-08 中国科学院自动化研究所 The construction method and system of Chinese Prosodic Hierarchy forecast model end to end
CN107995428A (en) * 2017-12-21 2018-05-04 广东欧珀移动通信有限公司 Image processing method, device and storage medium and mobile terminal
CN108549850A (en) * 2018-03-27 2018-09-18 联想(北京)有限公司 A kind of image-recognizing method and electronic equipment
CN108595416A (en) * 2018-03-27 2018-09-28 义语智能科技(上海)有限公司 Character string processing method and equipment
CN108595590A (en) * 2018-04-19 2018-09-28 中国科学院电子学研究所苏州研究院 A kind of Chinese Text Categorization based on fusion attention model
CN108628868A (en) * 2017-03-16 2018-10-09 北京京东尚科信息技术有限公司 File classification method and device
CN108763487A (en) * 2018-05-30 2018-11-06 华南理工大学 A kind of word representation method of fusion part of speech and sentence information based on Mean Shift
CN110427608A (en) * 2019-06-24 2019-11-08 浙江大学 A kind of Chinese word vector table dendrography learning method introducing layering ideophone feature
CN111178046A (en) * 2019-12-16 2020-05-19 山东众阳健康科技集团有限公司 Word vector training method based on sorting
CN111226275A (en) * 2019-12-31 2020-06-02 深圳市优必选科技股份有限公司 Voice synthesis method, device, terminal and medium based on rhythm characteristic prediction
CN111738360A (en) * 2020-07-24 2020-10-02 支付宝(杭州)信息技术有限公司 Two-party decision tree training method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452699A (en) * 2007-12-04 2009-06-10 株式会社东芝 Rhythm self-adapting and speech synthesizing method and apparatus
CN104867490A (en) * 2015-06-12 2015-08-26 百度在线网络技术(北京)有限公司 Metrical structure predicting method and metrical structure predicting device
CN104916284A (en) * 2015-06-10 2015-09-16 百度在线网络技术(北京)有限公司 Prosody and acoustics joint modeling method and device for voice synthesis system
CN105185374A (en) * 2015-09-11 2015-12-23 百度在线网络技术(北京)有限公司 Prosodic hierarchy annotation method and device
CN105244020A (en) * 2015-09-24 2016-01-13 百度在线网络技术(北京)有限公司 Prosodic hierarchy model training method, text-to-speech method and text-to-speech device
CN105654939A (en) * 2016-01-04 2016-06-08 北京时代瑞朗科技有限公司 Voice synthesis method based on voice vector textual characteristics

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452699A (en) * 2007-12-04 2009-06-10 株式会社东芝 Rhythm self-adapting and speech synthesizing method and apparatus
CN104916284A (en) * 2015-06-10 2015-09-16 百度在线网络技术(北京)有限公司 Prosody and acoustics joint modeling method and device for voice synthesis system
CN104867490A (en) * 2015-06-12 2015-08-26 百度在线网络技术(北京)有限公司 Metrical structure predicting method and metrical structure predicting device
CN105185374A (en) * 2015-09-11 2015-12-23 百度在线网络技术(北京)有限公司 Prosodic hierarchy annotation method and device
CN105244020A (en) * 2015-09-24 2016-01-13 百度在线网络技术(北京)有限公司 Prosodic hierarchy model training method, text-to-speech method and text-to-speech device
CN105654939A (en) * 2016-01-04 2016-06-08 北京时代瑞朗科技有限公司 Voice synthesis method based on voice vector textual characteristics

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HONGHUI DONG ET AL: "Prosodic word prediction using the lexical information", 《NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING,2005.IEEE NLP-KE"05》 *
丁星光 等: "基于深度学习的韵律结构预测学习的韵律结构预测学习的韵律结构预测", 《NCMMSC2015》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682217A (en) * 2016-12-31 2017-05-17 成都数联铭品科技有限公司 Method for enterprise second-grade industry classification based on automatic screening and learning of information
CN106652995A (en) * 2016-12-31 2017-05-10 深圳市优必选科技有限公司 Voice broadcasting method and system for text
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
CN108628868A (en) * 2017-03-16 2018-10-09 北京京东尚科信息技术有限公司 File classification method and device
CN108628868B (en) * 2017-03-16 2021-08-10 北京京东尚科信息技术有限公司 Text classification method and device
CN107423284A (en) * 2017-06-14 2017-12-01 中国科学院自动化研究所 Merge the construction method and system of the sentence expression of Chinese language words internal structural information
CN107423284B (en) * 2017-06-14 2020-03-06 中国科学院自动化研究所 Method and system for constructing sentence representation fusing internal structure information of Chinese words
CN107451115A (en) * 2017-07-11 2017-12-08 中国科学院自动化研究所 The construction method and system of Chinese Prosodic Hierarchy forecast model end to end
CN107451115B (en) * 2017-07-11 2020-03-06 中国科学院自动化研究所 Method and system for constructing end-to-end Chinese prosody hierarchical structure prediction model
CN107995428A (en) * 2017-12-21 2018-05-04 广东欧珀移动通信有限公司 Image processing method, device and storage medium and mobile terminal
CN108595416A (en) * 2018-03-27 2018-09-28 义语智能科技(上海)有限公司 Character string processing method and equipment
CN108549850B (en) * 2018-03-27 2021-07-16 联想(北京)有限公司 Image identification method and electronic equipment
CN108549850A (en) * 2018-03-27 2018-09-18 联想(北京)有限公司 A kind of image-recognizing method and electronic equipment
CN108595590A (en) * 2018-04-19 2018-09-28 中国科学院电子学研究所苏州研究院 A kind of Chinese Text Categorization based on fusion attention model
CN108763487A (en) * 2018-05-30 2018-11-06 华南理工大学 A kind of word representation method of fusion part of speech and sentence information based on Mean Shift
CN110427608A (en) * 2019-06-24 2019-11-08 浙江大学 A kind of Chinese word vector table dendrography learning method introducing layering ideophone feature
CN111178046A (en) * 2019-12-16 2020-05-19 山东众阳健康科技集团有限公司 Word vector training method based on sorting
CN111226275A (en) * 2019-12-31 2020-06-02 深圳市优必选科技股份有限公司 Voice synthesis method, device, terminal and medium based on rhythm characteristic prediction
CN111738360A (en) * 2020-07-24 2020-10-02 支付宝(杭州)信息技术有限公司 Two-party decision tree training method and system

Also Published As

Publication number Publication date
CN106227721B (en) 2019-02-01

Similar Documents

Publication Publication Date Title
CN106227721A (en) Chinese Prosodic Hierarchy prognoses system
CN105244020B (en) Prosodic hierarchy model training method, text-to-speech method and text-to-speech device
CN107577662A (en) Towards the semantic understanding system and method for Chinese text
CN107818164A (en) A kind of intelligent answer method and its system
CN109523989A (en) Phoneme synthesizing method, speech synthetic device, storage medium and electronic equipment
CN108763539B (en) Text classification method and system based on part-of-speech classification
CN110851601A (en) Cross-domain emotion classification system and method based on layered attention mechanism
CN107180084A (en) Word library updating method and device
CN108563638A (en) A kind of microblog emotional analysis method based on topic identification and integrated study
CN106683667A (en) Automatic rhythm extracting method, system and application thereof in natural language processing
US11727915B1 (en) Method and terminal for generating simulated voice of virtual teacher
JP2018005690A (en) Information processing apparatus and program
CN110472040A (en) Extracting method and device, storage medium, the computer equipment of evaluation information
JP2022128441A (en) Augmenting textual data for sentence classification using weakly-supervised multi-reward reinforcement learning
Raj et al. Text processing for text-to-speech systems in Indian languages.
Kumar et al. Morphological analyzer for agglutinative languages using machine learning approaches
Dongmei Design of English text-to-speech conversion algorithm based on machine learning
CN116205211A (en) Document level resume analysis method based on large-scale pre-training generation model
US20080120108A1 (en) Multi-space distribution for pattern recognition based on mixed continuous and discrete observations
CN110275953A (en) Personality classification method and device
CN110457470A (en) A kind of textual classification model learning method and device
Alías et al. Towards high-quality next-generation text-to-speech synthesis: A multidomain approach by automatic domain classification
CN109117471A (en) A kind of calculation method and terminal of the word degree of correlation
CN112287667A (en) Text generation method and equipment
CN105895076A (en) Speech synthesis method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant