CN110413987A - Punctuation mark prediction technique and relevant device based on multiple prediction models - Google Patents
Punctuation mark prediction technique and relevant device based on multiple prediction models Download PDFInfo
- Publication number
- CN110413987A CN110413987A CN201910515571.1A CN201910515571A CN110413987A CN 110413987 A CN110413987 A CN 110413987A CN 201910515571 A CN201910515571 A CN 201910515571A CN 110413987 A CN110413987 A CN 110413987A
- Authority
- CN
- China
- Prior art keywords
- prediction
- punctuation mark
- prediction model
- model
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
This application involves artificial intelligence fields, this application discloses a kind of punctuation mark prediction techniques and relevant device based on multiple prediction models, the described method includes: three punctuation mark prediction models of building, text to be predicted is inputted into three punctuation mark prediction models respectively, it is analyzed according to the prediction result of three punctuation mark prediction models, and carries out prediction output based on the analysis results.The application carries out punctuate prediction by using forward-backward recutrnce neural network, and combines Keywords matching and speech pause prediction model, can effectively improve punctuation mark prediction accuracy.
Description
Technical field
This application involves artificial intelligence field, in particular to a kind of punctuation mark prediction technique based on multiple prediction models
And relevant device.
Background technique
Punctuate is predicted to belong to the post-processing technology field of speech recognition, i.e., after converting speech into text, needs pair
The text converted is post-processed, and optimizes the user experience of speech recognition product, main includes spoken smooth, punctuate prediction
With inverse textual.After obtaining initial translation text, if being a whole sentence, without punctuation mark, user experience is poor.
But after it have passed through punctuate prediction, the sentence with punctuation mark, the very big promotion user experience of meeting can be exported.
Current punctuation mark prediction is all based on the length of Keywords matching and speech pause to carry out punctuation mark
Prediction, does not include contextual information, and prediction error rate is high.
Summary of the invention
The purpose of the application is to provide a kind of punctuation mark based on multiple prediction models in view of the deficiencies of the prior art
Prediction technique and relevant device carry out punctuate prediction by using forward-backward recutrnce neural network, and combine Keywords matching and language
Sound pause prediction model can effectively improve punctuation mark prediction accuracy.
In order to achieve the above objectives, the technical solution of the application provides a kind of punctuation mark prediction based on multiple prediction models
Method and relevant device.
This application discloses a kind of punctuation mark prediction techniques based on multiple prediction models, comprising the following steps:
The first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are constructed,
And the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are pre-
First configure prediction probability weight;
Text to be predicted is obtained, by the first punctuation mark prediction model described in the text input to be predicted, described in acquisition
The prediction result of first punctuation mark prediction model;
By the second punctuation mark prediction model described in the text input to be predicted, the second punctuation mark prediction is obtained
The prediction result of model;
By third punctuation mark prediction model described in the text input to be predicted, the third punctuation mark prediction is obtained
The prediction result of model;
Mould is predicted according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark
The prediction result and prediction probability weight of type carry out the prediction of the punctuation mark of the text to be predicted.
Preferably, the first punctuation mark prediction model of the building, the second punctuation mark prediction model and third punctuate symbol
Number prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and keyword
Punctuation mark prediction model is matched, and creates attention layer in the forward-backward recutrnce neural network punctuation mark prediction model,
The attention mechanism of the attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment
Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor
The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
Preferably, it is described by the first punctuation mark prediction model described in the text input to be predicted, obtain described first
The prediction result of punctuation mark prediction model, comprising:
By the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, the forward-backward recutrnce is obtained
The prediction term of neural network punctuation mark prediction model and prediction probability corresponding with the prediction term;
By the prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and corresponding with the prediction term pre-
Probability is surveyed to be stored in temporal cache.
Preferably, it is described by the second punctuation mark prediction model described in the text input to be predicted, obtain described second
The prediction result of punctuation mark prediction model, comprising:
By the text input speech pause punctuation mark prediction model to be predicted, the speech pause punctuation mark is obtained
The prediction term of prediction model and prediction probability corresponding with the prediction term;
The prediction term of the speech pause punctuation mark prediction model and prediction probability corresponding with the prediction term are deposited
Storage is in temporal cache.
Preferably, it is described by third punctuation mark prediction model described in the text input to be predicted, obtain the third
The prediction result of punctuation mark prediction model, comprising:
By the text input Keywords matching punctuation mark prediction model to be predicted, the Keywords matching punctuate is obtained
The prediction term of sign prediction model and prediction probability corresponding with the prediction term;
By the prediction term of the Keywords matching punctuation mark prediction model and prediction probability corresponding with the prediction term
It is stored in temporal cache.
Preferably, described according to the first punctuation mark prediction model, the second punctuation mark prediction model and third mark
The prediction result and prediction probability weight of point symbol prediction model carry out the prediction of the punctuation mark of the text to be predicted, packet
It includes:
Inquire the first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third
The all identical prediction term of the prediction term of punctuation mark prediction model;
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third
The corresponding prediction probability of all identical prediction term of the prediction term of punctuation mark prediction model and prediction probability weight carry out it is described to
Predict the prediction of the punctuation mark of text.
Preferably, the first punctuation mark prediction model, second punctuation mark according to the temporal cache are pre-
Survey the corresponding prediction probability of all identical prediction term of prediction term and the prediction probability power of model and third punctuation mark prediction model
The prediction of the punctuation mark of the text to be predicted is carried out again, comprising:
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third
The corresponding prediction probability of all identical prediction term of the prediction term of punctuation mark prediction model and prediction probability weight calculate separately often
The prediction of a prediction model divides probability;
Probability is divided to carry out the cumulative prediction total probability for obtaining the prediction term prediction of each prediction model;
The first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third mark
The corresponding prediction term of highest prediction total probability is inquired in all identical prediction term of the prediction term of point symbol prediction model, and will be described
Prediction term predicts output as the punctuation mark of the text to be predicted.
Disclosed herein as well is a kind of punctuation mark prediction meanss based on multiple prediction models, described device includes:
Model construction module: the first punctuation mark prediction model of building, the second punctuation mark prediction model and the are set as
Three punctuation mark prediction models, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and
Three punctuation mark prediction models are pre-configured with prediction probability weight;
First prediction module: being set as obtaining text to be predicted, and the first punctuate described in the text input to be predicted is accorded with
Number prediction model obtains the prediction result of the first punctuation mark prediction model;
Second prediction module: it is set as obtaining the second punctuation mark prediction model described in the text input to be predicted
The prediction result of the second punctuation mark prediction model;
Third prediction module: it is set as obtaining third punctuation mark prediction model described in the text input to be predicted
The prediction result of the third punctuation mark prediction model;
Prediction output module: it is set as according to the first punctuation mark prediction model, the second punctuation mark prediction model
And the prediction result and prediction probability weight of third punctuation mark prediction model carry out the punctuation mark of the text to be predicted
Prediction.
Disclosed herein as well is a kind of computer equipment, the computer equipment includes memory and processor, described to deposit
Computer-readable instruction is stored in reservoir to be made when the computer-readable instruction is executed by one or more processors
Obtain the step of one or more processors execute punctuation mark prediction technique described above.
Disclosed herein as well is a kind of storage medium, the storage medium can be read and write by processor, and the storage medium is deposited
Computer instruction is contained, when the computer-readable instruction is executed by one or more processors, so that one or more processing
Device executes the step of punctuation mark prediction technique described above.
The beneficial effect of the application is: the application carries out punctuate prediction by using forward-backward recutrnce neural network, and combines
Keywords matching and speech pause prediction model can effectively improve punctuation mark prediction accuracy.
Detailed description of the invention
Fig. 1 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the application one embodiment
Schematic diagram;
Fig. 2 is a kind of process of punctuation mark prediction technique based on multiple prediction models of second embodiment of the application
Schematic diagram;
Fig. 3 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the application third embodiment
Schematic diagram;
Fig. 4 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 4th embodiment of the application
Schematic diagram;
Fig. 5 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 5th embodiment of the application
Schematic diagram;
Fig. 6 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 6th embodiment of the application
Schematic diagram;
Fig. 7 is a kind of punctuation mark prediction meanss structural schematic diagram based on multiple prediction models of the embodiment of the present application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and
It is not used in restriction the application.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one
It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application
Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition
Other one or more features, integer, step, operation, element, component and/or their group.
A kind of punctuation mark prediction technique process such as Fig. 1 institute based on multiple prediction models of the application one embodiment
Show, the present embodiment the following steps are included:
Step s101, the first punctuation mark prediction model of building, the second punctuation mark prediction model and third punctuation mark
Prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark
Prediction model is pre-configured with prediction probability weight;
Specifically, can construct three punctuation mark prediction models in advance first, these three punctuation mark prediction models can
Being predicted with punctuation mark of the different prediction modes to input text;For example, forward-backward recutrnce neural network prediction mould
Type is prediction neural network based, can be used as the first punctuation mark prediction model;Speech pause prediction model is based on language
The length of time that sound pauses predicts punctuation mark, can be used as the second punctuation mark prediction model;Keywords matching is pre-
Surveying model is to be predicted based on preset keyword punctuation mark, can be used as third punctuation mark prediction model.
Specifically, when text to be predicted inputs the first punctuation mark prediction model, the prediction of the second punctuation mark respectively
After model and third punctuation mark prediction model, the first punctuation mark prediction model, the second punctuation mark can be obtained respectively
The prediction result of prediction model and third punctuation mark prediction model, therefore can in advance be respectively the first punctuation mark prediction
Model, the second punctuation mark prediction model and third punctuation mark prediction model configure prediction probability weight, first punctuate
The prediction result of sign prediction model, the second punctuation mark prediction model and third punctuation mark prediction model and first mark
The weighting of the prediction probability weight of point symbol prediction model, the second punctuation mark prediction model and third punctuation mark prediction model
It is exactly the prediction result of final text to be predicted.
Step s102 obtains text to be predicted, by the first punctuation mark prediction model described in the text input to be predicted,
Obtain the prediction result of the first punctuation mark prediction model;
Specifically, obtain text to be predicted first, the text to be predicted be a pile without the word of any punctuation mark and
The purpose of contamination, such as " I am Chinese, and I stays in Beijing ", punctuation mark prediction is looked for the word and contamination
To the position of punctuate, and suitable punctuation mark is added on a corresponding position, by taking " I am Chinese, and I stays in Beijing " as an example,
Output may be as follows:
" I am Chinese, I stays in Beijing."
" I am, Chinese I stay in Beijing "
" I am Chinese!I stays in Beijing, "
" does I am Chinese, and I stay in Beijing "
" I be China, people I, stay in Beijing ".
Above-mentioned every case all can be comprising a prediction term and probability corresponding with the prediction term, and what is finally exported can
To be the corresponding prediction term of maximum probability, it is also possible to carry out all prediction terms and probability corresponding with the prediction term defeated
Out, the text to be predicted of the acquisition can be the passage of manually input, be also possible to by speech recognition conversion come
Passage.
Specifically, can get the first punctuate for after the first punctuation mark prediction model described in the text input to be predicted
The prediction result of sign prediction model.
Second punctuation mark prediction model described in the text input to be predicted is obtained second mark by step s103
The prediction result of point symbol prediction model;
Specifically, after the second punctuation mark prediction model described in the text input to be predicted, can be obtained with step s102
The prediction result of the second punctuation mark prediction model is obtained, the prediction result of the second punctuation mark prediction model includes pre-
Survey item and probability corresponding with the prediction term.
Third punctuation mark prediction model described in the text input to be predicted is obtained the third mark by step s104
The prediction result of point symbol prediction model;
Specifically, after third punctuation mark prediction model described in the text input to be predicted, can be obtained with step s102
The prediction result of the third punctuation mark prediction model is obtained, the prediction result of the third punctuation mark prediction model includes pre-
Survey item and probability corresponding with the prediction term.
Step s105, according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate
The prediction result and prediction probability weight of sign prediction model carry out the prediction of the punctuation mark of the text to be predicted.
Specifically, the first punctuation mark prediction model, the second punctuation mark prediction model and ought be got respectively
After the prediction result of three punctuation mark prediction models, since the prediction result includes prediction term and corresponding with the prediction term
Probability, therefore can be pre- according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark
Survey model prediction probability weight calculation go out each prediction term probability weight and, the then probability weight of more each prediction term
The size of sum, maximum probability weight and corresponding prediction term are the punctuation mark of the final text to be predicted
Prediction result.
In the present embodiment, punctuate prediction is carried out by using forward-backward recutrnce neural network, and combine Keywords matching and language
Sound pause prediction model can effectively improve punctuation mark prediction accuracy.
In one embodiment, the step s101, the first punctuation mark prediction model of building, the prediction of the second punctuation mark
Model and third punctuation mark prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and keyword
Punctuation mark prediction model is matched, and creates attention layer in the forward-backward recutrnce neural network punctuation mark prediction model,
The attention mechanism of the attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment
Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor
The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
Specifically, the first punctuation mark prediction model can predict mould by forward-backward recutrnce neural network punctuation mark
Type is constructed, for general recurrent neural network, the state output of recurrent neural network be from front to back, and it is two-way
Recurrent neural network not only has state output from front to back, and there are also state outputs from back to front, that is to say, that lower a moment
The output of text or punctuate, it is not only related with output before, it is also related with output later.General neural network because
It is connectionless between hidden layer node, can not using the information of presequence handle current sequence, such as general DNN (depth
Neural network), CNN (convolutional neural networks) etc., but the hidden layer node of RNN (recurrent neural network) has connection, specifically
Take the form of network can the information to front carry out remember and be applied to current output in;Second punctuation mark is pre-
Surveying model can be constructed by speech pause punctuation mark prediction model;The third punctuation mark prediction model can lead to
Keywords matching punctuation mark prediction model is crossed to be constructed.
Specifically, attention layer can also be created in the forward-backward recutrnce neural network punctuation mark prediction model, do not have
Have the RNN (deep neural network) of attention mechanism, only can be big to the probability of words predictions several before it, and with away from
Increase with a distance from this word and decay, it is noted that power mechanism can be very good to solve this problem, the attention mechanism
It can be realized by the way that attention layer is added in forward-backward recutrnce neural network prediction model, the attention mechanism of the attention layer
Meet formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment
Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor
The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
In the present embodiment, by creating attention layer in the first punctuation mark prediction model, punctuation mark can be improved
The precision of prediction.
Fig. 2 is that a kind of punctuation mark prediction technique process based on multiple prediction models of second embodiment of the application is shown
It is intended to, as shown, the step s102 obtains the first punctuation mark prediction model described in the text input to be predicted
The prediction result of the first punctuation mark prediction model, comprising:
The text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted is obtained institute by step s201
State forward-backward recutrnce neural network punctuation mark prediction model prediction term and prediction probability corresponding with the prediction term;
Specifically, can be by the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, by institute
After the calculating for stating forward-backward recutrnce neural network punctuation mark prediction model, prediction term and prediction corresponding with the prediction term are exported
Probability.
Step s202, by the prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and with the prediction term
Corresponding prediction probability is stored in temporal cache.
Specifically, when the prediction term for obtaining the forward-backward recutrnce neural network punctuation mark prediction model and with the prediction
After the corresponding prediction probability of item, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache
In, then proceed to the prediction term for obtaining the second punctuation mark prediction model and third punctuation mark prediction model and with the prediction
The corresponding prediction probability of item.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied
Fruit is compared, and improves the precision of prediction.
Fig. 3 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the application third embodiment is shown
It is intended to, as shown, the step s103 obtains the second punctuation mark prediction model described in the text input to be predicted
The prediction result of the second punctuation mark prediction model, comprising:
The text input speech pause punctuation mark prediction model to be predicted is obtained the voice and stopped by step s301
The prediction term of punctuation mark prediction model and prediction probability corresponding with the prediction term;
Specifically, can be by the text input speech pause punctuation mark prediction model to be predicted, the speech pause is pre-
It surveys model to be predicted by the length of speech pause, for example, speech pause of the setting greater than 1s is that length is paused, is greater than 300ms
It is short pause less than 1s, the long output that pauses stops the punctuate of class, and such as fullstop, question mark and exclamation mark, short pause is comma, pause mark
And colon;After the calculating of the speech pause punctuation mark prediction model, prediction term and corresponding with the prediction term is exported
Prediction probability.
Step s302, by the prediction term of the speech pause punctuation mark prediction model and corresponding with the prediction term pre-
Probability is surveyed to be stored in temporal cache.
Specifically, when the prediction term that obtains the speech pause punctuation mark prediction model and corresponding with the prediction term
After prediction probability, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache, then after
The continuous prediction term for obtaining third punctuation mark prediction model and prediction probability corresponding with the prediction term.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied
Fruit is compared, and improves the precision of prediction.
Fig. 4 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 4th embodiment of the application is shown
It is intended to, as shown, the step s104 obtains third punctuation mark prediction model described in the text input to be predicted
The prediction result of the third punctuation mark prediction model, comprising:
The text input Keywords matching punctuation mark prediction model to be predicted is obtained the key by step s401
The prediction term of word matching punctuation mark prediction model and prediction probability corresponding with the prediction term;
Specifically, can be by the text input Keywords matching punctuation mark prediction model to be predicted, the keyword
Punctuation mark prediction is carried out by the word frequency of statistics keyword with prediction model, i.e., by statistics training text with punctuate sentence
Word frequency, the statistical probability for calculating the word with punctuate provides this statistical probability then during prediction, for example, after " "
Face add " ", add behind " " "!";After the calculating of the Keywords matching punctuation mark prediction model, export prediction term and
Prediction probability corresponding with the prediction term.
Step s402, by the prediction term of the Keywords matching punctuation mark prediction model and corresponding with the prediction term
Prediction probability is stored in temporal cache.
Specifically, when the prediction term that obtains the Keywords matching punctuation mark prediction model and corresponding with the prediction term
Prediction probability after, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache, then
Can with the prediction term of previously stored first punctuation mark prediction model and the second punctuation mark prediction model and with it is described pre-
It surveys the corresponding prediction probability of item and carries out analysis comparison.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied
Fruit is compared, and improves the precision of prediction.
Fig. 5 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 5th embodiment of the application is shown
It is intended to, as shown, the step s105, according to the first punctuation mark prediction model, the second punctuation mark prediction model
And the prediction result and prediction probability weight of third punctuation mark prediction model carry out the punctuation mark of the text to be predicted
Prediction, comprising:
Step s501 inquires the first punctuation mark prediction model described in the temporal cache, the prediction of the second punctuation mark
The all identical prediction term of the prediction term of model and third punctuation mark prediction model;
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol
After the prediction term of number prediction model and prediction probability corresponding with the prediction term, the first punctuate symbol can be inquired in temporal cache
The all identical prediction term of prediction term of number prediction model, the second punctuation mark prediction model and third punctuation mark prediction model,
For example, the prediction term of the first punctuation mark prediction model output is A, B and C, the prediction of the second punctuation mark prediction model output
Item is A and B, and the prediction term of third punctuation mark prediction model output is A and C, then three punctuation mark prediction models export
Identical prediction term is A.
Step s502, the first punctuation mark prediction model according to the temporal cache, the prediction of the second punctuation mark
The corresponding prediction probability of all identical prediction term of the prediction term of model and third punctuation mark prediction model and prediction probability weight
Carry out the prediction of the punctuation mark of the text to be predicted.
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol
After all identical prediction term of prediction term of number prediction model, it can be looked into temporal cache according to the identical prediction term
It askes, obtains prediction probability corresponding with the identical prediction term, the prediction probability includes three punctuation mark prediction models
Respective prediction probability, then according to the identical corresponding prediction probability of prediction term and each punctuation mark prediction model pair
The prediction probability weighted sum of identical prediction term described in the prediction probability weight calculation answered, and according to the prediction probability weighted sum
Carry out the prediction of the punctuation mark of the text to be predicted.
In the present embodiment, compared by the analysis of the prediction result to three punctuation mark prediction models, can according to point
Analyse the precision that result improves prediction.
Fig. 6 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 6th embodiment of the application is shown
It is intended to, as shown, the step s502, the first punctuation mark prediction model according to the temporal cache, the second mark
The corresponding prediction probability of all identical prediction term of the prediction term of point symbol prediction model and third punctuation mark prediction model and pre-
Survey the prediction that probability right carries out the punctuation mark of the text to be predicted, comprising:
Step s601, the first punctuation mark prediction model according to the temporal cache, the prediction of the second punctuation mark
The corresponding prediction probability of all identical prediction term of the prediction term of model and third punctuation mark prediction model and the probability of prediction power
Each prediction model prediction is calculated separately again divides probability;
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol
After all identical prediction term of prediction term of number prediction model and prediction probability corresponding with the identical prediction term, according to first
The prediction probability weight of punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model is distinguished
The prediction for calculating each prediction model divides probability.For example, identical prediction term is A, A in the first punctuation mark prediction model
Prediction probability is 90%, probability right 0.5;The prediction probability of A is 70% in second punctuation mark prediction model, probability right
It is 0.25;The prediction probability of A is 80% in third punctuation mark prediction model, probability right 0.25,;Then the first punctuation mark
The prediction of A divides probability to be 0.45 in prediction model, and the prediction of A divides probability to be 0.175 in the second punctuation mark prediction model, third
The prediction of A divides probability to be 0.2 in punctuation mark prediction model.
The prediction of each prediction model is divided probability to carry out the cumulative prediction for obtaining the prediction term total by step s602
Probability;
Specifically, when obtaining the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark
It, can be by the first punctuation mark prediction model, the second punctuation mark prediction model and after the prediction of prediction model divides probability
The prediction of three punctuation mark prediction models divides probability to add up, and obtains the prediction total probability of the identical prediction term.
Step s603, the first punctuation mark prediction model described in the temporal cache, the second punctuation mark predict mould
The corresponding prediction of highest prediction total probability is inquired in all identical prediction term of the prediction term of type and third punctuation mark prediction model
, and output is predicted using the prediction term as the punctuation mark of the text to be predicted.
Specifically, the prediction total probability of all identical prediction terms can be calculated, then described all identical
Choose highest prediction total probability in the prediction total probability of prediction term, and according to the highest prediction total probability find with it is described
The corresponding prediction term of highest prediction total probability, the corresponding prediction term of the highest prediction total probability is the text to be predicted
Punctuation mark predict output.
In the present embodiment, it is weighted by the prediction result to three punctuation mark prediction models and sums and analyze
Compare, the precision of prediction can be improved.
A kind of punctuation mark prediction meanss structure based on multiple prediction models of the embodiment of the present application is as shown in fig. 7, packet
It includes:
Model construction module 701, the first prediction module 702, the second prediction module 703, third prediction module 704 and prediction
Output module 705;Wherein, model construction module 701 is connected with the first prediction module 702, and the first prediction module 702 and second is pre-
It surveys module 703 to be connected, the second prediction module 703 is connected with third prediction module 704, and third prediction module 704 and prediction export
Module 705 is connected;Model construction module 701 is set as the first punctuation mark prediction model of building, the second punctuation mark prediction mould
Type and third punctuation mark prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark predict mould
Type and third punctuation mark prediction model are pre-configured with prediction probability weight;First prediction module 702 is set as obtaining to be predicted
First punctuation mark prediction model described in the text input to be predicted is obtained first punctuation mark and predicts mould by text
The prediction result of type;Second prediction module 703 is set as the second punctuation mark described in the text input to be predicted predicting mould
Type obtains the prediction result of the second punctuation mark prediction model;Third prediction module 704 is set as the text to be predicted
This input third punctuation mark prediction model, obtains the prediction result of the third punctuation mark prediction model;It predicts defeated
Module 705 is set as being accorded with according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate out
The prediction result and prediction probability weight of number prediction model carry out the prediction of the punctuation mark of the text to be predicted.
The embodiment of the present application also discloses a kind of computer equipment, and the computer equipment includes memory and processor,
Computer-readable instruction is stored in the memory, the computer-readable instruction is executed by one or more processors
When, so that one or more processors execute the step in punctuation mark prediction technique described in the various embodiments described above.
The embodiment of the present application also discloses a kind of storage medium, and the storage medium can be read and write by processor, the storage
Device is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors so that one or
Multiple processors execute the step in punctuation mark prediction technique described in the various embodiments described above.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between
In matter, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be
The non-volatile memory mediums such as magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random storage note
Recall body (Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art
For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application
Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (10)
1. a kind of punctuation mark prediction technique based on multiple prediction models, which comprises the following steps:
The first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are constructed, and is divided
Not Wei the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model match in advance
Set prediction probability weight;
Text to be predicted is obtained, by the first punctuation mark prediction model described in the text input to be predicted, obtains described first
The prediction result of punctuation mark prediction model;
By the second punctuation mark prediction model described in the text input to be predicted, the second punctuation mark prediction model is obtained
Prediction result;
By third punctuation mark prediction model described in the text input to be predicted, the third punctuation mark prediction model is obtained
Prediction result;
According to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model
Prediction result and prediction probability weight carry out the prediction of the punctuation mark of the text to be predicted.
2. as described in claim 1 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the building
First punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and Keywords matching
Punctuation mark prediction model, and attention layer is created in the forward-backward recutrnce neural network punctuation mark prediction model, it is described
The attention mechanism of attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaSwash for the hidden layer at passing moment
Matrix living, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjWhen being current
The hidden layer at quarter exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
3. as claimed in claim 2 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute
The first punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the first punctuation mark prediction model is obtained
Fruit, comprising:
By the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, the forward-backward recutrnce nerve is obtained
The prediction term of network punctuation mark prediction model and prediction probability corresponding with the prediction term;
The prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and prediction corresponding with the prediction term is general
Rate is stored in temporal cache.
4. as claimed in claim 3 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute
The second punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the second punctuation mark prediction model is obtained
Fruit, comprising:
By the text input speech pause punctuation mark prediction model to be predicted, the speech pause punctuation mark prediction is obtained
The prediction term of model and prediction probability corresponding with the prediction term;
The prediction term of the speech pause punctuation mark prediction model and prediction probability corresponding with the prediction term are stored in
In temporal cache.
5. as claimed in claim 4 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute
Third punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the third punctuation mark prediction model is obtained
Fruit, comprising:
By the text input Keywords matching punctuation mark prediction model to be predicted, the Keywords matching punctuation mark is obtained
The prediction term of prediction model and prediction probability corresponding with the prediction term;
The prediction term of the Keywords matching punctuation mark prediction model and prediction probability corresponding with the prediction term are stored
In temporal cache.
6. as claimed in claim 5 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the basis
The prediction result of the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model
And prediction probability weight carries out the prediction of the punctuation mark of the text to be predicted, comprising:
Inquire the first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third punctuate
The all identical prediction term of the prediction term of sign prediction model;
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third punctuate
The corresponding prediction probability of all identical prediction term of the prediction term of sign prediction model and prediction probability weight carry out described to be predicted
The prediction of the punctuation mark of text.
7. as claimed in claim 6 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the basis
First punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and the prediction of third punctuation mark
The corresponding prediction probability of all identical prediction term of the prediction term of model and prediction probability weight carry out the mark of the text to be predicted
The prediction of point symbol, comprising:
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third punctuate
The corresponding prediction probability of all identical prediction term of the prediction term of sign prediction model and prediction probability weight calculate separately each pre-
The prediction for surveying model divides probability;
Probability is divided to carry out the cumulative prediction total probability for obtaining the prediction term prediction of each prediction model;
The first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third punctuate symbol
The corresponding prediction term of inquiry highest prediction total probability in all identical prediction term of prediction term of number prediction model, and by the prediction
Item predicts output as the punctuation mark of the text to be predicted.
8. a kind of punctuation mark prediction meanss based on multiple prediction models, which is characterized in that described device includes:
Model construction module: the first punctuation mark prediction model of building, the second punctuation mark prediction model and third mark are set as
Point symbol prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third mark
Point symbol prediction model is pre-configured with prediction probability weight;
First prediction module: being set as obtaining text to be predicted, and the first punctuation mark described in the text input to be predicted is pre-
Model is surveyed, the prediction result of the first punctuation mark prediction model is obtained;
Second prediction module: being set as the second punctuation mark prediction model described in the text input to be predicted, described in acquisition
The prediction result of second punctuation mark prediction model;
Third prediction module: being set as third punctuation mark prediction model described in the text input to be predicted, described in acquisition
The prediction result of third punctuation mark prediction model;
Prediction output module: it is set as according to the first punctuation mark prediction model, the second punctuation mark prediction model and the
The prediction result and prediction probability weight of three punctuation mark prediction models carry out the prediction of the punctuation mark of the text to be predicted.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor, in the memory
It is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors, so that one
Or multiple processors are executed as described in any one of claims 1 to 7 the step of punctuation mark prediction technique.
10. a kind of storage medium, which is characterized in that the storage medium can be read and write by processor, and the storage medium is stored with
Computer instruction, when the computer-readable instruction is executed by one or more processors, so that one or more processors are held
Row is as described in any one of claims 1 to 7 the step of punctuation mark prediction technique.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910515571.1A CN110413987B (en) | 2019-06-14 | 2019-06-14 | Punctuation mark prediction method based on multiple prediction models and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910515571.1A CN110413987B (en) | 2019-06-14 | 2019-06-14 | Punctuation mark prediction method based on multiple prediction models and related equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110413987A true CN110413987A (en) | 2019-11-05 |
CN110413987B CN110413987B (en) | 2023-05-30 |
Family
ID=68359082
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910515571.1A Active CN110413987B (en) | 2019-06-14 | 2019-06-14 | Punctuation mark prediction method based on multiple prediction models and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110413987B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111145732A (en) * | 2019-12-27 | 2020-05-12 | 苏州思必驰信息科技有限公司 | Processing method and system after multi-task voice recognition |
CN111241810A (en) * | 2020-01-16 | 2020-06-05 | 百度在线网络技术(北京)有限公司 | Punctuation prediction method and device |
CN111261162A (en) * | 2020-03-09 | 2020-06-09 | 北京达佳互联信息技术有限公司 | Speech recognition method, speech recognition apparatus, and storage medium |
CN113095062A (en) * | 2021-04-12 | 2021-07-09 | 阿里巴巴新加坡控股有限公司 | Data processing method and device, electronic equipment and computer storage medium |
US20210365632A1 (en) * | 2020-05-19 | 2021-11-25 | International Business Machines Corporation | Text autocomplete using punctuation marks |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9135231B1 (en) * | 2012-10-04 | 2015-09-15 | Google Inc. | Training punctuation models |
CN107221330A (en) * | 2017-05-26 | 2017-09-29 | 北京搜狗科技发展有限公司 | Punctuate adding method and device, the device added for punctuate |
CN107767870A (en) * | 2017-09-29 | 2018-03-06 | 百度在线网络技术(北京)有限公司 | Adding method, device and the computer equipment of punctuation mark |
CN108038580A (en) * | 2017-12-30 | 2018-05-15 | 国网江苏省电力公司无锡供电公司 | The multi-model integrated Forecasting Methodology of photovoltaic power based on synchronous extruding wavelet transformation |
CN109614627A (en) * | 2019-01-04 | 2019-04-12 | 平安科技(深圳)有限公司 | A kind of text punctuate prediction technique, device, computer equipment and storage medium |
CN109858038A (en) * | 2019-03-01 | 2019-06-07 | 科大讯飞股份有限公司 | A kind of text punctuate determines method and device |
-
2019
- 2019-06-14 CN CN201910515571.1A patent/CN110413987B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9135231B1 (en) * | 2012-10-04 | 2015-09-15 | Google Inc. | Training punctuation models |
CN107221330A (en) * | 2017-05-26 | 2017-09-29 | 北京搜狗科技发展有限公司 | Punctuate adding method and device, the device added for punctuate |
CN107767870A (en) * | 2017-09-29 | 2018-03-06 | 百度在线网络技术(北京)有限公司 | Adding method, device and the computer equipment of punctuation mark |
CN108038580A (en) * | 2017-12-30 | 2018-05-15 | 国网江苏省电力公司无锡供电公司 | The multi-model integrated Forecasting Methodology of photovoltaic power based on synchronous extruding wavelet transformation |
CN109614627A (en) * | 2019-01-04 | 2019-04-12 | 平安科技(深圳)有限公司 | A kind of text punctuate prediction technique, device, computer equipment and storage medium |
CN109858038A (en) * | 2019-03-01 | 2019-06-07 | 科大讯飞股份有限公司 | A kind of text punctuate determines method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111145732A (en) * | 2019-12-27 | 2020-05-12 | 苏州思必驰信息科技有限公司 | Processing method and system after multi-task voice recognition |
CN111145732B (en) * | 2019-12-27 | 2022-05-10 | 思必驰科技股份有限公司 | Processing method and system after multi-task voice recognition |
CN111241810A (en) * | 2020-01-16 | 2020-06-05 | 百度在线网络技术(北京)有限公司 | Punctuation prediction method and device |
CN111241810B (en) * | 2020-01-16 | 2023-08-01 | 百度在线网络技术(北京)有限公司 | Punctuation prediction method and punctuation prediction device |
CN111261162A (en) * | 2020-03-09 | 2020-06-09 | 北京达佳互联信息技术有限公司 | Speech recognition method, speech recognition apparatus, and storage medium |
CN111261162B (en) * | 2020-03-09 | 2023-04-18 | 北京达佳互联信息技术有限公司 | Speech recognition method, speech recognition apparatus, and storage medium |
US20210365632A1 (en) * | 2020-05-19 | 2021-11-25 | International Business Machines Corporation | Text autocomplete using punctuation marks |
US11556709B2 (en) * | 2020-05-19 | 2023-01-17 | International Business Machines Corporation | Text autocomplete using punctuation marks |
CN113095062A (en) * | 2021-04-12 | 2021-07-09 | 阿里巴巴新加坡控股有限公司 | Data processing method and device, electronic equipment and computer storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110413987B (en) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110413987A (en) | Punctuation mark prediction technique and relevant device based on multiple prediction models | |
CN108536679B (en) | Named entity recognition method, device, equipment and computer readable storage medium | |
Vyas et al. | Fast transformers with clustered attention | |
CN110083705B (en) | Multi-hop attention depth model, method, storage medium and terminal for target emotion classification | |
CN108681610B (en) | generating type multi-turn chatting dialogue method, system and computer readable storage medium | |
Rastogi et al. | Scalable multi-domain dialogue state tracking | |
CN111897941B (en) | Dialogue generation method, network training method, device, storage medium and equipment | |
CN110534087A (en) | A kind of text prosody hierarchy Structure Prediction Methods, device, equipment and storage medium | |
US20220121906A1 (en) | Task-aware neural network architecture search | |
CN110032632A (en) | Intelligent customer service answering method, device and storage medium based on text similarity | |
Yang et al. | Graphdialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems | |
US9099083B2 (en) | Kernel deep convex networks and end-to-end learning | |
CN106897254B (en) | Network representation learning method | |
CN109117480B (en) | Word prediction method, word prediction device, computer equipment and storage medium | |
CN109271646A (en) | Text interpretation method, device, readable storage medium storing program for executing and computer equipment | |
CN109766557A (en) | A kind of sentiment analysis method, apparatus, storage medium and terminal device | |
CN104541324A (en) | A speech recognition system and a method of using dynamic bayesian network models | |
CN112232087B (en) | Specific aspect emotion analysis method of multi-granularity attention model based on Transformer | |
CN110727778A (en) | Intelligent question-answering system for tax affairs | |
CN110347831A (en) | Based on the sensibility classification method from attention mechanism | |
CN110334196B (en) | Neural network Chinese problem generation system based on strokes and self-attention mechanism | |
CN115455171B (en) | Text video mutual inspection rope and model training method, device, equipment and medium | |
CN115495552A (en) | Multi-round dialogue reply generation method based on two-channel semantic enhancement and terminal equipment | |
Shen et al. | Learning how to listen: A temporal-frequential attention model for sound event detection | |
CN110955765A (en) | Corpus construction method and apparatus of intelligent assistant, computer device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |