CN110413987A - Punctuation mark prediction technique and relevant device based on multiple prediction models - Google Patents

Punctuation mark prediction technique and relevant device based on multiple prediction models Download PDF

Info

Publication number
CN110413987A
CN110413987A CN201910515571.1A CN201910515571A CN110413987A CN 110413987 A CN110413987 A CN 110413987A CN 201910515571 A CN201910515571 A CN 201910515571A CN 110413987 A CN110413987 A CN 110413987A
Authority
CN
China
Prior art keywords
prediction
punctuation mark
prediction model
model
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910515571.1A
Other languages
Chinese (zh)
Other versions
CN110413987B (en
Inventor
李秀丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910515571.1A priority Critical patent/CN110413987B/en
Publication of CN110413987A publication Critical patent/CN110413987A/en
Application granted granted Critical
Publication of CN110413987B publication Critical patent/CN110413987B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

This application involves artificial intelligence fields, this application discloses a kind of punctuation mark prediction techniques and relevant device based on multiple prediction models, the described method includes: three punctuation mark prediction models of building, text to be predicted is inputted into three punctuation mark prediction models respectively, it is analyzed according to the prediction result of three punctuation mark prediction models, and carries out prediction output based on the analysis results.The application carries out punctuate prediction by using forward-backward recutrnce neural network, and combines Keywords matching and speech pause prediction model, can effectively improve punctuation mark prediction accuracy.

Description

Punctuation mark prediction technique and relevant device based on multiple prediction models
Technical field
This application involves artificial intelligence field, in particular to a kind of punctuation mark prediction technique based on multiple prediction models And relevant device.
Background technique
Punctuate is predicted to belong to the post-processing technology field of speech recognition, i.e., after converting speech into text, needs pair The text converted is post-processed, and optimizes the user experience of speech recognition product, main includes spoken smooth, punctuate prediction With inverse textual.After obtaining initial translation text, if being a whole sentence, without punctuation mark, user experience is poor. But after it have passed through punctuate prediction, the sentence with punctuation mark, the very big promotion user experience of meeting can be exported.
Current punctuation mark prediction is all based on the length of Keywords matching and speech pause to carry out punctuation mark Prediction, does not include contextual information, and prediction error rate is high.
Summary of the invention
The purpose of the application is to provide a kind of punctuation mark based on multiple prediction models in view of the deficiencies of the prior art Prediction technique and relevant device carry out punctuate prediction by using forward-backward recutrnce neural network, and combine Keywords matching and language Sound pause prediction model can effectively improve punctuation mark prediction accuracy.
In order to achieve the above objectives, the technical solution of the application provides a kind of punctuation mark prediction based on multiple prediction models Method and relevant device.
This application discloses a kind of punctuation mark prediction techniques based on multiple prediction models, comprising the following steps:
The first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are constructed, And the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are pre- First configure prediction probability weight;
Text to be predicted is obtained, by the first punctuation mark prediction model described in the text input to be predicted, described in acquisition The prediction result of first punctuation mark prediction model;
By the second punctuation mark prediction model described in the text input to be predicted, the second punctuation mark prediction is obtained The prediction result of model;
By third punctuation mark prediction model described in the text input to be predicted, the third punctuation mark prediction is obtained The prediction result of model;
Mould is predicted according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark The prediction result and prediction probability weight of type carry out the prediction of the punctuation mark of the text to be predicted.
Preferably, the first punctuation mark prediction model of the building, the second punctuation mark prediction model and third punctuate symbol Number prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and keyword Punctuation mark prediction model is matched, and creates attention layer in the forward-backward recutrnce neural network punctuation mark prediction model, The attention mechanism of the attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
Preferably, it is described by the first punctuation mark prediction model described in the text input to be predicted, obtain described first The prediction result of punctuation mark prediction model, comprising:
By the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, the forward-backward recutrnce is obtained The prediction term of neural network punctuation mark prediction model and prediction probability corresponding with the prediction term;
By the prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and corresponding with the prediction term pre- Probability is surveyed to be stored in temporal cache.
Preferably, it is described by the second punctuation mark prediction model described in the text input to be predicted, obtain described second The prediction result of punctuation mark prediction model, comprising:
By the text input speech pause punctuation mark prediction model to be predicted, the speech pause punctuation mark is obtained The prediction term of prediction model and prediction probability corresponding with the prediction term;
The prediction term of the speech pause punctuation mark prediction model and prediction probability corresponding with the prediction term are deposited Storage is in temporal cache.
Preferably, it is described by third punctuation mark prediction model described in the text input to be predicted, obtain the third The prediction result of punctuation mark prediction model, comprising:
By the text input Keywords matching punctuation mark prediction model to be predicted, the Keywords matching punctuate is obtained The prediction term of sign prediction model and prediction probability corresponding with the prediction term;
By the prediction term of the Keywords matching punctuation mark prediction model and prediction probability corresponding with the prediction term It is stored in temporal cache.
Preferably, described according to the first punctuation mark prediction model, the second punctuation mark prediction model and third mark The prediction result and prediction probability weight of point symbol prediction model carry out the prediction of the punctuation mark of the text to be predicted, packet It includes:
Inquire the first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third The all identical prediction term of the prediction term of punctuation mark prediction model;
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third The corresponding prediction probability of all identical prediction term of the prediction term of punctuation mark prediction model and prediction probability weight carry out it is described to Predict the prediction of the punctuation mark of text.
Preferably, the first punctuation mark prediction model, second punctuation mark according to the temporal cache are pre- Survey the corresponding prediction probability of all identical prediction term of prediction term and the prediction probability power of model and third punctuation mark prediction model The prediction of the punctuation mark of the text to be predicted is carried out again, comprising:
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third The corresponding prediction probability of all identical prediction term of the prediction term of punctuation mark prediction model and prediction probability weight calculate separately often The prediction of a prediction model divides probability;
Probability is divided to carry out the cumulative prediction total probability for obtaining the prediction term prediction of each prediction model;
The first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third mark The corresponding prediction term of highest prediction total probability is inquired in all identical prediction term of the prediction term of point symbol prediction model, and will be described Prediction term predicts output as the punctuation mark of the text to be predicted.
Disclosed herein as well is a kind of punctuation mark prediction meanss based on multiple prediction models, described device includes:
Model construction module: the first punctuation mark prediction model of building, the second punctuation mark prediction model and the are set as Three punctuation mark prediction models, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and Three punctuation mark prediction models are pre-configured with prediction probability weight;
First prediction module: being set as obtaining text to be predicted, and the first punctuate described in the text input to be predicted is accorded with Number prediction model obtains the prediction result of the first punctuation mark prediction model;
Second prediction module: it is set as obtaining the second punctuation mark prediction model described in the text input to be predicted The prediction result of the second punctuation mark prediction model;
Third prediction module: it is set as obtaining third punctuation mark prediction model described in the text input to be predicted The prediction result of the third punctuation mark prediction model;
Prediction output module: it is set as according to the first punctuation mark prediction model, the second punctuation mark prediction model And the prediction result and prediction probability weight of third punctuation mark prediction model carry out the punctuation mark of the text to be predicted Prediction.
Disclosed herein as well is a kind of computer equipment, the computer equipment includes memory and processor, described to deposit Computer-readable instruction is stored in reservoir to be made when the computer-readable instruction is executed by one or more processors Obtain the step of one or more processors execute punctuation mark prediction technique described above.
Disclosed herein as well is a kind of storage medium, the storage medium can be read and write by processor, and the storage medium is deposited Computer instruction is contained, when the computer-readable instruction is executed by one or more processors, so that one or more processing Device executes the step of punctuation mark prediction technique described above.
The beneficial effect of the application is: the application carries out punctuate prediction by using forward-backward recutrnce neural network, and combines Keywords matching and speech pause prediction model can effectively improve punctuation mark prediction accuracy.
Detailed description of the invention
Fig. 1 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the application one embodiment Schematic diagram;
Fig. 2 is a kind of process of punctuation mark prediction technique based on multiple prediction models of second embodiment of the application Schematic diagram;
Fig. 3 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the application third embodiment Schematic diagram;
Fig. 4 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 4th embodiment of the application Schematic diagram;
Fig. 5 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 5th embodiment of the application Schematic diagram;
Fig. 6 is a kind of process of punctuation mark prediction technique based on multiple prediction models of the 6th embodiment of the application Schematic diagram;
Fig. 7 is a kind of punctuation mark prediction meanss structural schematic diagram based on multiple prediction models of the embodiment of the present application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and It is not used in restriction the application.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.
A kind of punctuation mark prediction technique process such as Fig. 1 institute based on multiple prediction models of the application one embodiment Show, the present embodiment the following steps are included:
Step s101, the first punctuation mark prediction model of building, the second punctuation mark prediction model and third punctuation mark Prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark Prediction model is pre-configured with prediction probability weight;
Specifically, can construct three punctuation mark prediction models in advance first, these three punctuation mark prediction models can Being predicted with punctuation mark of the different prediction modes to input text;For example, forward-backward recutrnce neural network prediction mould Type is prediction neural network based, can be used as the first punctuation mark prediction model;Speech pause prediction model is based on language The length of time that sound pauses predicts punctuation mark, can be used as the second punctuation mark prediction model;Keywords matching is pre- Surveying model is to be predicted based on preset keyword punctuation mark, can be used as third punctuation mark prediction model.
Specifically, when text to be predicted inputs the first punctuation mark prediction model, the prediction of the second punctuation mark respectively After model and third punctuation mark prediction model, the first punctuation mark prediction model, the second punctuation mark can be obtained respectively The prediction result of prediction model and third punctuation mark prediction model, therefore can in advance be respectively the first punctuation mark prediction Model, the second punctuation mark prediction model and third punctuation mark prediction model configure prediction probability weight, first punctuate The prediction result of sign prediction model, the second punctuation mark prediction model and third punctuation mark prediction model and first mark The weighting of the prediction probability weight of point symbol prediction model, the second punctuation mark prediction model and third punctuation mark prediction model It is exactly the prediction result of final text to be predicted.
Step s102 obtains text to be predicted, by the first punctuation mark prediction model described in the text input to be predicted, Obtain the prediction result of the first punctuation mark prediction model;
Specifically, obtain text to be predicted first, the text to be predicted be a pile without the word of any punctuation mark and The purpose of contamination, such as " I am Chinese, and I stays in Beijing ", punctuation mark prediction is looked for the word and contamination To the position of punctuate, and suitable punctuation mark is added on a corresponding position, by taking " I am Chinese, and I stays in Beijing " as an example, Output may be as follows:
" I am Chinese, I stays in Beijing."
" I am, Chinese I stay in Beijing "
" I am Chinese!I stays in Beijing, "
" does I am Chinese, and I stay in Beijing "
" I be China, people I, stay in Beijing ".
Above-mentioned every case all can be comprising a prediction term and probability corresponding with the prediction term, and what is finally exported can To be the corresponding prediction term of maximum probability, it is also possible to carry out all prediction terms and probability corresponding with the prediction term defeated Out, the text to be predicted of the acquisition can be the passage of manually input, be also possible to by speech recognition conversion come Passage.
Specifically, can get the first punctuate for after the first punctuation mark prediction model described in the text input to be predicted The prediction result of sign prediction model.
Second punctuation mark prediction model described in the text input to be predicted is obtained second mark by step s103 The prediction result of point symbol prediction model;
Specifically, after the second punctuation mark prediction model described in the text input to be predicted, can be obtained with step s102 The prediction result of the second punctuation mark prediction model is obtained, the prediction result of the second punctuation mark prediction model includes pre- Survey item and probability corresponding with the prediction term.
Third punctuation mark prediction model described in the text input to be predicted is obtained the third mark by step s104 The prediction result of point symbol prediction model;
Specifically, after third punctuation mark prediction model described in the text input to be predicted, can be obtained with step s102 The prediction result of the third punctuation mark prediction model is obtained, the prediction result of the third punctuation mark prediction model includes pre- Survey item and probability corresponding with the prediction term.
Step s105, according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate The prediction result and prediction probability weight of sign prediction model carry out the prediction of the punctuation mark of the text to be predicted.
Specifically, the first punctuation mark prediction model, the second punctuation mark prediction model and ought be got respectively After the prediction result of three punctuation mark prediction models, since the prediction result includes prediction term and corresponding with the prediction term Probability, therefore can be pre- according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark Survey model prediction probability weight calculation go out each prediction term probability weight and, the then probability weight of more each prediction term The size of sum, maximum probability weight and corresponding prediction term are the punctuation mark of the final text to be predicted Prediction result.
In the present embodiment, punctuate prediction is carried out by using forward-backward recutrnce neural network, and combine Keywords matching and language Sound pause prediction model can effectively improve punctuation mark prediction accuracy.
In one embodiment, the step s101, the first punctuation mark prediction model of building, the prediction of the second punctuation mark Model and third punctuation mark prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and keyword Punctuation mark prediction model is matched, and creates attention layer in the forward-backward recutrnce neural network punctuation mark prediction model, The attention mechanism of the attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
Specifically, the first punctuation mark prediction model can predict mould by forward-backward recutrnce neural network punctuation mark Type is constructed, for general recurrent neural network, the state output of recurrent neural network be from front to back, and it is two-way Recurrent neural network not only has state output from front to back, and there are also state outputs from back to front, that is to say, that lower a moment The output of text or punctuate, it is not only related with output before, it is also related with output later.General neural network because It is connectionless between hidden layer node, can not using the information of presequence handle current sequence, such as general DNN (depth Neural network), CNN (convolutional neural networks) etc., but the hidden layer node of RNN (recurrent neural network) has connection, specifically Take the form of network can the information to front carry out remember and be applied to current output in;Second punctuation mark is pre- Surveying model can be constructed by speech pause punctuation mark prediction model;The third punctuation mark prediction model can lead to Keywords matching punctuation mark prediction model is crossed to be constructed.
Specifically, attention layer can also be created in the forward-backward recutrnce neural network punctuation mark prediction model, do not have Have the RNN (deep neural network) of attention mechanism, only can be big to the probability of words predictions several before it, and with away from Increase with a distance from this word and decay, it is noted that power mechanism can be very good to solve this problem, the attention mechanism It can be realized by the way that attention layer is added in forward-backward recutrnce neural network prediction model, the attention mechanism of the attention layer Meet formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaIt is hidden for the passing moment Activated matrix containing layer, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjFor The hidden layer at current time exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
In the present embodiment, by creating attention layer in the first punctuation mark prediction model, punctuation mark can be improved The precision of prediction.
Fig. 2 is that a kind of punctuation mark prediction technique process based on multiple prediction models of second embodiment of the application is shown It is intended to, as shown, the step s102 obtains the first punctuation mark prediction model described in the text input to be predicted The prediction result of the first punctuation mark prediction model, comprising:
The text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted is obtained institute by step s201 State forward-backward recutrnce neural network punctuation mark prediction model prediction term and prediction probability corresponding with the prediction term;
Specifically, can be by the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, by institute After the calculating for stating forward-backward recutrnce neural network punctuation mark prediction model, prediction term and prediction corresponding with the prediction term are exported Probability.
Step s202, by the prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and with the prediction term Corresponding prediction probability is stored in temporal cache.
Specifically, when the prediction term for obtaining the forward-backward recutrnce neural network punctuation mark prediction model and with the prediction After the corresponding prediction probability of item, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache In, then proceed to the prediction term for obtaining the second punctuation mark prediction model and third punctuation mark prediction model and with the prediction The corresponding prediction probability of item.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied Fruit is compared, and improves the precision of prediction.
Fig. 3 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the application third embodiment is shown It is intended to, as shown, the step s103 obtains the second punctuation mark prediction model described in the text input to be predicted The prediction result of the second punctuation mark prediction model, comprising:
The text input speech pause punctuation mark prediction model to be predicted is obtained the voice and stopped by step s301 The prediction term of punctuation mark prediction model and prediction probability corresponding with the prediction term;
Specifically, can be by the text input speech pause punctuation mark prediction model to be predicted, the speech pause is pre- It surveys model to be predicted by the length of speech pause, for example, speech pause of the setting greater than 1s is that length is paused, is greater than 300ms It is short pause less than 1s, the long output that pauses stops the punctuate of class, and such as fullstop, question mark and exclamation mark, short pause is comma, pause mark And colon;After the calculating of the speech pause punctuation mark prediction model, prediction term and corresponding with the prediction term is exported Prediction probability.
Step s302, by the prediction term of the speech pause punctuation mark prediction model and corresponding with the prediction term pre- Probability is surveyed to be stored in temporal cache.
Specifically, when the prediction term that obtains the speech pause punctuation mark prediction model and corresponding with the prediction term After prediction probability, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache, then after The continuous prediction term for obtaining third punctuation mark prediction model and prediction probability corresponding with the prediction term.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied Fruit is compared, and improves the precision of prediction.
Fig. 4 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 4th embodiment of the application is shown It is intended to, as shown, the step s104 obtains third punctuation mark prediction model described in the text input to be predicted The prediction result of the third punctuation mark prediction model, comprising:
The text input Keywords matching punctuation mark prediction model to be predicted is obtained the key by step s401 The prediction term of word matching punctuation mark prediction model and prediction probability corresponding with the prediction term;
Specifically, can be by the text input Keywords matching punctuation mark prediction model to be predicted, the keyword Punctuation mark prediction is carried out by the word frequency of statistics keyword with prediction model, i.e., by statistics training text with punctuate sentence Word frequency, the statistical probability for calculating the word with punctuate provides this statistical probability then during prediction, for example, after " " Face add " ", add behind " " "!";After the calculating of the Keywords matching punctuation mark prediction model, export prediction term and Prediction probability corresponding with the prediction term.
Step s402, by the prediction term of the Keywords matching punctuation mark prediction model and corresponding with the prediction term Prediction probability is stored in temporal cache.
Specifically, when the prediction term that obtains the Keywords matching punctuation mark prediction model and corresponding with the prediction term Prediction probability after, the prediction term and prediction probability corresponding with the prediction term can be stored in temporal cache, then Can with the prediction term of previously stored first punctuation mark prediction model and the second punctuation mark prediction model and with it is described pre- It surveys the corresponding prediction probability of item and carries out analysis comparison.
In the present embodiment, by the storage of the prediction result to punctuation mark prediction model, effectively prediction can be tied Fruit is compared, and improves the precision of prediction.
Fig. 5 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 5th embodiment of the application is shown It is intended to, as shown, the step s105, according to the first punctuation mark prediction model, the second punctuation mark prediction model And the prediction result and prediction probability weight of third punctuation mark prediction model carry out the punctuation mark of the text to be predicted Prediction, comprising:
Step s501 inquires the first punctuation mark prediction model described in the temporal cache, the prediction of the second punctuation mark The all identical prediction term of the prediction term of model and third punctuation mark prediction model;
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol After the prediction term of number prediction model and prediction probability corresponding with the prediction term, the first punctuate symbol can be inquired in temporal cache The all identical prediction term of prediction term of number prediction model, the second punctuation mark prediction model and third punctuation mark prediction model, For example, the prediction term of the first punctuation mark prediction model output is A, B and C, the prediction of the second punctuation mark prediction model output Item is A and B, and the prediction term of third punctuation mark prediction model output is A and C, then three punctuation mark prediction models export Identical prediction term is A.
Step s502, the first punctuation mark prediction model according to the temporal cache, the prediction of the second punctuation mark The corresponding prediction probability of all identical prediction term of the prediction term of model and third punctuation mark prediction model and prediction probability weight Carry out the prediction of the punctuation mark of the text to be predicted.
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol After all identical prediction term of prediction term of number prediction model, it can be looked into temporal cache according to the identical prediction term It askes, obtains prediction probability corresponding with the identical prediction term, the prediction probability includes three punctuation mark prediction models Respective prediction probability, then according to the identical corresponding prediction probability of prediction term and each punctuation mark prediction model pair The prediction probability weighted sum of identical prediction term described in the prediction probability weight calculation answered, and according to the prediction probability weighted sum Carry out the prediction of the punctuation mark of the text to be predicted.
In the present embodiment, compared by the analysis of the prediction result to three punctuation mark prediction models, can according to point Analyse the precision that result improves prediction.
Fig. 6 is that a kind of punctuation mark prediction technique process based on multiple prediction models of the 6th embodiment of the application is shown It is intended to, as shown, the step s502, the first punctuation mark prediction model according to the temporal cache, the second mark The corresponding prediction probability of all identical prediction term of the prediction term of point symbol prediction model and third punctuation mark prediction model and pre- Survey the prediction that probability right carries out the punctuation mark of the text to be predicted, comprising:
Step s601, the first punctuation mark prediction model according to the temporal cache, the prediction of the second punctuation mark The corresponding prediction probability of all identical prediction term of the prediction term of model and third punctuation mark prediction model and the probability of prediction power Each prediction model prediction is calculated separately again divides probability;
Specifically, when getting the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate symbol After all identical prediction term of prediction term of number prediction model and prediction probability corresponding with the identical prediction term, according to first The prediction probability weight of punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model is distinguished The prediction for calculating each prediction model divides probability.For example, identical prediction term is A, A in the first punctuation mark prediction model Prediction probability is 90%, probability right 0.5;The prediction probability of A is 70% in second punctuation mark prediction model, probability right It is 0.25;The prediction probability of A is 80% in third punctuation mark prediction model, probability right 0.25,;Then the first punctuation mark The prediction of A divides probability to be 0.45 in prediction model, and the prediction of A divides probability to be 0.175 in the second punctuation mark prediction model, third The prediction of A divides probability to be 0.2 in punctuation mark prediction model.
The prediction of each prediction model is divided probability to carry out the cumulative prediction for obtaining the prediction term total by step s602 Probability;
Specifically, when obtaining the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark It, can be by the first punctuation mark prediction model, the second punctuation mark prediction model and after the prediction of prediction model divides probability The prediction of three punctuation mark prediction models divides probability to add up, and obtains the prediction total probability of the identical prediction term.
Step s603, the first punctuation mark prediction model described in the temporal cache, the second punctuation mark predict mould The corresponding prediction of highest prediction total probability is inquired in all identical prediction term of the prediction term of type and third punctuation mark prediction model , and output is predicted using the prediction term as the punctuation mark of the text to be predicted.
Specifically, the prediction total probability of all identical prediction terms can be calculated, then described all identical Choose highest prediction total probability in the prediction total probability of prediction term, and according to the highest prediction total probability find with it is described The corresponding prediction term of highest prediction total probability, the corresponding prediction term of the highest prediction total probability is the text to be predicted Punctuation mark predict output.
In the present embodiment, it is weighted by the prediction result to three punctuation mark prediction models and sums and analyze Compare, the precision of prediction can be improved.
A kind of punctuation mark prediction meanss structure based on multiple prediction models of the embodiment of the present application is as shown in fig. 7, packet It includes:
Model construction module 701, the first prediction module 702, the second prediction module 703, third prediction module 704 and prediction Output module 705;Wherein, model construction module 701 is connected with the first prediction module 702, and the first prediction module 702 and second is pre- It surveys module 703 to be connected, the second prediction module 703 is connected with third prediction module 704, and third prediction module 704 and prediction export Module 705 is connected;Model construction module 701 is set as the first punctuation mark prediction model of building, the second punctuation mark prediction mould Type and third punctuation mark prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark predict mould Type and third punctuation mark prediction model are pre-configured with prediction probability weight;First prediction module 702 is set as obtaining to be predicted First punctuation mark prediction model described in the text input to be predicted is obtained first punctuation mark and predicts mould by text The prediction result of type;Second prediction module 703 is set as the second punctuation mark described in the text input to be predicted predicting mould Type obtains the prediction result of the second punctuation mark prediction model;Third prediction module 704 is set as the text to be predicted This input third punctuation mark prediction model, obtains the prediction result of the third punctuation mark prediction model;It predicts defeated Module 705 is set as being accorded with according to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuate out The prediction result and prediction probability weight of number prediction model carry out the prediction of the punctuation mark of the text to be predicted.
The embodiment of the present application also discloses a kind of computer equipment, and the computer equipment includes memory and processor, Computer-readable instruction is stored in the memory, the computer-readable instruction is executed by one or more processors When, so that one or more processors execute the step in punctuation mark prediction technique described in the various embodiments described above.
The embodiment of the present application also discloses a kind of storage medium, and the storage medium can be read and write by processor, the storage Device is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors so that one or Multiple processors execute the step in punctuation mark prediction technique described in the various embodiments described above.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between In matter, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be The non-volatile memory mediums such as magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random storage note Recall body (Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of punctuation mark prediction technique based on multiple prediction models, which comprises the following steps:
The first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model are constructed, and is divided Not Wei the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model match in advance Set prediction probability weight;
Text to be predicted is obtained, by the first punctuation mark prediction model described in the text input to be predicted, obtains described first The prediction result of punctuation mark prediction model;
By the second punctuation mark prediction model described in the text input to be predicted, the second punctuation mark prediction model is obtained Prediction result;
By third punctuation mark prediction model described in the text input to be predicted, the third punctuation mark prediction model is obtained Prediction result;
According to the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model Prediction result and prediction probability weight carry out the prediction of the punctuation mark of the text to be predicted.
2. as described in claim 1 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the building First punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model, comprising:
Construct forward-backward recutrnce neural network punctuation mark prediction model, speech pause punctuation mark prediction model and Keywords matching Punctuation mark prediction model, and attention layer is created in the forward-backward recutrnce neural network punctuation mark prediction model, it is described The attention mechanism of attention layer meets formula:
Wherein, αij=softmax (va Ttanh(Wasi-1+Uahj)), va TFor attention matrix, WaSwash for the hidden layer at passing moment Matrix living, si-1For the output that the hidden layer at passing moment activates, UaFor the hidden layer output matrix at current time, hjWhen being current The hidden layer at quarter exports, αijFor the output of the activation of attention layer, ciFor the output layer output Jing Guo attention mechanism.
3. as claimed in claim 2 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute The first punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the first punctuation mark prediction model is obtained Fruit, comprising:
By the text input forward-backward recutrnce neural network punctuation mark prediction model to be predicted, the forward-backward recutrnce nerve is obtained The prediction term of network punctuation mark prediction model and prediction probability corresponding with the prediction term;
The prediction term of the forward-backward recutrnce neural network punctuation mark prediction model and prediction corresponding with the prediction term is general Rate is stored in temporal cache.
4. as claimed in claim 3 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute The second punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the second punctuation mark prediction model is obtained Fruit, comprising:
By the text input speech pause punctuation mark prediction model to be predicted, the speech pause punctuation mark prediction is obtained The prediction term of model and prediction probability corresponding with the prediction term;
The prediction term of the speech pause punctuation mark prediction model and prediction probability corresponding with the prediction term are stored in In temporal cache.
5. as claimed in claim 4 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that described by institute Third punctuation mark prediction model described in text input to be predicted is stated, the prediction knot of the third punctuation mark prediction model is obtained Fruit, comprising:
By the text input Keywords matching punctuation mark prediction model to be predicted, the Keywords matching punctuation mark is obtained The prediction term of prediction model and prediction probability corresponding with the prediction term;
The prediction term of the Keywords matching punctuation mark prediction model and prediction probability corresponding with the prediction term are stored In temporal cache.
6. as claimed in claim 5 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the basis The prediction result of the first punctuation mark prediction model, the second punctuation mark prediction model and third punctuation mark prediction model And prediction probability weight carries out the prediction of the punctuation mark of the text to be predicted, comprising:
Inquire the first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third punctuate The all identical prediction term of the prediction term of sign prediction model;
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third punctuate The corresponding prediction probability of all identical prediction term of the prediction term of sign prediction model and prediction probability weight carry out described to be predicted The prediction of the punctuation mark of text.
7. as claimed in claim 6 based on the punctuation mark prediction technique of multiple prediction models, which is characterized in that the basis First punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and the prediction of third punctuation mark The corresponding prediction probability of all identical prediction term of the prediction term of model and prediction probability weight carry out the mark of the text to be predicted The prediction of point symbol, comprising:
The first punctuation mark prediction model according to the temporal cache, the second punctuation mark prediction model and third punctuate The corresponding prediction probability of all identical prediction term of the prediction term of sign prediction model and prediction probability weight calculate separately each pre- The prediction for surveying model divides probability;
Probability is divided to carry out the cumulative prediction total probability for obtaining the prediction term prediction of each prediction model;
The first punctuation mark prediction model described in the temporal cache, the second punctuation mark prediction model and third punctuate symbol The corresponding prediction term of inquiry highest prediction total probability in all identical prediction term of prediction term of number prediction model, and by the prediction Item predicts output as the punctuation mark of the text to be predicted.
8. a kind of punctuation mark prediction meanss based on multiple prediction models, which is characterized in that described device includes:
Model construction module: the first punctuation mark prediction model of building, the second punctuation mark prediction model and third mark are set as Point symbol prediction model, and the respectively described first punctuation mark prediction model, the second punctuation mark prediction model and third mark Point symbol prediction model is pre-configured with prediction probability weight;
First prediction module: being set as obtaining text to be predicted, and the first punctuation mark described in the text input to be predicted is pre- Model is surveyed, the prediction result of the first punctuation mark prediction model is obtained;
Second prediction module: being set as the second punctuation mark prediction model described in the text input to be predicted, described in acquisition The prediction result of second punctuation mark prediction model;
Third prediction module: being set as third punctuation mark prediction model described in the text input to be predicted, described in acquisition The prediction result of third punctuation mark prediction model;
Prediction output module: it is set as according to the first punctuation mark prediction model, the second punctuation mark prediction model and the The prediction result and prediction probability weight of three punctuation mark prediction models carry out the prediction of the punctuation mark of the text to be predicted.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor, in the memory It is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors, so that one Or multiple processors are executed as described in any one of claims 1 to 7 the step of punctuation mark prediction technique.
10. a kind of storage medium, which is characterized in that the storage medium can be read and write by processor, and the storage medium is stored with Computer instruction, when the computer-readable instruction is executed by one or more processors, so that one or more processors are held Row is as described in any one of claims 1 to 7 the step of punctuation mark prediction technique.
CN201910515571.1A 2019-06-14 2019-06-14 Punctuation mark prediction method based on multiple prediction models and related equipment Active CN110413987B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910515571.1A CN110413987B (en) 2019-06-14 2019-06-14 Punctuation mark prediction method based on multiple prediction models and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910515571.1A CN110413987B (en) 2019-06-14 2019-06-14 Punctuation mark prediction method based on multiple prediction models and related equipment

Publications (2)

Publication Number Publication Date
CN110413987A true CN110413987A (en) 2019-11-05
CN110413987B CN110413987B (en) 2023-05-30

Family

ID=68359082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910515571.1A Active CN110413987B (en) 2019-06-14 2019-06-14 Punctuation mark prediction method based on multiple prediction models and related equipment

Country Status (1)

Country Link
CN (1) CN110413987B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145732A (en) * 2019-12-27 2020-05-12 苏州思必驰信息科技有限公司 Processing method and system after multi-task voice recognition
CN111241810A (en) * 2020-01-16 2020-06-05 百度在线网络技术(北京)有限公司 Punctuation prediction method and device
CN111261162A (en) * 2020-03-09 2020-06-09 北京达佳互联信息技术有限公司 Speech recognition method, speech recognition apparatus, and storage medium
CN113095062A (en) * 2021-04-12 2021-07-09 阿里巴巴新加坡控股有限公司 Data processing method and device, electronic equipment and computer storage medium
US20210365632A1 (en) * 2020-05-19 2021-11-25 International Business Machines Corporation Text autocomplete using punctuation marks

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9135231B1 (en) * 2012-10-04 2015-09-15 Google Inc. Training punctuation models
CN107221330A (en) * 2017-05-26 2017-09-29 北京搜狗科技发展有限公司 Punctuate adding method and device, the device added for punctuate
CN107767870A (en) * 2017-09-29 2018-03-06 百度在线网络技术(北京)有限公司 Adding method, device and the computer equipment of punctuation mark
CN108038580A (en) * 2017-12-30 2018-05-15 国网江苏省电力公司无锡供电公司 The multi-model integrated Forecasting Methodology of photovoltaic power based on synchronous extruding wavelet transformation
CN109614627A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 A kind of text punctuate prediction technique, device, computer equipment and storage medium
CN109858038A (en) * 2019-03-01 2019-06-07 科大讯飞股份有限公司 A kind of text punctuate determines method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9135231B1 (en) * 2012-10-04 2015-09-15 Google Inc. Training punctuation models
CN107221330A (en) * 2017-05-26 2017-09-29 北京搜狗科技发展有限公司 Punctuate adding method and device, the device added for punctuate
CN107767870A (en) * 2017-09-29 2018-03-06 百度在线网络技术(北京)有限公司 Adding method, device and the computer equipment of punctuation mark
CN108038580A (en) * 2017-12-30 2018-05-15 国网江苏省电力公司无锡供电公司 The multi-model integrated Forecasting Methodology of photovoltaic power based on synchronous extruding wavelet transformation
CN109614627A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 A kind of text punctuate prediction technique, device, computer equipment and storage medium
CN109858038A (en) * 2019-03-01 2019-06-07 科大讯飞股份有限公司 A kind of text punctuate determines method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145732A (en) * 2019-12-27 2020-05-12 苏州思必驰信息科技有限公司 Processing method and system after multi-task voice recognition
CN111145732B (en) * 2019-12-27 2022-05-10 思必驰科技股份有限公司 Processing method and system after multi-task voice recognition
CN111241810A (en) * 2020-01-16 2020-06-05 百度在线网络技术(北京)有限公司 Punctuation prediction method and device
CN111241810B (en) * 2020-01-16 2023-08-01 百度在线网络技术(北京)有限公司 Punctuation prediction method and punctuation prediction device
CN111261162A (en) * 2020-03-09 2020-06-09 北京达佳互联信息技术有限公司 Speech recognition method, speech recognition apparatus, and storage medium
CN111261162B (en) * 2020-03-09 2023-04-18 北京达佳互联信息技术有限公司 Speech recognition method, speech recognition apparatus, and storage medium
US20210365632A1 (en) * 2020-05-19 2021-11-25 International Business Machines Corporation Text autocomplete using punctuation marks
US11556709B2 (en) * 2020-05-19 2023-01-17 International Business Machines Corporation Text autocomplete using punctuation marks
CN113095062A (en) * 2021-04-12 2021-07-09 阿里巴巴新加坡控股有限公司 Data processing method and device, electronic equipment and computer storage medium

Also Published As

Publication number Publication date
CN110413987B (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN110413987A (en) Punctuation mark prediction technique and relevant device based on multiple prediction models
CN108536679B (en) Named entity recognition method, device, equipment and computer readable storage medium
Vyas et al. Fast transformers with clustered attention
CN110083705B (en) Multi-hop attention depth model, method, storage medium and terminal for target emotion classification
CN108681610B (en) generating type multi-turn chatting dialogue method, system and computer readable storage medium
Rastogi et al. Scalable multi-domain dialogue state tracking
CN111897941B (en) Dialogue generation method, network training method, device, storage medium and equipment
CN110534087A (en) A kind of text prosody hierarchy Structure Prediction Methods, device, equipment and storage medium
US20220121906A1 (en) Task-aware neural network architecture search
CN110032632A (en) Intelligent customer service answering method, device and storage medium based on text similarity
Yang et al. Graphdialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems
US9099083B2 (en) Kernel deep convex networks and end-to-end learning
CN106897254B (en) Network representation learning method
CN109117480B (en) Word prediction method, word prediction device, computer equipment and storage medium
CN109271646A (en) Text interpretation method, device, readable storage medium storing program for executing and computer equipment
CN109766557A (en) A kind of sentiment analysis method, apparatus, storage medium and terminal device
CN104541324A (en) A speech recognition system and a method of using dynamic bayesian network models
CN112232087B (en) Specific aspect emotion analysis method of multi-granularity attention model based on Transformer
CN110727778A (en) Intelligent question-answering system for tax affairs
CN110347831A (en) Based on the sensibility classification method from attention mechanism
CN110334196B (en) Neural network Chinese problem generation system based on strokes and self-attention mechanism
CN115455171B (en) Text video mutual inspection rope and model training method, device, equipment and medium
CN115495552A (en) Multi-round dialogue reply generation method based on two-channel semantic enhancement and terminal equipment
Shen et al. Learning how to listen: A temporal-frequential attention model for sound event detection
CN110955765A (en) Corpus construction method and apparatus of intelligent assistant, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant