CN110705254B - Text sentence-breaking method and device, electronic equipment and storage medium - Google Patents
Text sentence-breaking method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110705254B CN110705254B CN201910927354.3A CN201910927354A CN110705254B CN 110705254 B CN110705254 B CN 110705254B CN 201910927354 A CN201910927354 A CN 201910927354A CN 110705254 B CN110705254 B CN 110705254B
- Authority
- CN
- China
- Prior art keywords
- sentence
- break
- word
- breaking
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a text sentence-breaking method, a text sentence-breaking device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining a character feature vector of each character in the text; inputting the character feature vector of each character into the sentence-breaking model to obtain the sentence-breaking probability of each character output by the sentence-breaking model; the punctuation model is obtained based on sample word feature vectors of sample words in the sample text and punctuation mark training; determining a plurality of candidate sentence-break results based on the sentence-break probability of each word; and determining a sentence-break result based on a preset word number threshold and a plurality of candidate sentence-break results. The method, the device, the electronic equipment and the storage medium provided by the embodiment of the invention can obtain the sentence-breaking result of which the length of each sentence is less than or equal to the preset word number threshold while ensuring that the local semantics are not cut off, realize high-efficiency and accurate text sentence breaking, and avoid the loss of labor cost and time cost.
Description
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a text sentence-breaking method and apparatus, an electronic device, and a storage medium.
Background
The caption refers to the commentary displayed in the playing interface when the audio and video is played, and can help audiences understand the audio and video content.
At present, the punctuation of the caption text is mostly finished manually, which consumes manpower resources and time cost. Although the rapid development of natural language processing technology makes text sentence-breaking technology become mature day by day, when breaking a caption text sentence, there is usually a specific text sentence-breaking requirement, and the general text sentence-breaking technology cannot meet the sentence-breaking requirement of the caption text.
Disclosure of Invention
The embodiment of the invention provides a text sentence-breaking method and device, electronic equipment and a storage medium, which are used for solving the problems that the existing caption text sentence-breaking is finished manually and wastes time and labor.
In a first aspect, an embodiment of the present invention provides a text sentence-breaking method, including:
determining a character feature vector of each character in the text;
inputting the character feature vector of each character into a sentence-breaking model to obtain the sentence-breaking probability of each character output by the sentence-breaking model; the punctuation model is obtained by training based on sample word feature vectors of sample words in a sample text and punctuation marks;
determining a plurality of candidate sentence-break results based on the sentence-break probability of each word;
and determining a sentence-break result based on a preset word number threshold value and the plurality of candidate sentence-break results.
Preferably, the determining a plurality of candidate sentence-punctuation results based on the sentence-punctuation probability of each word specifically includes:
constructing a search tree based on the sentence break probability of each word;
determining a plurality of candidate sentence-break results based on the search tree.
Preferably, the determining a sentence-break result based on a preset word count threshold and the plurality of candidate sentence-break results specifically includes:
ranking the plurality of candidate sentence-breaking results based on the sequence of sentence-breaking scores from large to small; the sentence break score is determined based on the sentence break probability of the character corresponding to each sentence break position in the candidate sentence break result and the sentence break probability of the character corresponding to each sentence break position;
starting from the first candidate sentence-breaking result, if the word number of each sentence in the current candidate sentence-breaking result is less than or equal to the preset word number threshold value, taking the current candidate sentence-breaking result as the sentence-breaking result; otherwise, updating the next candidate sentence-break result as the current candidate sentence-break result.
Preferably, the determining the sentence break result based on a preset word count threshold and the plurality of candidate sentence break results further comprises:
if clauses with the word number larger than the preset word number threshold exist in each candidate sentence-break result, determining clauses with the word number larger than the preset word number threshold in the first candidate sentence-break result;
sentence interruption is carried out based on the sentence interruption probability of each word in the clauses with the word number larger than the preset word number threshold value until the word number of each clause in the first candidate sentence interruption result is smaller than or equal to the preset word number threshold value;
and taking the first candidate sentence-breaking result as the sentence-breaking result.
Preferably, the sentence break based on the sentence break probability of each word in the sentence with the word number greater than the preset word number threshold further includes:
determining the distance between any word and the position of the last sentence break in the clauses with the word number larger than the preset word number threshold;
determining a distance excitation probability of any word based on the distance;
and updating the sentence break probability of any character based on the sentence break probability of any character and the distance excitation probability.
Preferably, the determining the word feature vector of each word in the text specifically includes:
determining the word feature vector of any word based on the word vector of any word or based on the word vector and the assistant feature vector of any word;
wherein the supplementary feature vector comprises at least one of a location feature vector, a word co-occurrence feature vector, and an acoustic feature vector; the position feature vector represents the position of any character in the participle, and the character co-occurrence feature vector represents the co-occurrence condition of any character and a sentence break.
Preferably, the word co-occurrence feature vector of any word includes mutual information of the any word and a sentence break; the mutual information is determined based on the sentence break occurrence probability, the word occurrence probability of any word and the word break co-occurrence probability.
Preferably, the determining a word feature vector of each word in the text further comprises:
extracting audio data from the audio and video file;
and performing voice recognition on the audio data to obtain the text.
Preferably, the determining a sentence-break result based on a preset word count threshold and the plurality of candidate sentence-break results further comprises:
determining the boundary of the front and back moments of each clause in the sentence-break result based on the audio data corresponding to the text;
and converting the text into a text in a subtitle format based on the sentence break result and the boundary of the front moment and the rear moment of each clause.
In a second aspect, an embodiment of the present invention provides a text sentence-breaking apparatus, including:
the character feature vector determining unit is used for determining a character feature vector of each character in the text;
the punctuation probability determining unit is used for inputting the character feature vector of each character into the punctuation model to obtain the punctuation probability of each character output by the punctuation model; the sentence break model is obtained by training based on sample word feature vectors of sample words in the sample text and sentence break marks;
a candidate sentence-break result determining unit, configured to determine a plurality of candidate sentence-break results based on the sentence-break probability of each word;
and the sentence break result determining unit is used for determining a sentence break result based on a preset word number threshold value and the candidate sentence break results.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a bus, where the processor and the communication interface, the memory complete communication with each other through the bus, and the processor may call a logic instruction in the memory to perform the steps of the method provided in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method as provided in the first aspect.
According to the text sentence-breaking method, the text sentence-breaking device, the electronic equipment and the storage medium, the text sentence-breaking is carried out based on the preset word number threshold and the sentence-breaking probability of each word, the sentence-breaking result that the length of each sentence is smaller than or equal to the preset word number threshold is obtained while the local semantics are not cut off, efficient and accurate text sentence-breaking is achieved, and the loss of labor cost and time cost is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a text sentence-breaking method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for determining a word feature vector according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a sentence break model according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of a sentence break result determination method according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a search tree according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a text sentence-breaking device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the traditional subtitle generation method, voice information in audio and video needs to be transcribed into subtitle texts through manual audiometry, and then the subtitle texts are subjected to sentence break processing, so that the subtitle generation method meets the requirements of the audio and video on subtitles. If the word number of the caption required by the audio and video is N, the caption text is segmented by the staff within N words, so that each word can not be longer than N words and the local semantics can not be cut off. The process consumes manpower resources and time cost, and cannot meet the subtitle synchronization requirement of live programs.
In recent years, with the rapid development of natural language processing technology, text sentence segmentation technology is becoming mature, but the general text sentence segmentation technology generally performs sentence segmentation in word units, does not have the requirement on the number of words of a sentence, has obvious difference from the caption sentence segmentation requirement of a television program scene, and cannot meet the sentence segmentation requirement of an actual caption system.
In order to solve the above problem, an embodiment of the present invention provides a text sentence-breaking method. The text sentence segmentation method provided by the embodiment of the invention can be applied to a non-real-time subtitle offline generation scene, can also be applied to a real-time subtitle online generation scene, and can also be applied to other scenes with word number requirements on the sentence segmentation. In the following embodiments, a subtitle generating scene is taken as an example for the scheme description, and details of the embodiments of the present invention are not repeated. Fig. 1 is a schematic flowchart of a text sentence-breaking method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
The text here is the text that needs sentence break processing. In a subtitle generating scene, the text may be obtained by performing speech recognition on audio data that needs subtitle production, or may be obtained by performing manual audiometry and recording on the audio data that needs subtitle production, which is not specifically limited in this embodiment of the present invention. In the scene of subtitle off-line generation, audio data is extracted from audio and video files needing subtitle production, and in the scene of subtitle on-line generation, the audio data is obtained by performing real-time voice endpoint detection on audio and video streams.
In the existing technical scheme, sentence segmentation is usually performed by taking word segmentation in a text as a unit, so that a sentence segmentation result is obtained. When a sentence is broken for a text, the word number requirement for each clause exists, and the sentence breaking requirement for the text by using the clause as a unit is not consistent with the sentence breaking requirement of the text. Here, the word segmentation operation refers to splitting the text into individual words. Each character is provided with a character feature vector correspondingly, the character feature vector is used for representing the features of a single character, and the character feature vector of any character can comprise the character vector of the character, the word vector to which the character belongs, the statistical probability of the sentence break before and after the character, and the like.
Specifically, the sentence break model is a pre-trained model, and is used for analyzing whether a sentence break is performed at the position of each word based on the input word feature vector of each word, and outputting the sentence break probability of each word. Here, the sentence break probability of any word is used to indicate the probability of performing a sentence break at the position of the word, and the sentence break at the position of the word may refer to a sentence break before the word or a sentence break after the word.
In addition, before step 120 is executed, the sentence-break model may be obtained through pre-training, and specifically, the sentence-break model may be obtained through training in the following manner: firstly, a large number of sample texts are collected, and sentence breaking is carried out on the sample texts based on a preset word number threshold value, so that a sentence breaking mark of each sample word in the sample texts is obtained, and the sentence breaking mark is used for representing whether the position of the sample word is a sentence breaking position or not. In addition, a sample word feature vector for each sample word in the sample text is determined. And then training the initial model based on the sample character feature vector of the sample character in the sample text and the punctuation mark, thereby obtaining the punctuation model. The initial model may be a single neural network model or a combination of a plurality of neural network models, and the embodiment of the present invention does not specifically limit the type and structure of the initial model.
Specifically, the candidate sentence-break result is a sentence-break result obtained by sentence-breaking the text based on the sentence-break probability of each word. Here, the candidate sentence-break result may be obtained by an algorithm such as greedy search, cluster search, or the like, or a sentence-break threshold may be preset, and if the sentence-break probability of any word is greater than the sentence-break threshold, a sentence-break is performed at the word, and different candidate sentence-break results are determined by different sentence-break thresholds.
Specifically, the preset word number threshold is a preset maximum word number of a clause, for example, when the preset word number threshold is 10, a sentence-breaking result obtained by breaking a text is obtained, and the length of each clause is less than or equal to 10. The sentence-break result of the text is used for indicating the position of the sentence-break in the text. It should be noted that the length of each clause corresponding to the sentence-breaking result is less than or equal to the preset word number threshold.
After a plurality of candidate sentence-break results are obtained, whether each sentence length in any candidate sentence-break result is smaller than or equal to a preset word number threshold value or not can be judged, and then the candidate sentence-break result with each sentence length smaller than or equal to the preset word number threshold value is selected from the plurality of candidate sentence-break results and taken as a final sentence-break result.
According to the method provided by the embodiment of the invention, text sentence breaking is carried out based on the preset word number threshold and the sentence breaking probability of each word, the local semantics are not cut off, and the sentence breaking result of which the length of each sentence is less than or equal to the preset word number threshold is obtained, so that efficient and accurate text sentence breaking is realized, and the loss of labor cost and time cost is avoided.
Based on the foregoing embodiment, in the method, step 110 specifically includes: the word feature vector for any word is determined based on the word vector for that word, or based on the word vector and the assist feature vector for that word.
In particular, the word feature vector of any word may be the word vector of that word, or a combination of the word vector and the ancillary feature vector of that word.
Further, fig. 2 is a schematic flowchart of a method for determining a word feature vector according to an embodiment of the present invention, and as shown in fig. 2, in the method, step 110 specifically includes:
Specifically, for any word, a word vector of the word may be obtained by initializing a word2vec model or a Glove model, or may be obtained by initializing randomly, which is not specifically limited in this embodiment of the present invention.
Specifically, the position feature vector of any word is used to represent the position of the word in the belonging participle, and the position feature vector may be used to represent that the word is in the beginning, end, or word of the belonging participle, may also be used to represent whether the word is in the end of the belonging participle, and may also be used to represent whether the word is in the beginning of the belonging participle. For example, for a participle "subtitle text", the participle is composed of four words "word", "subtitle", "text" and "text", assuming that a position feature vector adopts a one-bit discrete value to indicate whether the word is at the end of the participle to which the word belongs, and setting "0" to indicate that the word is not at the end of the word, and "1" to indicate that the word is at the end of the word, the position feature vectors corresponding to the "word", "subtitle" and "text" are all "0", and the position feature vector corresponding to the "text" is "1". For another example, if the position feature vector uses a two-bit discrete value to indicate that the word is in the beginning of a word, the end of a word, or a word in the word segment, and "00" is set to indicate that the word is in the beginning of a word, "01" is set to indicate that the word is in the end of a word, "11" is set to indicate that the word is in the end of a word, the position feature vector corresponding to the "word" is "00", the position feature vectors corresponding to the "screen" and the "text" are both "01", and the position feature vector corresponding to the "text" is "11".
The word co-occurrence feature vector of any word is used for representing the co-occurrence condition of the word and the sentence break, the word co-occurrence feature vector can be obtained by statistics in advance, the word co-occurrence feature vector can be used for representing the correlation between the word and the sentence break, and the higher the correlation is, the higher the probability of the sentence break at the position of the word is.
The acoustic feature vector of any word is used to represent the features of the speech data corresponding to the word, such as the intensity, loudness, pitch, pause time, and speech speed of the speech data corresponding to the word.
It should be noted that, in the embodiment of the present invention, the execution order of step 111 and step 112 is not specifically limited, and step 111 may be executed before step 112, may be executed after step 112, and may also be executed synchronously with step 112.
Specifically, after the word vector and the auxiliary feature vector of any word are obtained, the word vector and the auxiliary feature vector may be spliced to obtain the word feature vector of the word.
The method provided by the embodiment of the invention enriches the character feature vector of any character by determining the auxiliary feature vector of the character, and is beneficial to improving the accuracy of sentence break probability.
Based on any of the above embodiments, in the method, in the auxiliary feature vector, the word co-occurrence feature vector of any word includes mutual information of the word and the sentence break; mutual information is determined based on the sentence break occurrence probability, the word occurrence probability for the word, and the word break co-occurrence probability.
In particular, mutual Information (PMI) is used to measure the correlation between two variables. In the embodiment of the invention, mutual information contained in the word co-occurrence feature vector of any word is used for measuring the correlation between the word and the sentence break.
And the sentence break occurrence probability is the probability of sentence break occurrence obtained through the statistics of sentence break results of a large number of sample texts. For any word, the word occurrence probability of the word is the probability of the word occurring statistically obtained through a large number of sample texts, and the word phrase co-occurrence probability of the word is the probability of the phrase breaking at the position of the word statistically obtained through phrase breaking results of a large number of sample texts, for example, in a 1000-word text, the phrase breaking is 100 times, and the word occurs 50 times, wherein the phrase breaking is 20 times at the position of the word. The sentence-break occurrence probability is 100/1000=0.1, the word occurrence probability of the word is 50/1000=0.05, and the word-break co-occurrence probability of the word is 20/1000=0.02. The mutual information of the word can be calculated to be 0.02/(0.1 × 0.05) according to the PMI formula.
Further, when the word co-occurrence eigenvector is calculated, the mutual information PMI (Φ W) of the word preceding sentence break can be calculated by the PMI formula respectively i ) Mutual information PMI (W) of sentence break after word sum i Φ):
Where phi is a punctuation symbol, W i Is any word, P (phi W) i ) Probability of co-occurrence of word-phrase for pre-word phrase, P (W) i Phi) is the co-occurrence probability of word-sentence interruption after word interruption, P (phi) is the occurrence probability of sentence interruption, P (W) i ) Is the word occurrence probability.
According to any of the embodiments, in the method, the acoustic feature vector in the auxiliary feature vector includes a pause duration feature vector, or the pause duration feature vector and the speech rate feature vector.
Specifically, for any word, the time interval between the speech data corresponding to the word and the speech data corresponding to the word subsequent to the word, i.e., the pause duration of the word. The pause duration feature vector of the word is used to characterize the pause duration of the word. Usually, there is a correlation between the size of the pause duration and the semantic sentence break, and the larger the pause duration is, the higher the probability of sentence break at the position of the word is.
When determining the sentence break probability based on the pause duration feature vector, it is generally considered that the longer the pause duration, the greater the probability of the post-word sentence break. However, if the speech speed of the speaker corresponding to the speech data is slow, applying the pause duration feature vector for determining the sentence break probability results in that the sentence break probability of each word is higher than the actual situation. In the embodiment of the invention, the sentence breaking probability is determined by using the speech rate feature vector. Here, the speech rate feature vector is used to represent the speech rate of the speaker corresponding to the voice data, and the speech rate feature vector of any word may be obtained based on the number of words ending at the word and the duration of the voice data ending at the word. The complementary relation exists between the speech speed characteristic vector and the pause duration characteristic vector, so that the problem that the semantic segmentation is too fragmented due to too slow speech speed of a speaker can be avoided.
Based on any of the above embodiments, fig. 3 is a schematic structural diagram of a sentence-breaking model provided in an embodiment of the present invention, and as shown in fig. 3, the sentence-breaking model includes an input layer, a hidden layer and an output layer, where W1, W2, \8230, wn respectively represents n words in a text, vec1, vec2, \8230, vecn respectively represents word feature vectors corresponding to the n words, the hidden layer analyzes each word feature vector, and the output layer outputs Punc1, punc2, \8230, and Puncn, where Punc1, punc2, and Puncn respectively represent sentence-breaking identifications of the n words, and each sentence-breaking identification corresponds to a sentence-breaking probability for representing a confidence of the sentence-breaking identification.
Based on any of the above embodiments, fig. 4 is a schematic flow chart of a sentence break result determination method provided in an embodiment of the present invention, as shown in fig. 4, in the method, step 130 specifically includes:
Specifically, assume any word is W i ,W i Has a sentence-breaking probability of P (1 count W) i ) Then W is i Has a sentence-free probability of 1-P (1 equals W) i ). Search tree construction based on sentence break probability and sentence break probability of each character in text. Fig. 5 is a schematic structural diagram of a search tree according to an embodiment of the present invention, and as shown in fig. 5, each word corresponds to two nodes and is respectively used for indicating a punctuation or a non-punctuation at a position where the word is located, each node includes two nodes and is used for indicating a punctuation or a non-punctuation at a position where a next word of the word is located, and a search tree formed by the nodes includes a situation where each word in a text is punctuated or non-punctuated. And calculating the sum of the probabilities corresponding to each node in each path in the search tree as the score of each path, and selecting a plurality of paths with the highest score as candidate sentence-breaking results or selecting a plurality of paths with the score larger than or equal to a preset score threshold value as candidate sentence-breaking results.
Referring to fig. 5, the leftmost path is a sentence break at the position of each word, the rightmost path is a sentence break at the position of each word, and assuming n =10, the sentence break probability and the sentence break probability of each word are shown in the following table:
i | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
sentence break probability P (1 Lian W) i ) | 0.1 | 0.2 | 0.7 | 0.2 | 0.4 | 0.9 | 0.1 | 0.2 | 0.3 | 0.7 |
Probability of punctuation 1-P (1 count W) i ) | 0.9 | 0.8 | 0.3 | 0.8 | 0.6 | 0.1 | 0.9 | 0.8 | 0.7 | 0.3 |
For the path A, the values corresponding to the 3 rd and 6 th nodes in the path A are sentence break probabilities, and the values corresponding to the other nodes are sentence break probabilities, so that the score corresponding to the path A is the sum of the sentence break probabilities of the 3 rd and 6 th words and the sentence break probabilities of the 1 st, 2 nd, 4 th, 5 th, 7 th, 8 th, 9 th and 10 th words; for the path B, the values corresponding to the 3 rd, 6 th and 10 th nodes in the path B are sentence break probabilities, and the values corresponding to the other nodes are sentence break probabilities, so that the score corresponding to the path B is the sum of the sentence break probabilities of the 3 rd, 6 th and 10 th words and the sentence break probabilities of the 1 st, 2 nd, 4 th, 5 th, 7 th, 8 th and 9 th words.
Based on any of the above embodiments, in the method, step 140 specifically includes: step 141, arranging a plurality of candidate sentence-breaking results in a descending order based on the sentence-breaking scores; the sentence break score is determined based on the sentence break probability of the character corresponding to each sentence break position in the candidate sentence break result and the sentence break probability of the character corresponding to each sentence break position; starting from the first candidate sentence-break result, if the word number of each sentence in the current candidate sentence-break result is less than or equal to a preset word number threshold, taking the current candidate sentence-break result as a sentence-break result; otherwise, updating the next candidate sentence-break result as the current candidate sentence-break result.
Specifically, the sentence-breaking score may be a score of a corresponding path on the search tree constructed in step 131, that is, a sentence-breaking probability of a word corresponding to each sentence-breaking position in the candidate sentence-breaking result and a sum of sentence-breaking probabilities of words corresponding to each sentence-breaking position, or a score result obtained by inputting the sentence-breaking probability of a word corresponding to each sentence-breaking position in the candidate sentence-breaking result and the sentence-breaking probability of a word corresponding to each sentence-breaking position into a score model obtained by training a sample sentence-breaking result and a corresponding score thereof in advance.
Suppose that 3 candidate sentence-break results are obtained, and the 3 candidate sentence-break results are arranged according to the sequence of sentence-break scores from large to small. Of the 3 candidate sentence-break results, the first candidate sentence-break result has sentence numbers of 7, 9, and 12, the second candidate sentence-break result has sentence numbers of 10, 9, and the third candidate sentence-break result has sentence numbers of 12, 7, and 9, respectively. Assuming that the preset word number threshold is 10, firstly, judging whether a first candidate sentence-breaking result meets the preset word number threshold, judging that the first candidate sentence-breaking result has clauses with the word number larger than 10, sequentially judging whether a second candidate sentence-breaking result meets the preset word number threshold, judging that the length of each clause in the second candidate sentence-breaking result is smaller than or equal to 10, and taking the second candidate sentence-breaking result as a final sentence-breaking result.
According to any of the above embodiments, in the method, step 140 further includes: step 142, if each candidate sentence-break result has a clause with the word number larger than the preset word number threshold, determining a clause with the word number larger than the preset word number threshold in the first candidate sentence-break result; step 143, sentence breaking is performed based on the sentence breaking probability of each word in the clauses with the word number larger than the preset word number threshold value until the word number of each clause in the first candidate sentence breaking result is less than or equal to the preset word number threshold value; and taking the first candidate sentence-breaking result as a sentence-breaking result.
Specifically, the sentence break probability of each word in the clause where the number of words is greater than the preset word number threshold may be that the sentence break is performed at a position where the word with the largest sentence break probability is located, or that the sentence break is performed at a position where the word with the sentence break probability greater than the preset probability threshold.
After sentence break is carried out on the clauses with the word number larger than the preset word number threshold, whether the clauses with the word number larger than the preset word number threshold still exist in the first candidate sentence break result or not is judged again, and if the clauses do not exist, the first candidate sentence break result is used as a sentence break result; if yes, the clauses with the word number larger than the preset word number threshold value are determined again, and sentence breaking is carried out.
Assume that the default word count threshold is 10. In the 3 candidate sentence-breaking results, the first candidate sentence-breaking result has sentence numbers of 7, 9 and 12, the second candidate sentence-breaking result has sentence numbers of 11, 9 and 8, and the third candidate sentence-breaking result has sentence numbers of 12, 7 and 9, wherein the 3 candidate sentence-breaking results have sentences with word numbers larger than a preset word number threshold value. At this time, of the clauses of the first candidate sentence-break result, the clause with the word number greater than 10 is determined as the third clause, and the word number of the third clause is determined as 12. The sentence break probability of each word in the third clause is obtained as follows:
i | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 |
sentence break probability P (1 Lian W) i ) | 0.1 | 0.2 | 0.7 | 0.2 | 0.4 | 0.9 | 0.1 | 0.2 | 0.3 | 0.7 | 0.1 | 0.2 |
In the third clause, the probability of sentence break of the 22 th character is the largest, sentence break is carried out at the position of the 22 th character to obtain two clauses with the length of 6, the number of the clauses of the first candidate sentence break result is 7, 9, 6 and 6 respectively, the number of the clauses is less than 10, and the first candidate sentence break result is used as the sentence break result.
According to any of the above embodiments, the method further includes, between step 142 and step 143: determining the distance between any character and the position of the last sentence break in the clauses with the character number larger than the preset character number threshold; determining a distance excitation probability for the word based on the distance; and updating the sentence break probability of the character based on the sentence break probability and the distance excitation probability of the character.
Specifically, under the condition that each candidate sentence-break result does not satisfy the preset word number threshold, a sentence with a word number greater than the preset word number threshold in the first candidate sentence-break result needs to be sentence-broken. In order to satisfy the preset word number threshold value as much as possible, the embodiment of the invention takes the distance as an excitation condition on the basis of determining the distance between any word in a clause and the position of the last clause, and excites the clause probability of the word, and the excitation is larger when the distance between any word and the position of the last clause is larger. Here, the excitation is embodied by a distance excitation probability, and for the distance excitation probability of any word, that is, the probability for exciting a sentence break at the position of the word, which is obtained based on the distance, the sentence break probability can be updated based on the distance excitation probability and the sentence break probability, so that the sentence break probability under the distance excitation is obtained. Here, the distance excitation probability and the sentence break probability may be directly added to be the updated sentence break probability, and the distance excitation probability and the sentence break probability may also be weighted to obtain the updated sentence break probability.
Further, the excitation formula is as follows:
in which P (1 count W) i ) Is the ith word W i Probability of sentence break, P (1 count W) i ) Is' W i The sentence-breaking probability under distance excitation, alpha and beta are regulating parameters, N is a preset word number threshold, and l is W i The distance from the position of the last sentence break.
Based on any of the above embodiments, in the method, step 110 further includes: extracting audio data from the audio and video file; and carrying out voice recognition on the audio data to obtain a text.
Here, the audio/video file is an audio file or a video file that needs to be subjected to caption production. The audio and video files can be pre-recorded files or can be generated in real time in the live broadcasting process. After the audio and video file is determined, audio data are extracted according to the audio and video file, and voice recognition is carried out on the audio data, so that a text needing sentence breaking can be obtained.
According to any of the above embodiments, the method further includes, after the step 140: determining the front and rear time boundary of each clause in the sentence-break result based on the audio data corresponding to the text; and converting the text into a text in a subtitle format based on the sentence break result and the front and rear time boundaries of each clause.
Specifically, after the sentence-break result is obtained, aligning each clause in the sentence-break result with the audio data through a forced alignment algorithm to obtain a corresponding front-rear time boundary of each clause in the audio data. Specifically, the specific position of each word in the text in the audio data can be obtained through a forward and backward algorithm, and further the specific position of each clause in the audio data, namely the front and back time boundaries of each clause, can be obtained.
And after the front-rear time boundary of each clause is obtained, carrying out format conversion on the text based on the sentence break result and the front-rear time boundary of each clause. And obtaining the subtitle format text. For example, the text "this is a sentence-breaking model with very good effect", the sentence-breaking result obtained by the sentence-breaking is "this is a sentence-breaking model with very good effect", wherein the front and back time boundaries of the clause "this is a sentence-breaking model" are 1s and 4s, and the front and back time boundaries of the clause "with very good effect" are 4.5s and 7s, and the caption format text obtained thereby is as follows:
1.000 4.000 this is a punctuation model
4.500 The 7.000 effect is very good
Based on any one of the above embodiments, an embodiment of the present invention provides a sentence break model training method, which specifically includes the following steps:
first, a large amount of sample texts meeting the requirements of the subtitle clauses of the media scene are collected. Here, the sample text is a subtitle text where a sentence break position is manually marked in an actual service, for example, a subtitle text manually added in a broadcasting process of various television programs.
After the sample text is obtained, the sample text is split into independent sample words, and a sentence break identifier of each sample word in the sample text is determined, wherein the sentence break identifier is used for representing whether a sentence is broken at the position of the sample word, for example, if the sentence break identifier is 0, the sentence is not broken at the position of the sample word, and if the sentence break identifier is 1, the sentence is broken at the position of the sample word.
Further, a word vector and an assistant feature vector for each sample word in the sample text are determined, and a word feature vector for any sample word is determined by splicing the word vector and the assistant feature vector for any sample word.
And then training the initial model based on the sample character feature vector of the sample character in the sample text and the sentence break identifier, thereby obtaining the sentence break model. Here, the initial model may be a long-short term memory network LSTM, a bidirectional long-short term memory network BLSTM, a Self-Attention mechanism Self-Attention, and the like, which is not specifically limited in this embodiment of the present invention.
Based on any one of the above embodiments, an embodiment of the present invention provides a subtitle generating method, which specifically includes the following steps:
under a non-real-time subtitle offline generation scene, extracting audio data from an audio/video file needing subtitle production; and under the scene of real-time subtitle on-line generation, audio data is obtained by carrying out real-time voice endpoint detection on the audio and video stream. And then inputting the audio data into a voice recognition system for voice recognition, wherein in the process, the voice recognition system performs slicing processing on the audio data according to the pause information in the audio data and outputs a sliced text, namely a subtitle text.
And after the subtitle text is obtained, preprocessing the subtitle text. Here the preprocessing steps include word segmentation, word vector conversion and assist feature vector extraction. The auxiliary feature vectors include position feature vectors, word co-occurrence feature vectors and acoustic feature vectors, the word co-occurrence feature vectors can be obtained by querying from a word co-occurrence feature table counted in advance, and pause duration feature vectors and speech speed feature vectors in the acoustic feature vectors need to be determined by combining audio data.
After a word vector and an auxiliary characteristic vector of each word in the subtitle text are obtained, the word characteristic vector of the word is determined by splicing the word vector and the auxiliary characteristic vector of any word. And then inputting the character feature vector of each character in the caption text into the sentence break model, and calculating and outputting the sentence break probability of each character through the sentence break model by a forward algorithm.
After the sentence break probability of each character is obtained, determining a plurality of candidate sentence break results and a sentence break score of each candidate sentence break result through cluster searching based on the sentence break probability and the sentence break probability of each character; and the sentence break score is the sum of the sentence break probability of the character corresponding to each sentence break position and the sentence break probability of the character corresponding to each sentence break position in the candidate sentence break result.
Then, arranging a plurality of candidate sentence-break results based on the sequence of sentence-break scores from large to small, starting from the first candidate sentence-break result, judging whether the word number of each sentence in the current candidate sentence-break result is less than or equal to a preset word number threshold one by one, if so, taking the current candidate sentence-break result as the sentence-break result, otherwise, updating the next candidate sentence-break result as the current candidate sentence-break result for judgment.
If clauses with the word number larger than the preset word number threshold exist in each candidate sentence-break result, analyzing the first candidate sentence-break result, determining the clauses with the word number larger than the preset word number threshold in the first candidate sentence-break result, performing sentence-break on the positions of the words with the maximum sentence-break probability in the clauses, detecting whether the clauses with the word number larger than the preset word number threshold exist in the first candidate sentence-break result again after the sentence-break is completed, and if the clauses do not exist, taking the first candidate sentence-break result as the sentence-break result; if the candidate sentence-breaking result exists, sentence-breaking is carried out on the clauses with the word number larger than the preset word number threshold value until the clauses with the word number larger than the preset word number threshold value do not exist in the first candidate sentence-breaking result.
After the sentence-breaking result is obtained, aligning each clause in the sentence-breaking result with the audio and video through a forced alignment algorithm to obtain a corresponding front-rear time boundary of each clause in the audio and video. And after the front and rear time boundaries of each clause are obtained, carrying out format conversion on the subtitle text based on the sentence break result and the front and rear time boundaries of each clause. And obtaining the text in the subtitle format.
According to the method provided by the embodiment of the invention, the caption text is punctuated based on the preset word number threshold and the punctuation probability of each word, the punctuation result that the length of each clause is less than or equal to the preset word number threshold is obtained while the local semantics are not cut off, the efficient and accurate caption text punctuation is realized, the loss of labor cost and time cost is avoided, the instantaneity of the punctuation of the caption text is improved, and the generation speed of the caption is favorably accelerated. In addition, the auxiliary feature vector of any word is determined, so that the word feature vector of the word is enriched, and the accuracy of sentence break probability is improved.
Based on any of the above embodiments, fig. 6 is a schematic structural diagram of a text sentence-breaking device provided in an embodiment of the present invention, as shown in fig. 6, the device includes a word feature vector determining unit 610, a sentence-breaking probability determining unit 620, a candidate sentence-breaking result determining unit 630, and a sentence-breaking result determining unit 640;
the word feature vector determining unit 610 is configured to determine a word feature vector of each word in the text;
the sentence break probability determining unit 620 is configured to input the word feature vector of each word into a sentence break model, so as to obtain a sentence break probability of each word output by the sentence break model; the punctuation model is obtained by training based on sample word feature vectors of sample words in a sample text and punctuation marks;
the candidate sentence-break result determining unit 630 is configured to determine a plurality of candidate sentence-break results based on the sentence-break probability of each word;
the sentence break result determining unit 640 is configured to determine a sentence break result based on a preset word count threshold and the plurality of candidate sentence break results.
The device provided by the embodiment of the invention can be used for text sentence breaking based on the preset word number threshold and the sentence breaking probability of each word, so that the local semantics are not cut off, and the sentence breaking result of which the length of each sentence is less than or equal to the preset word number threshold is obtained, thereby realizing efficient and accurate text sentence breaking and avoiding the loss of labor cost and time cost.
Based on any of the above embodiments, in the apparatus, the candidate sentence-break result determining unit 630 is specifically configured to:
constructing a search tree based on the sentence break probability of each word;
based on the search tree, a plurality of candidate sentence-break results are determined.
Based on any of the above embodiments, in the apparatus, the sentence break result determining unit 640 includes:
a sequential ranking subunit, configured to rank the multiple candidate sentence-break results in a descending order based on the sentence-break scores; the sentence break score is determined based on the sentence break probability of the character corresponding to each sentence break position in the candidate sentence break result and the sentence break probability of the character corresponding to each sentence break position;
a word number judging subunit, configured to start from the first candidate sentence-breaking result, and if the number of words in each sentence in the current candidate sentence-breaking result is less than or equal to the preset word number threshold, take the current candidate sentence-breaking result as the sentence-breaking result; otherwise, updating the next candidate sentence-breaking result to the current candidate sentence-breaking result.
Based on any of the above embodiments, in the apparatus, the sentence break result determining unit 640 further includes:
a super word segmentation sub-unit, configured to determine a first segmentation result with a word count greater than the preset word count threshold if each of the candidate segmentation results has a segmentation with a word count greater than the preset word count threshold;
sentence interruption is carried out based on the sentence interruption probability of each word in the clauses with the word number larger than the preset word number threshold value until the word number of each clause in the first candidate sentence interruption result is smaller than or equal to the preset word number threshold value;
and taking the first candidate sentence-breaking result as the sentence-breaking result.
According to any of the above embodiments, in the apparatus, the superword-breaking sentence unit is further configured to:
determining the distance between any word and the position of the last sentence break in the clauses with the word number larger than the preset word number threshold;
determining a distance excitation probability of any word based on the distance;
and updating the sentence break probability of any character based on the sentence break probability and the distance excitation probability of any character.
Based on any of the above embodiments, in the apparatus, the word feature vector determination unit 610 is specifically configured to:
determining the word feature vector of any word based on the word vector of any word or based on the word vector and the assistant feature vector of any word;
wherein the supplementary feature vector comprises at least one of a location feature vector, a word co-occurrence feature vector, and an acoustic feature vector; the position feature vector represents the position of any character in the participle, and the character co-occurrence feature vector represents the co-occurrence condition of any character and a sentence break.
Based on any of the above embodiments, in the apparatus, the word co-occurrence feature vector of any word includes mutual information of the word and a sentence break; the mutual information is determined based on the sentence break occurrence probability, the word occurrence probability of any word and the word break co-occurrence probability.
Based on any of the above embodiments, the apparatus further comprises:
the text acquisition unit is used for extracting audio data from the audio and video file; and performing voice recognition on the audio data to obtain the text.
Based on any of the above embodiments, the apparatus further comprises:
the caption generating unit is used for determining the boundary of the front moment and the rear moment of each clause in the sentence break result based on the audio data corresponding to the text;
and converting the text into a text in a subtitle format based on the sentence break result and the front and back time boundaries of each clause.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 7, the electronic device may include: a processor (processor) 710, a communication Interface (Communications Interface) 720, a memory (memory) 730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may call logic instructions in memory 730 to perform the following method: determining a character feature vector of each character in the text; inputting the character feature vector of each character into a sentence break model to obtain the sentence break probability of each character output by the sentence break model; the sentence break model is obtained by training based on sample word feature vectors of sample words in the sample text and sentence break marks; determining a plurality of candidate sentence-break results based on the sentence-break probability of each word; and determining a sentence-break result based on a preset word number threshold value and the plurality of candidate sentence-break results.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Embodiments of the present invention further provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the method provided in the foregoing embodiments when executed by a processor, and the method includes: determining a character feature vector of each character in the text; inputting the character feature vector of each character into a sentence break model to obtain the sentence break probability of each character output by the sentence break model; the sentence break model is obtained by training based on sample word feature vectors of sample words in the sample text and sentence break marks; determining a plurality of candidate sentence-break results based on the sentence-break probability of each word; and determining a sentence-breaking result based on a preset word number threshold value and the candidate sentence-breaking results.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (11)
1. A text sentence-breaking method, comprising:
determining a character feature vector of each character in the text;
inputting the character feature vector of each character into a sentence-breaking model to obtain the sentence-breaking probability of each character output by the sentence-breaking model; the sentence break model is obtained by training based on sample word feature vectors of sample words in the sample text and sentence break marks;
determining a plurality of candidate sentence-break results based on the sentence-break probability of each word;
determining sentence-break results based on a preset word number threshold and the plurality of candidate sentence-break results;
the determining a plurality of candidate sentence-break results based on the sentence-break probability of each word specifically includes:
constructing a search tree based on the sentence break probability of each word;
determining a plurality of candidate sentence-break results based on the search tree;
the determining a sentence break result based on a preset word count threshold and the plurality of candidate sentence break results comprises:
and selecting the candidate sentence-break results of which the sentence length is less than or equal to the preset word number threshold from the candidate sentence-break results as the sentence-break results.
2. The method for text clause segmentation according to claim 1, wherein the step of selecting a candidate sentence segmentation result with a sentence length less than or equal to the preset word number threshold from a plurality of candidate sentence segmentation results as the short sentence result specifically comprises:
ranking the plurality of candidate sentence-break results based on the sequence of sentence-break scores from large to small; the sentence break score is determined based on the sentence break probability of the character corresponding to each sentence break position in the candidate sentence break result and the sentence break probability of the character corresponding to each sentence break position;
starting from the first candidate sentence-break result, if the word number of each sentence in the current candidate sentence-break result is less than or equal to the preset word number threshold, taking the current candidate sentence-break result as the sentence-break result; otherwise, updating the next candidate sentence-break result as the current candidate sentence-break result.
3. The text sentence-breaking method of claim 2 wherein the determining a sentence-breaking result based on a preset word count threshold and the plurality of candidate sentence-breaking results further comprises:
if clauses with the word number larger than the preset word number threshold exist in each candidate sentence-break result, determining clauses with the word number larger than the preset word number threshold in the first candidate sentence-break result;
sentence interruption is carried out based on the sentence interruption probability of each word in the clauses with the word number larger than the preset word number threshold value until the word number of each clause in the first candidate sentence interruption result is smaller than or equal to the preset word number threshold value;
and taking the first candidate sentence-breaking result as the sentence-breaking result.
4. The text sentence-breaking method of claim 3 wherein the sentence-breaking based on the sentence-breaking probability of each word in the sentence with the word count greater than the preset word count threshold further comprises:
determining the distance between any word and the position of the last sentence break in the clauses with the word number larger than the preset word number threshold;
determining a distance excitation probability of any word based on the distance;
and updating the sentence break probability of any character based on the sentence break probability of any character and the distance excitation probability.
5. The method of claim 1, wherein the determining the word feature vector of each word in the text specifically comprises:
determining the word feature vector of any word based on the word vector of any word or based on the word vector and the assistant feature vector of any word;
wherein the supplementary feature vector comprises at least one of a location feature vector, a word co-occurrence feature vector, and an acoustic feature vector; the position feature vector represents the position of any character in the participle to which the character co-occurrence feature vector represents the co-occurrence condition of any character and a punctuation.
6. The text sentence-breaking method of claim 5, wherein the word co-occurrence feature vector of any word includes mutual information of the any word and the sentence-breaking; the mutual information is determined based on the sentence break occurrence probability, the word occurrence probability of any word and the word break co-occurrence probability.
7. The method of text sentence-breaking according to claim 1, wherein the determining a word feature vector for each word in the text further comprises:
extracting audio data from the audio and video file;
and performing voice recognition on the audio data to obtain the text.
8. The text sentence-breaking method of claim 1 wherein the determining a sentence-breaking result based on a preset word count threshold and the plurality of candidate sentence-breaking results further comprises:
determining the boundary of the front and back moments of each clause in the sentence-break result based on the audio data corresponding to the text;
and converting the text into a text in a subtitle format based on the sentence break result and the boundary of the front moment and the rear moment of each clause.
9. A text sentence-breaking apparatus, comprising:
the character feature vector determining unit is used for determining a character feature vector of each character in the text;
the sentence break probability determining unit is used for inputting the character feature vector of each character into a sentence break model to obtain the sentence break probability of each character output by the sentence break model; the sentence break model is obtained by training based on sample word feature vectors of sample words in the sample text and sentence break marks;
a candidate sentence-punctuation result determining unit for determining a plurality of candidate sentence-punctuation results based on the sentence-punctuation probability of each word;
a sentence-break result determining unit for determining a sentence-break result based on a preset word number threshold and the plurality of candidate sentence-break results;
the candidate sentence-break result determining unit is specifically configured to:
constructing a search tree based on the sentence break probability of each word;
determining a plurality of candidate sentence-breaking results based on the search tree;
the sentence-break result determining unit is specifically configured to:
and selecting the candidate sentence-break results of which the sentence length is less than or equal to the preset word number threshold from the candidate sentence-break results as the sentence-break results.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the text sentence breaking method according to any of claims 1 to 8 are implemented by the processor when executing the program.
11. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the text sentence-breaking method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910927354.3A CN110705254B (en) | 2019-09-27 | 2019-09-27 | Text sentence-breaking method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910927354.3A CN110705254B (en) | 2019-09-27 | 2019-09-27 | Text sentence-breaking method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110705254A CN110705254A (en) | 2020-01-17 |
CN110705254B true CN110705254B (en) | 2023-04-07 |
Family
ID=69197110
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910927354.3A Active CN110705254B (en) | 2019-09-27 | 2019-09-27 | Text sentence-breaking method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110705254B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652002B (en) * | 2020-06-16 | 2023-04-18 | 抖音视界有限公司 | Text division method, device, equipment and computer readable medium |
CN112002328B (en) * | 2020-08-10 | 2024-04-16 | 中央广播电视总台 | Subtitle generation method and device, computer storage medium and electronic equipment |
CN114125571B (en) * | 2020-08-31 | 2024-07-30 | 小红书科技有限公司 | Subtitle generating method, subtitle testing method and subtitle processing device |
CN113392639B (en) * | 2020-09-30 | 2023-09-26 | 腾讯科技(深圳)有限公司 | Title generation method, device and server based on artificial intelligence |
CN113436617B (en) * | 2021-06-29 | 2023-08-18 | 平安科技(深圳)有限公司 | Voice sentence breaking method, device, computer equipment and storage medium |
CN114420102B (en) * | 2022-01-04 | 2022-10-14 | 广州小鹏汽车科技有限公司 | Method and device for speech sentence-breaking, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017140221A1 (en) * | 2016-02-18 | 2017-08-24 | 腾讯科技(深圳)有限公司 | Text information processing method and device |
CN107247706A (en) * | 2017-06-16 | 2017-10-13 | 中国电子技术标准化研究院 | Text punctuate method for establishing model, punctuate method, device and computer equipment |
WO2018036555A1 (en) * | 2016-08-25 | 2018-03-01 | 腾讯科技(深圳)有限公司 | Session processing method and apparatus |
CN109145282A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Punctuate model training method, punctuate method, apparatus and computer equipment |
-
2019
- 2019-09-27 CN CN201910927354.3A patent/CN110705254B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017140221A1 (en) * | 2016-02-18 | 2017-08-24 | 腾讯科技(深圳)有限公司 | Text information processing method and device |
WO2018036555A1 (en) * | 2016-08-25 | 2018-03-01 | 腾讯科技(深圳)有限公司 | Session processing method and apparatus |
CN107247706A (en) * | 2017-06-16 | 2017-10-13 | 中国电子技术标准化研究院 | Text punctuate method for establishing model, punctuate method, device and computer equipment |
CN109145282A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Punctuate model training method, punctuate method, apparatus and computer equipment |
Non-Patent Citations (1)
Title |
---|
古汉语句子切分与句读标记方法研究;王川等;《河南大学学报(自然科学版)》;20090916(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110705254A (en) | 2020-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110705254B (en) | Text sentence-breaking method and device, electronic equipment and storage medium | |
CN106331893B (en) | Real-time caption presentation method and system | |
CN108305643B (en) | Method and device for determining emotion information | |
CN106297776B (en) | A kind of voice keyword retrieval method based on audio template | |
US20190278846A1 (en) | Semantic extraction method and apparatus for natural language, and computer storage medium | |
CN111090727B (en) | Language conversion processing method and device and dialect voice interaction system | |
CN111986656B (en) | Teaching video automatic caption processing method and system | |
CN110781668B (en) | Text information type identification method and device | |
CN109858038B (en) | Text punctuation determination method and device | |
CN110853649A (en) | Label extraction method, system, device and medium based on intelligent voice technology | |
CN112002328B (en) | Subtitle generation method and device, computer storage medium and electronic equipment | |
CN112399269B (en) | Video segmentation method, device, equipment and storage medium | |
CN111341305A (en) | Audio data labeling method, device and system | |
CN113035199B (en) | Audio processing method, device, equipment and readable storage medium | |
CN110740275A (en) | nonlinear editing systems | |
CN111883137A (en) | Text processing method and device based on voice recognition | |
JP2012181358A (en) | Text display time determination device, text display system, method, and program | |
US20240020489A1 (en) | Providing subtitle for video content in spoken language | |
CN113921011A (en) | Audio processing method, device and equipment | |
CN110750980B (en) | Phrase corpus acquisition method and phrase corpus acquisition device | |
CN113470617B (en) | Speech recognition method, electronic equipment and storage device | |
CN116187292A (en) | Dialogue template generation method and device and computer readable storage medium | |
CN115691503A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN112783324B (en) | Man-machine interaction method and device and computer storage medium | |
CN109949828B (en) | Character checking method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |