CN109002433A - A kind of document creation method and device - Google Patents

A kind of document creation method and device Download PDF

Info

Publication number
CN109002433A
CN109002433A CN201810540691.2A CN201810540691A CN109002433A CN 109002433 A CN109002433 A CN 109002433A CN 201810540691 A CN201810540691 A CN 201810540691A CN 109002433 A CN109002433 A CN 109002433A
Authority
CN
China
Prior art keywords
word sequence
text
word
generated
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810540691.2A
Other languages
Chinese (zh)
Other versions
CN109002433B (en
Inventor
祝文博
李超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mobvoi Innovation Technology Co Ltd
Original Assignee
Chumen Wenwen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chumen Wenwen Information Technology Co Ltd filed Critical Chumen Wenwen Information Technology Co Ltd
Priority to CN201810540691.2A priority Critical patent/CN109002433B/en
Publication of CN109002433A publication Critical patent/CN109002433A/en
Application granted granted Critical
Publication of CN109002433B publication Critical patent/CN109002433B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the present invention provides a kind of document creation method and device, the described method includes: obtaining topic corresponding to keyword and target text to be generated, wherein, the keyword is a part in word sequence composed by the lead-in of each word sequence in the target text to be generated;First word sequence in the target text to be generated is generated by text generation model trained in advance based on the keyword and the topic;It is at least based on first word sequence and preset simple or compound vowel of a Chinese syllable, other word sequences in the target text to be generated in addition to first word sequence are generated according to preset rhymed rule by the text generation model;According to the sequencing for generating first word sequence and other word sequences, first word sequence and other word sequences are combined, rhymed target text is obtained.

Description

A kind of document creation method and device
Technical field
The present embodiments relate to natural language processing field more particularly to a kind of document creation methods and device.
Background technique
Automatically generated using computer text such as generate poem, the lyrics, dialogue technology, belong to natural language processing Field generates mainly based on the technologies such as Computational Linguistics, artificial intelligence, deep learning to study and simulate the mankind The process and method of natural language text.Poem is the crystallization of human language, has the characteristics that rules and forms, antithesis, rhymes, and hides head Poem is a kind of style of a verse, poem, etc. of special shape in poem, it is embedded in one in the content that you to be expressed with first word of every poem Word, verse with a hidden head connotation is deep, grade is high, value weight.
With the fast development of Computational Linguistics, artificial intelligence and deep learning, it is often used neural network at present Seq2seq (the Sequence to of (Neural Networks, NN) as encoder (Encoder) and decoder (Decoder) Sequence, sequence to sequence) model generates text.Since seq2seq model is generated in text based on probability distribution Each word sequence, so the problem certainly existed is exactly to directly generate often through the seq2seq model Text be all it is not rhymed, this significantly impacts the aesthetic feeling of the text of generation.As it can be seen that the existing method for generating text is not enough closed It is bad to generate effect for reason.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of document creation method and device, a mesh of the embodiment of the present invention Be by combining text generation model and preset rhymed rule, to generate the corresponding rhymed text of preset simple or compound vowel of a Chinese syllable This.
In order to achieve the above objectives, the embodiment of the present invention mainly provides the following technical solutions:
In a first aspect, the embodiment of the present invention provides a kind of document creation method, comprising: obtain keyword and mesh to be generated Mark topic corresponding to text, wherein the keyword is each word sequence in the target text to be generated A part in word sequence composed by lead-in;It is raw by text trained in advance based on the keyword and the topic At model, first word sequence in the target text to be generated is generated;At least it is based on first word sequence With preset simple or compound vowel of a Chinese syllable, the target text to be generated is generated according to preset rhymed rule by the text generation model In other word sequences in addition to first word sequence;According to generating first word sequence and described other First word sequence and other word sequences are combined, obtain rhymed by the sequencing of word sequence Target text.
Second aspect, the embodiment of the present invention provide a kind of text generating apparatus, comprising: first obtains unit, first generate Unit, the second generation unit and the second obtaining unit, wherein the first obtains unit, for obtaining keyword and to be generated At target text corresponding to topic, wherein the keyword be the target text to be generated in each text A part in word sequence composed by the lead-in of sequence;First generation unit, for being based on the keyword and institute Topic is stated, by text generation model trained in advance, generates first word sequence in the target text to be generated; Second generation unit passes through the text generation for being at least based on first word sequence and preset simple or compound vowel of a Chinese syllable Model generates in the target text to be generated in addition to first word sequence according to preset rhymed rule Other word sequences;Second obtaining unit, for according to generation first word sequence and other text sequences First word sequence and other word sequences are combined by the sequencing of column, obtain rhymed target text This.
The third aspect, the embodiment of the present invention provide a kind of storage medium, and the storage medium includes the program of storage, In, equipment where controlling the storage medium in described program operation executes above-mentioned document creation method.
Fourth aspect, the embodiment of the present invention provide a kind of electronic equipment, comprising: at least one processor;And with it is described At least one processor, the bus of processor connection;Wherein, the processor, memory are completed mutual by the bus Communication;The processor is used to call the program instruction in the memory, to execute above-mentioned document creation method.
A kind of document creation method and device provided in an embodiment of the present invention are obtaining keyword and target text to be generated After topic corresponding to this, wherein keyword is made of the lead-in of each word sequence in target text to be generated Word sequence in a part, keyword and topic can be based on first, by text generation model trained in advance, generate to First word sequence in the target text of generation.Then it can be at least based on first word sequence and preset simple or compound vowel of a Chinese syllable, led to Text generation model is crossed, according to preset rhymed rule, is generated in target text to be generated in addition to first word sequence Other word sequences;Finally, according to the sequencing for generating first word sequence and other word sequences, by first text Word sequence and other word sequences are combined, so that it may obtain required rhymed target text.In this way, due to obtained Other word sequences in target text to be generated in addition to first word sequence are according to preset simple or compound vowel of a Chinese syllable according to default Rhymed rule generated to rhyme, thus, which is rhymed with preset simple or compound vowel of a Chinese syllable, then, pass through first Sentence word sequence and other word sequence target texts generated are exactly rhymed.In this way, improving the effect for generating text Fruit improves user experience.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is the flow diagram of the document creation method in the embodiment of the present invention one;
Fig. 2 is the schematic diagram of the document creation system in the embodiment of the present invention two;
Fig. 3 is the flow diagram of the document creation method in the embodiment of the present invention two;
Fig. 4 is the structural schematic diagram of the text generating apparatus in the embodiment of the present invention three;
Fig. 5 is the structural schematic diagram of the electronic equipment in the embodiment of the present invention four.
Specific embodiment
The exemplary embodiment that the present invention will be described in more detail below with reference to accompanying drawings.Although showing the present invention in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention without should be by embodiments set forth here It is limited.It is to be able to thoroughly understand the present invention on the contrary, providing these embodiments, and can be by the scope of the present invention It is fully disclosed to those skilled in the art.
Embodiment one
The embodiment of the present invention provides a kind of document creation method, and text generation method can be applied to various need to carry out The occasion of rhymed text is generated, such as generates rhymed verse with a hidden head, generates the rhymed lyrics, generates rhymed speech draft.
Fig. 1 is the flow diagram of the document creation method in the embodiment of the present invention one, shown in Figure 1, and the text is raw Include: at method
S101: topic corresponding to keyword and target text to be generated is obtained;
Wherein, keyword is word sequence composed by the lead-in of each word sequence in target text to be generated In a part, the title of entitled target text to be generated.
In implementation process, which can be the text information that user directly inputs in the user interface, such as " little Hua It is very beautiful ", it is also possible to the text information extracted from a certain picture, it is, of course, also possible to obtain the pass by other means Keyword, the text information such as extracted according to the voice messaging of user, here, the embodiment of the present invention is not specifically limited.
In practical applications, the number of words of the keyword can be according to corresponding to the text type of target text to be generated Sentence number determines.Here, the text type of target text to be generated can be preset, can also be by user's operation Lai really It is fixed.Wherein, the text type of target text to be generated can be five-character quatrain, poem with five characters in one line, seven-word poem, Song Of Divination, such as dream It enables, small stream sand etc. of washing.Illustratively, when the text type of target text to be generated is five-character quatrain, the number of words of the keyword Less than or equal to 4;When the text type of target text to be generated be seven-character octave when, the number of words of the keyword be less than or Equal to 8;When the text type of target text to be generated be word name be wash small stream sand word when, the number of words of the keyword is less than Or it is equal to 6.Certainly, the number of words of the keyword is also possible to be set as needed by user.
In practical applications, which can be in pre-set topic library, the topic automatically selected at random, Such as " Song Of Divination chants plum ", " stepping on Stork " are also possible to the text information of user's input, such as " some day ", " meeting for the first time " Deng it is, of course, also possible to the topic be obtained by other means, such as from pre-stored topic library, according to the keyword Number of words matches a topic, and here, the embodiment of the present invention is not specifically limited.
Illustratively, when user wants to generate a rhymed target text, when such as rhymed verse with a hidden head, user can be straight The keyword for inputting and needing to ensconce in target text is connect, after then the server of generation text receives the keyword that user inputs, Server can select title of the topic as the target text from topic library, in this manner it is possible to achieve the keyword And topic.To generate corresponding rhymed target text according to the keyword and topic.
In an alternative embodiment of the invention, it when the type of target text is poem, can be instructed using ancient poetry as corpus Practice seq2seq model, to obtain the text generation model that can generate rhymed verse with a hidden head, then, before above-mentioned S101, on State document creation method further include: from pre-stored ancient poetry library, obtain ancient poetry;Utilize in the topic of ancient poetry, ancient poetry For first character in one and first as corpus, training obtains the first seq2seq model;Utilize each sentence in ancient poetry And its corresponding first character, as corpus, training obtains the 2nd seq2seq model;Using each in ancient poetry as language Material, training obtain the 3rd seq2seq model;By the first seq2seq model, the 2nd seq2seq model and the 3rd seq2seq mould Type is determined as text generation model.
In the specific implementation process, it is possible, firstly, to using in first and first in the topic of ancient poetry, ancient poetry First character trains the first seq2seq model as corpus, in this manner it is possible to by first in keyword obtained Word and topic, which are input in the first seq2seq model, generates first poem;Then, each sentence in ancient poetry and its right is used The first character answered trains the 2nd seq2seq model as corpus, in this way, when there are k-th in keyword obtained Word, and when needing to generate kth sentence poem, so that it may -1 poem of k-th of word and kth in keyword obtained is input to second Kth sentence poem is generated in seq2seq model, wherein k is the positive integer more than or equal to 2;Finally, using every in ancient poetry One is used as corpus, trains the 3rd seq2seq model, in this way, g-th of word is not present in keyword obtained, and needs When generating g poems, so that it may only g-1 poems is input in the 3rd seq2seq model and generate g poems, wherein G is the positive integer greater than k.
Certainly, in practical applications, using different types it is anticipated that such as article, the lyrics, love song, dialogue, so that it may Train the text generation model for generating different type text.For example, can be trained using existing love song as expecting Seq2seq model, to be created that the text generation model that can generate rhymed hiding head love song.Here, the embodiment of the present invention is not It is specifically limited.
S102: being based on keyword and topic, by text generation model trained in advance, generates target text to be generated In first word sequence;
Specifically, it is executing after S101 obtains keyword and topic, in order to generate the in target text to be generated One word sequence, can by keyword first character and topic be input to the text generate model in come generate this first Sentence word sequence.
S103: it is at least based on first word sequence and preset simple or compound vowel of a Chinese syllable, by text generation model, according to preset signature Rhythm rule, generates other word sequences in target text to be generated in addition to first word sequence;
Specifically, in order to enable target text generated be it is rhymed, target to be generated is being generated by S102 After first word sequence in text, if there are also other words other than first character in keyword, so that it may according to this Other words and preset simple or compound vowel of a Chinese syllable in one word sequence, keyword meet default rhyme by text generation model to generate Other word sequences in the target text to be generated of rule in addition to first word sequence;If in addition in keyword Without other words outside one word, so that it may be come according to first word sequence and preset simple or compound vowel of a Chinese syllable by text generation model Generate other word sequences in the target text to be generated for meeting the default rule that rhymes in addition to first word sequence.This Sample, since other word sequences in target text to be generated obtained in addition to first word sequence are according to default Simple or compound vowel of a Chinese syllable generate, and meet preset rhymed rule, thus, given birth to according to first word sequence and other word sequences At target text be exactly rhymed.
In practical applications, according to the difference of the text type of target text to be generated, target text when institute is generated The corresponding rhymed rule used is different, illustratively, when text type is regulated verse and the poem of four lines, corresponding rhymed rule It can be " even number sentence is rhymed ";When text type is the word of the entitled small stream sand of washing of word, corresponding rhymed rule is " rhythm arrives Bottom ".So, it according to the difference of rhymed rule, generates other in addition to first word sequence in target text to be generated The method of word sequence may include and be not limited to following three kinds of situations:
The first situation, when rhymed rule is " even number sentence is rhymed ", the text of even number sentence in target text to be generated Sequence is rhymed.
In the specific implementation process, above-mentioned S103 may comprise steps of:
Step 1031a: when i is 2, second is generated by text generation model according at least to first word sequence Word sequence;
Step 1031b: when i is 2n-1, according at least to (i-1)-th word sequence, by text generation model, the is generated I word sequences, wherein n is the positive integer more than or equal to 2, and i is the positive integer less than or equal to N, and N is to be generated Target text in include word sequence total number;
Step 1031c: when i is 2n, first simple or compound vowel of a Chinese syllable of the last character in second word sequence is determined as pre- If simple or compound vowel of a Chinese syllable;It is generated and second text according at least to (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable by text generation model I-th word sequence that word sequence mutually rhymes.
In practical applications, the size of total number N of the word sequence for including in target text to be generated can by with Family sets itself as needed, such as when user needs to generate 28 lyrics, N 28.The size of total sentence number N can also basis Sentence number corresponding to the text type of user's selection determines, wherein text type can be five-character quatrain, poem with five characters in one line, seven The speech poem of four lines, Song Of Divination, such as dream enables, small stream sand of washing.Illustratively, when user wants to generate five speech of a head according to keyword and topic When the poem of four lines, N is equal to 4;When user wants to generate a first seven-character octave according to keyword and topic, N is equal to 8;Work as user When wanting to generate the word of the entitled small stream sand of washing of a head-word board according to keyword and topic, N is equal to 6.Certainly, the size of total sentence number N It can also be set automatically by system, the embodiment of the present invention is not specifically limited.
Below by taking total number N of the text sequence of target text to be generated is equal to 4 as an example, meet to illustrate how to generate The detailed process of four word sequences of the rhymed rule of even number sentence.
It is possible, firstly, to first word sequence is generated by text generation model according to keyword and topic, It is secondary, model can be generated by the text to generate second word sequence according at least to first word sequence;Then, Model can be generated by the text to generate third sentence word sequence according at least to second word sequence;Finally, in order to So that the 4th word sequence mutually rhymes with second word sequence, can according at least to the third sentence word sequence and this second First simple or compound vowel of a Chinese syllable of the last character in sentence word sequence generates model by the text to generate and second word sequence phase The 4th rhymed word sequence.
Illustratively, it is assumed that the last character is " Chinese ", above-mentioned 4th word sequence in above-mentioned second word sequence Middle the last character is " elder generation ".In practical applications, the pronunciation of Chinese character is by initial consonant and simple or compound vowel of a Chinese syllable (a small number of Chinese characters dimerous Only simple or compound vowel of a Chinese syllable).For " Chinese ", its initial consonant is h, and simple or compound vowel of a Chinese syllable is an, reads han altogether, for " elder generation ", its initial consonant It is x, simple or compound vowel of a Chinese syllable is ian, reads xian altogether.Further, the mapping relations in a part of simple or compound vowel of a Chinese syllable rhythm portion are given in table 1, according to Content shown in table 1 it is found that know rhythm portion belonging to " Chinese " and " elder generation " (alternatively referred to as rhythm rut) be it is identical, therefore, the 4th Sentence word sequence and second word sequence are rhymed.
Table 1
Second situation, when rhymed rule is " first sentence rhymed+even number sentence rhymed ", target text to be generated (includes N Sentence word sequence) in, other than the word sequence of even number sentence is rhymed, first word sequence is also rhymed.
In the specific implementation process, above-mentioned S103 may comprise steps of:
Step 1032a: when i is 2m, second simple or compound vowel of a Chinese syllable of the last character in first word sequence is determined as pre- If simple or compound vowel of a Chinese syllable;It is generated and first text according at least to (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable by text generation model I-th word sequence that word sequence mutually rhymes, wherein m is the positive integer more than or equal to 1, and i is less than or equal to N's Positive integer, N are total number of the word sequence for including in target text to be generated;
Step 1032b: when i is 2m+1, according at least to (i-1)-th word sequence, by text generation model, the is generated I word sequences.
Illustratively, still by taking total number N of the text sequence of target text to be generated is equal to 4 as an example, when rhymed rule When then for " first sentence rhymed+even number sentence rhymed ", firstly, after obtaining first word sequence, in order to enable first sentence is rhymed It rhymes while meeting with even number sentence, it can be according at least to the last one in first word sequence and first word sequence Second simple or compound vowel of a Chinese syllable of word generates model by the text to generate second word sequence to rhyme with first word sequence;So Afterwards, model can be generated by the text to generate third sentence word sequence according at least to second word sequence;Finally, In order to enable the 4th word sequence mutually rhymes with first word sequence, according at least to the third sentence word sequence and can be somebody's turn to do Second simple or compound vowel of a Chinese syllable of the last character in first word sequence generates model by the text to generate and first text sequence The 4th word sequence that column mutually rhyme.
The third situation, when rhymed rule is " sentence sentence is rhymed ", target text to be generated (including N word sequences) In, rhythm portion belonging to the simple or compound vowel of a Chinese syllable of the last character in each text sequence is identical.
In the specific implementation process, above-mentioned S103 may comprise steps of:
Step 1033: the third simple or compound vowel of a Chinese syllable of the last character in first word sequence is determined as preset simple or compound vowel of a Chinese syllable;Extremely It is based on (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable less, by text generation model, generation mutually rhymes with first word sequence I-th word sequence, wherein i is positive integer more than or equal to 2, and i is the positive integer less than or equal to N, and N is Total number of the word sequence for including in target text to be generated.
Illustratively, still by taking total number N of the text sequence of target text to be generated is equal to 4 as an example, when rhymed rule When being then " sentence sentence is rhymed ", after obtaining first word sequence, so that it may by the last character of first word sequence As rhyme, by the second simple or compound vowel of a Chinese syllable of the last character, to generate other three texts to rhyme with first word sequence Word sequence.
Certainly, above-mentioned rhymed rule can also be other types, and such as " alternateing rhymed " i.e. odd number sentence and even number sentence are each From rhyming, the embodiment of the present invention is not specifically limited.
In addition, in other embodiments of the present invention, in order to which keyword to be hidden in the text sequence in target text to be generated In column, when generating other word sequences in target text to be generated in addition to first word sequence, according in keyword Whether there is also keywords, if above-mentioned steps 1031b or above-mentioned steps 1032b may include: that there are i-th in keyword Word generates i-th word sequence by text generation model according to i-th of word in (i-1)-th word sequence and keyword, So that the first character in i-th word sequence is i-th of word in keyword;Otherwise, according to (i-1)-th word sequence, lead to Text generation model is crossed, i-th word sequence is generated.
Specifically, when generating i-th word sequence in target text, if there are i-th of word in keyword, this When, it is necessary to it, therefore, can be by using i-th of word in keyword as the lead-in in i-th word sequence that will be generated I-th of word in i-1 word sequences and keyword, is input in text generation model, to generate i-th in the keyword I-th word sequence of a word as first sentence.And when i-th of word being not present in keyword, so that it may directly by i-th text The previous sentence word sequence of sequence, i.e. (i-1)-th word sequence, are input in text generation model and generate i-th text sequence Column.
Similarly, in other embodiments of the present invention, in order to keyword to be hidden in the text sequence in target text to be generated In column, generates and need rhymed text sequence in target text to be generated, i-th such as mutually to rhyme with first word sequence When word sequence, according in keyword, whether there is also keywords, in above-mentioned steps 1032a or above-mentioned steps 1033 " extremely It is based on (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable less, by text generation model, generation mutually rhymes with first word sequence I-th word sequence " the step of, if may include: in keyword there are i-th of word, according to (i-1)-th word sequence, I-th of word and preset simple or compound vowel of a Chinese syllable in keyword generate the mutually to rhyme with first word sequence by text generation model I word sequences;Otherwise, it is generated and first according to (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable by text generation model I-th word sequence that sentence word sequence mutually rhymes.
In an alternative embodiment of the invention, if the text type of target text is poem, above-mentioned first can be used Seq2seq model, the 2nd seq2seq model and the 3rd seq2seq model realize above-mentioned text generation model, at this point, above-mentioned S102 may include: by keyword first character and topic be input in the first seq2seq model, generate first text Sequence;
In other embodiments of the present invention, if the text type of target text is poem, above-mentioned first can be used Seq2seq model, the 2nd seq2seq model and the 3rd seq2seq model realize above-mentioned text generation model, are generating target When i-th word sequence in text, if there are i-th of word, above-mentioned steps 1031b or above-mentioned steps in keyword 1032b may include: that i-th of word in (i-1)-th word sequence and keyword is input in the 2nd seq2seq model, raw At i-th word sequence;When generating i-th word sequence in target text, if be not present i-th in keyword Word, above-mentioned steps 1031b or above-mentioned steps 1032b may include: that (i-1)-th word sequence is input to the 3rd seq2seq In model, i-th word sequence is generated.
It similarly, in other embodiments of the present invention, can be using above-mentioned the if the text type of target text is poem One seq2seq model, the 2nd seq2seq model and the 3rd seq2seq model realize above-mentioned text generation model, in order to will close Keyword is hidden in the word sequence in target text to be generated, is generated and is needed rhymed text sequence in target text to be generated Column, when i-th word sequence such as mutually to rhyme with first word sequence, according in keyword whether there is also keyword, In above-mentioned steps 1032a or above-mentioned steps 1033 " it is at least based on (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable, passes through text Model is generated, i-th word sequence mutually to rhyme with first word sequence is generated " the step of, if may include: keyword It is middle that there are i-th of words to pass through second according to i-th of the word and preset simple or compound vowel of a Chinese syllable in (i-1)-th word sequence, keyword Seq2seq model generates i-th word sequence mutually to rhyme with first word sequence;Otherwise, according to (i-1)-th text sequence Column and preset simple or compound vowel of a Chinese syllable generate i-th word sequence mutually to rhyme with first word sequence by the 3rd seq2seq model.
S104: according to the sequencing for generating first word sequence and other word sequences, by first word sequence It is combined with other word sequences, obtains rhymed target text.
Illustratively, it is assumed that executing S102 first word sequence obtained is " Liu carrys out elk or accompanies ", is executed S103 other word sequences obtained successively include " moral others between not sometimes ", " utterly routed to thing ", " handsome strongly fragrant fluttering is not Known to " this three word sequences, then, according to the sequencing for generating first word sequence and other word sequences, by the One word sequence and other word sequences are combined, and it is as follows to obtain rhymed target text:
Liu carrys out elk or accompanies,
Moral between others not sometimes.
It is utterly routed to thing,
Handsome strongly fragrant fluttering is unknowable.
So far, the process for generating rhymed text is just completed.
As shown in the above, document creation method provided by the embodiment of the present invention is obtaining keyword and to be generated Target text corresponding to after topic, wherein keyword is the head of each word sequence in target text to be generated A part in word sequence composed by word can be based on keyword and topic first, pass through text generation mould trained in advance Type generates first word sequence in target text to be generated.Then it can at least based on first word sequence and preset Simple or compound vowel of a Chinese syllable, according to preset rhymed rule, generated in target text to be generated except first text by text generation model Other word sequences other than sequence;Finally, according to the sequencing for generating first word sequence and other word sequences, it will First word sequence and other word sequences are combined, so that it may obtain required rhymed target text.In this way, due to Other word sequences in target text to be generated obtained in addition to first word sequence are according to preset simple or compound vowel of a Chinese syllable And the generation that rhymes according to preset rhymed rule, thus, which rhymed with preset simple or compound vowel of a Chinese syllable, that , it is exactly rhymed by first word sequence and other word sequence target texts generated.In this way, generating text This when, it will be able to improve the effect for generating text, improve user experience.
Embodiment two
Based on previous embodiment, the present embodiment provides a kind of document creation methods, are applied to following scene: mesh to be generated The text type for marking text is poem, and total number N of target text to be generated is 4, topic be from preset topic library with What machine was chosen, rhymed rule corresponding to target text is " even number sentence is rhymed ".
The embodiment of the present invention provides a kind of document creation system, shown in Figure 2, which includes: topic library 201, One seq2seq model 202, the 2nd seq2seq model 203 and the 3rd seq2seq model 204;Wherein, topic library 101 is to be used for Target topic is randomly choosed after obtaining keyword;First seq2seq model 202 is with the in the topic of ancient poetry, ancient poetry What the first character of one poem and first poem was trained as corpus, for generating the according to the keyword and topic of input One poem;2nd seq2seq model 203 is that each poem of ancient poetry and the corresponding first character of each poem is used to instruct as corpus Experienced, for generating next poem according to the keyword and a upper poem of input;3rd seq2seq model 204 is with ancient poetry Each poem is corpus training, can be in the total of the keyword in keyword for generating next poem according to a upper poem Number carrys out the verse that completion lacks when being less than total sentence number N.
In practical applications, seq2seq model, such as the first seq2seq model, the 2nd seq2seq model and described 3rd seq2seq model etc., comprising: encoder (Encoder) and decoder (Decoder) input in seq2seq model After word sequence A, the word sequence A of input is encoded into a state vector by the word sequence A of study input by encoder Then state vector S is passed to decoder, decoder uses searching algorithm such as Beam by study state vector S by S (beam search), greedy search (greedy search) etc. are searched for, another word sequence B is exported.
In implementation process, in order to generate rhymed word sequence B, the i.e. simple or compound vowel of a Chinese syllable of the last character in word sequence B Belong to identical rhythm portion with preset simple or compound vowel of a Chinese syllable, it, can be according to probability if decoder is searched in decoded state vector S using Beam Sequence obtains multiple word sequence B, then again using preset simple or compound vowel of a Chinese syllable as restrictive condition, search end word simple or compound vowel of a Chinese syllable and with it is preset Simple or compound vowel of a Chinese syllable belongs to the word sequence B in identical rhythm portion as generation result.Certainly, if decoder makes in decoded state vector S With greedy search, revised decoding strategy can be come with preset simple or compound vowel of a Chinese syllable in decoded state vector S, to generate rhymed text sequence Arrange B.
Fig. 3 is the flow diagram of the document creation method in the embodiment of the present invention two, shown in Figure 3, this method packet It includes:
S301: keyword is obtained;
S302: from preset topic library, a target topic is determined at random;
S303: by keyword first character and target topic be input in the first seq2seq model, generate first Poem;
S304: it determines in keyword with the presence or absence of second word;
If it is determined that second word is not present in keyword, S305a is executed, to generate second poem;Otherwise, it executes S305b, to generate second poem.
S305a: first poem is input in the 3rd seq2seq model, generates second poem;
Executing after S305a obtains second poem, executing S306 to S307, come obtain third sentence poem and with second poem Mutually the 4th rhymed poem.
S306: second poem is input in the 3rd seq2seq model, generates third sentence poem;
S307: the simple or compound vowel of a Chinese syllable of the last character in third sentence poem and second poem is input in the 3rd seq2seq model, raw At the 4th poem mutually to rhyme with second poem;
In practical applications, the simple or compound vowel of a Chinese syllable of the last character in third sentence poem and second poem is being input to third After in seq2seq model, firstly, the 3rd seq2seq model can generate multiple 4th poems according to probability according to third sentence poem, Then, the 3rd seq2seq model can be using the simple or compound vowel of a Chinese syllable of the last character in second poem as screening conditions, to filter out and second The 4th poem that sentence poem mutually rhymes.
S305b: second word in first poem and keyword is input in the 2nd seq2seq model, generates second Sentence poem;
After executing S305b and obtaining second poem, S308 is executed.
S308: it determines in keyword with the presence or absence of third word;
If it is determined that third word is not present in keyword, execution S306 to S307, to obtain third sentence poem and with second The 4th poem that sentence poem mutually rhymes;Otherwise, S309 is executed, to generate third sentence poem.
S309: the third word in second poem and keyword is input in the 2nd seq2seq model, generates third sentence Poem;
After executing S309 and obtaining third sentence poem, S310 is executed.
S310: it determines in keyword with the presence or absence of the 4th word;
If it is determined that the 4th word is not present in keyword, S307 is executed, the 4th poem mutually to rhyme with second poem; Otherwise, S311 is executed, to generate the 4th poem mutually to rhyme with second poem.
S311: the simple or compound vowel of a Chinese syllable of the 4th word in third sentence poem, keyword and the last character in second poem is input to In 2nd seq2seq model, the 4th poem mutually to rhyme with second poem is generated.
In practical applications, by the 4th word in third sentence poem, keyword and the last character in second poem After simple or compound vowel of a Chinese syllable is input in the 2nd seq2seq model, firstly, the 2nd seq2seq model can be according in third sentence poem and keyword The 4th word, generate the 4th poem that multiple lead-ins are the 4th word according to probability, then, the 3rd seq2seq model can be with The simple or compound vowel of a Chinese syllable of the last character is as screening conditions in second poem, come filter out and lead-in mutually rhymed with second poem be this 4th poem of four words.
S312: first poem to the 4th poem is arranged by the sequencing of generation, obtains rhymed target text.
Finally, after obtaining first poem, second poem, third sentence poem and the 4th poem, so that it may extremely by first poem 4th poem is arranged by the sequencing of generation, to obtain rhymed target text.
As shown in the above, in the document creation method provided in inventive embodiments, due to obtaining first It is mutually to be given as security according to what the simple or compound vowel of a Chinese syllable of the last character in second poem generated with second poem after sentence poem, second poem, third sentence poem 4th poem of rhythm, then, it is exactly rhymed by first poem to the 4th poem target text generated.In this way, When generating text, it will be able to improve the effect of text generated, improve user experience.
Embodiment three
Based on the same inventive concept, as an implementation of the above method, the embodiment of the invention provides a kind of text generations Device, the Installation practice is corresponding with preceding method embodiment, and to be easy to read, present apparatus embodiment is no longer to preceding method reality The detail content applied in example is repeated one by one, it should be understood that the device in the present embodiment can correspond to realization preceding method Full content in embodiment.
Fig. 4 is the structural schematic diagram of the text generating apparatus in the embodiment of the present invention three, shown in Figure 4, the device 40 It include: first obtains unit 401, the first generation unit 402, the second generation unit 403 and the second obtaining unit 404, wherein First obtains unit 401, for obtaining topic corresponding to keyword and target text to be generated, wherein keyword be to A part in word sequence composed by the lead-in of each word sequence in the target text of generation;First generation unit 402, for being based on keyword and topic, by text generation model trained in advance, generate in target text to be generated First word sequence;Second generation unit 403 passes through text for being at least based on first word sequence and preset simple or compound vowel of a Chinese syllable This generation model generates its in target text to be generated in addition to first word sequence according to preset rhymed rule Its word sequence;Second obtaining unit 404, for according to generating the successive suitable of first word sequence and other word sequences First word sequence and other word sequences are combined, obtain rhymed target text by sequence.
In embodiments of the present invention, the second generation unit, for according at least to first word sequence, leading to when i is 2 Text generation model is crossed, second word sequence is generated;When i is 2n-1, according at least to (i-1)-th word sequence, pass through text This generation model generates i-th word sequence, wherein n is the positive integer more than or equal to 2, and i is less than or equal to N's Positive integer, N are total number of the word sequence for including in target text to be generated;When i is 2n, by second word sequence In first simple or compound vowel of a Chinese syllable of the last character be determined as preset simple or compound vowel of a Chinese syllable;According at least to (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable, By text generation model, i-th word sequence mutually to rhyme with second word sequence is generated.
In embodiments of the present invention, the second generation unit is used for when i is 2m, will be last in first word sequence Second simple or compound vowel of a Chinese syllable of one word is determined as preset simple or compound vowel of a Chinese syllable;According at least to (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable, pass through text Model is generated, i-th word sequence mutually to rhyme with first word sequence is generated, wherein m is just more than or equal to 1 Integer, i are the positive integer less than or equal to N, and N is total number of the word sequence for including in target text to be generated;Work as i When for 2m+1, i-th word sequence is generated by text generation model according at least to (i-1)-th word sequence.
In embodiments of the present invention, the second generation unit, if for there are i-th of words in keyword, according to (i-1)-th I-th of word in word sequence and keyword generates i-th word sequence by text generation model, so that i-th text First character in sequence is i-th of word in keyword;Otherwise, according to (i-1)-th word sequence, pass through text generation mould Type generates i-th word sequence.
In other embodiments of the present invention, the device further include: acquiring unit, the first training unit, the second training unit, Third training unit and determination unit, wherein acquiring unit, for obtaining ancient poetry from pre-stored ancient poetry library;First instruction Practice unit, for using the first character in first and first in the topic of ancient poetry, ancient poetry as corpus, trained To the first seq2seq model;Second training unit, for using in ancient poetry each sentence and its corresponding first character as language Material, training obtain the 2nd seq2seq model;Third training unit, for using each in ancient poetry as corpus, trained To the 3rd seq2seq model;Determination unit is used for the first seq2seq model, the 2nd seq2seq model and the 3rd seq2seq Model is determined as text generation model.
In embodiments of the present invention, the first generation unit, for by keyword first character and topic be input to In one seq2seq model, first word sequence is generated;Second generation unit, if be used in keyword there are i-th of word, I-th of word in (i-1)-th word sequence and keyword is input in the 2nd seq2seq model, i-th text sequence is generated Column;If be also used in keyword, there is no i-th of words, and (i-1)-th word sequence is input in the 3rd seq2seq model, Generate i-th word sequence.
In embodiments of the present invention, the second generation unit, for by of the last character in first word sequence Three simple or compound vowel of a Chinese syllable are determined as preset simple or compound vowel of a Chinese syllable;It is at least based on (i-1)-th word sequence and preset simple or compound vowel of a Chinese syllable, by text generation model, Generate i-th word sequence mutually to rhyme with first word sequence, wherein i is the positive integer more than or equal to 2, and i is Positive integer less than or equal to N, N are total number of the word sequence for including in target text to be generated.
Above-mentioned text generating apparatus includes processor and memory, above-mentioned first obtains unit, the first generation unit, second Generation unit, second obtaining unit etc. store in memory as program unit, are stored in memory by processor execution In above procedure unit realize corresponding function.
Above-mentioned processor can be by central processing unit (Central Processing Unit, CPU), microprocessor (Micro Processor Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field-programmable Gate array (Field Programmable Gate Array, FPGA) etc. is realized.
Memory may include the non-volatile memory in computer-readable medium, random access memory (Random Access Memory, RAM) and/or the forms such as Nonvolatile memory, such as read-only memory (Read Only Memory, ROM) Or flash memory (Flash RAM), memory include at least one storage chip.
Based on the same inventive concept, the embodiment of the present invention provides a kind of storage medium, is stored thereon with program, the program quilt Processor realizes above-mentioned document creation method when executing.
Based on the same inventive concept, the embodiment of the present invention provides a kind of processor, and processor is for running program, wherein Program executes above-mentioned document creation method when running.
Since the text generating apparatus that the present embodiment is introduced is the text generation side that can be executed in the embodiment of the present invention The device of method, so based on document creation method described in the embodiment of the present invention, those skilled in the art can be much of that The specific embodiment and its various change form of the text generating apparatus of the present embodiment are solved, so raw for the text herein How to realize that the document creation method in the embodiment of the present invention is no longer discussed in detail at device.As long as those skilled in the art Implement device used by document creation method in the embodiment of the present invention, belongs to the range to be protected of the application.
In practical applications, text generating means can be applied in electronic equipment.Electronic equipment can be in a variety of manners To implement.For example, electronic equipment described in the embodiment of the present invention may include such as intelligent sound box, mobile phone, tablet computer, pen Remember this computer, palm PC, personal digital assistant (Personal Digital Assistant, PDA), portable media play The mobile terminals such as device (Portable Media Player, PMP), navigation device, wearable device, Intelligent bracelet, pedometer, And the fixed terminals such as smart television, desktop computer, server.
Example IV
Based on the same inventive concept, the embodiment of the present invention provides a kind of electronic equipment.Fig. 5 is in the embodiment of the present invention four The structural schematic diagram of electronic equipment, shown in Figure 5, which includes: at least one processor 51;And with it is described At least one processor 52, the bus 53 of the connection of processor 51;Wherein, the processor 51, memory 52 pass through the bus 53 complete mutual communication;The processor 51 is used to call the program instruction in the memory 52, to execute following step It is rapid: to obtain topic corresponding to keyword and target text to be generated, wherein keyword is in target text to be generated A part in word sequence composed by the lead-in of each word sequence;Based on keyword and topic, by training in advance Text generation model, generate first word sequence in target text to be generated;At least it is based on first word sequence It, according to preset rhymed rule, is generated in target text to be generated except first with preset simple or compound vowel of a Chinese syllable by text generation model Other word sequences other than sentence word sequence;According to generate first word sequence and other word sequences sequencing, First word sequence and other word sequences are combined, rhymed target text is obtained.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: when i is 2, until It is few that second word sequence is generated by text generation model according to first word sequence;When i is 2n-1, according at least to (i-1)-th word sequence generates i-th word sequence by text generation model, wherein n is just more than or equal to 2 Integer, i are the positive integer less than or equal to N, and N is total number of the word sequence for including in target text to be generated;Work as i When for 2n, first simple or compound vowel of a Chinese syllable of the last character in second word sequence is determined as preset simple or compound vowel of a Chinese syllable;According at least to (i-1)-th Sentence word sequence and preset simple or compound vowel of a Chinese syllable generate i-th text mutually to rhyme with second word sequence by text generation model Sequence.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: when i is 2m, Second simple or compound vowel of a Chinese syllable of the last character in first word sequence is determined as preset simple or compound vowel of a Chinese syllable;According at least to (i-1)-th text Sequence and preset simple or compound vowel of a Chinese syllable generate i-th word sequence mutually to rhyme with first word sequence by text generation model, Wherein, m is the positive integer more than or equal to 1, and i is the positive integer less than or equal to N, and N is in target text to be generated Total number of the word sequence for including;When i is 2m+1, according at least to (i-1)-th word sequence, by text generation model, Generate i-th word sequence.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: if keyword Middle there are i-th of words, generate i-th by text generation model according to i-th of word in (i-1)-th word sequence and keyword Sentence word sequence, so that the first character in i-th word sequence is i-th of word in keyword;Otherwise, according to (i-1)-th Word sequence generates i-th word sequence by text generation model.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: from being stored in advance Ancient poetry library in, obtain ancient poetry;Using the first character in first and first in the topic of ancient poetry, ancient poetry as language Material, training obtain the first seq2seq model;Using in ancient poetry each sentence and its corresponding first character as corpus, training Obtain the 2nd seq2seq model;Using each in ancient poetry as corpus, training obtains the 3rd seq2seq model;By first Seq2seq model, the 2nd seq2seq model and the 3rd seq2seq model, are determined as text generation model.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: will be in keyword First character and topic be input in the first seq2seq model, generate first word sequence;If there are in keyword I-th of word in (i-1)-th word sequence and keyword is input in the 2nd seq2seq model by i word, generates i-th text Word sequence;If i-th of word is not present in keyword, (i-1)-th word sequence is input in the 3rd seq2seq model, it is raw At i-th word sequence.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs: by first text The third simple or compound vowel of a Chinese syllable of the last character in word sequence is determined as preset simple or compound vowel of a Chinese syllable;It is at least based on (i-1)-th word sequence and presets Simple or compound vowel of a Chinese syllable i-th word sequence mutually to rhyme with first word sequence is generated, wherein i is big by text generation model In or equal to 2 positive integer, and i is positive integer less than or equal to N, and N is the text for including in target text to be generated Total number of word sequence.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, Usable storage medium (including but not limited to magnetic disk storage, CD-ROM (Compact Disc Read-Only Memory, CD-ROM), optical memory etc.) on the form of computer program product implemented.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, RAM and/or Nonvolatile memory etc. Form, such as ROM or Flash RAM.Memory is the example of computer-readable medium.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. Computer readable storage medium can be ROM, programmable read only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read Only Memory EPROM (Erasable Programmable Read-Only Memory, EPROM), electricity Erasable Programmable Read Only Memory EPROM (Electrically Erasable Programmable Read-Only Memory, EEPROM), magnetic RAM (Ferromagnetic Random Access Memory, FRAM), flash Device (Flash Memory), magnetic surface storage, CD or CD-ROM (Compact Disc Read-Only Memory, The memories such as CD-ROM);Be also possible to flash memory or other memory techniques, CD-ROM, digital versatile disc (DVD) or Other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium, It can be used for storing and can be accessed by a computing device information;It can also be various including one of above-mentioned memory or any combination Electronic equipment, such as mobile phone, computer, tablet device, personal digital assistant.As defined in this article, computer can Reading medium not includes temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of the present invention can provide as method, system or computer program product. Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the present invention Form.It is deposited moreover, the present invention can be used to can be used in the computer that one or more wherein includes computer usable program code The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.
The above is only the embodiment of the present invention, are not intended to restrict the invention.To those skilled in the art, The invention may be variously modified and varied.It is all within the spirit and principles of the present invention made by any modification, equivalent replacement, Improve etc., it should be included within scope of the presently claimed invention.

Claims (10)

1. a kind of document creation method, which is characterized in that the described method includes:
Obtain topic corresponding to keyword and target text to be generated, wherein the keyword is the mesh to be generated Mark a part in word sequence composed by the lead-in of each word sequence in text;
The target to be generated is generated by text generation model trained in advance based on the keyword and the topic First word sequence in text;
It is at least based on first word sequence and preset simple or compound vowel of a Chinese syllable, by the text generation model, according to preset signature Rhythm rule, generates other word sequences in the target text to be generated in addition to first word sequence;
According to the sequencing for generating first word sequence and other word sequences, by first text sequence Column and other word sequences are combined, and obtain rhymed target text.
2. the method according to claim 1, wherein described be at least based on first word sequence and preset Simple or compound vowel of a Chinese syllable according to preset rhymed rule, generate and remove institute in the target text to be generated by the text generation model State other word sequences other than first word sequence, comprising:
When i is 2, second text sequence is generated by the text generation model according at least to first word sequence Column;
When i is 2n-1, i-th text sequence is generated by the text generation model according at least to (i-1)-th word sequence Column, wherein n is the positive integer more than or equal to 2, and i is the positive integer less than or equal to N, and N is target text to be generated In include word sequence total number;
When i is 2n, first simple or compound vowel of a Chinese syllable of the last character in second word sequence is determined as the preset simple or compound vowel of a Chinese syllable;Extremely It is few to be generated and second text sequence according to (i-1)-th word sequence and the preset simple or compound vowel of a Chinese syllable by the text generation model I-th word sequence that column mutually rhyme.
3. the method according to claim 1, wherein described be at least based on first word sequence and preset Simple or compound vowel of a Chinese syllable according to preset rhymed rule, generate and remove institute in the target text to be generated by the text generation model State other word sequences other than first word sequence, comprising:
When i is 2m, second simple or compound vowel of a Chinese syllable of the last character in first word sequence is determined as the preset simple or compound vowel of a Chinese syllable;Extremely It is few to be generated and first text sequence according to (i-1)-th word sequence and the preset simple or compound vowel of a Chinese syllable by the text generation model I-th word sequence that column mutually rhyme, wherein m is the positive integer more than or equal to 1, and i is just whole less than or equal to N Number, N are total number of the word sequence for including in target text to be generated;
When i is 2m+1, i-th text sequence is generated by the text generation model according at least to (i-1)-th word sequence Column.
4. according to the method in claim 2 or 3, which is characterized in that it is described according at least to (i-1)-th word sequence, By the text generation model, i-th word sequence is generated, comprising:
Led to if there are i-th of words in the keyword according to i-th of word in (i-1)-th word sequence and the keyword The text generation model is crossed, i-th word sequence is generated, so that the first character in i-th word sequence is described I-th of word in keyword;Otherwise, i-th text is generated by the text generation model according to (i-1)-th word sequence Sequence.
5. according to the method described in claim 4, it is characterized in that, before the acquisition keyword and topic, the method Further include:
From pre-stored ancient poetry library, ancient poetry is obtained;
Using first in the topic of the ancient poetry, the ancient poetry and it is first described in first character as corpus, Training obtains the first seq2seq model;
Using each sentence and its corresponding first character stated in ancient poetry as corpus, training obtains the 2nd seq2seq model;
Using each in the ancient poetry as corpus, training obtains the 3rd seq2seq model;
The first seq2seq model, the 2nd seq2seq model and the 3rd seq2seq model are determined as described Text generation model.
6. according to the method described in claim 5, it is characterized in that, described be based on the keyword and the topic, by pre- First trained text generation model generates first word sequence in the target text to be generated, comprising: by the pass First character and the topic in keyword are input in the first seq2seq model, generate first word sequence;
If there are i-th of words in the keyword, according to i-th in (i-1)-th word sequence and the keyword Word generates i-th word sequence by the text generation model, comprising: by (i-1)-th word sequence and the pass I-th of word in keyword is input in the 2nd seq2seq model, generates i-th word sequence;
If there is no i-th of words in the keyword generates i-th word sequence according to (i-1)-th word sequence, Include: that (i-1)-th word sequence is input in the 3rd seq2seq model, generates i-th word sequence.
7. the method according to claim 1, wherein described be at least based on first word sequence and preset Simple or compound vowel of a Chinese syllable according to preset rhymed rule, generate and remove institute in the target text to be generated by the text generation model State other word sequences other than first word sequence, comprising:
The third simple or compound vowel of a Chinese syllable of the last character in first word sequence is determined as the preset simple or compound vowel of a Chinese syllable;At least it is based on i-th- 1 word sequence and the preset simple or compound vowel of a Chinese syllable, by the text generation model, generation mutually rhymes with first word sequence I-th word sequence, wherein i is positive integer more than or equal to 2, and i is the positive integer less than or equal to N, N be to Total number of the word sequence for including in the target text of generation.
8. a kind of text generating apparatus, which is characterized in that described device includes: first obtains unit, the first generation unit, second Generation unit and the second obtaining unit, wherein
The first obtains unit, for obtaining topic corresponding to keyword and target text to be generated, wherein the pass Keyword is a part in word sequence composed by the lead-in of each word sequence in the target text to be generated;
First generation unit, for being based on the keyword and the topic, by text generation model trained in advance, Generate first word sequence in the target text to be generated;
Second generation unit passes through the text for being at least based on first word sequence and preset simple or compound vowel of a Chinese syllable Generate model, according to preset rhymed rule, generate in the target text to be generated except first word sequence with Outer other word sequences;
Second obtaining unit, for according to generating the successive suitable of first word sequence and other word sequences First word sequence and other word sequences are combined, obtain rhymed target text by sequence.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program When control the storage medium where equipment execute document creation method as described in any one of claim 1 to 7.
10. a kind of electronic equipment, which is characterized in that the electronic equipment includes:
At least one processor;
And at least one processor, the bus being connected to the processor;
Wherein, the processor, memory complete mutual communication by the bus;The processor is described for calling Program instruction in memory, to execute document creation method as described in any one of claim 1 to 7.
CN201810540691.2A 2018-05-30 2018-05-30 Text generation method and device Active CN109002433B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810540691.2A CN109002433B (en) 2018-05-30 2018-05-30 Text generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810540691.2A CN109002433B (en) 2018-05-30 2018-05-30 Text generation method and device

Publications (2)

Publication Number Publication Date
CN109002433A true CN109002433A (en) 2018-12-14
CN109002433B CN109002433B (en) 2022-04-01

Family

ID=64574195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810540691.2A Active CN109002433B (en) 2018-05-30 2018-05-30 Text generation method and device

Country Status (1)

Country Link
CN (1) CN109002433B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134960A (en) * 2019-05-15 2019-08-16 北京奇艺世纪科技有限公司 A kind of generation method and relevant device of text
CN110209803A (en) * 2019-06-18 2019-09-06 腾讯科技(深圳)有限公司 Story generation method, device, computer equipment and storage medium
CN110287489A (en) * 2019-06-24 2019-09-27 北京大米科技有限公司 Document creation method, device, storage medium and electronic equipment
CN110377902A (en) * 2019-06-21 2019-10-25 北京百度网讯科技有限公司 The training method and device of text generation model are described
CN110705310A (en) * 2019-09-20 2020-01-17 北京金山数字娱乐科技有限公司 Article generation method and device
CN111444679A (en) * 2020-03-27 2020-07-24 北京小米松果电子有限公司 Poetry generation method and device, electronic equipment and storage medium
CN111767694A (en) * 2019-03-26 2020-10-13 北京京东尚科信息技术有限公司 Text generation method and device and computer readable storage medium
CN111783455A (en) * 2020-07-13 2020-10-16 网易(杭州)网络有限公司 Training method and device of text generation model and text generation method and device
CN111898339A (en) * 2020-07-28 2020-11-06 中国平安人寿保险股份有限公司 Ancient poetry generation method, device, equipment and medium based on constraint decoding
CN115994532A (en) * 2023-03-22 2023-04-21 暗链科技(深圳)有限公司 Corpus classification method, nonvolatile readable storage medium and electronic device
CN116011430A (en) * 2023-03-22 2023-04-25 暗链科技(深圳)有限公司 Vowel duplication elimination method, nonvolatile readable storage medium and electronic equipment
CN116011431A (en) * 2023-03-22 2023-04-25 暗链科技(深圳)有限公司 Method for generating mnemonic words and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385596A (en) * 2010-09-03 2012-03-21 腾讯科技(深圳)有限公司 Verse searching method and device
CN105185373A (en) * 2015-08-06 2015-12-23 百度在线网络技术(北京)有限公司 Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus
CN105551481A (en) * 2015-12-21 2016-05-04 百度在线网络技术(北京)有限公司 Rhythm marking method of voice data and apparatus thereof
CN105955964A (en) * 2016-06-13 2016-09-21 北京百度网讯科技有限公司 Method and apparatus for automatically generating poem
CN106569995A (en) * 2016-09-26 2017-04-19 天津大学 Method for automatically generating Chinese poetry based on corpus and metrical rule
CN106776517A (en) * 2016-12-20 2017-05-31 科大讯飞股份有限公司 Automatic compose poem method and apparatus and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385596A (en) * 2010-09-03 2012-03-21 腾讯科技(深圳)有限公司 Verse searching method and device
CN105185373A (en) * 2015-08-06 2015-12-23 百度在线网络技术(北京)有限公司 Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus
CN105551481A (en) * 2015-12-21 2016-05-04 百度在线网络技术(北京)有限公司 Rhythm marking method of voice data and apparatus thereof
CN105955964A (en) * 2016-06-13 2016-09-21 北京百度网讯科技有限公司 Method and apparatus for automatically generating poem
CN106569995A (en) * 2016-09-26 2017-04-19 天津大学 Method for automatically generating Chinese poetry based on corpus and metrical rule
CN106776517A (en) * 2016-12-20 2017-05-31 科大讯飞股份有限公司 Automatic compose poem method and apparatus and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QIXIN WANG 等: "Can M achine Generate Traditional Chinese Po etry? A Feigenbaum Test", 《SPRINGER INTERNATIONAL PUBLISHING AG 2016》 *
XIAOYUAN YI 等: "Generating Chinese Classical Poems with RNN Encoder-Decoder", 《SPRINGER INTERNATIONAL PUBLISHING AG 2017》 *
王哲: "基于深度学习技术的中国传统诗歌生成方法研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767694A (en) * 2019-03-26 2020-10-13 北京京东尚科信息技术有限公司 Text generation method and device and computer readable storage medium
CN111767694B (en) * 2019-03-26 2024-04-16 北京京东尚科信息技术有限公司 Text generation method, apparatus and computer readable storage medium
CN110134960A (en) * 2019-05-15 2019-08-16 北京奇艺世纪科技有限公司 A kind of generation method and relevant device of text
CN110209803A (en) * 2019-06-18 2019-09-06 腾讯科技(深圳)有限公司 Story generation method, device, computer equipment and storage medium
CN110209803B (en) * 2019-06-18 2023-11-14 腾讯科技(深圳)有限公司 Story generation method, apparatus, computer device and storage medium
CN110377902A (en) * 2019-06-21 2019-10-25 北京百度网讯科技有限公司 The training method and device of text generation model are described
CN110377902B (en) * 2019-06-21 2023-07-25 北京百度网讯科技有限公司 Training method and device for descriptive text generation model
CN110287489A (en) * 2019-06-24 2019-09-27 北京大米科技有限公司 Document creation method, device, storage medium and electronic equipment
CN110705310B (en) * 2019-09-20 2023-07-18 北京金山数字娱乐科技有限公司 Article generation method and device
CN110705310A (en) * 2019-09-20 2020-01-17 北京金山数字娱乐科技有限公司 Article generation method and device
CN111444679A (en) * 2020-03-27 2020-07-24 北京小米松果电子有限公司 Poetry generation method and device, electronic equipment and storage medium
CN111444679B (en) * 2020-03-27 2024-05-24 北京小米松果电子有限公司 Poem generation method and device, electronic equipment and storage medium
CN111783455A (en) * 2020-07-13 2020-10-16 网易(杭州)网络有限公司 Training method and device of text generation model and text generation method and device
CN111783455B (en) * 2020-07-13 2024-06-04 网易(杭州)网络有限公司 Training method and device of text generation model, and text generation method and device
CN111898339B (en) * 2020-07-28 2023-07-21 中国平安人寿保险股份有限公司 Ancient poetry generating method, device, equipment and medium based on constraint decoding
CN111898339A (en) * 2020-07-28 2020-11-06 中国平安人寿保险股份有限公司 Ancient poetry generation method, device, equipment and medium based on constraint decoding
CN116011431A (en) * 2023-03-22 2023-04-25 暗链科技(深圳)有限公司 Method for generating mnemonic words and electronic equipment
CN116011430A (en) * 2023-03-22 2023-04-25 暗链科技(深圳)有限公司 Vowel duplication elimination method, nonvolatile readable storage medium and electronic equipment
CN115994532A (en) * 2023-03-22 2023-04-21 暗链科技(深圳)有限公司 Corpus classification method, nonvolatile readable storage medium and electronic device
CN116011430B (en) * 2023-03-22 2024-04-02 暗链科技(深圳)有限公司 Vowel duplication elimination method, nonvolatile readable storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109002433B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN109002433A (en) A kind of document creation method and device
Dourish The stuff of bits: An essay on the materialities of information
CN112086086A (en) Speech synthesis method, device, equipment and computer readable storage medium
CN110188362A (en) Text handling method and device
CN111951780B (en) Multitasking model training method for speech synthesis and related equipment
CN110264987A (en) Chord based on deep learning carries out generation method
CN109977382A (en) Verse generates the training method of model, writes the poem method and device automatically
CN110134960A (en) A kind of generation method and relevant device of text
US20120065979A1 (en) Method and system for text to speech conversion
Nash Manhattan: End-User Programming for Music.
Kennedy Chaos media: a sonic economy of digital space
Murail Target practice
CN113591472B (en) Lyric generation method, lyric generation model training method and device and electronic equipment
CN111666445A (en) Scene lyric display method and device and sound box equipment
Wu et al. Beyond Language Models: Byte Models are Digital World Simulators
Collins et al. Remixing AIs: mind swaps, hybrainity, and splicing musical models
Díaz-Báñez Mathematics and Flamenco: An Unexpected Partnership.
Davel et al. Default-and-refinement approach to pronunciation prediction
Roncoroni Electronic music and generative remixing: improving L-systems aesthetics and algorithms
Assayag Improvising in creative symbolic interaction
Emmerson Listening with machines: a shared approach
Noble A systems biological interpretation of the concept of no-self (anātman)
CN108694934A (en) A kind of bitmap-converted is the method for music rhythm
Moncada Ukhu pacha and La historia de nosotros: electroacoustic music composition portfolio
Hadas Hallucination or Classification: How Computational Literature Interacts with Text Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221206

Address after: 210034 floor 8, building D11, Hongfeng Science Park, Nanjing Economic and Technological Development Zone, Jiangsu Province

Patentee after: New Technology Co.,Ltd.

Address before: 100094 1001, 10th floor, office building a, 19 Zhongguancun Street, Haidian District, Beijing

Patentee before: MOBVOI INFORMATION TECHNOLOGY Co.,Ltd.