CN106227714A - A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence - Google Patents
A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence Download PDFInfo
- Publication number
- CN106227714A CN106227714A CN201610556319.1A CN201610556319A CN106227714A CN 106227714 A CN106227714 A CN 106227714A CN 201610556319 A CN201610556319 A CN 201610556319A CN 106227714 A CN106227714 A CN 106227714A
- Authority
- CN
- China
- Prior art keywords
- key word
- word
- poem
- basis
- language material
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
It is an object of the invention to provide a kind of method and apparatus obtaining the key word generating poem based on artificial intelligence.The method according to the invention includes: extract one or more bases key word from poem solicited message;When a basis key word is not in poem corpus, obtain the one or more expanded keyword corresponding with this basis key word;By in the one or more expanded keyword, select to be contained at least one expanded keyword in described poem corpus, as the language material key word corresponding with this key word, to generate corresponding verse based on this language material key word.It is an advantage of the current invention that: by basis key word is extended, carry out the conversion between optimized integration key word and language material key word so that poem automatically generates mechanism and copes with continuous renewal and the change of modern languages.
Description
Technical field
The present invention relates to field of computer technology, particularly relate to a kind of key obtaining generation poem based on artificial intelligence
The method and apparatus of word.
Background technology
Artificial intelligence (Artificial Intelligence), english abbreviation is AI.It is research, be developed for simulation,
One new science of technology of theory, method, technology and the application system of the intelligence of extension and extension people.Artificial intelligence is to calculate
One branch of machine science, its attempt understands the essence of intelligence, and produce a kind of new can be in the way of human intelligence be similar
The intelligent machine made a response, the research in this field includes robot, language identification, image recognition, natural language processing and specially
Family's system etc..
In existing poem generation technique, it is typically only capable to accept the input of key word, and long sentence input cannot be accepted, with
Time, the key word accepted typically requires as the everyday words in classic poetry, and, the generation process of poem also relies primarily on ancient poetry
The corpus of word.But, the natural language that people are used develops into today, has occurred in that to run into and does not comprises in classic poetry
Modern vocabulary, such as, new things title or the name etc. of modern such as " chaffy dish ", " Zhou Jielun ";Further, some vocabulary are in the modern times
It is provided with implication diverse with ancient times.In the case, existing poem generating mode often cannot realize respectively
Plant the fusion between new term and the classic poetry rhythm, it is impossible to the natural language needed for generating poem is carried out the most appropriate recognizing
Know and process.
Summary of the invention
It is an object of the invention to provide a kind of method and apparatus obtaining the key word generating poem based on artificial intelligence.
According to an aspect of the invention, it is provided a kind of side obtaining the key word generating poem based on artificial intelligence
Method, wherein, said method comprising the steps of:
A extracts one or more bases key word from poem solicited message;
B, when a basis key word is not in poem corpus, obtains corresponding with this basis key word one or more
Expanded keyword;
C, by the one or more expanded keyword, selects at least one being contained in described poem corpus to expand
Exhibition key word, as the language material key word corresponding with this key word, to generate corresponding verse based on this language material key word.
According to an aspect of the invention, it is provided a kind of acquisition based on artificial intelligence generates taking of the key word of poem
Word device, wherein, described in take word device and include:
Extraction element, for extracting one or more bases key word from poem solicited message;
First acquisition device, for when a basis key word is not in poem corpus, obtaining and this basis key word
Corresponding one or more expanded keyword;
First selects device, for by the one or more expanded keyword, selecting to be contained in described poem language material
At least one expanded keyword in storehouse, as the language material key word corresponding with this key word, with raw based on this language material key word
Become corresponding verse.
Compared with prior art, the invention have the advantages that by basis key word is extended, carry out optimized integration
Conversion between key word and language material key word such that it is able to automatically generate and both meet original poem solicited message, meet again
The poetry works that the rhythm of classical poems, word etc. require.Achieve between modern languages culture and poem type and term
Merge so that poem automatically generates mechanism and copes with continuous renewal and the change of modern languages.It is thus possible to it is broadly full
The demand of the foot user poem for generating.
Accompanying drawing explanation
By the detailed description that non-limiting example is made made with reference to the following drawings of reading, other of the present invention
Feature, purpose and advantage will become more apparent upon:
Fig. 1 illustrates the method stream that a kind of based on artificial intelligence the acquisition according to the present invention generates the key word of poem
Cheng Tu;
Fig. 2 illustrates the word that takes of the key word that a kind of based on artificial intelligence the acquisition according to the present invention generates poem and fills
The structural representation put;
In accompanying drawing, same or analogous reference represents same or analogous parts.
Detailed description of the invention
Below in conjunction with the accompanying drawings the present invention is described in further detail.
Fig. 1 illustrates the method stream that a kind of based on artificial intelligence the acquisition according to the present invention generates the key word of poem
Cheng Tu;Wherein, the method according to the invention includes step S1, step S2 and step S3.
Wherein, the method according to the invention is realized by the word device that takes being contained in computer equipment.Described calculating
Machine equipment includes a kind of can automatically carrying out the electricity of numerical computations and/or information processing according to the instruction being previously set or storing
Subset, its hardware includes but not limited to microprocessor, special IC (ASIC), programmable gate array (FPGA), numeral
Processor (DSP), embedded device etc..Described computer equipment includes the network equipment and subscriber equipment.
Wherein, the described network equipment includes but not limited to the service that single network server, multiple webserver form
Device group or the cloud being made up of a large amount of main frames or the webserver based on cloud computing (Cloud Computing), wherein, cloud computing
It is the one of Distributed Calculation, the super virtual machine being made up of a group loosely-coupled computer collection.
Wherein, described subscriber equipment includes but not limited to that any one can be passed through keyboard, mouse, remote controller with user, touch
The mode such as template or voice-operated device carries out the electronic product of man-machine interaction, such as, computer, panel computer, smart mobile phone, PDA or
Handheld device etc..
With reference to Fig. 1, in step sl, take word device from poem solicited message, extract one or more bases key word.
Wherein, described poem solicited message includes the solicited message for generating poem.Preferably, described poem request letter
Breath includes one or more bases key word.
Preferably, described poem solicited message uses the long sentential form with complex structure.
Specifically, take word device and use the natural language processing mode such as semantic analysis, participle, from described poem solicited message
Middle extraction one or more basis key word.
Preferably, word device is taken to each base in poem solicited message in its reverse document frequency (IDF, inverse
Document frequency) it is ranked up, with the IDF value ranking results according to each word, select required one or many
Individual basis key word.
Wherein, the IDF value of some word can be obtained by following formula (1).
Wherein idftFor the IDF value of key word t, | D | represents that the sum of document in corpus, | Dt | expression comprise key word
The number of documents of t.
It should be noted that when using different corpus, the possible difference of the IDF value of each key word, such as, for
Same key word, uses the poem corpus comprising classic poetry, the webpage corpus with employing comprises all webpages, calculates gained
IDF value may be different.
Those skilled in the art should determine selected corpus according to practical situation and demand, and then obtain phase
The IDF value of the key word answered.Such as, for each key word in poem solicited message, can directly add up it at poem language material
IDF value in storehouse;The most such as, when a certain key word found in poem solicited message is not in poem corpus, can be based on
Webpage corpus carrys out computer IDF value, and adjusts IDF value with the adjustment weights between poem corpus and webpage corpus
Whole, etc..Here is omitted.
Then, in step s 2, when a basis key word is not in poem corpus, takes word device and obtain and this basis
One or more expanded keyword that key word is corresponding.
Wherein, described poem corpus includes the corpus being made up of poem.Such as, Tang poetry, the such poems of the Song Dynasty, Yuan songs etc. are comprised many
Plant the corpus of material.
Specifically, when a basis key word is not in poem corpus, takes word device and use multiple webpages to described base
Plinth key word is extended, to obtain the one or more expanded keyword corresponding with this basis key word.
Preferably, take word device and obtain the one or more info webs corresponding with this basis key word;Then, word dress is taken
Put from the one or more info web, extract the extension pass the most corresponding with the one or more info web respectively
Keyword.
Wherein, described expanded keyword is different from described basis key word.
Such as, take word device and carry out web page interrogation based on a basis key word, corresponding with this basis key word to obtain
One or more results web page, and based on this basis key word, extract from these one or more results web page and close with this basis
Keyword is close or the word that is associated is as expanded keyword.
Wherein, take word device to be determined and basis key word phase by natural language processing modes such as semantic analysis
Like or the expanded keyword that is associated.
It is highly preferred that take word device based on described basis key word, scan in web database, multiple to obtain
The info web corresponding with this basis key word;And quality information of based on each info web, from multiple described info webs
Middle selection quality information meets one or more info webs of predetermined quality condition.
Wherein, those skilled in the art are it is to be appreciated that can pass through the access number of webpage, outer chain number, user's time of staying
Etc. parameter determine the quality information of webpage, here is omitted.
Then, in step s3, take word device by the one or more expanded keyword, select to be contained in described poem
At least one expanded keyword in word corpus, as the language material key word corresponding with this key word, to close based on this language material
Keyword generates corresponding verse.
Specifically, take word device and judge whether these one or more expanded keyword are contained in poem corpus respectively
In, when expanded keyword is contained in poem corpus, using this expanded keyword as language material key word.
An example according to the present invention, takes word device and obtains basis key word " Liu Dehua " in step sl, and determine
This basis key word is also not included in poem corpus.Then, in step s 2, take word device to search for and obtain and this basis
One or more webpages of key word " Liu Dehua ", with from these one or more webpages, obtain corresponding with this basis key word
Multiple expanded keyword " king ", " singer ", " performer " etc., then, in step s3, take word device and select to be contained in poem
Expanded keyword " king " in corpus, as the language material key word corresponding with basis key word " Liu Dehua ", with based on this
Generate corresponding verse.
Preferably, when basis key word corresponding one or more expanded keyword when, take word device and obtain institute respectively
State the value information of one or more expanded keyword, and, based on the respective weights of the one or more expanded keyword
Information therefrom selects at least one expanded keyword, as the language material key word corresponding with described basis key word.
Wherein, described value information is for indicating the importance of expanded keyword, and such as, expanded keyword is at web data
IDF value in storehouse, the most such as, when expanded keyword is contained in poem corpus, this expanded keyword is at this poem language material
IDF value etc. in storehouse.
It is highly preferred that take word device from multiple expanded keyword corresponding to key word of basis, according to each extension key
Word IDF value in web database, selects the expanded keyword (such as x) of predetermined quantity;Then, judge that this makes a reservation for respectively
The expanded keyword of quantity is the most all in poem corpus, and to expanded keyword (the such as y being contained in poem corpus
Individual, wherein y≤x) obtain its IDF value in poem corpus respectively, and select at least one extension to close based on this IDF value
Keyword, as the language material key word corresponding with this basis key word.
A preferred version according to the present invention, the method according to the invention also includes step S4 (not shown).
In step s 4, when a basis key word information is contained in corpus, word device is taken by this basis key word
As language material key word.
According to the another preferred version of the present invention, the method for the present invention also includes step S5 (not shown) and step S6 (figure
Do not show).
In step s 5, take word device and obtain poem type to be generated.
Wherein, poem type includes the structural style of poem.Such as, five speech ancient poetry, seven-character "old style" verse, five-character quatrain, five speeches
Regulated verse and all kinds of word, the name of tune etc..
Specifically, take word device and according to the input operation of user, or according to default type, poem to be generated can be determined
Song type.
Then, in step s 6, take word device based on described poem type, determine the sum of required language material key word
N。
Specifically, take the word device type according to poem to be generated, determine the sentence quantity that this type is corresponding, and according to
Sentence quantity determines required expectation key word number.
Preferably, total as required language material key word of the sentence quantity in the poem type that word device will be obtained is taken
Number N.
Such as, take word device and determine that user selects seven-word poem in step s 5, the most in step s 6, take word device and determine
Required language material key word sum is 4;The most such as, take word device and receive the selection information selecting poem with five characters in one line of user,
The most in step s 6, take word device and determine that required language material key word sum is 8;The most such as, the type that word device receives is taken
For word " read slave spoil ", the most in step s 6, take word device according to clause corresponding to this word, determine the language material key word of correspondence
Sum is 8.
According to the preferred embodiment of this programme, described method also includes step S7 (not shown).
In the step s 7, when the quantity of the multiple bases key word extracted from described poem solicited message is more than N number of,
Take word device value information based on each basis key word, from the key word of the plurality of basis, select N number of basis key word.
Wherein, word device value information based on each basis key word is taken, from the key word of the plurality of basis described in
The mode selecting N number of basis key word is believed based on the respective weights of the one or more expanded keyword with the aforementioned word device that takes
Breath therefrom selects the mode of at least one expanded keyword similar or close, repeats no more.
According to the another preferred embodiment of this programme, described method also includes step S8 (not shown).
In step s 8, when fixed language material key word less than N number of time, to described fixed language material key word based on
Described poem corpus is extended, to obtain the language material key word of remaining number from corpus data storehouse.
According to the another preferred embodiment of this programme, when fixed language material key word is less than time N number of, take word device to institute
State fixed language material key word to be extended based on described poem corpus, to obtain remaining number from corpus data storehouse
Language material key word.
Specifically, word device is taken by general based on each word is added up obtained language model by poem corpus
Rate, obtains the association key word corresponding with the one or more language material key words to determine, and is closed by the up/down literary composition obtained
Keyword is as language material key word.
Wherein, take word device and can obtain each word correspondence respectively directly according to the probabilistic language model that each word is corresponding
Up/down literary composition key word;Or, take word device and can give probabilistic language model in real time to a certain language material pass in poem corpus
Keyword carries out adding up and obtain its probabilistic language model, to obtain the up/down literary composition key word corresponding to this language material key word.
Such as, when the language material key word obtained is " sunset clouds " word, word device is taken according to the poem in poem corpus
The key word of word carries out adding up the probabilistic language model obtained, it may be determined that corresponding the most frequently used hereafter key word is
" lonely duck ", and using this hereafter key word as language material key word.
The most such as, when the language material key word obtained is " the Changjiang river ", takes word device and comprise in " the Changjiang river " in poem corpus
The poem sentence of one word carries out adding up as follows:
(1)
Boundless/the wood that falls/sough/under,
Not to the utmost/the Changjiang river/billowing/come.
(2)
Lonely sail/distant view/blue sky/to the greatest extent,
Only see/the Changjiang river/horizon/stream.
Then from above-mentioned two sections of verses, it may be determined that the key word above in " the Changjiang river " includes " not to the utmost ", " only seeing ";Hereafter close
Keyword includes " billowing ", " horizon " etc..
Then, taking word device can therefrom select corresponding up/down literary composition key word as language material based on probabilistic language model
Key word.
Need to obtain K language material key word preferably for predetermined, and currently obtain the situation of m language material key word
With Wi, (m < K), represents that i-th descriptor, the then process obtaining remaining K-m language material key word can pass through following formula (2)
Represent:
Wherein, Wm+1:KRepresent that m+1 is to the sequence of k-th key word, P (Wm+1:K|W1:m) represent given W1:m(the 1st
Sequence to m-th word) in the case of, Wm+1:KThe conditional probability occurred.
According to Markov Hypothesis (Markov Assumption), each word probability of occurrence is the most relevant with front n-1 word
(n is a hyper parameter here, typically takes 5), here by the method for n gram language model to P (Wm+1:K|W1:m) solve, so obtaining
Below equation (3):
Here Wj-n+1..., Wj-1Represent word WjN-1 word above, P (Wj|Wj-n+1..., Wj-1) represent given
In the case of front n-1 word, generate WjConditional probability.
Wherein, probability P (Wj|Wj-n+1..., Wj-1) equation below (4) can be used, by the method for Maximum-likelihood estimation
Carry out:
Wherein, the C (W in formula (4)j-n+1..., Wj) represent frequency statistics, i.e. word strings Wj-n+1..., WjAt language material
Occurrence number in storehouse, similarly, C (Wj-n+1..., Wj-1) represent word strings Wj-n+1..., Wj-1Appearance in corpus
Number of times.
According to the another preferred version of the present invention, the method according to the invention also includes by taking word device based on being received
Voice messaging determines described poem solicited message;And/or, take word device and the described verse generated is converted to voice letter
Breath.
Wherein, text corresponding for verse or voice should be changed with demand by those skilled in the art according to practical situation
For corresponding form, here is omitted.
The method according to the invention, by being extended basis key word, carrys out optimized integration key word and language material is crucial
Conversion between word such that it is able to automatically generate and both meet original poem solicited message, meet again the rhythm of classical poems, use
The poetry works that word etc. require.Achieve the fusion between modern languages culture and poem type and term so that poem is automatic
Generting machanism copes with continuous renewal and the change of modern languages.It is thus possible to broadly meet user for generation
The demand of poem.
Fig. 2 illustrates the word that takes of the key word that a kind of based on artificial intelligence the acquisition according to the present invention generates poem and fills
The structural representation put.Wherein, include that extraction element the 1, first acquisition device 2 and first selects according to the word device that takes of the present invention
Device 3.
With reference to Fig. 2, extraction element 1 extracts one or more bases key word from poem solicited message.
Wherein, described poem solicited message includes the solicited message for generating poem.Preferably, described poem request letter
Breath includes one or more bases key word.
Preferably, described poem solicited message uses the long sentential form with complex structure.
Specifically, extraction element 1 uses the natural language processing mode such as semantic analysis, participle, from described poem request letter
Breath extracts one or more bases key word.
Preferably, extraction element 1 to each base in poem solicited message in its reverse document frequency (IDF,
Inverse document frequency) it is ranked up, with the IDF value ranking results according to each word, select required
One or more bases key word.
Wherein, the IDF value of some word can be obtained by following formula (1).
Wherein idftFor the IDF value of key word t, | D | represents that the sum of document in corpus, | Dt | expression comprise key word
The number of documents of t.
It should be noted that when using different corpus, the possible difference of the IDF value of each key word, such as, for
Same key word, uses the poem corpus comprising classic poetry, the webpage corpus with employing comprises all webpages, calculates gained
IDF value may be different.
Those skilled in the art should determine selected corpus according to practical situation and demand, and then obtain phase
The IDF value of the key word answered.Such as, for each key word in poem solicited message, can directly add up it at poem language material
IDF value in storehouse;The most such as, when a certain key word found in poem solicited message is not in poem corpus, can be based on
Webpage corpus carrys out computer IDF value, and adjusts IDF value with the adjustment weights between poem corpus and webpage corpus
Whole, etc..Here is omitted.
Then, when a basis key word is not in poem corpus, the first acquisition device 2 obtains and this basis key word
Corresponding one or more expanded keyword.
Wherein, described poem corpus includes the corpus being made up of poem.Such as, Tang poetry, the such poems of the Song Dynasty, Yuan songs etc. are comprised many
Plant the corpus of material.
Specifically, when a basis key word is not in poem corpus, the first acquisition device 2 uses multiple webpage to institute
State basis key word to be extended, to obtain the one or more expanded keyword corresponding with this basis key word.
Preferably, the sub-acquisition device (not shown) being contained in the first acquisition device 2 China obtains and this basis key word
Corresponding one or more info webs;Then, the sub-extraction element (not shown) of the first acquisition device 2 China it is contained in respectively
The expanded keyword the most corresponding with the one or more info web is extracted from the one or more info web.
Wherein, described expanded keyword is different from described basis key word.
Such as, sub-acquisition device carries out web page interrogation based on a basis key word, corresponding with this basis key word to obtain
One or more results web page, and sub-extraction element is based on this basis key word, from these one or more results web page
Extract word that is close with this basis key word or that be associated as expanded keyword.
Wherein, sub-extraction element can be determined and basis key word by natural language processing modes such as semantic analysis
Expanded keyword that is similar or that be associated.
It is highly preferred that the searcher (not shown) being contained in sub-acquisition device is based on described basis key word, at net
Page data scans in storehouse, to obtain multiple info web corresponding with this basis key word;Further, sub-acquisition dress it is contained in
The second selection device (not shown) quality information based on each info web in putting, selects from multiple described info webs
Quality information meets one or more info webs of predetermined quality condition.
Wherein, those skilled in the art are it is to be appreciated that can pass through the access number of webpage, outer chain number, user's time of staying
Etc. parameter determine the quality information of webpage, here is omitted.
Then, first selects device 3 by the one or more expanded keyword, selects to be contained in described poem language material
At least one expanded keyword in storehouse, as the language material key word corresponding with this key word, with raw based on this language material key word
Become corresponding verse.
Specifically, first device 3 is selected to judge whether these one or more expanded keyword are contained in poem language respectively
In material storehouse, when expanded keyword is contained in poem corpus, using this expanded keyword as language material key word.
An example according to the present invention, extraction element 1 obtains basis key word " Liu Dehua ", and determines that this basis is crucial
Word is also not included in poem corpus.Then, the first acquisition device 2 is searched for and is obtained and this basis key word " Liu Dehua "
One or more webpages, with from these one or more webpages, obtain the multiple expanded keyword corresponding with this basis key word
" king ", " singer ", " performer " etc., then, first selects device 3 to select to be contained in the expanded keyword in poem corpus
" king ", as the language material key word corresponding with basis key word " Liu Dehua ", to generate corresponding verse based on this.
Preferably, when basis key word corresponding one or more expanded keyword when, it is contained in the first selection device 3
In the second acquisition device (not shown) obtain the value information of the one or more expanded keyword respectively, and, comprise
The first son in device 3 is selected to select device (not shown) based on the respective power of the one or more expanded keyword in first
Value information therefrom selects at least one expanded keyword, as the language material key word corresponding with described basis key word.
Wherein, described value information is for indicating the importance of expanded keyword, and such as, expanded keyword is at web data
IDF value in storehouse, the most such as, when expanded keyword is contained in poem corpus, this expanded keyword is at this poem language material
IDF value etc. in storehouse.
It is highly preferred that first selects device 3 from multiple expanded keyword that basis key word is corresponding, extend according to each
Key word IDF value in web database, selects the expanded keyword (such as x) of predetermined quantity;Then, judging respectively should
The expanded keyword of predetermined quantity is the most all in poem corpus, and to the expanded keyword being contained in poem corpus
(such as y, wherein y≤x) obtains its IDF value in poem corpus respectively, and selects at least one based on this IDF value
Expanded keyword, as the language material key word corresponding with this basis key word.
A preferred version according to the present invention, when a basis key word information is contained in corpus, takes word device
Using this basis key word as language material key word.
According to the another preferred version of the present invention, the present invention take word device also include the 3rd acquisition device (not shown) and
Determine device (not shown).
3rd acquisition device obtains poem type to be generated.
Wherein, poem type includes the structural style of poem.Such as, five speech ancient poetry, seven-character "old style" verse, five-character quatrain, five speeches
Regulated verse and all kinds of word, the name of tune etc..
Specifically, the 3rd acquisition device can according to the input operation of user, or according to default type, determine to be generated
Poem type.
It is then determined device is based on described poem type, determine the total N of required language material key word.
Specifically, it is determined that device is according to the type of poem to be generated, determine the sentence quantity that this type is corresponding, and according to
Sentence quantity determines required expectation key word number.
Preferably, total as required language material key word of the sentence quantity in the poem type that device will be obtained is determined
Number N.
Such as, the 3rd acquisition device obtains user and selects seven-word poem, it is determined that device determines required language material key word
Sum is 4;The most such as, the 3rd acquisition device receives the selection information selecting poem with five characters in one line of user, then the 3rd obtain dress
Put and determine that required language material key word sum is 8;The most such as, the type that the 3rd acquisition device receives is that word " is read slave to spoil ",
Then the 3rd acquisition device is according to clause corresponding to this word, determines that the language material key word sum of correspondence is 8.
According to the preferred embodiment of this programme, when the number of the multiple bases key word extracted from described poem solicited message
Amount, more than time N number of, take word device value information based on each basis key word, selects N from the key word of the plurality of basis
Individual basis key word.
Wherein, word device value information based on each basis key word is taken, from the key word of the plurality of basis described in
The mode selecting N number of basis key word is believed based on the respective weights of the one or more expanded keyword with the aforementioned word device that takes
Breath therefrom selects the mode of at least one expanded keyword similar or close, repeats no more.
According to the another preferred embodiment of this programme, when fixed language material key word is less than time N number of, take word device to institute
State fixed language material key word to be extended based on described poem corpus, to obtain remaining number from corpus data storehouse
Language material key word.
According to the another preferred embodiment of this programme, when fixed language material key word is less than time N number of, take word device to institute
State fixed language material key word to be extended based on described poem corpus, to obtain remaining number from corpus data storehouse
Language material key word.
Specifically, word device is taken by general based on each word is added up obtained language model by poem corpus
Rate, obtains the association key word corresponding with the one or more language material key words to determine, and is closed by the up/down literary composition obtained
Keyword is as language material key word.
Wherein, take word device and can obtain each word correspondence respectively directly according to the probabilistic language model that each word is corresponding
Up/down literary composition key word;Or, take word device and can give probabilistic language model in real time to a certain language material pass in poem corpus
Keyword carries out adding up and obtain its probabilistic language model, to obtain the up/down literary composition key word corresponding to this language material key word.
Such as, when the language material key word obtained is " sunset clouds " word, word device is taken according to the poem in poem corpus
The key word of word carries out adding up the probabilistic language model obtained, it may be determined that corresponding the most frequently used hereafter key word is
" lonely duck ", and using this hereafter key word as language material key word.
The most such as, when the language material key word obtained is " the Changjiang river ", takes word device and comprise in " the Changjiang river " in poem corpus
The poem sentence of one word carries out adding up as follows:
(1)
Boundless/the wood that falls/sough/under,
Not to the utmost/the Changjiang river/billowing/come.
(2)
Lonely sail/distant view/blue sky/to the greatest extent,
Only see/the Changjiang river/horizon/stream.
Then from above-mentioned two sections of verses, it may be determined that the key word above in " the Changjiang river " includes " not to the utmost ", " only seeing ";Hereafter close
Keyword includes " billowing ", " horizon " etc..
Then, taking word device can therefrom select corresponding up/down literary composition key word as language material based on probabilistic language model
Key word.
Need to obtain K language material key word preferably for predetermined, and currently obtain the situation of m language material key word
With Wi, (m < K), represents that i-th descriptor, the then process obtaining remaining K-m language material key word can pass through following formula (2)
Represent:
Wherein, Wm+1:KRepresent that m+1 is to the sequence of k-th key word, P (Wm+1:K|W1:m) represent given W1:m(the 1st
Sequence to m-th word) in the case of, Wm+1:KThe conditional probability occurred.
According to Markov Hypothesis (Markov Assumption), each word probability of occurrence is the most relevant with front n-1 word
(n is a hyper parameter here, typically takes 5), here by the method for n gram language model to P (Wm+1:K|W1:m) solve, so obtaining
Below equation (3):
Here Wj-n+1..., Wj-1Represent word WjN-1 word above, P (Wj|Wj-n+1..., Wj-1) represent given
In the case of front n-1 word, generate WjConditional probability.
Wherein, probability P (Wj|Wj-n+1..., Wj-1) equation below (4) can be used, by the method for Maximum-likelihood estimation
Carry out:
Wherein, the C (W in formula (4)j-n+1..., Wj) represent frequency statistics, i.e. word strings Wj-n+1..., WjAt language material
Occurrence number in storehouse, similarly, C (Wj-n+1..., Wj-1) represent word strings Wj-n+1..., Wj-1Appearance in corpus
Number of times.
According to the another preferred version of the present invention, determined described poem by taking word device based on the voice messaging received
Solicited message;And/or, by taking word device, the described verse generated is converted to voice messaging.
Wherein, text corresponding for verse or voice should be changed with demand by those skilled in the art according to practical situation
For corresponding form, here is omitted.
According to the solution of the present invention, by basis key word is extended, carrys out optimized integration key word and language material is crucial
Conversion between word such that it is able to automatically generate and both meet original poem solicited message, meet again the rhythm of classical poems, use
The poetry works that word etc. require.Achieve the fusion between modern languages culture and poem type and term so that poem is automatic
Generting machanism copes with continuous renewal and the change of modern languages.It is thus possible to broadly meet user for generation
The demand of poem.
The software program of the present invention can perform to realize steps described above or function by processor.Similarly, originally
The software program (including the data structure being correlated with) of invention can be stored in computer readable recording medium storing program for performing, and such as, RAM deposits
Reservoir, magnetically or optically driver or floppy disc and similar devices.It addition, some steps of the present invention or function can employ hardware to reality
Existing, such as, perform the circuit of each function or step as coordinating with processor.
It addition, the part of the present invention can be applied to computer program, such as computer program instructions, when its quilt
When computer performs, by the operation of this computer, can call or provide the method according to the invention and/or technical scheme.
And call the programmed instruction of the method for the present invention, it is possibly stored in fixing or movably in record medium, and/or passes through
Data stream in broadcast or other signal bearing medias and be transmitted, and/or be stored in and run according to described programmed instruction
In the working storage of computer equipment.Here, include a device according to one embodiment of present invention, this device includes using
In the memorizer of storage computer program instructions with for performing the processor of programmed instruction, wherein, when this computer program refers to
When order is performed by this processor, trigger this plant running method based on aforementioned multiple embodiments according to the present invention and/or skill
Art scheme.
It is obvious to a person skilled in the art that the invention is not restricted to the details of above-mentioned one exemplary embodiment, Er Qie
In the case of the spirit or essential attributes of the present invention, it is possible to realize the present invention in other specific forms.Therefore, no matter
From the point of view of which point, all should regard embodiment as exemplary, and be nonrestrictive, the scope of the present invention is by appended power
Profit requires rather than described above limits, it is intended that all by fall in the implication of equivalency and scope of claim
Change is included in the present invention.Should not be considered as limiting involved claim by any reference in claim.This
Outward, it is clear that " including ", a word was not excluded for other unit or step, and odd number is not excluded for plural number.In system claims, statement is multiple
Unit or device can also be realized by software or hardware by a unit or device.The first, the second word such as grade is used for table
Show title, and be not offered as any specific order.
Claims (20)
1. the method obtaining the key word generating poem based on artificial intelligence, wherein, said method comprising the steps of:
A extracts one or more bases key word from poem solicited message;
B, when a basis key word is not in poem corpus, obtains the one or more extensions corresponding with this basis key word
Key word;
C, by the one or more expanded keyword, selects at least one extension being contained in described poem corpus to close
Keyword, as the language material key word corresponding with this key word, to generate corresponding verse based on this language material key word.
Method the most according to claim 1, wherein, described step b further includes steps of
B1 obtains the one or more info webs corresponding with this basis key word;
B2 extracts the expansion the most corresponding with the one or more info web respectively from the one or more info web
Exhibition key word, wherein, described expanded keyword is different from described basis key word.
Method the most according to claim 2, wherein, described step b1 farther includes:
-based on described basis key word, scan in web database, multiple corresponding with this basis key word to obtain
Info web;
-quality information based on each info web, selects quality information to meet predetermined quality from multiple described info webs
One or more info webs of condition.
The most according to the method in any one of claims 1 to 3, wherein, described step c further includes steps of
-obtain the value information of the one or more expanded keyword respectively;
-therefrom select at least one expanded keyword based on the respective value information of the one or more expanded keyword,
As the language material key word corresponding with described basis key word.
Method the most according to any one of claim 1 to 4, wherein, described method is further comprising the steps of:
-when a basis key word information is contained in corpus, using this basis key word as language material key word.
Method the most according to any one of claim 1 to 5, wherein, described method is further comprising the steps of:
-obtain poem type to be generated;
-based on described poem type, determine the total N of required language material key word.
Method the most according to claim 6, wherein, described method is further comprising the steps of:
-when the quantity of the multiple bases key word extracted from described poem solicited message is more than time N number of, close based on each basis
The value information of keyword, selects N number of basis key word from the key word of the plurality of basis.
8. according to the method described in claim 6 or 7, wherein, described method is further comprising the steps of:
-when fixed language material key word is less than time N number of, to described fixed language material key word based on described poem corpus
It is extended, to obtain the language material key word of remaining number from corpus data storehouse.
Method the most according to any one of claim 1 to 8, wherein, described method is further comprising the steps of:
-determine described poem solicited message based on the voice messaging received.
10. according to right, he requires the method according to any one of 1 to 9, and wherein, described method is further comprising the steps of:
-the described verse generated is converted to voice messaging.
11. 1 kinds based on artificial intelligence obtain generate poems key words take word device, wherein, described in take word device bag
Include:
Extraction element, for extracting one or more bases key word from poem solicited message;
First acquisition device, for when a basis key word is not in poem corpus, obtaining corresponding with this basis key word
One or more expanded keyword;
First selects device, for by the one or more expanded keyword, selecting to be contained in described poem corpus
At least one expanded keyword, as the language material key word corresponding with this key word, to generate phase based on this language material key word
The verse answered.
The 12. word devices that take according to claim 11, wherein, described first acquisition device farther includes:
Sub-acquisition device, for obtaining the one or more info webs corresponding with this basis key word;
Sub-extraction element, for extracting and the one or more info web respectively from the one or more info web
The most corresponding expanded keyword, wherein, described expanded keyword is different from described basis key word.
The 13. word devices that take according to claim 12, wherein, described sub-acquisition device farther includes:
Searcher, for based on described basis key word, scanning for, to obtain multiple and this basis in web database
The info web that key word is corresponding;
Second selects device, for quality information based on each info web, selects quality from multiple described info webs
Information meets one or more info webs of predetermined quality condition.
14. according to the method according to any one of claim 11 to 13, and wherein, described first selects device to farther include:
Second acquisition device, for obtaining the value information of the one or more expanded keyword respectively;
First son select device, for based on the respective value information of the one or more expanded keyword therefrom select to
A few expanded keyword, as the language material key word corresponding with described basis key word.
15. according to taking word device according to any one of claim 11 to 14, wherein, described in take word device and be additionally operable to:
-when a basis key word information is contained in corpus, using this basis key word as language material key word.
16. according to taking word device according to any one of claim 11 to 15, wherein, described in take word device and also include:
3rd acquisition device, for obtaining poem type to be generated;
Determine device, for based on described poem type, determine the total N of required language material key word.
The 17. word devices that take according to claim 16, wherein, described in take word device and be additionally operable to:
-when the quantity of the multiple bases key word extracted from described poem solicited message is more than time N number of, close based on each basis
The value information of keyword, selects N number of basis key word from the key word of the plurality of basis.
18. according to taking word device described in claim 16 or 17, wherein, described in take word device and be additionally operable to:
-when fixed language material key word is less than time N number of, to described fixed language material key word based on described poem corpus
It is extended, to obtain the language material key word of remaining number from corpus data storehouse.
19. according to taking word device according to any one of claim 11 to 18, wherein, described in take word device and be additionally operable to:
-determine described poem solicited message based on the voice messaging received.
20. according to right, he requires to take word device according to any one of 11 to 19, wherein, described in take word device and be additionally operable to:
-the described verse generated is converted to voice messaging.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610556319.1A CN106227714A (en) | 2016-07-14 | 2016-07-14 | A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610556319.1A CN106227714A (en) | 2016-07-14 | 2016-07-14 | A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106227714A true CN106227714A (en) | 2016-12-14 |
Family
ID=57520060
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610556319.1A Pending CN106227714A (en) | 2016-07-14 | 2016-07-14 | A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106227714A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095752A (en) * | 2016-06-07 | 2016-11-09 | 北京百度网讯科技有限公司 | A kind of method and apparatus for automatically generating poem |
CN107944003A (en) * | 2017-12-06 | 2018-04-20 | 国云科技股份有限公司 | A kind of classic poetry is collected and data analysing method |
CN108415893A (en) * | 2018-03-15 | 2018-08-17 | 平安科技(深圳)有限公司 | Poem automatic generation method, device, computer equipment and storage medium |
CN109213777A (en) * | 2017-06-29 | 2019-01-15 | 杭州九阳小家电有限公司 | A kind of voice-based recipe processing method and system |
CN110414001A (en) * | 2019-07-18 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Sentence generation method and device, storage medium and electronic device |
WO2019242001A1 (en) * | 2018-06-22 | 2019-12-26 | Microsoft Technology Licensing, Llc | Method, computing device and system for generating content |
CN110738061A (en) * | 2019-10-17 | 2020-01-31 | 北京搜狐互联网信息服务有限公司 | Ancient poetry generation method, device and equipment and storage medium |
CN110852086A (en) * | 2019-09-18 | 2020-02-28 | 平安科技(深圳)有限公司 | Artificial intelligence based ancient poetry generating method, device, equipment and storage medium |
CN111814488A (en) * | 2020-07-22 | 2020-10-23 | 网易(杭州)网络有限公司 | Poetry generation method and device, electronic equipment and readable storage medium |
CN111950255A (en) * | 2019-05-17 | 2020-11-17 | 腾讯数码(天津)有限公司 | Poetry generation method, device and equipment and storage medium |
CN113010717A (en) * | 2021-04-26 | 2021-06-22 | 中国人民解放军国防科技大学 | Image verse description generation method, device and equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1112541A1 (en) * | 1998-09-09 | 2001-07-04 | Invention Machine Corporation | Document semantic analysis/selection with knowledge creativity capability |
CN102014199A (en) * | 2010-09-16 | 2011-04-13 | 宇龙计算机通信科技(深圳)有限公司 | Information display method and terminal |
CN102385596A (en) * | 2010-09-03 | 2012-03-21 | 腾讯科技(深圳)有限公司 | Verse searching method and device |
CN103106282A (en) * | 2013-02-27 | 2013-05-15 | 王义东 | Method for search and display of webpage |
CN103530291A (en) * | 2012-07-03 | 2014-01-22 | 同程网络科技股份有限公司 | Keyword release word developing method and device thereof suitable for search engine |
CN103744956A (en) * | 2014-01-06 | 2014-04-23 | 同济大学 | Diversified expansion method of keyword |
-
2016
- 2016-07-14 CN CN201610556319.1A patent/CN106227714A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1112541A1 (en) * | 1998-09-09 | 2001-07-04 | Invention Machine Corporation | Document semantic analysis/selection with knowledge creativity capability |
CN102385596A (en) * | 2010-09-03 | 2012-03-21 | 腾讯科技(深圳)有限公司 | Verse searching method and device |
CN102014199A (en) * | 2010-09-16 | 2011-04-13 | 宇龙计算机通信科技(深圳)有限公司 | Information display method and terminal |
CN103530291A (en) * | 2012-07-03 | 2014-01-22 | 同程网络科技股份有限公司 | Keyword release word developing method and device thereof suitable for search engine |
CN103106282A (en) * | 2013-02-27 | 2013-05-15 | 王义东 | Method for search and display of webpage |
CN103744956A (en) * | 2014-01-06 | 2014-04-23 | 同济大学 | Diversified expansion method of keyword |
Non-Patent Citations (1)
Title |
---|
崔希亮,张宝林主编: "《第二届汉语中介语语料库建设与应用国际学术讨论会论文选集》", 31 December 2013 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095752A (en) * | 2016-06-07 | 2016-11-09 | 北京百度网讯科技有限公司 | A kind of method and apparatus for automatically generating poem |
CN106095752B (en) * | 2016-06-07 | 2019-06-25 | 北京百度网讯科技有限公司 | A kind of method and apparatus for automatically generating poem |
CN109213777A (en) * | 2017-06-29 | 2019-01-15 | 杭州九阳小家电有限公司 | A kind of voice-based recipe processing method and system |
CN107944003A (en) * | 2017-12-06 | 2018-04-20 | 国云科技股份有限公司 | A kind of classic poetry is collected and data analysing method |
CN108415893A (en) * | 2018-03-15 | 2018-08-17 | 平安科技(深圳)有限公司 | Poem automatic generation method, device, computer equipment and storage medium |
WO2019174186A1 (en) * | 2018-03-15 | 2019-09-19 | 平安科技(深圳)有限公司 | Automatic poem generation method and apparatus, and computer device and storage medium |
CN108415893B (en) * | 2018-03-15 | 2019-09-20 | 平安科技(深圳)有限公司 | Poem automatic generation method, device, computer equipment and storage medium |
WO2019242001A1 (en) * | 2018-06-22 | 2019-12-26 | Microsoft Technology Licensing, Llc | Method, computing device and system for generating content |
CN111950255A (en) * | 2019-05-17 | 2020-11-17 | 腾讯数码(天津)有限公司 | Poetry generation method, device and equipment and storage medium |
CN111950255B (en) * | 2019-05-17 | 2023-05-30 | 腾讯数码(天津)有限公司 | Poem generation method, device, equipment and storage medium |
CN110414001A (en) * | 2019-07-18 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Sentence generation method and device, storage medium and electronic device |
CN110414001B (en) * | 2019-07-18 | 2023-09-26 | 腾讯科技(深圳)有限公司 | Sentence generation method and device, storage medium and electronic device |
CN110852086A (en) * | 2019-09-18 | 2020-02-28 | 平安科技(深圳)有限公司 | Artificial intelligence based ancient poetry generating method, device, equipment and storage medium |
CN110852086B (en) * | 2019-09-18 | 2022-02-08 | 平安科技(深圳)有限公司 | Artificial intelligence based ancient poetry generating method, device, equipment and storage medium |
CN110738061A (en) * | 2019-10-17 | 2020-01-31 | 北京搜狐互联网信息服务有限公司 | Ancient poetry generation method, device and equipment and storage medium |
CN110738061B (en) * | 2019-10-17 | 2024-05-28 | 北京搜狐互联网信息服务有限公司 | Ancient poetry generating method, device, equipment and storage medium |
CN111814488A (en) * | 2020-07-22 | 2020-10-23 | 网易(杭州)网络有限公司 | Poetry generation method and device, electronic equipment and readable storage medium |
CN111814488B (en) * | 2020-07-22 | 2024-06-07 | 网易(杭州)网络有限公司 | Poem generation method and device, electronic equipment and readable storage medium |
CN113010717A (en) * | 2021-04-26 | 2021-06-22 | 中国人民解放军国防科技大学 | Image verse description generation method, device and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106227714A (en) | A kind of method and apparatus obtaining the key word generating poem based on artificial intelligence | |
CN111753060B (en) | Information retrieval method, apparatus, device and computer readable storage medium | |
CN104376406B (en) | A kind of enterprise innovation resource management and analysis method based on big data | |
CN104899273B (en) | A kind of Web Personalization method based on topic and relative entropy | |
CN105069102B (en) | Information push method and apparatus | |
CN103514299B (en) | Information search method and device | |
CN101470732B (en) | Auxiliary word stock generation method and apparatus | |
KR20210116379A (en) | Method, apparatus for text generation, device and storage medium | |
CN105069143B (en) | Extract the method and device of keyword in document | |
CN106951438A (en) | A kind of event extraction system and method towards open field | |
CN102567509B (en) | Method and system for instant messaging with visual messaging assistance | |
CN102955848A (en) | Semantic-based three-dimensional model retrieval system and method | |
CN111190997A (en) | Question-answering system implementation method using neural network and machine learning sequencing algorithm | |
CN103870000A (en) | Method and device for sorting candidate items generated by input method | |
CN111488467A (en) | Construction method and device of geographical knowledge graph, storage medium and computer equipment | |
CN109063147A (en) | Online course forum content recommendation method and system based on text similarity | |
CN104978314A (en) | Media content recommendation method and device | |
CN105740310B (en) | A kind of automatic answer method of abstracting and system in question answering system | |
CN102844755A (en) | Method of extracting named entity | |
JP2018509664A (en) | Model generation method, word weighting method, apparatus, device, and computer storage medium | |
CN106095912A (en) | For the method and apparatus generating expanding query word | |
CN110348919A (en) | Item recommendation method, device and computer readable storage medium | |
CN115018549A (en) | Method for generating advertisement file, device, equipment, medium and product thereof | |
KR20160112248A (en) | Latent keyparase generation method and apparatus | |
CN109960721A (en) | Multiple Compression based on source contents constructs content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161214 |
|
RJ01 | Rejection of invention patent application after publication |