CN109086270B

CN109086270B - Automatic poetry making system and method based on ancient poetry corpus vectorization

Info

Publication number: CN109086270B
Application number: CN201810817519.7A
Authority: CN
Inventors: 铉静; 何伟东; 李良炎; 何中市; 吴琼; 郭飞; 张航; 周泽寻; 杜井龙; 王路路; 陈定定; 许祥娟
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2018-07-24
Filing date: 2018-07-24
Publication date: 2022-03-01
Anticipated expiration: 2038-07-24
Also published as: CN109086270A

Abstract

The invention discloses an automatic poetry making system based on ancient poetry corpus vectorization and a method thereof. The invention has the beneficial effects that: the machine can fully learn meaning and mood in the poetry, and then obtain the ancient poetry that needs according to the neural network direct input key word after studying when needs are poetry, utilizes the experience study of predecessor to obtain the ability of poetry, also has the artistic aesthetic feeling when satisfying the poetry law.

Description

Automatic poetry making system and method based on ancient poetry corpus vectorization

Technical Field

The invention relates to the technical field of automatic poetry by computers, in particular to an automatic poetry making system and method based on ancient poetry corpus vectorization.

Background

With the continuous promotion of computer technology and hardware computing power, artificial intelligence is closer to the expectation of people, for example, the robot AlphaGo can surpass the world champion of the go by computing, but in the creative or artistic field, the artificial intelligence still can not be competent for related work, for example, Chinese classical poetry is a language art, and the artistic value and the literary achievement are long-running. Ancient poetry possesses regularity and abstract simultaneously, and the tie and the narrow rule of different poetry bodies all have the regulation, and each antithetical couplet still needs the time of a long time to match, and strict regulation makes ancient poetry have pronunciation and the aesthetic feeling in the rhythm, simultaneously because the wide and brisk of chinese culture, the meaning of every word all can have multiple content and different people's understanding, consequently, the creation of ancient poetry needs can make the poetry that is rich in aesthetic feeling and mood after the outstanding poetry study of predecessor, the confluence.

For computers and artificial intelligence, regular work is easy to complete, but abstract creation and artistic aesthetic feeling are the difficulties of poetry made by machines: 1. how to vectorize the natural language into a language which can be read and understood by a machine and enable information contained in the natural language to be stored to the maximum extent; 2. what method can be used to calculate these vectors, allowing the computer to simulate human processing of natural language; 3. how to construct a neural network model can more appropriately represent the relationship between the text data, and the minimum calculation cost is spent; 4. how to solve the training problem by the network design optimization method and the hyper-parameters so as to improve the final effect of the model; 5. if a picture is input for poetry, how to position scenes and themes in the picture and identify the names of objects; 6. and the emotions of the samples generated by the original operation of the machine need to be reserved in the examination and the word change of the level and narrow and the rhyme. At present, parameters such as learning rate of a neural network and model building need to accumulate experience in continuous practice, so that a parameter model suitable for solving the problem is obtained.

Disclosure of Invention

In order to realize the purpose of automatically poetry by a machine, the invention provides an automatic poetry making system and a method thereof based on ancient poetry linguistic data vectorization, each character of historical excellent poetry is converted into linguistic data vectors, and meanwhile, the relationship between the linguistic data vectors is established, so that the machine can fully learn the meaning and the mood in the poetry, further, when poetry is needed, the needed ancient poetry is obtained by directly inputting key characters and words according to a neural network after learning, the poetry making ability is obtained by using experience learning of predecessors, and the poetry law is met while the artistic aesthetic feeling is achieved.

In order to achieve the purpose, the invention adopts the following specific technical scheme:

an automatic poetry making system based on ancient poetry corpus vectorization comprises a corpus processing mechanism, a corpus vector library, an LSTM network model and a poetry screening mechanism;

the corpus processing mechanism is used for converting the operation of corpus vectors and corpus vectors;

the corpus vector library is used for storing corpus vectors;

the LSTM network model is used for generating poetry drafts;

the poetry screening mechanism is used for processing the operations of rhyme and tone undertone of poetry draft;

the corpus processing mechanism is connected with the corpus vector library in a bidirectional mode, and the corpus processing mechanism, the LSTM network model and the poetry screening mechanism are connected in sequence.

Through the design, the automatic poetry making system learns the language relationship and habits among all words in poetry through the LSTM network model, when poetry is needed, the key words are input into the system, the corpus processing mechanism identifies the key words and decomposes the key words to make poetry drafts through the LSTM network model, then the poetry screening mechanism selects contents which accord with rhymes and tone rules as finally determined poetry, and finally poetry results are obtained.

Further, the LSTM network model is a network model composed of two serial layers of LSTM structures, the optimization function of the LSTM network model is to calculate random gradient descent, and the loss function is to calculate cross entropy.

The network model formed by the serial two-layer LSTM structure can more accurately identify the relation between words, but the data volume of the calculation result can be properly reduced due to the fact that the obtained data volume is larger.

Preferably, the LSTM network model discards 20% of the total data after calculation, the learning rate is 0.01, and the number of iterations is 700.

An automatic poetry method based on ancient poetry corpus vectorization comprises the following steps:

s1, inputting ancient poems to a corpus processing mechanism, wherein the corpus processing mechanism converts the characters of the ancient poems into corpus vectors and stores the corpus vectors into a corpus vector library;

s2, building an LSTM network model;

s3, inputting the corpus training set to the LSTM network model to complete the training of the LSTM network model;

s4, image words are input to a corpus processing mechanism, and the corpus processing mechanism calculates to obtain poetry alternative words according to the corpus vectors corresponding to each image word in a corpus vector library;

s5, the corpus processing mechanism inputs the poetry alternative words into the LSTM network model to obtain poetry draft;

and S6, selecting the poems which most accord with the poetry rule from poetry drafts by the poetry screening mechanism according to the rhyme and tone rule of the poetry body to obtain the finalized poetry, wherein the finalized poetry is the poetry result automatically made.

Through the design, a large number of excellent ancient poems enter the corpus processing mechanism, each character of each poem is vectorized, if a skip-gram model is used, corpus vectors are obtained, so that a computer can recognize related contents of each character, the LSTM network model can process the connection relation between the characters, the purpose of understanding the meaning of each character and analyzing the relation between the characters is achieved, the training process of the LSTM network model is the process of learning the characters, when training is completed, the LSTM network model can simply make poems, the contents of regularity such as rhymes, peaceful and narrow tones and the like are processed after poem drafts are made, and poem manuscript setting is finally completed.

Further described, the specific content of step S1 is as follows:

s1.1, inputting ancient poems to a corpus processing mechanism, wherein each character appearing in the ancient poems is divided by the corpus processing mechanism and is marked as m unrepeated characters, and the same character appearing more than once is marked as the same unrepeated character;

s1.2, counting the occurrence frequency of each nonrepeating character and the characters adjacent to the context and appearing in each poem;

s1.3, the corpus processing mechanism sets a random n-dimensional vector for each nonrepeating word, the n-dimensional vector is the corpus vector of the nonrepeating word, the corpus vector is correspondingly stored in a corpus vector library, n belongs to [180, 220], and n is an integer;

s1.4, constructing a Huffman tree, wherein the Huffman tree comprises end nodes and middle nodes, each end node is a child node of the middle node, each middle node is only provided with 2 child nodes, each end node points to a corpus vector of a nonrepeating word in a corpus vector library respectively, the number of times of occurrence of the end node as a corresponding nonrepeating word is recorded, each middle node is recorded with a node value as the sum of the node values of the child nodes, the end node with the larger node value is closer to a root node, and the root node is the middle node with the largest node value;

s1.5, the selection probability of the words adjacent to the context of the corpus vector x on the Huffman tree is as follows:

p(context|x)＝Πp_i

wherein p is_iThe probability of selecting the first child node for the ith intermediate node in the Huffman tree is as follows:

x is the corpus vector, theta, input by the intermediate node_iThe weight of the corpus vector input on the ith intermediate node is obtained;

s1.6, repeating the pair of x and theta by using a gradient descent method_iRespectively calculating partial derivatives:

first, calculate theta_iPartial derivatives of (a):

will new theta_iCorrespondingly, after updating to p (context | x), the partial derivative of x is calculated:

updating the new x to the corpus vector library correspondingly;

s1.7, reselecting an un-updated corpus vector x and returning to the step S1.5 until each corpus vector x in the corpus vector library is updated once, so as to obtain a new corpus vector library.

Through the design, the n-dimensional vector of each nonrepeating character is initially randomly arranged, but the corresponding corpus vectors after partial derivative operation correspond to the content of the Huffman tree one by one, namely, each corpus vector contains the frequency of the input ancient poetry and the information of the characters adjacent to the ancient poetry in each poetry, the larger the value of n is, the richer the corresponding information is, but the larger the calculated amount is, and the more the x and theta are_iCalculating the partial derivatives can more accurately record the paths in Huffman, so that the calculation is more accurate.

Further, the corpus training set is a set composed of 80% corpus vectors in a corpus vector library, and the corpus vectors in the corpus training set are ordered according to the word order in the corresponding ancient poems;

the corpus training set is according to 9: 1, wherein the training corpus is used for training and adjusting parameter setting of the LSTM network model, and the verification corpus is used for verifying and proofreading the LSTM network model after training and adjusting.

The data of the corpus vector library is divided into training corpora and verification corpora, the training corpora are input for learning during training, and the verification corpora are input after learning to verify the learning effect until a good learning effect is achieved.

Further, the image words in step S4 are obtained by inputting images to the image feature extraction model, and the specific method is as follows:

s4.1, inputting an image to an image feature extraction model, wherein the image feature extraction model extracts image words from the image;

and S4.2, the corpus processing mechanism matches corresponding corpus vectors in a corpus vector library for the image words one by one, and the corpus vectors are selected words of the poetry.

The meaning words and phrases can be manually input into the key words and phrases, the corpus processing mechanism identifies the key words and phrases and then correspondingly matches the corpus vectors of the corpus vector library, an image feature extraction model can also be additionally arranged, the image feature extraction model can extract scenes in the image and convert the scenes into words and phrases, at the moment, the words and phrases of key scenes in the image can be obtained only by inputting the image into the image feature extraction model, and the corpus processing mechanism processes the extracted words and phrases to obtain poetry alternative words.

Further described, the image feature extraction model is an improved VGG-16 convolutional neural network model, and comprises a convolutional layer group 1, a pooling layer, a convolutional layer group 2, a pooling layer, a convolutional layer group 3, a pooling layer, a convolutional layer group 4, a pooling layer, a convolutional layer group 5, a pooling layer, 2 convolutional layers, a Bounding-box layer and a Softmax layer, which are connected in sequence, wherein the convolutional layer group 1 and the convolutional layer group 2 are both composed of 2 convolutional layers connected in series, the convolutional layer group 3, the convolutional layer group 4 and the convolutional layer group 5 are respectively composed of 3 convolutional layers connected in series, and each convolutional layer is connected with the Bounding-box layer.

The traditional VGG-16 convolutional neural network structure is 2 convolutional layers, a pooling layer, 3 convolutional layers, 3 pooling layers, a pooling layer, 3 full-link layers and a Softmax layer which are connected in sequence, the improved VGG-16 convolutional neural network model adjusts the 3 full-link layers into the 2 convolutional layers and the Bounding-box layer on the basis of the traditional VGG-16, enables each convolutional layer to be directly connected with the Bounding-box layer to form a full convolutional network, and adjusts the parameters of each convolutional layer through the Bounding-box layer, and in addition, when an input image is large, when more scenes need to be extracted, the convolutional layers can be correspondingly added in front of the Bounding-box layer.

Further, the image word in step S4 is to input a word a, the corpus processing mechanism obtains a subsequent related word by association calculation according to the word a, the word a and the subsequent related word form a word string, and the word string is a poetry alternative;

the method for calculating the subsequent associated words is to find out the next word with the highest matching degree in the corpus vector library according to the corpus vector of the previous word, and the matching degree is calculated as follows:

wherein, a is the corpus vector of the previous word and b is the corpus vector of any word in the corpus vector library, and the word corresponding to the corpus vector b satisfying cos (a, b) maximum is the next word.

Inputting a word A, calculating the word B with the highest matching degree with the word A in the corpus vector library by the corpus processing mechanism, calculating the word C with the highest matching degree with the word B, and repeating the steps to finally obtain a plurality of matched words to form a word string, and inputting the word string into the LSTM network model to obtain poems. The method only provides one cue word, and the subsequent content is completely matched and calculated by a machine.

To be further described, after the ancient poetry is input in step S1, the same or similar meaning words are classified to create a meaning word list, and the poetry alternative words in step S4 include the input meaning words and the same or similar meaning words in the meaning word list.

Because of the factors of the character ambiguity and the similar meaning words of the Chinese characters, the description words of the same thing in different poems may be different, the meaning word spectrum is designed to combine the words with the same or similar meaning into one kind, and when the poems formed by the input words are lack of aesthetic feeling, the words can be correspondingly adjusted, and the adjustment mode is selected from the words with the same or similar meaning.

Further, the poem body for automatically making poems is a seven-language poem, and the level and narrow rules are as follows: "level in the middle, zeptos and zeptos, level in the middle, zeptos and zeptos-. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. 'OR' is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptozeptozeptos-zeptozeptos-zeptozeptozeptozeptozeptos-zeptozeptos-zeptos-zeptozeptos-zeptozeptozeptos-zeptos-s-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. ";

wherein, level-indicates that the vowel of the character is level or zeptop;

the selection method of the flat zeptosis rule in the step 6 is as follows: the poetry screening mechanism compares the poetry draft with the flat and narrow rules one by one, if not, the inconsistent characters are replaced by characters with the same or similar meanings in the image word notation correspondingly, and the flat and narrow rules are compared again until the poetry draft is completely consistent with the flat and narrow rules.

The invention has the beneficial effects that:

1. the cyclic neural network is based on the connection of human brain features and neurons, and the learning of natural language is very close to the learning of human beings on natural language, so that after an LSTM network model is introduced and the learning is carried out on a corpus of big data, a machine can obtain a better generation model, and the logic, poetry and image relation of poetry is processed.

2. The convolutional neural network is prominent in recognition of objects, can extract most of scene features required by people, and also provides rich keywords and image themes for poetry creation.

3. Because the word vector is calculated through the word frequency and the co-occurrence of the poetry corpus, and the co-occurrence of the words reflects the relation between the words, the cosine of the vector calculated through the word vector can reflect the distance of the relation between the words, so that the method can be used for performing rhyme characters, flat and narrow replacement, word cloud expansion and the like, and is convenient and quick to implement by combining with a word classification table of ancient poetry words.

4. The image word spectrum of the poetry can be used for a keyword input step generated by a machine, and the image word spectrum is used for expansion, so that the problems of inconsistent themes and random jumping in most machine poetry systems are solved.

5. The invention uses word string technique to make machine simulate human thinking mode, which is beneficial practice in cognitive engineering and can realize artistic creation intelligence of human writing mechanism in poetry task of machine to a certain extent.

Drawings

FIG. 1 is a block diagram of the system architecture of the present invention;

FIG. 2 is a schematic structural diagram of an LSTM network model of an embodiment;

FIG. 3 is a flow chart of a method of the present invention;

fig. 4 is a detailed flowchart of step S1;

FIG. 5 is a schematic view of Huffman of an embodiment;

FIG. 6 is a schematic diagram of the improved VGG-16 convolutional neural network model structure of the present invention;

FIG. 7 is a schematic structural diagram of an improved VGG-16 convolutional neural network model of an embodiment.

Detailed Description

The invention is described in further detail below with reference to the figures and the embodiments.

As shown in fig. 1, an automatic poetry making system based on ancient poetry corpus vectorization comprises a corpus processing mechanism, a corpus vector library, an LSTM network model and a poetry screening mechanism;

The LSTM network model in this embodiment is preferably a network model composed of two serial LSTM structures, as shown in fig. 2, where two dotted boxes in the upper and lower parts of the diagram respectively represent one layer of LSTM structure, and each a in the diagram_i,jX representing a neuron, input₁、X₂The input h is the connection relation of two characters for the corpus vectors of two characters connected in sequence in the ancient poetry, namely, every time 1 word connected with two characters is input, the LSTM network model can learn the word relation;

preferably, the optimization function of the LSTM network model is to calculate a random gradient descent, the loss function is to calculate a cross entropy, the LSTM network model discards 20% of the total data after calculation, the learning rate is 0.01, and the number of iterations is 700.

As shown in fig. 3, an automatic poetry method based on ancient poetry corpus vectorization adopts the following steps:

s2, building an LSTM network model;

The specific content of step S1 is as shown in fig. 4:

s1.3, the corpus processing mechanism sets a random n-dimensional vector for each nonrepeating word, where the n-dimensional vector is a corpus vector of the nonrepeating word, and stores the corpus vector into a corpus vector library, where n belongs to [180, 220], and n is an integer, where n is preferably 200;

preferably, the present embodiment selects two seven-sentence absolute sentences: libai, Wanglushan waterfall, states that the river falls for nine days, with three thousand feet directly under the stream. "and" window "in Dufu" Absolute sentence "includes Xiling Qianqiu Xue, Jia Po Dong Wu Wanli ship. Thus, a Huffman tree is built, wherein the thousand words appear 2 times, and the rest words appear only 1 time, so that the end nodes of the thousand words are closer to the root node than the end nodes of the rest words, and meanwhile, the node value of the thousand words is 2, and the rest words are all 1, and finally the Huffman tree shown in fig. 5 is formed.

p(context|x)＝Πp_i

first, calculate theta_iPartial derivatives of (a):

updating the new x to the corpus vector library correspondingly;

The corpus training set adopted in this embodiment is a set composed of 80% of corpus vectors in a corpus vector library, and the corpus vectors in the corpus training set are ordered according to the word order in the corresponding ancient poetry;

In this embodiment, a poetry mode is performed on an input image, that is, the image words in step S4 are image words obtained by inputting an image to an image feature extraction model, and the specific method is as follows:

As shown in fig. 6, the image feature extraction model is an improved VGG-16 convolutional neural network model, and includes a convolutional layer group 1, a pooling layer (Pool), a convolutional layer group 2, a pooling layer, a convolutional layer group 3, a pooling layer, a convolutional layer group 4, a pooling layer, a convolutional layer group 5, a pooling layer, 2 convolutional layers, a Bounding-box layer, and a Softmax layer, which are connected in sequence, where the convolutional layer group 1 and the convolutional layer group 2 are each composed of 2 convolutional layers (Conv) connected in series, the convolutional layer group 3, the convolutional layer group 4, and the convolutional layer group 5 are each composed of 3 convolutional layers connected in series, and each convolutional layer is connected to the Bounding-box layer.

The preferred improved VGG-16 convolutional neural network model of this embodiment is the structure shown in fig. 7, and the dotted line portion in the graph is the convolutional portion of the conventional VGG-16 convolutional neural network structure, that is, 2 convolutional layers, pooling layers, 3 convolutional layers, 3 pooling layers, 3 convolutional layers, 3 pooling layers, which are connected in sequence, and then 6 convolutional layers are connected in sequence, and finally a Bounding-box layer and a Softmax layer are connected, compared with the structure of fig. 4, the structure of fig. 5 has 4 convolutional layers added before the Bounding-box layer, so as to obtain the features of more images, where the convolutional core of each convolutional layer is 3 × 3, and the pooling layer is 2 × 2.

Example two: the image words of the step S4 are input as a word a, the corpus processing mechanism obtains a subsequent associated word by association calculation according to the word a, the word a and the subsequent associated word form a word string, and the word string is a poem alternative word;

After the ancient poetry is input in the step S1, words with the same or similar meanings are classified to establish a meaning word list, and the poetry alternative words in the step S4 comprise the input meaning words and words with the same or similar meanings in the meaning word list.

The poetry body for automatically making poetry is a seven-language regular poetry, and the level and narrow rules are as follows: "level in the middle, zeptos and zeptos, level in the middle, zeptos and zeptos-. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. 'OR' is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptos-zeptozeptozeptozeptozeptozeptozeptos-zeptozeptos-zeptozeptozeptozeptozeptos-zeptozeptos-zeptos-zeptozeptos-zeptozeptozeptos-zeptos-s-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones-tone-tones. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. The zeptos in the zeptos are flat, the zeptos in the zeptos are flat. And the middle level is zeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptozeptos. ";

wherein, level-indicates that the vowel of the character is level or zeptop;

Claims

1. An automatic poetry method based on ancient poetry corpus vectorization is characterized by comprising the following steps:

s2, building an LSTM network model;

s6, selecting poems which are most consistent with the poem rules from poem drafts by the poem screening mechanism according to the rhyme and tone-tone rules of the poem body to obtain fixed-draft poems, wherein the fixed-draft poems are poem automatic making results;

the specific content of step S1 is as follows:

s1.5, selecting the characters adjacent to the context of any corpus vector x on the Huffman tree according to the following selection probability:

p(context|x)＝∏p_i

s1.6, using gradient descent method to x and theta_iRespectively calculating partial derivatives:

first, calculate theta_iPartial derivatives of (a):

updating the new x to the corpus vector library correspondingly;

2. The automatic poetry method based on ancient poetry corpus vectorization as claimed in claim 1, characterized in that: the corpus training set is a set formed by 80% of corpus vectors in a corpus vector library, and the corpus vectors in the corpus training set are ordered according to the word order in the corresponding ancient poetry;

3. The automatic poetry method based on ancient poetry corpus vectorization as claimed in claim 1, characterized in that: the image words in step S4 are image words obtained by inputting images to the image feature extraction model, and the specific method is as follows:

4. The ancient poetry corpus vectorization-based automatic poetry method as claimed in claim 3, wherein: the image feature extraction model is an improved VGG-16 convolutional neural network model and comprises a convolutional layer group 1, a pooling layer, a convolutional layer group 2, a pooling layer, a convolutional layer group 3, a pooling layer, a convolutional layer group 4, a pooling layer, a convolutional layer group 5, a pooling layer, 2 convolutional layers, a Bounding-box layer and a Softmax layer which are sequentially connected, wherein the convolutional layer group 1 and the convolutional layer group 2 are composed of 2 convolutional layers which are connected in series, the convolutional layer group 3, the convolutional layer group 4 and the convolutional layer group 5 are composed of 3 convolutional layers which are connected in series, and each convolutional layer is connected with the Bounding-box layer.

5. The automatic poetry method based on ancient poetry corpus vectorization as claimed in claim 1, characterized in that: the image words of the step S4 are input as a word a, the corpus processing mechanism obtains a subsequent associated word by association calculation according to the word a, the word a and the subsequent associated word form a word string, and the word string is a poem alternative word;

the method for calculating the subsequent associated word is to find out the next word with the highest matching degree in the corpus vector library according to the corpus vector of the previous word, and the matching degree is calculated as follows:

6. The automatic poetry method based on ancient poetry corpus vectorization as claimed in claim 1, characterized in that: after the ancient poetry is input in the step S1, words with the same or similar meanings are classified to establish a meaning word list, and the poetry alternative words in the step S4 comprise the input meaning words and words with the same or similar meanings in the meaning word list.