CN114662456A - Image ancient poem generation method based on Faster R-convolutional neural network detection model - Google Patents
Image ancient poem generation method based on Faster R-convolutional neural network detection model Download PDFInfo
- Publication number
- CN114662456A CN114662456A CN202210273907.XA CN202210273907A CN114662456A CN 114662456 A CN114662456 A CN 114662456A CN 202210273907 A CN202210273907 A CN 202210273907A CN 114662456 A CN114662456 A CN 114662456A
- Authority
- CN
- China
- Prior art keywords
- image
- ancient
- poetry
- picture
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
An image ancient poem generation method based on a Faster R-convolutional neural network detection model comprises the steps of collecting ancient poem elephant word pictures, preprocessing the ancient poem elephant word pictures, constructing an ancient poem elephant word image data set, inputting user images, extracting image keyword features, extracting visual image features, constructing an ancient poem text generation model, judging ancient poem emotional tendency and displaying and generating ancient poems. According to the image ancient poem image data collection method, the image ancient poem image data collection is collected, trained and constructed, so that the accuracy rate, the detection speed and the generation speed of image detection during image ancient poem generation are improved. Image keyword characteristics and image visual characteristics are combined in the ancient poetry generating network, and the theme consistency of the pictures and the ancient poetry is improved. Adopt the emotional tendency who judges the generation ancient poetry in image ancient poetry generates, richened image ancient poetry and generated the function, promoted image ancient poetry and generated the quality. The method has the advantages of high generation speed, high consistency of the image and the ancient poetry theme and the like, and can be used in the technical field of image ancient poetry generation.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to computer image target detection, natural language generation and text emotion classification.
Background
Ancient poetry generation is an important and challenging research task in the field of natural language generation, aiming at enabling computers to create high-quality poetry like poetry. The automatic poetry generation study underwent a transition from the traditional machine translation approach to the deep learning text input approach. However, the mainstream text input method based on deep learning has a great problem: first, the input of a small number of keywords, limited to the text expression ability of the input text, may not sufficiently express the user's writing ideas and emotional fluctuations in the generated poetry. Second, current image data set has the problem that the detection accuracy is low, detection speed is slow when facing image ancient poetry to produce, lacks the proprietary image data set of ancient poetry ideogram. Third, there is a lack of emotional tendency judgment for ancient poems. Compared with keywords, the pictures contain richer semantic information and visual information, are more suitable for being used as input for generating ancient poems and can more fully express the writing intention of the user. And constructing a special image word and image data set for the ancient poems has a remarkable effect of improving the generation quality of the ancient poems.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an image ancient poem generation method based on a Faster R-convolution neural network detection model, which has high retrieval accuracy, high generation speed and good generation quality.
The technical scheme adopted for solving the technical problems comprises the following steps:
(1) collecting picture of ancient poetry elephant words
Based on 100 image words common to the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by a crawler method to obtain 10000 pictures of the ancient poems and image words.
(2) Preprocessing ancient poetry elephant word picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
(3) Construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object class method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and performing 8-step analysis on 10000 collected pictures and 10000 extensible markup language files corresponding to the pictures: 2, carrying out data set segmentation; and (4) training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set.
(4) Inputting user images
And a single picture which needs to be poem is selected as user input, and the size of the picture is not required.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
wherein I represents an input picture, f represents a convolutional neural network calculation, j represents the jth component in pi, and pij(I) Indicates the probability of the jth tag contained in the picture I, fn(I) And (3) representing the nth label score of the picture I after the calculation of the convolutional neural network, wherein the value range of j is 0-9.
The loss function J of the keyword extraction network is:
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
and N represents the number of the extracted keywords, and the value range of N is 0-9.
(6) Extracting visual features of an image
And a set V of visual characteristic vectors extracted from the picture, wherein each vector contains local visual coding information of different positions of the picture, and the weight of each word when the ancient poetry is generated is represented by different vectors.
Obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
where n represents the nth layer of convolution,represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu;
extracting a visual feature vector V according to formula :
V={v1,v2,…,vB}
(7) creating ancient poetry text generation model
By keywordsExtracting N keyword sets K obtained by the network, wherein the K belongs to the { K ∈ [ ]1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating i row ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the connection of ancient poems from line 1 to line i, xjAnd (3) vector representation representing the jth word, wherein C is the length of the ancient poetry sequence, the first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method.
The encoder of the ancient poetry generating network selects a bidirectional gating circulating unit and generates an ancient poetry sequence l1:i-1Coding into a hidden vector H, and a forward gating cyclic unit is responsible for a forward coding sequence l1:i-1And obtaining a forward semantic hidden vectorThe reverse gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vectorSplicing As sequence l1:i-1Determining a forward semantic hidden vector according to the formulaReverse semantic hidden vectorCoding vector hj:
Wherein GRU () is a gated loop unit operation,representing hidden vectorsAnd reverse semantic hidden vectorsThe splicing operation of (1).
The decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and the preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using Softmax function after updatingtThe output with the highest probability is selected as ytI.e. the next word, the next sentence of ancient poetry is generated word by word.
(8) Judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Wherein N ispNumber of words representing positive emotions, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive emotion wordsjRepresenting the weight of the emotional words.
The result of the weighted calculation is used as the judgment basis,the generated poems are positively emotional inclined,the generated poems are negative emotions,0 is not an emotional tendency.
(9) Show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
In the step (2) of the present invention, the performing detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of an enhanced image h (x, y) is 38-64 levels, and the enhanced image h (x, y) is determined according to an formula:
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
In the step (3), the labeling the preprocessed picture by using the pascal visual object class method includes: image collection, image preprocessing, image annotation, data set segmentation, Faster R-convolution spiritTraining by a network model, and exporting and storing the model; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss functionAnd regression functionIs determined as follows, the loss function L ({ p)i},{ti}), classification loss functionRegression function
R=smoothL1(x)
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;is a standard box label with a value of 0 or 1; t is ti4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs a mark corresponding to the positive anchor frameCoordinate vectors of the standard frames; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For the smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is selected as the residual network 50.
Compared with the prior art, the invention has the following advantages:
according to the method, an image detection algorithm based on a fast R-convolutional neural network is adopted to self-establish an image data set of ancient poetry ideogram, so that the detection accuracy and detection speed of the image ancient poetry during generation are improved, the extracted keyword information and visual characteristic information of the image are combined to determine the generation of the ancient poetry, the theme consistency of the generated ancient poetry and the image information is enhanced, the connection and continuity between the ancient poetry sentences are improved through the combination of a decoder gate control circulation unit and a decoder gate control circulation unit of a coder Bi-gate control circulation unit, the functionality and readability of the ancient poetry generation are enriched through the judgment of the emotional tendency of the ancient poetry, and the generation quality and interestingness of the ancient poetry are improved. The invention can help the user experience the input picture to obtain the fun of ancient poetry, promotes the ancient poetry culture to the common public, attracts the public to carry and propagate the poetry culture, is beneficial to enhancing the culture confidence of the public and carries and propagates the Chinese excellent traditional culture.
Drawings
FIG. 1 is a flow chart of example 1 of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the drawings and examples, but the present invention is not limited to the embodiments described below.
Example 1
In fig. 1, the image ancient poem generating method based on the fast R-convolutional neural network detection model of the embodiment comprises the following steps:
(1) collecting picture of ancient poetry elephant words
Based on 100 image words common to the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by a crawler method to obtain 10000 pictures of the ancient poems and image words.
(2) Preprocessing ancient poetry elephant word picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in the embodiment is 16 levels, the specific number of the gray levels of the image f (x, y) is determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in the embodiment is 50 levels, and the specific number of the gray levels of the enhanced image h (x, y) is determined according to the specific result of the detail gray level processing on the image.
The enhanced image h (x, y) is determined by equation :
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
(3) Construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object class method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and performing 8-step analysis on 10000 collected pictures and 10000 extensible markup language files corresponding to the pictures: 2, carrying out data set segmentation; and (4) training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set.
The labeling of the preprocessed picture by using the Pascal visual object class method comprises the following steps: image collection, image preprocessing, image annotation, data set segmentation, fast R-convolution neural network model training and model export storage; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss functionAnd regression functionIs determined as follows, the loss function L ({ p)i},{ti}), classification loss function Lcls(pi,pi *) Regression function
R=smoothL1(x)
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;is a standard box label with a value of 0 or 1; t is tiIs 4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs the coordinate vector of the standard frame corresponding to the positive anchor frame; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For a smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is chosen asA residual network 50.
(4) Inputting user images
And a single picture which needs to be poem is selected as user input, and the size of the picture is not required.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Softmax(f(I))
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth label contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value of j in the embodiment is 5.
The loss function J of the keyword extraction network is:
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0 to 9, and the value of N in this embodiment is 5.
(6) Extracting visual features of an image
And a set V of visual characteristic vectors extracted from the picture, wherein each vector contains local visual coding information of different positions of the picture, and the weight of each word when the ancient poetry is generated is represented by different vectors.
Obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
where n represents the nth layer of convolution,represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu.
Extracting a visual feature vector V according to formula :
V={v1,v2,…,vB}
the method adopts the extracted keyword information and visual characteristic information of the image to determine the generation of the ancient poems, thereby enhancing the theme consistency of the generated ancient poems and the image information.
(7) Creating ancient poetry text generation model
N keyword sets K obtained from keyword extraction network, K ∈ { K }1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating I line ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the coupling of the poetry from line 1 to line i, xjAnd (3) vector representation representing the jth word, wherein C is the length of the ancient poetry sequence, the first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method.
The encoder of the ancient poetry generating network selects a bidirectional gating circulating unit and generates an ancient poetry sequence l1:i-1Coded into a hidden vector H, the forward-gated cyclic unit being responsible forForward coding sequence l1:i-1And obtaining a forward semantic hidden vectorThe reverse gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vectorSplicingAs sequence l1:i-1Determining a forward semantic hidden vector according to the formulaReverse semantic hidden vectorCoding vector hj:
Wherein GRU () is a gated loop unit operation,representing hidden vectorsAnd reverse semantic hidden vectorsThe splicing operation of (1).
The decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using a Softmax function after updatingtThe output with the highest probability is selected as ytI.e. the next word, the next sentence of ancient poetry is generated word by word.
The invention adopts the combination of the encoder Bi-gate control circulation unit decoder gate control circulation unit, and improves the connection and the continuity between the ancient poetry sentences.
(8) Judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Wherein N ispNumber of words representing positive emotion, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive emotion wordsjRepresenting the weight of the emotional words.
The result of the weighted calculation is used as the judgment basis,the generated ancient poems are positively emotional tendencies,the generated ancient poems are negative emotions,0 is not an emotional tendency.
(9) Show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
Image ancient poem generation method based on Faster R-convolution neural network detection model
Example 2
The image ancient poem generating method based on the Faster R-convolutional neural network detection model comprises the following steps:
(1) collecting picture of ancient poetry elephant words
This procedure is the same as in example 1.
(2) Pretreatment of ancient poetry ideogram picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in this embodiment is 0 level, the specific number of gray levels of the image f (x, y) is specifically determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in this embodiment is 38 levels, and the specific number of gray levels of the enhanced image h (x, y) is determined according to the specific result of the detail gray level processing on the image.
The specific method for determining the enhanced image h (x, y) is the same as in example 1.
(3) Construction of image data set of ancient poetry elephant words
This procedure is the same as in example 1.
(4) Inputting user images
This procedure is the same as in example 1.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value of j in the embodiment is 0.
The loss function J of the keyword extraction network is:
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0-9, and the value of N in this embodiment is 0.
The other steps were the same as in example 1.
And finishing the image ancient poem generation method based on the Faster R-convolution neural network detection model.
Example 3
The method for generating image ancient poems based on the Faster R-convolutional neural network detection model comprises the following steps:
(1) collecting picture of ancient poetry elephant words
This procedure is the same as in example 1.
(2) Preprocessing ancient poetry elephant word picture
And carrying out size unification treatment on the acquired image words and pictures, and carrying out detail gray level treatment on the pictures by adopting a piecewise linear gray level enhancement method to enhance the image contrast and compress unnecessary image details.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in this embodiment is 32 levels, the specific number of the gray levels of the image f (x, y) is specifically determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in this embodiment is 64 levels, and the specific number of the gray levels of the enhanced image h (x, y) is determined according to the specific result of performing detail gray level processing on the image.
The specific method for determining the enhanced image h (x, y) is the same as in example 1.
(3) Construction of image data set of ancient poetry elephant words
This procedure is the same as in example 1.
(4) Inputting user images
This procedure is the same as in example 1.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value range of j is 9.
The loss function J of the keyword extraction network is:
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0-9, and the value of N in this embodiment is 9.
The other steps were the same as in example 1.
And finishing the image ancient poem generation method based on the Faster R-convolution neural network detection model.
Claims (3)
1. An image ancient poem generation method based on a Faster R-convolutional neural network detection model is characterized by comprising the following steps:
(1) collecting ancient poetry elephant words picture
Based on 100 common image words of the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by adopting a crawler method to obtain 10000 pictures of the ancient poems;
(2) preprocessing ancient poetry elephant word picture
Carrying out size unification treatment on the collected image word picture, and carrying out detail gray level treatment on the picture by adopting a piecewise linear gray level enhancement method to enhance the image contrast and compress unnecessary image details;
(3) construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and segmenting a data set of 10000 collected pictures and the extensible markup language files corresponding to 10000 pictures according to a ratio of 8: 2; training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set;
(4) inputting user images
A single picture which needs to be poem is selected as user input, and the size of the picture has no requirement;
(5) extracting image keyword features
Extracting high latitude semantic features in an image from an input image of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Softmax(f(I))
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) Representing the nth label score of the picture I after the calculation of the convolutional neural network, wherein the value range of j is 0-9;
the loss function J of the keyword extraction network is:
where Ψ represents the number of samples;
setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of the extracted keywords, and the value range of N is 0-9;
(6) extracting visual features of an image
A set V of visual characteristic vectors extracted from the picture, wherein each vector comprises local visual coding information of different positions of the picture, and the weight of each character when the ancient poetry is generated is represented by different vectors;
obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
where n represents the nth layer of convolution,represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu;
extracting the visual feature vector V according to formula :
V={v1,v2,…,vB}
(7) creating ancient poetry text generation model
N keyword sets K obtained by keyword extraction network, wherein K belongs to { K ∈ [ ]1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating i row ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the connection of ancient poems from line 1 to line i, xjExpressing the vector expression of the jth word, wherein C is the length of an ancient poetry sequence, a first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method;
the encoder of the ancient poetry generating network selects a bidirectional gating circulation unit, and the generated ancient poetry sequence l1:i-1Coding into a hidden vector H, and a forward gating cyclic unit is responsible for a forward coding sequence l1:i-1And obtaining a forward semantic hidden vectorReverse directionThe gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vectorSplicingAs sequence l1:i-1Determining a forward semantic hidden vector according to the formulaReverse semantic hidden vectorEncoding vector hj:
Wherein GRU () is a gated loop unit operation,representing hidden vectorsAnd reverse semantic hidden vectorsSplicing operation of (3);
the decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and the preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using Softmax function after updatingtThe output with the highest probability is selected as ytThe next word is obtained, and the next sentence of ancient poetry is generated word by word;
(8) judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Wherein N ispNumber of words representing positive emotion, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive sentiment wordsjRepresenting the weight of the emotional words;
the result of the weighted calculation is used as the judgment basis,the generated poems are positively emotional inclined,the generated poems are negative emotions,0 is not emotional tendency;
(9) show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
2. The method for generating image ancient poetry based on the Faster R-convolutional neural network detection model as claimed in claim 1, wherein in the step (2), the detail gray level processing of the image by using the piecewise linear gray level enhancement method is as follows: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of an enhanced image h (x, y) is 38-64 levels, and the enhanced image h (x, y) is determined according to an formula:
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
3. The method for generating image ancient poetry based on the Faster R-convolutional neural network detection model as claimed in claim 1, wherein in the step (3), said labeling the preprocessed pictures by using the pascal visual object class method comprises: image collection, image preprocessing, image annotation, data set segmentation, fast R-convolution neural network model training and model export storage; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss function Lcls(pi,pi *) And a regression function Lreg(ti,ti *) Is determined as followsi},{ti}), classification loss function Lcls(pi,pi *) Regression function Lreg(ti,ti *):
R=smoothL1(x)
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;is a standard box label with a value of 0 or 1; t is tiIs 4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs the coordinate vector of the standard frame corresponding to the positive anchor frame; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For a smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is selected as the residual network 50.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210273907.XA CN114662456A (en) | 2022-03-19 | 2022-03-19 | Image ancient poem generation method based on Faster R-convolutional neural network detection model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210273907.XA CN114662456A (en) | 2022-03-19 | 2022-03-19 | Image ancient poem generation method based on Faster R-convolutional neural network detection model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114662456A true CN114662456A (en) | 2022-06-24 |
Family
ID=82031805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210273907.XA Withdrawn CN114662456A (en) | 2022-03-19 | 2022-03-19 | Image ancient poem generation method based on Faster R-convolutional neural network detection model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114662456A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115062179A (en) * | 2022-07-06 | 2022-09-16 | 吴致远 | Image-oriented end-to-end Chinese ancient poetry recommendation method based on deep learning |
CN115080786A (en) * | 2022-08-22 | 2022-09-20 | 科大讯飞股份有限公司 | Picture poetry-based method, device and equipment and storage medium |
-
2022
- 2022-03-19 CN CN202210273907.XA patent/CN114662456A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115062179A (en) * | 2022-07-06 | 2022-09-16 | 吴致远 | Image-oriented end-to-end Chinese ancient poetry recommendation method based on deep learning |
CN115080786A (en) * | 2022-08-22 | 2022-09-20 | 科大讯飞股份有限公司 | Picture poetry-based method, device and equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN113254599A (en) | Multi-label microblog text classification method based on semi-supervised learning | |
CN111985239A (en) | Entity identification method and device, electronic equipment and storage medium | |
CN114662456A (en) | Image ancient poem generation method based on Faster R-convolutional neural network detection model | |
CN113505200B (en) | Sentence-level Chinese event detection method combined with document key information | |
CN112183058B (en) | Poetry generation method and device based on BERT sentence vector input | |
CN111858878B (en) | Method, system and storage medium for automatically extracting answer from natural language text | |
CN112861524A (en) | Deep learning-based multilevel Chinese fine-grained emotion analysis method | |
CN112349294B (en) | Voice processing method and device, computer readable medium and electronic equipment | |
CN112561718A (en) | Case microblog evaluation object emotion tendency analysis method based on BilSTM weight sharing | |
CN113657115A (en) | Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion | |
CN111145914B (en) | Method and device for determining text entity of lung cancer clinical disease seed bank | |
CN113051887A (en) | Method, system and device for extracting announcement information elements | |
CN111311364B (en) | Commodity recommendation method and system based on multi-mode commodity comment analysis | |
CN113032541A (en) | Answer extraction method based on bert and fusion sentence cluster retrieval | |
CN113094502A (en) | Multi-granularity takeaway user comment sentiment analysis method | |
CN115408488A (en) | Segmentation method and system for novel scene text | |
CN114298055B (en) | Retrieval method and device based on multilevel semantic matching, computer equipment and storage medium | |
CN112329449B (en) | Emotion analysis method based on emotion dictionary and Transformer | |
CN116644759B (en) | Method and system for extracting aspect category and semantic polarity in sentence | |
CN116522165B (en) | Public opinion text matching system and method based on twin structure | |
CN116757195B (en) | Implicit emotion recognition method based on prompt learning | |
CN112651225A (en) | Multi-item selection machine reading understanding method based on multi-stage maximum attention | |
CN115759102A (en) | Chinese poetry wine culture named entity recognition method | |
CN116258147A (en) | Multimode comment emotion analysis method and system based on heterogram convolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220624 |
|
WW01 | Invention patent application withdrawn after publication |