CN114662456A - Image ancient poem generation method based on Faster R-convolutional neural network detection model - Google Patents

Image ancient poem generation method based on Faster R-convolutional neural network detection model Download PDF

Info

Publication number
CN114662456A
CN114662456A CN202210273907.XA CN202210273907A CN114662456A CN 114662456 A CN114662456 A CN 114662456A CN 202210273907 A CN202210273907 A CN 202210273907A CN 114662456 A CN114662456 A CN 114662456A
Authority
CN
China
Prior art keywords
image
ancient
poetry
picture
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210273907.XA
Other languages
Chinese (zh)
Inventor
谈启雷
吴晓军
杨红红
张玉梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN202210273907.XA priority Critical patent/CN114662456A/en
Publication of CN114662456A publication Critical patent/CN114662456A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

An image ancient poem generation method based on a Faster R-convolutional neural network detection model comprises the steps of collecting ancient poem elephant word pictures, preprocessing the ancient poem elephant word pictures, constructing an ancient poem elephant word image data set, inputting user images, extracting image keyword features, extracting visual image features, constructing an ancient poem text generation model, judging ancient poem emotional tendency and displaying and generating ancient poems. According to the image ancient poem image data collection method, the image ancient poem image data collection is collected, trained and constructed, so that the accuracy rate, the detection speed and the generation speed of image detection during image ancient poem generation are improved. Image keyword characteristics and image visual characteristics are combined in the ancient poetry generating network, and the theme consistency of the pictures and the ancient poetry is improved. Adopt the emotional tendency who judges the generation ancient poetry in image ancient poetry generates, richened image ancient poetry and generated the function, promoted image ancient poetry and generated the quality. The method has the advantages of high generation speed, high consistency of the image and the ancient poetry theme and the like, and can be used in the technical field of image ancient poetry generation.

Description

Image ancient poem generation method based on Faster R-convolutional neural network detection model
Technical Field
The invention belongs to the technical field of computers, and particularly relates to computer image target detection, natural language generation and text emotion classification.
Background
Ancient poetry generation is an important and challenging research task in the field of natural language generation, aiming at enabling computers to create high-quality poetry like poetry. The automatic poetry generation study underwent a transition from the traditional machine translation approach to the deep learning text input approach. However, the mainstream text input method based on deep learning has a great problem: first, the input of a small number of keywords, limited to the text expression ability of the input text, may not sufficiently express the user's writing ideas and emotional fluctuations in the generated poetry. Second, current image data set has the problem that the detection accuracy is low, detection speed is slow when facing image ancient poetry to produce, lacks the proprietary image data set of ancient poetry ideogram. Third, there is a lack of emotional tendency judgment for ancient poems. Compared with keywords, the pictures contain richer semantic information and visual information, are more suitable for being used as input for generating ancient poems and can more fully express the writing intention of the user. And constructing a special image word and image data set for the ancient poems has a remarkable effect of improving the generation quality of the ancient poems.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an image ancient poem generation method based on a Faster R-convolution neural network detection model, which has high retrieval accuracy, high generation speed and good generation quality.
The technical scheme adopted for solving the technical problems comprises the following steps:
(1) collecting picture of ancient poetry elephant words
Based on 100 image words common to the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by a crawler method to obtain 10000 pictures of the ancient poems and image words.
(2) Preprocessing ancient poetry elephant word picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
(3) Construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object class method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and performing 8-step analysis on 10000 collected pictures and 10000 extensible markup language files corresponding to the pictures: 2, carrying out data set segmentation; and (4) training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set.
(4) Inputting user images
And a single picture which needs to be poem is selected as user input, and the size of the picture is not required.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
Figure BDA0003555060790000021
wherein I represents an input picture, f represents a convolutional neural network calculation, j represents the jth component in pi, and pij(I) Indicates the probability of the jth tag contained in the picture I, fn(I) And (3) representing the nth label score of the picture I after the calculation of the convolutional neural network, wherein the value range of j is 0-9.
The loss function J of the keyword extraction network is:
Figure BDA0003555060790000022
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
and N represents the number of the extracted keywords, and the value range of N is 0-9.
(6) Extracting visual features of an image
And a set V of visual characteristic vectors extracted from the picture, wherein each vector contains local visual coding information of different positions of the picture, and the weight of each word when the ancient poetry is generated is represented by different vectors.
Obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
Figure BDA0003555060790000031
where n represents the nth layer of convolution,
Figure BDA0003555060790000032
represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu;
extracting a visual feature vector V according to formula :
V={v1,v2,…,vB}
(7) creating ancient poetry text generation model
By keywordsExtracting N keyword sets K obtained by the network, wherein the K belongs to the { K ∈ [ ]1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating i row ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the connection of ancient poems from line 1 to line i, xjAnd (3) vector representation representing the jth word, wherein C is the length of the ancient poetry sequence, the first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method.
The encoder of the ancient poetry generating network selects a bidirectional gating circulating unit and generates an ancient poetry sequence l1:i-1Coding into a hidden vector H, and a forward gating cyclic unit is responsible for a forward coding sequence l1:i-1And obtaining a forward semantic hidden vector
Figure BDA0003555060790000033
The reverse gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vector
Figure BDA0003555060790000034
Splicing
Figure BDA0003555060790000035
Figure BDA0003555060790000041
As sequence l1:i-1Determining a forward semantic hidden vector according to the formula
Figure BDA0003555060790000042
Reverse semantic hidden vector
Figure BDA0003555060790000043
Coding vector hj
Figure BDA0003555060790000044
Figure BDA0003555060790000045
Figure BDA0003555060790000046
Wherein GRU () is a gated loop unit operation,
Figure BDA0003555060790000047
representing hidden vectors
Figure BDA0003555060790000048
And reverse semantic hidden vectors
Figure BDA0003555060790000049
The splicing operation of (1).
The decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and the preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using Softmax function after updatingtThe output with the highest probability is selected as ytI.e. the next word, the next sentence of ancient poetry is generated word by word.
(8) Judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Figure BDA00035550607900000410
Figure BDA00035550607900000411
Wherein N ispNumber of words representing positive emotions, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive emotion wordsjRepresenting the weight of the emotional words.
The result of the weighted calculation is used as the judgment basis,
Figure BDA00035550607900000412
the generated poems are positively emotional inclined,
Figure BDA00035550607900000413
the generated poems are negative emotions,
Figure BDA00035550607900000414
0 is not an emotional tendency.
(9) Show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
In the step (2) of the present invention, the performing detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of an enhanced image h (x, y) is 38-64 levels, and the enhanced image h (x, y) is determined according to an formula:
Figure BDA0003555060790000051
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
In the step (3), the labeling the preprocessed picture by using the pascal visual object class method includes: image collection, image preprocessing, image annotation, data set segmentation, Faster R-convolution spiritTraining by a network model, and exporting and storing the model; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss function
Figure BDA0003555060790000052
And regression function
Figure BDA0003555060790000053
Is determined as follows, the loss function L ({ p)i},{ti}), classification loss function
Figure BDA00035550607900000510
Regression function
Figure BDA0003555060790000054
Figure BDA0003555060790000055
Figure BDA0003555060790000056
Figure BDA0003555060790000057
R=smoothL1(x)
Figure BDA0003555060790000058
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;
Figure BDA0003555060790000059
is a standard box label with a value of 0 or 1; t is ti4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs a mark corresponding to the positive anchor frameCoordinate vectors of the standard frames; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For the smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is selected as the residual network 50.
Compared with the prior art, the invention has the following advantages:
according to the method, an image detection algorithm based on a fast R-convolutional neural network is adopted to self-establish an image data set of ancient poetry ideogram, so that the detection accuracy and detection speed of the image ancient poetry during generation are improved, the extracted keyword information and visual characteristic information of the image are combined to determine the generation of the ancient poetry, the theme consistency of the generated ancient poetry and the image information is enhanced, the connection and continuity between the ancient poetry sentences are improved through the combination of a decoder gate control circulation unit and a decoder gate control circulation unit of a coder Bi-gate control circulation unit, the functionality and readability of the ancient poetry generation are enriched through the judgment of the emotional tendency of the ancient poetry, and the generation quality and interestingness of the ancient poetry are improved. The invention can help the user experience the input picture to obtain the fun of ancient poetry, promotes the ancient poetry culture to the common public, attracts the public to carry and propagate the poetry culture, is beneficial to enhancing the culture confidence of the public and carries and propagates the Chinese excellent traditional culture.
Drawings
FIG. 1 is a flow chart of example 1 of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the drawings and examples, but the present invention is not limited to the embodiments described below.
Example 1
In fig. 1, the image ancient poem generating method based on the fast R-convolutional neural network detection model of the embodiment comprises the following steps:
(1) collecting picture of ancient poetry elephant words
Based on 100 image words common to the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by a crawler method to obtain 10000 pictures of the ancient poems and image words.
(2) Preprocessing ancient poetry elephant word picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in the embodiment is 16 levels, the specific number of the gray levels of the image f (x, y) is determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in the embodiment is 50 levels, and the specific number of the gray levels of the enhanced image h (x, y) is determined according to the specific result of the detail gray level processing on the image.
The enhanced image h (x, y) is determined by equation :
Figure BDA0003555060790000071
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
(3) Construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object class method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and performing 8-step analysis on 10000 collected pictures and 10000 extensible markup language files corresponding to the pictures: 2, carrying out data set segmentation; and (4) training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set.
The labeling of the preprocessed picture by using the Pascal visual object class method comprises the following steps: image collection, image preprocessing, image annotation, data set segmentation, fast R-convolution neural network model training and model export storage; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss function
Figure BDA0003555060790000072
And regression function
Figure BDA0003555060790000073
Is determined as follows, the loss function L ({ p)i},{ti}), classification loss function Lcls(pi,pi *) Regression function
Figure BDA0003555060790000074
Figure BDA0003555060790000075
Figure BDA0003555060790000076
Figure BDA0003555060790000077
R=smoothL1(x)
Figure BDA0003555060790000081
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;
Figure BDA0003555060790000082
is a standard box label with a value of 0 or 1; t is tiIs 4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs the coordinate vector of the standard frame corresponding to the positive anchor frame; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For a smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is chosen asA residual network 50.
(4) Inputting user images
And a single picture which needs to be poem is selected as user input, and the size of the picture is not required.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Softmax(f(I))
Figure BDA0003555060790000083
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth label contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value of j in the embodiment is 5.
The loss function J of the keyword extraction network is:
Figure BDA0003555060790000084
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0 to 9, and the value of N in this embodiment is 5.
(6) Extracting visual features of an image
And a set V of visual characteristic vectors extracted from the picture, wherein each vector contains local visual coding information of different positions of the picture, and the weight of each word when the ancient poetry is generated is represented by different vectors.
Obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
Figure BDA0003555060790000091
where n represents the nth layer of convolution,
Figure BDA0003555060790000092
represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu.
Extracting a visual feature vector V according to formula :
V={v1,v2,…,vB}
the method adopts the extracted keyword information and visual characteristic information of the image to determine the generation of the ancient poems, thereby enhancing the theme consistency of the generated ancient poems and the image information.
(7) Creating ancient poetry text generation model
N keyword sets K obtained from keyword extraction network, K ∈ { K }1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating I line ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the coupling of the poetry from line 1 to line i, xjAnd (3) vector representation representing the jth word, wherein C is the length of the ancient poetry sequence, the first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method.
The encoder of the ancient poetry generating network selects a bidirectional gating circulating unit and generates an ancient poetry sequence l1:i-1Coded into a hidden vector H, the forward-gated cyclic unit being responsible forForward coding sequence l1:i-1And obtaining a forward semantic hidden vector
Figure BDA0003555060790000093
The reverse gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vector
Figure BDA0003555060790000094
Splicing
Figure BDA0003555060790000095
As sequence l1:i-1Determining a forward semantic hidden vector according to the formula
Figure BDA0003555060790000101
Reverse semantic hidden vector
Figure BDA0003555060790000102
Coding vector hj
Figure BDA0003555060790000103
Figure BDA0003555060790000104
Figure BDA0003555060790000105
Wherein GRU () is a gated loop unit operation,
Figure BDA0003555060790000106
representing hidden vectors
Figure BDA0003555060790000107
And reverse semantic hidden vectors
Figure BDA0003555060790000108
The splicing operation of (1).
The decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using a Softmax function after updatingtThe output with the highest probability is selected as ytI.e. the next word, the next sentence of ancient poetry is generated word by word.
The invention adopts the combination of the encoder Bi-gate control circulation unit decoder gate control circulation unit, and improves the connection and the continuity between the ancient poetry sentences.
(8) Judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Figure BDA0003555060790000109
Figure BDA00035550607900001010
Wherein N ispNumber of words representing positive emotion, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive emotion wordsjRepresenting the weight of the emotional words.
The result of the weighted calculation is used as the judgment basis,
Figure BDA00035550607900001011
the generated ancient poems are positively emotional tendencies,
Figure BDA00035550607900001012
the generated ancient poems are negative emotions,
Figure BDA00035550607900001013
0 is not an emotional tendency.
(9) Show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
Image ancient poem generation method based on Faster R-convolution neural network detection model
Example 2
The image ancient poem generating method based on the Faster R-convolutional neural network detection model comprises the following steps:
(1) collecting picture of ancient poetry elephant words
This procedure is the same as in example 1.
(2) Pretreatment of ancient poetry ideogram picture
The size unification processing is carried out on the collected image word pictures, the detail gray level processing is carried out on the pictures by adopting a piecewise linear gray level enhancement method, the image contrast is enhanced, and unnecessary image details are compressed.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in this embodiment is 0 level, the specific number of gray levels of the image f (x, y) is specifically determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in this embodiment is 38 levels, and the specific number of gray levels of the enhanced image h (x, y) is determined according to the specific result of the detail gray level processing on the image.
The specific method for determining the enhanced image h (x, y) is the same as in example 1.
(3) Construction of image data set of ancient poetry elephant words
This procedure is the same as in example 1.
(4) Inputting user images
This procedure is the same as in example 1.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
Figure BDA0003555060790000121
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value of j in the embodiment is 0.
The loss function J of the keyword extraction network is:
Figure BDA0003555060790000122
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0-9, and the value of N in this embodiment is 0.
The other steps were the same as in example 1.
And finishing the image ancient poem generation method based on the Faster R-convolution neural network detection model.
Example 3
The method for generating image ancient poems based on the Faster R-convolutional neural network detection model comprises the following steps:
(1) collecting picture of ancient poetry elephant words
This procedure is the same as in example 1.
(2) Preprocessing ancient poetry elephant word picture
And carrying out size unification treatment on the acquired image words and pictures, and carrying out detail gray level treatment on the pictures by adopting a piecewise linear gray level enhancement method to enhance the image contrast and compress unnecessary image details.
The method for carrying out detail gray level processing on the picture by using the piecewise linear gray level enhancement method comprises the following steps: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of the image f (x, y) in this embodiment is 32 levels, the specific number of the gray levels of the image f (x, y) is specifically determined according to the input image, the gray level of an enhanced image h (x, y) is 38-64 levels, the gray level of the enhanced image h (x, y) in this embodiment is 64 levels, and the specific number of the gray levels of the enhanced image h (x, y) is determined according to the specific result of performing detail gray level processing on the image.
The specific method for determining the enhanced image h (x, y) is the same as in example 1.
(3) Construction of image data set of ancient poetry elephant words
This procedure is the same as in example 1.
(4) Inputting user images
This procedure is the same as in example 1.
(5) Extracting image keyword features
Extracting high latitude semantic features in an input picture of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Soft max(f(I))
Figure BDA0003555060790000131
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) The nth label score of the picture I after the convolution neural network calculation is represented, the value range of j is 0-9, and the value range of j is 9.
The loss function J of the keyword extraction network is:
Figure BDA0003555060790000132
where Ψ represents the number of samples.
Setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of extracted keywords, the value range of N is 0-9, and the value of N in this embodiment is 9.
The other steps were the same as in example 1.
And finishing the image ancient poem generation method based on the Faster R-convolution neural network detection model.

Claims (3)

1. An image ancient poem generation method based on a Faster R-convolutional neural network detection model is characterized by comprising the following steps:
(1) collecting ancient poetry elephant words picture
Based on 100 common image words of the ancient poems, crawling 100 pictures corresponding to the image words from the Internet image data by adopting a crawler method to obtain 10000 pictures of the ancient poems;
(2) preprocessing ancient poetry elephant word picture
Carrying out size unification treatment on the collected image word picture, and carrying out detail gray level treatment on the picture by adopting a piecewise linear gray level enhancement method to enhance the image contrast and compress unnecessary image details;
(3) construction of image data set of ancient poetry elephant words
Labeling the preprocessed pictures by using a Pascal visual object method, sequentially labeling image word labels contained in the pictures, outputting extensible markup language files corresponding to the pictures, and segmenting a data set of 10000 collected pictures and the extensible markup language files corresponding to 10000 pictures according to a ratio of 8: 2; training by adopting a Faster R-convolutional neural network to obtain an ancient poetry image data set;
(4) inputting user images
A single picture which needs to be poem is selected as user input, and the size of the picture has no requirement;
(5) extracting image keyword features
Extracting high latitude semantic features in an image from an input image of a user by using a convolutional neural network, predicting probability distribution of image tags by using a Softmax function, and determining predicted tag distribution pi according to an formula:
Π=Softmax(f(I))
Figure FDA0003555060780000011
where I represents the input picture, f represents the convolutional neural network computation, j represents the jth component in Π, and πj(I) Indicates the probability of the jth tag contained in the picture I, fn(I) Representing the nth label score of the picture I after the calculation of the convolutional neural network, wherein the value range of j is 0-9;
the loss function J of the keyword extraction network is:
Figure FDA0003555060780000021
where Ψ represents the number of samples;
setting a probability threshold, selecting a label with a high probability threshold as a label of the sample, namely a keyword of the image, wherein a keyword set K is represented as:
K={k1,k2,……,kN}
wherein N represents the number of the extracted keywords, and the value range of N is 0-9;
(6) extracting visual features of an image
A set V of visual characteristic vectors extracted from the picture, wherein each vector comprises local visual coding information of different positions of the picture, and the weight of each character when the ancient poetry is generated is represented by different vectors;
obtaining visual feature vectors of the user input pictures through convolutional neural network processing, wherein convolutional neural network convolutional layers are determined according to an formula:
Figure FDA0003555060780000022
where n represents the nth layer of convolution,
Figure FDA0003555060780000023
represents the output of the nth layer of convolution, represents the convolution operation, and g represents the activation function Relu;
extracting the visual feature vector V according to formula :
V={v1,v2,…,vB}
(7) creating ancient poetry text generation model
N keyword sets K obtained by keyword extraction network, wherein K belongs to { K ∈ [ ]1,k2,…,kNV, V e { V ∈ { V } in combination with a set of visual feature vectors using convolutional neural networks1,v2,…,vB},vjRepresenting j part information in pictures contained in each visual feature vector, generating ancient poems sentence by sentence, and generating i row ancient poems liAll rows l generated before1:i-1∈{x1,x2,…,xCAll can be used as inputs to the model, where l1:i-1Sequence representing the connection of ancient poems from line 1 to line i, xjExpressing the vector expression of the jth word, wherein C is the length of an ancient poetry sequence, a first sentence of the ancient poetry is generated by a keyword set K and a visual characteristic vector set V, and other sentences are generated according to the method;
the encoder of the ancient poetry generating network selects a bidirectional gating circulation unit, and the generated ancient poetry sequence l1:i-1Coding into a hidden vector H, and a forward gating cyclic unit is responsible for a forward coding sequence l1:i-1And obtaining a forward semantic hidden vector
Figure FDA0003555060780000031
Reverse directionThe gated cyclic unit is responsible for the reverse coding sequence l1:i-1And obtaining a reverse semantic hidden vector
Figure FDA0003555060780000032
Splicing
Figure FDA0003555060780000033
As sequence l1:i-1Determining a forward semantic hidden vector according to the formula
Figure FDA0003555060780000034
Reverse semantic hidden vector
Figure FDA0003555060780000035
Encoding vector hj
Figure FDA0003555060780000036
Figure FDA0003555060780000037
Figure FDA0003555060780000038
Wherein GRU () is a gated loop unit operation,
Figure FDA0003555060780000039
representing hidden vectors
Figure FDA00035550607800000310
And reverse semantic hidden vectors
Figure FDA00035550607800000311
Splicing operation of (3);
the decoder of the ancient poetry generating network selects a one-way gating circulation unit, and the next sentence of ancient poetry l is obtained by decoding visual characteristic information V and the preceding coding information Hi∈{y1,y2,…,yGCircularly updating internal transition state s by using gated cyclic unit of decodertFor decoding yt,stCalculating the probability distribution y of each word by using Softmax function after updatingtThe output with the highest probability is selected as ytThe next word is obtained, and the next sentence of ancient poetry is generated word by word;
(8) judging emotional tendency of ancient poems
Outputting an emotional tendency judgment network for the generated poetry and poetry sentences, giving emotional word weights according to emotional strength by inquiring an emotional dictionary, and determining weighted summation according to the following formula to judge the emotional tendency of the poetry and poetry
Figure FDA00035550607800000312
Figure FDA00035550607800000313
Wherein N ispNumber of words representing positive emotion, NnNumber of words representing negative emotions, wpiWeight, wp, representing positive sentiment wordsjRepresenting the weight of the emotional words;
the result of the weighted calculation is used as the judgment basis,
Figure FDA00035550607800000314
the generated poems are positively emotional inclined,
Figure FDA00035550607800000315
the generated poems are negative emotions,
Figure FDA00035550607800000316
0 is not emotional tendency;
(9) show and generate ancient poems
And displaying the generated ancient poems and emotion tendency judgment results of the ancient poems to the user.
2. The method for generating image ancient poetry based on the Faster R-convolutional neural network detection model as claimed in claim 1, wherein in the step (2), the detail gray level processing of the image by using the piecewise linear gray level enhancement method is as follows: the gray level of an image f (x, y) input by a user is 0-32 levels, the gray level of an enhanced image h (x, y) is 38-64 levels, and the enhanced image h (x, y) is determined according to an formula:
Figure FDA0003555060780000041
wherein, x represents the length of the picture pixel, y represents the width of the picture pixel, and the sections [ a, b ], [ c, d ] are respectively a certain gray scale section of the original image and the enhanced image.
3. The method for generating image ancient poetry based on the Faster R-convolutional neural network detection model as claimed in claim 1, wherein in the step (3), said labeling the preprocessed pictures by using the pascal visual object class method comprises: image collection, image preprocessing, image annotation, data set segmentation, fast R-convolution neural network model training and model export storage; wherein the loss function L ({ p) of the fast R-convolutional neural network modeli},{ti}) as a classification loss function Lcls(pi,pi *) And a regression function Lreg(ti,ti *) Is determined as followsi},{ti}), classification loss function Lcls(pi,pi *) Regression function Lreg(ti,ti *):
Figure FDA0003555060780000042
Figure FDA0003555060780000043
Figure FDA0003555060780000044
R=smoothL1(x)
Figure FDA0003555060780000045
Wherein i is an index of an anchor frame in the small batch descent; p is a radical ofiIs the predicted probability with anchor frame i as the target;
Figure FDA0003555060780000046
is a standard box label with a value of 0 or 1; t is tiIs 4 parameterized coordinate vectors representing predicted bounding boxes; n is a radical ofclsIs the coordinate vector of the standard frame corresponding to the positive anchor frame; n is a radical ofregThe number of anchor frame positions; smoothL1(x) For a smooth function, here 2400, λ is 10, the learning rate of the model is 0.0025, the number of iteration rounds is 12, the batch size of each round of training is 2, and the backbone network is selected as the residual network 50.
CN202210273907.XA 2022-03-19 2022-03-19 Image ancient poem generation method based on Faster R-convolutional neural network detection model Withdrawn CN114662456A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210273907.XA CN114662456A (en) 2022-03-19 2022-03-19 Image ancient poem generation method based on Faster R-convolutional neural network detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210273907.XA CN114662456A (en) 2022-03-19 2022-03-19 Image ancient poem generation method based on Faster R-convolutional neural network detection model

Publications (1)

Publication Number Publication Date
CN114662456A true CN114662456A (en) 2022-06-24

Family

ID=82031805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210273907.XA Withdrawn CN114662456A (en) 2022-03-19 2022-03-19 Image ancient poem generation method based on Faster R-convolutional neural network detection model

Country Status (1)

Country Link
CN (1) CN114662456A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062179A (en) * 2022-07-06 2022-09-16 吴致远 Image-oriented end-to-end Chinese ancient poetry recommendation method based on deep learning
CN115080786A (en) * 2022-08-22 2022-09-20 科大讯飞股份有限公司 Picture poetry-based method, device and equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062179A (en) * 2022-07-06 2022-09-16 吴致远 Image-oriented end-to-end Chinese ancient poetry recommendation method based on deep learning
CN115080786A (en) * 2022-08-22 2022-09-20 科大讯飞股份有限公司 Picture poetry-based method, device and equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN113254599A (en) Multi-label microblog text classification method based on semi-supervised learning
CN111985239A (en) Entity identification method and device, electronic equipment and storage medium
CN114662456A (en) Image ancient poem generation method based on Faster R-convolutional neural network detection model
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN112183058B (en) Poetry generation method and device based on BERT sentence vector input
CN111858878B (en) Method, system and storage medium for automatically extracting answer from natural language text
CN112861524A (en) Deep learning-based multilevel Chinese fine-grained emotion analysis method
CN112349294B (en) Voice processing method and device, computer readable medium and electronic equipment
CN112561718A (en) Case microblog evaluation object emotion tendency analysis method based on BilSTM weight sharing
CN113657115A (en) Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN113051887A (en) Method, system and device for extracting announcement information elements
CN111311364B (en) Commodity recommendation method and system based on multi-mode commodity comment analysis
CN113032541A (en) Answer extraction method based on bert and fusion sentence cluster retrieval
CN113094502A (en) Multi-granularity takeaway user comment sentiment analysis method
CN115408488A (en) Segmentation method and system for novel scene text
CN114298055B (en) Retrieval method and device based on multilevel semantic matching, computer equipment and storage medium
CN112329449B (en) Emotion analysis method based on emotion dictionary and Transformer
CN116644759B (en) Method and system for extracting aspect category and semantic polarity in sentence
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN116757195B (en) Implicit emotion recognition method based on prompt learning
CN112651225A (en) Multi-item selection machine reading understanding method based on multi-stage maximum attention
CN115759102A (en) Chinese poetry wine culture named entity recognition method
CN116258147A (en) Multimode comment emotion analysis method and system based on heterogram convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220624

WW01 Invention patent application withdrawn after publication