CN112434145A - Picture-viewing poetry method based on image recognition and natural language processing - Google Patents

Picture-viewing poetry method based on image recognition and natural language processing Download PDF

Info

Publication number
CN112434145A
CN112434145A CN202011333715.0A CN202011333715A CN112434145A CN 112434145 A CN112434145 A CN 112434145A CN 202011333715 A CN202011333715 A CN 202011333715A CN 112434145 A CN112434145 A CN 112434145A
Authority
CN
China
Prior art keywords
poetry
image recognition
picture
model
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011333715.0A
Other languages
Chinese (zh)
Inventor
李雪威
解向川
雷松源
陈志超
童跃凡
任艺丹
徐天一
赵满坤
高洁
刘志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202011333715.0A priority Critical patent/CN112434145A/en
Publication of CN112434145A publication Critical patent/CN112434145A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to a picture-viewing poetry method based on image recognition and natural language processing, which is characterized in that: the method comprises the following steps: s1, collecting and processing an image data set; s2, establishing an image recognition model for extracting keywords related to the image; s3, testing the image recognition effect; s4, poetry data set is collected and processed; s5, establishing a keyword and poem matching model; and S6, establishing a poem writing model. The invention has scientific and reasonable design, obtains key words by processing images through the VGG16 neural network, calculates and matches the optimal poetry through a word bag model, TF-IDF and cosine similarity, and compiles the poetry through the LSTM neural network. After the picture input by the user is processed, the output result obtained by segmenting the data set and optimizing the keywords has certain linguistic significance and excellent effect.

Description

Picture-viewing poetry method based on image recognition and natural language processing
Technical Field
The invention belongs to natural language processing and image recognition, and particularly relates to a poetry looking picture method based on image recognition and natural language processing.
Background
Image recognition refers to a computer vision technique that processes an unknown image with a computer and recognizes relevant information in the image. Generalized image recognition can be simplified into four steps: image acquisition, image preprocessing, feature extraction and image identification. Feature extraction refers to converting various information contained in an image into a feature vector which is convenient for computer processing under a specific recognition task. Feature extraction typically utilizes convolutional neural networks. The method is a feedforward neural network which comprises convolution calculation and has a depth structure, and the artificial neurons of the feedforward neural network can respond to peripheral units in a part of coverage range and have excellent performance on large-scale image processing. After the image recognition is carried out by feature extraction, all information of the image is converted into a series of feature vectors, and the image recognition is a process for recognizing the feature vectors.
TF-IDF is a commonly used weighting technique for information retrieval and text mining to evaluate the importance of a word to one of a set of documents or a corpus of documents. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus.
A Recurrent Neural Network (RNN) is a type of recurrent neural network in which sequence data is input, recursion is performed in the direction of evolution of the sequence, and all nodes are connected in a chain. Long-short term memory (LSTM) is a time-cycle neural network that passes information useful for subsequent calculations by forgetting information in the state of cells and remembering new information, while useless information is discarded and hidden state variables are output at each time step.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a poetry method based on image recognition and natural language processing by using a picture, wherein images are processed by a VGG16 neural network to obtain keywords, the optimal poetry is calculated and matched by a bag-of-words model, TF-IDF and cosine similarity, and the poetry is compiled by an LSTM neural network. After the picture input by the user is processed, the output result obtained by segmenting the data set and optimizing the keywords has certain linguistic significance and excellent effect.
The technical problem to be solved by the invention is realized by the following technical scheme:
a picture-viewing poetry method based on image recognition and natural language processing is characterized in that: the method comprises the following steps:
s1, collecting and processing an image data set;
s2, establishing an image recognition model for extracting keywords related to the image;
s3, testing the image recognition effect;
s4, poetry data set is collected and processed;
s5, establishing a keyword and poetry matching model for obtaining poetry matched with the image;
and S6, establishing a poetry writing model for creating poetry related to the keywords.
Moreover, the specific steps of collecting and processing the image data set in step S1 are as follows:
a. collecting pictures related to ancient poems by a crawler method;
b. manually screening pictures to ensure the quality of the pictures;
c. the original picture data set is converted to a file in tfrecrds format for subsequent use.
The specific steps of step S2 are: an image recognition model was built from the VGG16 model, and 16 layers of CNNs were constructed by iteratively stacking 3 × 3 small convolution kernels and 2 × 2 maximal pooling layers.
In step S3, the model is tested using the test set picture, and the recognition effect is determined according to the result of the picture classification and the probability.
In step S4, a large number of ancient poetry data sets are collected, and preprocessing such as unifying the format and removing abnormal data is performed on the poetry data sets, and the processed data is used as a corpus.
Moreover, the step S5 of matching the keywords with the poems specifically includes the steps of:
a. dividing poetry into words, and calling a model for calculating TF-IDF to statistically calculate the TF-IDF weight of each word in a word frequency matrix;
b. calculating the similarity between the keywords and the poem;
c. and (4) taking each half of the cosine similarity and the Jaccard value as a weight to sort the similarity values, and selecting the poem with higher matching degree as a result.
The invention has the advantages and beneficial effects that:
1. the invention relates to a poetry viewing method based on image recognition and natural language processing, which comprises the steps of processing an image through a VGG16 neural network to obtain keywords, calculating and matching optimal poetry through a bag-of-words model, TF-IDF and cosine similarity, and compiling poetry through an LSTM neural network. After the picture input by the user is processed, the output result obtained by segmenting the data set and optimizing the keywords has certain linguistic significance and excellent effect.
2. The invention relates to a picture-viewing poetry method based on image recognition and natural language processing, which can extract keywords of objects existing in an image, and match the existing ancient poetry or create new poetry according to the requirements of a user on the basis of the keywords; the invention hopes to explore a method for endowing a computer with higher level intelligence through fusion and collision among technologies in different fields, and finally, according to actual test effects, the invention accords with expected effects to a certain extent and can finish the purpose of poetry by looking at the picture.
Drawings
FIG. 1 is a schematic diagram of a VGG16 structure according to the present invention;
FIG. 2 is a comparison graph of the differences between LSTM and RNN of the present invention;
FIG. 3 is a schematic diagram illustrating the effect of the present invention;
fig. 4 is a schematic diagram illustrating another effect of the present invention.
Detailed Description
The present invention is further illustrated by the following specific examples, which are intended to be illustrative, not limiting and are not intended to limit the scope of the invention.
A picture-viewing poetry method based on image recognition and natural language processing is characterized in that: the method comprises the following specific steps:
step 1: collecting and preprocessing image data sets
In order to establish a reliable and required image recognition model, the method has higher requirements on the images of the training model, and has the advantages of higher accuracy and enough sensitivity to the frequently-occurring intention in poetry. There is therefore a need to collect a sufficient, high quality image of the usual intent of ancient poetry.
Step 2: establishing an image recognition model
The good image recognition model directly determines the final use effect. The invention utilizes the VGG16 model proposed by the vision geometry group of Oxford university, obtains the required image recognition model based on the model and the training of 70% of the collected data in the data set, and utilizes the rest 30% of the collected data in the image data set to test the image recognition model to ensure the quality of the image recognition model.
And step 3: gathering and preprocessing poetry data
In order to train a poetry writing model and provide the poetry matching function by the invention, poetry data also needs to be collected. The invention utilizes the existing poetry data set ' chicken-poetry ' to be the most complete Chinese poetry classical literature set database '. The poetry database is the most comprehensive Chinese classical corpus database and comprises 5.5 thousands of Tang poems, 26 thousands of Song poems, 2.1 thousands of Song poems and other classical corpuses. And according to the classification of the image data set, the poetry data set is also classified according to the contained intention and is stored in a file in a csv format.
And 4, step 4: establishing a matching model of keywords and poems
The principle of the matching model is as follows: firstly, calculating TF-IDF weight of each poem, then calculating similarity between keywords obtained by an image recognition model and the poems, obtaining total similarity by using cosine similarity and jaccar similarity accounting for 50& respectively, sequencing poems according to the similarity between the total similarity and the keywords, and selecting the poems with the highest similarity as a final result.
And 5: establishing poem-writing model
The invention utilizes the collected poetry data set to train a poetry writing model by means of an LSTM network. The generated model generates poetry according to the input keywords, but the generated result format and punctuation have problems, and cutting is needed to remove the content after the last period.
The VGG16 model is a convolutional neural network model proposed by Simony and Zisserman of the Visual Geometry Group (Visual Geometry Group) of the university of Oxford, and comprises 13 convolutional layers and 3 fully-connected layers. The VGG16 model performs well in the field of image recognition, which is why we chose it. The network structure is shown in fig. 1.
When training a VGG16 network, a tfrecrd file which is made by making a data set crawled by the user into two parts is used as input, and the specified keywords are common poetry keywords such as: and (4) performing iterative training on chrysanthemum, plum and moon, wherein the learning rate is set to be 0.0001, the batch size is 64, and the steps are jointly trained for 5000 steps to obtain two trained network models.
The invention constructs a test set by the same method, evaluates the VGG16 image recognition model, and the main purpose of the test set accuracy calculation is to evaluate the performance of the model on a non-training set and whether an overfitting phenomenon exists, so that the accuracy of both recognition models is more than 90%. It can be seen that the model also performs well on the test set.
In a module of matching poetry with keywords, dividing words of the poetry by using a natural language processing technology, and calling a word bag model to convert words in a text into a word frequency matrix; then calling a model for calculating TF-IDF to count the TF-IDF weight of each word in the word frequency matrix; and obtaining the cosine similarity and the Jaccard value of the keywords and each word. Poetry that best matches the keywords identified by the image can be found with higher accuracy in this way.
The poetry writing model adopts an LSTM network structure, is a special RNN network, compared with the RNN, the LSTM solves the problems of gradient disappearance and gradient explosion in the long sequence training process, and the LSTM has better performance in a longer sequence. The fact proves that the LSTM network can well write out poems which meet the requirements according to the keywords. The main input-output differences between the LSTM structure and the general RNN are shown in fig. 2.
By combining the results, the invention can realize the function of matching the existing ancient poems or creating new poems by looking at pictures according to the requirements of users, the specific practical use effect is shown in figures 3 and 4, and the function of looking at pictures to make poems can be well finished.
The method is an image-viewing poetry-based poetry method based on image recognition and natural language processing, can extract keywords of objects existing in an image, and matches the existing ancient poetry or creates new poetry according to the requirements of users on the basis of the keywords. The invention hopes to explore a method for endowing a computer with higher level intelligence through fusion and collision among technologies in different fields, and finally, according to actual test effects, the invention accords with expected effects to a certain extent and can finish the purpose of poetry by looking at the picture.
Although the embodiments of the present invention and the accompanying drawings are disclosed for illustrative purposes, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the invention and the appended claims, and therefore the scope of the invention is not limited to the disclosure of the embodiments and the accompanying drawings.

Claims (6)

1. A picture-viewing poetry method based on image recognition and natural language processing is characterized in that: the method comprises the following steps:
s1, collecting and processing an image data set;
s2, establishing an image recognition model for extracting keywords related to the image;
s3, testing the image recognition effect;
s4, poetry data set is collected and processed;
s5, establishing a keyword and poetry matching model for obtaining poetry matched with the image;
and S6, establishing a poetry writing model for creating poetry related to the keywords.
2. The picture-viewing poetry method based on image recognition and natural language processing as claimed in claim 1, wherein: the specific steps of collecting and processing the image data set in step S1 are as follows:
a. collecting pictures related to ancient poems by a crawler method;
b. manually screening pictures to ensure the quality of the pictures;
c. the original picture data set is converted to a file in tfrecrds format for subsequent use.
3. The picture-viewing poetry method based on image recognition and natural language processing as claimed in claim 1, wherein: the specific steps of step S2 are: an image recognition model was built from the VGG16 model, and 16 layers of CNNs were constructed by iteratively stacking 3 × 3 small convolution kernels and 2 × 2 maximal pooling layers.
4. The picture-viewing poetry method based on image recognition and natural language processing as claimed in claim 1, wherein: and step S3, testing the model by using the test set picture, and judging the recognition effect according to the picture classification result and the probability.
5. The picture-viewing poetry method based on image recognition and natural language processing as claimed in claim 1, wherein: step S4 is to collect a large number of ancient poetry data sets, and to perform preprocessing such as unifying the format and eliminating abnormal data on the poetry data sets, and to use the processed data as a corpus.
6. The picture-viewing poetry method based on image recognition and natural language processing as claimed in claim 1, wherein: the step S5 of matching the keywords with the poems specifically includes the steps of:
a. dividing poetry into words, and calling a model for calculating TF-IDF to statistically calculate the TF-IDF weight of each word in a word frequency matrix;
b. calculating the similarity between the keywords and the poem;
c. and (4) taking each half of the cosine similarity and the Jaccard value as a weight to sort the similarity values, and selecting the poem with higher matching degree as a result.
CN202011333715.0A 2020-11-25 2020-11-25 Picture-viewing poetry method based on image recognition and natural language processing Pending CN112434145A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011333715.0A CN112434145A (en) 2020-11-25 2020-11-25 Picture-viewing poetry method based on image recognition and natural language processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011333715.0A CN112434145A (en) 2020-11-25 2020-11-25 Picture-viewing poetry method based on image recognition and natural language processing

Publications (1)

Publication Number Publication Date
CN112434145A true CN112434145A (en) 2021-03-02

Family

ID=74697395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011333715.0A Pending CN112434145A (en) 2020-11-25 2020-11-25 Picture-viewing poetry method based on image recognition and natural language processing

Country Status (1)

Country Link
CN (1) CN112434145A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113794915A (en) * 2021-09-13 2021-12-14 海信电子科技(武汉)有限公司 Server, display equipment, poetry and song endowing generation method and media asset playing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480132A (en) * 2017-07-25 2017-12-15 浙江工业大学 A kind of classic poetry generation method of image content-based
CN109086270A (en) * 2018-07-24 2018-12-25 重庆大学 System and method of composing poem automatically based on classic poetry corpus vectorization
CN111695349A (en) * 2019-02-28 2020-09-22 北京京东尚科信息技术有限公司 Text matching method and text matching system
CN111797262A (en) * 2020-06-24 2020-10-20 北京小米松果电子有限公司 Poetry generation method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480132A (en) * 2017-07-25 2017-12-15 浙江工业大学 A kind of classic poetry generation method of image content-based
CN109086270A (en) * 2018-07-24 2018-12-25 重庆大学 System and method of composing poem automatically based on classic poetry corpus vectorization
CN111695349A (en) * 2019-02-28 2020-09-22 北京京东尚科信息技术有限公司 Text matching method and text matching system
CN111797262A (en) * 2020-06-24 2020-10-20 北京小米松果电子有限公司 Poetry generation method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113794915A (en) * 2021-09-13 2021-12-14 海信电子科技(武汉)有限公司 Server, display equipment, poetry and song endowing generation method and media asset playing method

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
CN109993100B (en) Method for realizing facial expression recognition based on deep feature clustering
CN111552803B (en) Text classification method based on graph wavelet network model
CN111008224B (en) Time sequence classification and retrieval method based on deep multitasking representation learning
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN113806494A (en) Named entity recognition method based on pre-training language model
CN111582506A (en) Multi-label learning method based on global and local label relation
CN112749274A (en) Chinese text classification method based on attention mechanism and interference word deletion
CN114579746A (en) Optimized high-precision text classification method and device
CN116152554A (en) Knowledge-guided small sample image recognition system
CN115062727A (en) Graph node classification method and system based on multi-order hypergraph convolutional network
CN114881173A (en) Resume classification method and device based on self-attention mechanism
CN112860898B (en) Short text box clustering method, system, equipment and storage medium
CN112434686B (en) End-to-end misplaced text classification identifier for OCR (optical character) pictures
CN112231476B (en) Improved graphic neural network scientific literature big data classification method
CN112434145A (en) Picture-viewing poetry method based on image recognition and natural language processing
Hajihashemi et al. A pattern recognition based Holographic Graph Neuron for Persian alphabet recognition
CN111859955A (en) Public opinion data analysis model based on deep learning
CN113822061B (en) Small sample patent classification method based on feature map construction
CN115100694A (en) Fingerprint quick retrieval method based on self-supervision neural network
Gong et al. Graph convolutional networks-based label distribution learning for image classification
CN114003706A (en) Keyword combination generation model training method and device
CN109472319B (en) Three-dimensional model classification method and retrieval method
Sun et al. Analysis of English writing text features based on random forest and Logistic regression classification algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210302

RJ01 Rejection of invention patent application after publication