CN109857884B - Automatic image semantic description method - Google Patents

Automatic image semantic description method Download PDF

Info

Publication number
CN109857884B
CN109857884B CN201811564965.8A CN201811564965A CN109857884B CN 109857884 B CN109857884 B CN 109857884B CN 201811564965 A CN201811564965 A CN 201811564965A CN 109857884 B CN109857884 B CN 109857884B
Authority
CN
China
Prior art keywords
image
description
semantic
labeling
classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811564965.8A
Other languages
Chinese (zh)
Other versions
CN109857884A (en
Inventor
李祖贺
张涛
钱晓亮
曾黎
金保华
于泽琦
田二林
于源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University of Light Industry
Original Assignee
Zhengzhou University of Light Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University of Light Industry filed Critical Zhengzhou University of Light Industry
Priority to CN201811564965.8A priority Critical patent/CN109857884B/en
Publication of CN109857884A publication Critical patent/CN109857884A/en
Application granted granted Critical
Publication of CN109857884B publication Critical patent/CN109857884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an automatic image semantic description method, which comprises the following steps of; clustering the image set according to the visual characteristics by using a clustering algorithm; dividing the clustered image set into a plurality of categories; using CNN to perform image pre-description processing; labeling the image with a plurality of categories, and calling the pre-description of the first layer as the category labeling of the image; constructing a classifier for the images of each category by using the SVM; determining whether to add such a description to the image using a classifier; an MBRM model marking algorithm is utilized; and obtaining the image semantics through the combination of the image regions obtained by the related training set. The invention provides an automatic image semantic description method, which can effectively fuse the bottom-layer characteristics of an image and image semantic description high-level semantic information, has the characteristics of high precision, high accuracy, definition, formalization, shareability, conceptualization and the like, can be widely applied to a plurality of fields including information retrieval, information extraction, semantic network and knowledge management, and has strong applicability.

Description

Automatic image semantic description method
Technical Field
The invention relates to the technical field of image semantic description, in particular to an automatic image semantic description method.
Background
Image content automatic description (image capturing), namely, the content of an image is automatically described by natural language, because the image content automatic description has wide application prospects, such as a man-machine interaction and blind guiding system, the image content automatic description is recently a new focus in the fields of computer vision and artificial intelligence, and is different from image classification or object detection, the image automatic description takes the comprehensive description of objects, scenes and relations thereof as a target, and relates to visual scene analysis, content semantic understanding and natural language processing, and is the integrated design of a tip technology in a mixed task;
in the prior art, image features extracted by a CNN are used as input of a Recurrent Neural Network (RNN), image semantic description information is used as output of the RNN, and an image semantic description problem is regarded as a translation process from an image to a semantic description, so that an automatic image semantic description model based on the CNN and the RNN is constructed.
Disclosure of Invention
The invention aims to provide an automatic image semantic description method to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using CNN to perform image pre-description processing, and marking the pre-description purpose;
step 4, marking a plurality of categories on the image, and calling the pre-description of the first layer as the category marking of the image;
step 5, constructing a classifier for the image of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through combination of image areas obtained by the related training sets.
Preferably, in the step 1, the algorithm of the clustering algorithm is a K-means image clustering algorithm.
Preferably, in step 1, the algorithm of the clustering algorithm is an image clustering algorithm of isodata.
Preferably, in step 3, the image pre-description processing process includes: vectorization, attribute establishment, projection transformation and data formatting conversion.
Preferably, in step 6, the classifier is used to guide the generation of the visual analysis and prediction stage category, so as to realize the parameter optimization of the semantic classifier.
Preferably, in step 4, the class labeling of the image is optimized by inverse propagation of the loss by mapping to a class space based on the features of each node and calculating a classification loss.
Compared with the prior art, the invention has the beneficial effects that: the invention provides an automatic image semantic description method, which comprises the following steps that 1, the bottom-layer characteristics of an image and image semantic description high-level semantic information can be effectively fused, the precision and the accuracy are high, the high semantic description precision can be achieved by using fewer parameters, and the requirements of practical application can be well met;
2. the method describes semantics through the relation between concepts, has the characteristics of definition, formalization, sharing, conceptualization and the like, can be widely applied to a plurality of fields including information retrieval, information extraction, semantic network and knowledge management, and has strong applicability.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1:
the invention provides a technical scheme that: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm, wherein the algorithm of the clustering algorithm is a K-means image clustering algorithm, so that the clustering accuracy can be improved;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using the CNN to perform image pre-description processing, marking the purpose of pre-description, wherein the image pre-description processing process comprises the following steps: vectorization, attribute establishment, projection transformation and data formatting conversion;
step 4, labeling the image with a plurality of categories, wherein the pre-description of the first layer is called the category labeling of the image, the category labeling of the image is mapped to a category space and calculates the classification loss based on the characteristics of each node, and the optimization is realized through loss reverse transmission;
step 5, constructing a classifier for the image of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier, and guiding visual analysis and generation of prediction stage categories by using the classifier so as to realize parameter optimization of a semantic classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
Example 2:
the invention provides a technical scheme that: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm, wherein the algorithm of the clustering algorithm is an image clustering algorithm of isodata, the number of categories can be automatically increased or decreased in the clustering process, and the efficiency is accelerated;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using the CNN to perform image pre-description processing, marking the purpose of pre-description, wherein the image pre-description processing process comprises the following steps: vectorization, attribute establishment, projection transformation and data formatting conversion;
step 4, labeling the image with a plurality of categories, wherein the pre-description of the first layer is called the category labeling of the image, the category labeling of the image is mapped to a category space and calculates the classification loss based on the characteristics of each node, and the optimization is realized through loss reverse transmission;
step 5, constructing a classifier for the images of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier, and guiding visual analysis and generation of prediction stage categories by using the classifier so as to realize parameter optimization of a semantic classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. An automatic image semantic description method is characterized by comprising the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using CNN to perform image pre-description processing, and marking the pre-description purpose;
step 4, labeling the image into a plurality of categories, and calling the pre-description of the first layer as the category labeling of the image;
step 5, constructing a classifier for the images of each category by using an SVM;
step 6, judging whether to add the description to the image or not by using a classifier;
step 7, marking the semantic keywords marked by the test image in detail by using an MBRM model marking algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
2. The automatic image semantic description method according to claim 1, characterized in that: in the step 1, the algorithm of the clustering algorithm is a K-means image clustering algorithm.
3. The automatic image semantic description method according to claim 1, characterized in that: in step 1, the algorithm of the clustering algorithm is an image clustering algorithm of isodata.
4. The automatic image semantic description method according to claim 1, characterized in that: in step 3, the image pre-description processing process comprises: vectorization, attribute establishment, projection transformation and data formatting conversion.
5. The automatic image semantic description method according to claim 1, characterized in that: and step 6, guiding visual analysis and generation of prediction stage categories through a classifier, and further realizing parameter optimization of the semantic classifier.
6. The automatic image semantic description method according to claim 1, characterized in that: in step 4, the class labeling of the image is optimized by mapping to a class space based on the features of each node and calculating the classification loss through loss reverse transfer.
CN201811564965.8A 2018-12-20 2018-12-20 Automatic image semantic description method Active CN109857884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811564965.8A CN109857884B (en) 2018-12-20 2018-12-20 Automatic image semantic description method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811564965.8A CN109857884B (en) 2018-12-20 2018-12-20 Automatic image semantic description method

Publications (2)

Publication Number Publication Date
CN109857884A CN109857884A (en) 2019-06-07
CN109857884B true CN109857884B (en) 2023-02-07

Family

ID=66891712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811564965.8A Active CN109857884B (en) 2018-12-20 2018-12-20 Automatic image semantic description method

Country Status (1)

Country Link
CN (1) CN109857884B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069342A (en) * 2020-09-03 2020-12-11 Oppo广东移动通信有限公司 Image classification method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075263A (en) * 2007-06-28 2007-11-21 北京交通大学 Automatic image marking method emerged with pseudo related feedback and index technology
CN101963995A (en) * 2010-10-25 2011-02-02 哈尔滨工程大学 Image marking method based on characteristic scene
WO2016095487A1 (en) * 2014-12-17 2016-06-23 中山大学 Human-computer interaction-based method for parsing high-level semantics of image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075263A (en) * 2007-06-28 2007-11-21 北京交通大学 Automatic image marking method emerged with pseudo related feedback and index technology
CN101963995A (en) * 2010-10-25 2011-02-02 哈尔滨工程大学 Image marking method based on characteristic scene
WO2016095487A1 (en) * 2014-12-17 2016-06-23 中山大学 Human-computer interaction-based method for parsing high-level semantics of image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种新的基于语义聚类和图算法的自动图像标注方法;芮晓光等;《中国图象图形学报》;20070228(第02期);全文 *
基于分类融合和关联规则挖掘的图像语义标注;秦铭等;《计算机工程与科学》;20180515(第05期);全文 *

Also Published As

Publication number Publication date
CN109857884A (en) 2019-06-07

Similar Documents

Publication Publication Date Title
CN105844239B (en) It is a kind of that video detecting method is feared based on CNN and LSTM cruelly
CN109271539B (en) Image automatic labeling method and device based on deep learning
CN104376105A (en) Feature fusing system and method for low-level visual features and text description information of images in social media
Cornia et al. Explaining transformer-based image captioning models: An empirical analysis
CN115203421A (en) Method, device and equipment for generating label of long text and storage medium
CN115099239B (en) Resource identification method, device, equipment and storage medium
CN114091472B (en) Training method of multi-label classification model
CN110287369B (en) Semantic-based video retrieval method and system
CN114333062B (en) Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency
CN109857884B (en) Automatic image semantic description method
CN109344911B (en) Parallel processing classification method based on multilayer LSTM model
CN113590827A (en) Scientific research project text classification device and method based on multiple angles
CN113743079A (en) Text similarity calculation method and device based on co-occurrence entity interaction graph
CN116662924A (en) Aspect-level multi-mode emotion analysis method based on dual-channel and attention mechanism
CN110580280A (en) Method, device and storage medium for discovering new words
CN114842301A (en) Semi-supervised training method of image annotation model
Yang et al. Fine-Grained Lip Image Segmentation using Fuzzy Logic and Graph Reasoning
CN113987536A (en) Method and device for determining security level of field in data table, electronic equipment and medium
CN113886602A (en) Multi-granularity cognition-based domain knowledge base entity identification method
CN113177478A (en) Short video semantic annotation method based on transfer learning
Tripathi et al. Multimodal query-guided object localization
Zhang et al. Image caption generation method based on an interaction mechanism and scene concept selection module
CN115374765B (en) Computing power network 5G data analysis system and method based on natural language processing
CN109657684A (en) A kind of image, semantic analytic method based on Weakly supervised study
Zhao Construction of Safety Early Warning Model for Construction of Engineering Based on Convolution Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant