CN109857884B - Automatic image semantic description method - Google Patents
Automatic image semantic description method Download PDFInfo
- Publication number
- CN109857884B CN109857884B CN201811564965.8A CN201811564965A CN109857884B CN 109857884 B CN109857884 B CN 109857884B CN 201811564965 A CN201811564965 A CN 201811564965A CN 109857884 B CN109857884 B CN 109857884B
- Authority
- CN
- China
- Prior art keywords
- image
- description
- semantic
- labeling
- classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an automatic image semantic description method, which comprises the following steps of; clustering the image set according to the visual characteristics by using a clustering algorithm; dividing the clustered image set into a plurality of categories; using CNN to perform image pre-description processing; labeling the image with a plurality of categories, and calling the pre-description of the first layer as the category labeling of the image; constructing a classifier for the images of each category by using the SVM; determining whether to add such a description to the image using a classifier; an MBRM model marking algorithm is utilized; and obtaining the image semantics through the combination of the image regions obtained by the related training set. The invention provides an automatic image semantic description method, which can effectively fuse the bottom-layer characteristics of an image and image semantic description high-level semantic information, has the characteristics of high precision, high accuracy, definition, formalization, shareability, conceptualization and the like, can be widely applied to a plurality of fields including information retrieval, information extraction, semantic network and knowledge management, and has strong applicability.
Description
Technical Field
The invention relates to the technical field of image semantic description, in particular to an automatic image semantic description method.
Background
Image content automatic description (image capturing), namely, the content of an image is automatically described by natural language, because the image content automatic description has wide application prospects, such as a man-machine interaction and blind guiding system, the image content automatic description is recently a new focus in the fields of computer vision and artificial intelligence, and is different from image classification or object detection, the image automatic description takes the comprehensive description of objects, scenes and relations thereof as a target, and relates to visual scene analysis, content semantic understanding and natural language processing, and is the integrated design of a tip technology in a mixed task;
in the prior art, image features extracted by a CNN are used as input of a Recurrent Neural Network (RNN), image semantic description information is used as output of the RNN, and an image semantic description problem is regarded as a translation process from an image to a semantic description, so that an automatic image semantic description model based on the CNN and the RNN is constructed.
Disclosure of Invention
The invention aims to provide an automatic image semantic description method to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using CNN to perform image pre-description processing, and marking the pre-description purpose;
step 4, marking a plurality of categories on the image, and calling the pre-description of the first layer as the category marking of the image;
step 5, constructing a classifier for the image of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through combination of image areas obtained by the related training sets.
Preferably, in the step 1, the algorithm of the clustering algorithm is a K-means image clustering algorithm.
Preferably, in step 1, the algorithm of the clustering algorithm is an image clustering algorithm of isodata.
Preferably, in step 3, the image pre-description processing process includes: vectorization, attribute establishment, projection transformation and data formatting conversion.
Preferably, in step 6, the classifier is used to guide the generation of the visual analysis and prediction stage category, so as to realize the parameter optimization of the semantic classifier.
Preferably, in step 4, the class labeling of the image is optimized by inverse propagation of the loss by mapping to a class space based on the features of each node and calculating a classification loss.
Compared with the prior art, the invention has the beneficial effects that: the invention provides an automatic image semantic description method, which comprises the following steps that 1, the bottom-layer characteristics of an image and image semantic description high-level semantic information can be effectively fused, the precision and the accuracy are high, the high semantic description precision can be achieved by using fewer parameters, and the requirements of practical application can be well met;
2. the method describes semantics through the relation between concepts, has the characteristics of definition, formalization, sharing, conceptualization and the like, can be widely applied to a plurality of fields including information retrieval, information extraction, semantic network and knowledge management, and has strong applicability.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1:
the invention provides a technical scheme that: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm, wherein the algorithm of the clustering algorithm is a K-means image clustering algorithm, so that the clustering accuracy can be improved;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using the CNN to perform image pre-description processing, marking the purpose of pre-description, wherein the image pre-description processing process comprises the following steps: vectorization, attribute establishment, projection transformation and data formatting conversion;
step 4, labeling the image with a plurality of categories, wherein the pre-description of the first layer is called the category labeling of the image, the category labeling of the image is mapped to a category space and calculates the classification loss based on the characteristics of each node, and the optimization is realized through loss reverse transmission;
step 5, constructing a classifier for the image of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier, and guiding visual analysis and generation of prediction stage categories by using the classifier so as to realize parameter optimization of a semantic classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
Example 2:
the invention provides a technical scheme that: an automatic image semantic description method comprises the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm, wherein the algorithm of the clustering algorithm is an image clustering algorithm of isodata, the number of categories can be automatically increased or decreased in the clustering process, and the efficiency is accelerated;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using the CNN to perform image pre-description processing, marking the purpose of pre-description, wherein the image pre-description processing process comprises the following steps: vectorization, attribute establishment, projection transformation and data formatting conversion;
step 4, labeling the image with a plurality of categories, wherein the pre-description of the first layer is called the category labeling of the image, the category labeling of the image is mapped to a category space and calculates the classification loss based on the characteristics of each node, and the optimization is realized through loss reverse transmission;
step 5, constructing a classifier for the images of each category by using an SVM;
step 6, judging whether to add description of the type to the image by using a classifier, and guiding visual analysis and generation of prediction stage categories by using the classifier so as to realize parameter optimization of a semantic classifier;
step 7, carrying out detailed annotation on semantic keywords of the test image annotation by using an MBRM model annotation algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (6)
1. An automatic image semantic description method is characterized by comprising the following steps;
step 1, clustering an image set according to visual characteristics by using a clustering algorithm;
step 2, dividing the clustered image set into a plurality of categories, wherein each category is divided into a plurality of images;
step 3, using CNN to perform image pre-description processing, and marking the pre-description purpose;
step 4, labeling the image into a plurality of categories, and calling the pre-description of the first layer as the category labeling of the image;
step 5, constructing a classifier for the images of each category by using an SVM;
step 6, judging whether to add the description to the image or not by using a classifier;
step 7, marking the semantic keywords marked by the test image in detail by using an MBRM model marking algorithm;
8, labeling the label of the second layer of the image according to the type of the test image to obtain a corresponding image;
and 9, taking the images as training sets of the detailed labeling stage together, and obtaining image semantics through the combination of image areas obtained by the related training sets.
2. The automatic image semantic description method according to claim 1, characterized in that: in the step 1, the algorithm of the clustering algorithm is a K-means image clustering algorithm.
3. The automatic image semantic description method according to claim 1, characterized in that: in step 1, the algorithm of the clustering algorithm is an image clustering algorithm of isodata.
4. The automatic image semantic description method according to claim 1, characterized in that: in step 3, the image pre-description processing process comprises: vectorization, attribute establishment, projection transformation and data formatting conversion.
5. The automatic image semantic description method according to claim 1, characterized in that: and step 6, guiding visual analysis and generation of prediction stage categories through a classifier, and further realizing parameter optimization of the semantic classifier.
6. The automatic image semantic description method according to claim 1, characterized in that: in step 4, the class labeling of the image is optimized by mapping to a class space based on the features of each node and calculating the classification loss through loss reverse transfer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811564965.8A CN109857884B (en) | 2018-12-20 | 2018-12-20 | Automatic image semantic description method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811564965.8A CN109857884B (en) | 2018-12-20 | 2018-12-20 | Automatic image semantic description method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109857884A CN109857884A (en) | 2019-06-07 |
CN109857884B true CN109857884B (en) | 2023-02-07 |
Family
ID=66891712
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811564965.8A Active CN109857884B (en) | 2018-12-20 | 2018-12-20 | Automatic image semantic description method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109857884B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112069342A (en) * | 2020-09-03 | 2020-12-11 | Oppo广东移动通信有限公司 | Image classification method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101075263A (en) * | 2007-06-28 | 2007-11-21 | 北京交通大学 | Automatic image marking method emerged with pseudo related feedback and index technology |
CN101963995A (en) * | 2010-10-25 | 2011-02-02 | 哈尔滨工程大学 | Image marking method based on characteristic scene |
WO2016095487A1 (en) * | 2014-12-17 | 2016-06-23 | 中山大学 | Human-computer interaction-based method for parsing high-level semantics of image |
-
2018
- 2018-12-20 CN CN201811564965.8A patent/CN109857884B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101075263A (en) * | 2007-06-28 | 2007-11-21 | 北京交通大学 | Automatic image marking method emerged with pseudo related feedback and index technology |
CN101963995A (en) * | 2010-10-25 | 2011-02-02 | 哈尔滨工程大学 | Image marking method based on characteristic scene |
WO2016095487A1 (en) * | 2014-12-17 | 2016-06-23 | 中山大学 | Human-computer interaction-based method for parsing high-level semantics of image |
Non-Patent Citations (2)
Title |
---|
一种新的基于语义聚类和图算法的自动图像标注方法;芮晓光等;《中国图象图形学报》;20070228(第02期);全文 * |
基于分类融合和关联规则挖掘的图像语义标注;秦铭等;《计算机工程与科学》;20180515(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109857884A (en) | 2019-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105844239B (en) | It is a kind of that video detecting method is feared based on CNN and LSTM cruelly | |
CN109271539B (en) | Image automatic labeling method and device based on deep learning | |
CN104376105A (en) | Feature fusing system and method for low-level visual features and text description information of images in social media | |
Cornia et al. | Explaining transformer-based image captioning models: An empirical analysis | |
CN115203421A (en) | Method, device and equipment for generating label of long text and storage medium | |
CN115099239B (en) | Resource identification method, device, equipment and storage medium | |
CN114091472B (en) | Training method of multi-label classification model | |
CN110287369B (en) | Semantic-based video retrieval method and system | |
CN114333062B (en) | Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency | |
CN109857884B (en) | Automatic image semantic description method | |
CN109344911B (en) | Parallel processing classification method based on multilayer LSTM model | |
CN113590827A (en) | Scientific research project text classification device and method based on multiple angles | |
CN113743079A (en) | Text similarity calculation method and device based on co-occurrence entity interaction graph | |
CN116662924A (en) | Aspect-level multi-mode emotion analysis method based on dual-channel and attention mechanism | |
CN110580280A (en) | Method, device and storage medium for discovering new words | |
CN114842301A (en) | Semi-supervised training method of image annotation model | |
Yang et al. | Fine-Grained Lip Image Segmentation using Fuzzy Logic and Graph Reasoning | |
CN113987536A (en) | Method and device for determining security level of field in data table, electronic equipment and medium | |
CN113886602A (en) | Multi-granularity cognition-based domain knowledge base entity identification method | |
CN113177478A (en) | Short video semantic annotation method based on transfer learning | |
Tripathi et al. | Multimodal query-guided object localization | |
Zhang et al. | Image caption generation method based on an interaction mechanism and scene concept selection module | |
CN115374765B (en) | Computing power network 5G data analysis system and method based on natural language processing | |
CN109657684A (en) | A kind of image, semantic analytic method based on Weakly supervised study | |
Zhao | Construction of Safety Early Warning Model for Construction of Engineering Based on Convolution Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |