CN102142089A - Semantic binary tree-based image annotation method - Google Patents

Semantic binary tree-based image annotation method Download PDF

Info

Publication number
CN102142089A
CN102142089A CN 201110002770 CN201110002770A CN102142089A CN 102142089 A CN102142089 A CN 102142089A CN 201110002770 CN201110002770 CN 201110002770 CN 201110002770 A CN201110002770 A CN 201110002770A CN 102142089 A CN102142089 A CN 102142089A
Authority
CN
China
Prior art keywords
image
word
binary tree
semantic
mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201110002770
Other languages
Chinese (zh)
Other versions
CN102142089B (en
Inventor
刘咏梅
杨帆
杜福鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201110002770A priority Critical patent/CN102142089B/en
Publication of CN102142089A publication Critical patent/CN102142089A/en
Application granted granted Critical
Publication of CN102142089B publication Critical patent/CN102142089B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明提供的是一种基于语义二叉树的图像标注方法。步骤1,对于特定场景的图像集,采用图像分割算法对用于学习的标注图像进行分割,获得图像区域的视觉描述;步骤2,构造用于学习的所有图像的视觉最近邻图;步骤3,根据步骤2中的最近邻图建立所述场景的语义二叉树;步骤4,对所述场景下的待标注图像,从语义二叉树的根节点到叶子节点找到相应位置,并将该节点处到根节点的所有标注字传递给所述图像。本发明旨在对特定场景下的训练用的标注图像集建立语义二叉树,来提高利用图像视觉特征进行场景分类后的图像的自动语义标注的精度。

Figure 201110002770

The invention provides an image labeling method based on a semantic binary tree. Step 1, for the image set of a specific scene, use the image segmentation algorithm to segment the labeled images used for learning to obtain the visual description of the image area; Step 2, construct the visual nearest neighbor graph of all images used for learning; Step 3, Establish the semantic binary tree of the scene according to the nearest neighbor graph in step 2; step 4, find the corresponding position from the root node to the leaf node of the semantic binary tree for the image to be labeled under the scene, and place the node to the root node All annotation words are passed to the image. The present invention aims to establish a semantic binary tree for a labeled image set used for training in a specific scene, so as to improve the accuracy of automatic semantic labeling of images after scene classification using image visual features.

Figure 201110002770

Description

一种基于语义二叉树的图像标注方法An Image Annotation Method Based on Semantic Binary Tree

技术领域technical field

本发明涉及的是一种图像的自动语义标注方法。The invention relates to an automatic semantic labeling method for images.

背景技术Background technique

图像的标注字作为一种非常宝贵的图像描述资源,较好地反映了图像的高级语义信息。如何充分利用训练图像的标注字信息,是提高图像标注精度的重要手段。本发明的背景是在综合利用图像语义与视觉特征的相关性基础上,提取出训练图像的语义场景,并对不同场景的训练图像建立视觉模型,最后按照视觉特征对待标注图像进行语义归类。As a very valuable image description resource, image annotations better reflect the high-level semantic information of images. How to make full use of the annotation word information of the training image is an important means to improve the accuracy of image annotation. The background of the present invention is to extract the semantic scene of the training image based on the comprehensive utilization of the correlation between image semantics and visual features, and establish a visual model for the training images of different scenes, and finally perform semantic classification on the labeled images according to the visual features.

发明内容Contents of the invention

本发明的目的在于提供一种能提高对经过场景归类后待标注图像的标注精度的基于语义二叉树的图像标注方法。The object of the present invention is to provide an image tagging method based on a semantic binary tree that can improve the tagging accuracy of images to be tagged after scene classification.

本发明的目的是这样实现的:The purpose of the present invention is achieved like this:

步骤1,对于特定场景的图像集,采用图像分割算法对用于学习的标注图像进行分割,获得图像区域的视觉描述;Step 1. For the image set of a specific scene, the image segmentation algorithm is used to segment the labeled image for learning to obtain the visual description of the image area;

步骤2,构造用于学习的所有图像的视觉最近邻图;Step 2, constructing visual nearest neighbor graphs of all images used for learning;

步骤3,根据步骤2中的最近邻图建立所述场景的语义二叉树;Step 3, establishing a semantic binary tree of the scene according to the nearest neighbor graph in step 2;

步骤4,对所述场景下的待标注图像,从语义二叉树的根节点到叶子节点找到相应位置,并将该节点处到根节点的所有标注字传递给所述图像。Step 4, for the image to be labeled in the scene, find the corresponding position from the root node to the leaf node of the semantic binary tree, and transfer all the labeled words from the node to the root node to the image.

所述构造用于学习的所有图像的视觉最近邻图的方法为:图像间的视觉距离采用多区域集成匹配的相似性测度推土机距离,图的顶点对应每一幅图像,连接顶点的边对应图像间的视觉距离。The method of constructing the visual nearest neighbor graph of all images used for learning is: the visual distance between images adopts the similarity measure bulldozer distance of multi-region integration matching, the vertices of the graph correspond to each image, and the edges connecting the vertices correspond to the images visual distance between.

所述建立语义二叉树的方法为:二叉树的根节点处汇集了场景中的所有标注图像,代表所述场景的标注字对应根节点的语义表示,对步骤2中的最近邻图采用规格化切分的二分算法,将图像分成两个集合,分别代表根节点的左子树和右子树,统计两个集合中除了根节点处的标注字外的一个显著标注字,并按该标注字重新确定每幅图像的归属;寻找显著标注字的方法是统计集合中各标注字的出现次数,将出现次数最高的标注字作为显著标注字;如果出现次数最多的标注字不止一个,将词频较低的一个标注字作为显著标注字;The method for establishing a semantic binary tree is as follows: at the root node of the binary tree, all marked images in the scene are collected, and the marked words representing the scene correspond to the semantic representation of the root node, and the nearest neighbor graph in step 2 is normalized and segmented Divide the image into two sets, representing the left subtree and the right subtree of the root node respectively, count a significant label word in the two sets except the label word at the root node, and re-determine according to the label word The attribution of each image; the method of finding the prominent marked words is to count the number of occurrences of each marked word in the collection, and use the marked word with the highest frequency of occurrence as the marked marked word; One marked word as a prominent marked word;

对根节点的左子树和右子树重复上述操作,直到只有一副图像或者集合中无显著出现的标注字,底端的叶子节点对应了出现频度较低的标注字的图像。Repeat the above operations on the left subtree and right subtree of the root node until there is only one image or no marked words appearing in the collection, and the bottom leaf nodes correspond to images with less frequent marked words.

本发明利用标注字和视觉信息对特定场景的标注图像建立语义二叉树,提出了一个对特定场景的标注图像建立语义树的具体方法。树的顶点对应该场景下最常见的标注字,随着语义树的生长,每个叶子节点对应的语义被分支裁剪,子节点的语义逐渐细化,代表的标注字逐步具体,趋于并通过建立的语义二叉树,对该场景的待标注图像,从该场景语义树的根部到叶子节点,得到相应的标注信息。The present invention uses marked words and visual information to build a semantic binary tree for marked images of specific scenes, and proposes a specific method for building semantic trees for marked images of specific scenes. The vertices of the tree correspond to the most common marked words in this scene. As the semantic tree grows, the semantics corresponding to each leaf node are cut by branches, and the semantics of the sub-nodes are gradually refined, and the represented marked words are gradually specific, tending to and passing The established semantic binary tree obtains corresponding annotation information from the root of the semantic tree of the scene to the leaf nodes of the image to be annotated for the scene.

本发明旨在对特定场景下的训练用的标注图像集建立语义二叉树,来提高利用图像视觉特征进行场景分类后的图像的自动语义标注的精度。The present invention aims to establish a semantic binary tree for a labeled image set used for training in a specific scene, so as to improve the accuracy of automatic semantic labeling of images after scene classification is performed using image visual features.

本发明将结点带有关键字的二叉树用于图像标注,具有较高的实用价值。将对许多CBIR应用有重要帮助,例如google的图像索索引擎。The invention uses the binary tree with keywords in the nodes for image labeling, and has high practical value. It will be of great help to many CBIR applications, such as Google's image search engine.

附图说明Description of drawings

附图是本发明的流程图。Accompanying drawing is the flowchart of the present invention.

具体实施方式Detailed ways

下面结合附图举例对本发明做更详细的描述:The present invention is described in more detail below in conjunction with accompanying drawing example:

步骤1,对于特定场景的图像集,采用图像分割算法对用于学习的标注图像进行分割,获得图像区域的视觉描述。Step 1. For an image set of a specific scene, an image segmentation algorithm is used to segment the annotated image used for learning to obtain a visual description of the image region.

步骤2,构造用于学习的所有图像的视觉最近邻图。图像间的视觉距离采用多区域集成匹配的相似性测度推土机距离(Earth Mover’s Distance,EMD)。图的顶点对应每一幅图像,连接顶点的边对应图像间的视觉距离。Step 2, construct visual nearest neighbor graphs of all images used for learning. The visual distance between images uses the similarity measure of multi-region integrated matching, the Earth Mover's Distance (EMD). The vertices of the graph correspond to each image, and the edges connecting the vertices correspond to the visual distance between the images.

步骤3,根据步骤2中的最近邻图建立该场景的语义二叉树。方法如下。Step 3, build a semantic binary tree of the scene according to the nearest neighbor graph in step 2. Methods as below.

二叉树的根节点处汇集了该场景中的所有标注图像,代表该场景的标注字对应根节点的语义表示。对步骤2中的最近邻图采用N-Cut(Normalized Cut,规格化切分)的二分算法,将图像分成两个集合,分别代表根节点的左子树和右子树。统计两个集合中除了根节点处的标注字外的一个显著标注字,并按该标注字重新确定每幅图像的归属。寻找显著标注字的方法是统计集合中各标注字的出现次数,将出现次数最高的标注字作为显著标注字。如果出现次数最多的标注字不止一个,将词频较低的一个标注字作为显著标注字。The root node of the binary tree collects all the labeled images in the scene, and represents the semantic representation of the labeled words of the scene corresponding to the root node. Use the N-Cut (Normalized Cut) bisection algorithm for the nearest neighbor graph in step 2 to divide the image into two sets, representing the left subtree and right subtree of the root node respectively. Count a significant tag word in the two sets except the tag word at the root node, and re-determine the attribution of each image according to the tag word. The method of finding the marked words is to count the occurrence times of each marked word in the set, and take the marked word with the highest number of occurrences as the significant marked word. If there is more than one marked word with the most frequent occurrences, the marked word with a lower word frequency is taken as a significant marked word.

对根节点的左子树和右子树重复上述操作,直到只有一副图像或者集合中无显著出现的标注字。底端的叶子节点对应了出现频度较低的标注字的图像。Repeat the above operations on the left subtree and right subtree of the root node until there is only one image or no marked words appearing in the collection. The leaf nodes at the bottom correspond to images of labeled words that appear less frequently.

步骤4,对该场景下的待标注图像,从语义二叉树的根节点到叶子节点找到相应位置,并将该节点处到根节点的所有标注字传递给该图像。Step 4, for the image to be labeled in the scene, find the corresponding position from the root node to the leaf node of the semantic binary tree, and transfer all the labeled words from the node to the root node to the image.

Claims (3)

1. image labeling method based on semantic binary tree is characterized in that:
Step 1 for the image set of special scenes, adopts image segmentation algorithm that the mark image that is used to learn is cut apart, and the vision that obtains image-region is described;
Step 2, the vision arest neighbors figure of all images that is configured to learn;
Step 3, the semantic binary tree of setting up described scene according to the arest neighbors figure in the step 2;
Step 4 to the image to be marked under the described scene, finds the relevant position from the root node of semantic binary tree to leaf node, and this node place is passed to described image to all mark words of root node.
2. the image labeling method based on semantic binary tree according to claim 1, the method that it is characterized in that the vision arest neighbors figure of the described all images that is configured to learn is: the visible sensation distance between image adopts the similarity measure dozer distance of the integrated coupling of multizone, corresponding each width of cloth image in the summit of figure, the visible sensation distance between the limit correspondence image on connection summit.
3. the image labeling method based on semantic binary tree according to claim 1 and 2, the method that it is characterized in that the semantic binary tree of described foundation is: the root node place of binary tree has compiled all the mark images in the scene, represent the semantic expressiveness of the corresponding root node of mark word of described scene, arest neighbors figure in the step 2 is adopted two fens algorithms of normalization cutting, image is divided into two set, represent the left subtree and the right subtree of root node respectively, add up the remarkable mark of except the mark word at root node place word in two set, and redefine the ownership of every width of cloth image by this mark word; The method of seeking remarkable mark word is the occurrence number that respectively marks word in the statistics set, and the mark word that occurrence number is the highest is as significantly marking word; If more than one of the maximum mark word of number of times, the mark word that word frequency is lower is as significantly marking word;
The left subtree of root node and right subtree are repeated aforesaid operations, in having only a sub-picture or set, do not have the mark word that significantly occurs, the leaf node correspondence of bottom the image of the lower mark word of occurrence frequency.
CN201110002770A 2011-01-07 2011-01-07 Semantic binary tree-based image annotation method Expired - Fee Related CN102142089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110002770A CN102142089B (en) 2011-01-07 2011-01-07 Semantic binary tree-based image annotation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110002770A CN102142089B (en) 2011-01-07 2011-01-07 Semantic binary tree-based image annotation method

Publications (2)

Publication Number Publication Date
CN102142089A true CN102142089A (en) 2011-08-03
CN102142089B CN102142089B (en) 2012-09-26

Family

ID=44409586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110002770A Expired - Fee Related CN102142089B (en) 2011-01-07 2011-01-07 Semantic binary tree-based image annotation method

Country Status (1)

Country Link
CN (1) CN102142089B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436583A (en) * 2011-09-26 2012-05-02 哈尔滨工程大学 Image segmentation method based on annotated image learning
CN103365850A (en) * 2012-03-27 2013-10-23 富士通株式会社 Method and device for annotating images
CN103530415A (en) * 2013-10-29 2014-01-22 谭永 Natural language search method and system compatible with keyword search
CN103632388A (en) * 2013-12-19 2014-03-12 百度在线网络技术(北京)有限公司 Semantic annotation method, device and client for image
CN106814162A (en) * 2016-12-15 2017-06-09 珠海华海科技有限公司 A kind of Outdoor Air Quality solution and system
CN108171283A (en) * 2017-12-31 2018-06-15 厦门大学 A kind of picture material automatic describing method based on structuring semantic embedding
CN108182443A (en) * 2016-12-08 2018-06-19 广东精点数据科技股份有限公司 A kind of image automatic annotation method and device based on decision tree
WO2019021088A1 (en) * 2017-07-24 2019-01-31 International Business Machines Corporation Navigating video scenes using cognitive insights
CN110199525A (en) * 2017-01-18 2019-09-03 Pcms控股公司 For selecting scene with the system and method for the browsing history in augmented reality interface
CN110288019A (en) * 2019-06-21 2019-09-27 北京百度网讯科技有限公司 Image labeling method, device and storage medium
CN110413820A (en) * 2019-07-12 2019-11-05 深兰科技(上海)有限公司 A kind of acquisition methods and device of picture description information
US10916013B2 (en) 2018-03-14 2021-02-09 Volvo Car Corporation Method of segmentation and annotation of images
CN112347278A (en) * 2019-10-25 2021-02-09 北京沃东天骏信息技术有限公司 Method and apparatus for training a characterization model
US11100366B2 (en) 2018-04-26 2021-08-24 Volvo Car Corporation Methods and systems for semi-automated image segmentation and annotation

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1936892A (en) * 2006-10-17 2007-03-28 浙江大学 Image content semanteme marking method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1936892A (en) * 2006-10-17 2007-03-28 浙江大学 Image content semanteme marking method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《IEEE》 20091231 Lixing Jianga等 Automatic Image Annotation Based on Decision Tree Machine Learning , *
《智能系统学报》 20100228 刘咏梅等 基于空间位置约束的K均值图像分割 第5卷, 第1期 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436583A (en) * 2011-09-26 2012-05-02 哈尔滨工程大学 Image segmentation method based on annotated image learning
CN103365850A (en) * 2012-03-27 2013-10-23 富士通株式会社 Method and device for annotating images
CN103365850B (en) * 2012-03-27 2017-07-14 富士通株式会社 Image labeling method and image labeling device
CN103530415A (en) * 2013-10-29 2014-01-22 谭永 Natural language search method and system compatible with keyword search
CN103632388A (en) * 2013-12-19 2014-03-12 百度在线网络技术(北京)有限公司 Semantic annotation method, device and client for image
CN108182443A (en) * 2016-12-08 2018-06-19 广东精点数据科技股份有限公司 A kind of image automatic annotation method and device based on decision tree
CN108182443B (en) * 2016-12-08 2020-08-07 广东精点数据科技股份有限公司 Automatic image labeling method and device based on decision tree
CN106814162A (en) * 2016-12-15 2017-06-09 珠海华海科技有限公司 A kind of Outdoor Air Quality solution and system
US11663751B2 (en) 2017-01-18 2023-05-30 Interdigital Vc Holdings, Inc. System and method for selecting scenes for browsing histories in augmented reality interfaces
CN110199525B (en) * 2017-01-18 2021-12-14 Pcms控股公司 System and method for browsing history records in augmented reality interface
CN110199525A (en) * 2017-01-18 2019-09-03 Pcms控股公司 For selecting scene with the system and method for the browsing history in augmented reality interface
WO2019021088A1 (en) * 2017-07-24 2019-01-31 International Business Machines Corporation Navigating video scenes using cognitive insights
US10970334B2 (en) 2017-07-24 2021-04-06 International Business Machines Corporation Navigating video scenes using cognitive insights
CN108171283B (en) * 2017-12-31 2020-06-16 厦门大学 Image content automatic description method based on structured semantic embedding
CN108171283A (en) * 2017-12-31 2018-06-15 厦门大学 A kind of picture material automatic describing method based on structuring semantic embedding
US10916013B2 (en) 2018-03-14 2021-02-09 Volvo Car Corporation Method of segmentation and annotation of images
US11100366B2 (en) 2018-04-26 2021-08-24 Volvo Car Corporation Methods and systems for semi-automated image segmentation and annotation
CN110288019A (en) * 2019-06-21 2019-09-27 北京百度网讯科技有限公司 Image labeling method, device and storage medium
CN110413820B (en) * 2019-07-12 2022-03-29 深兰科技(上海)有限公司 Method and device for acquiring picture description information
CN110413820A (en) * 2019-07-12 2019-11-05 深兰科技(上海)有限公司 A kind of acquisition methods and device of picture description information
CN112347278A (en) * 2019-10-25 2021-02-09 北京沃东天骏信息技术有限公司 Method and apparatus for training a characterization model

Also Published As

Publication number Publication date
CN102142089B (en) 2012-09-26

Similar Documents

Publication Publication Date Title
CN102142089A (en) Semantic binary tree-based image annotation method
Goëau et al. Pl@ ntnet mobile app
CN102012939B (en) Method for automatically tagging animation scenes for matching through comprehensively utilizing overall color feature and local invariant features
CN102768670B (en) Webpage clustering method based on node property label propagation
CN115294150A (en) Image processing method and terminal equipment
CN101963995A (en) Image marking method based on characteristic scene
CN105389326B (en) Image labeling method based on weak matching probability typical relevancy models
Weyand et al. Visual landmark recognition from internet photo collections: A large-scale evaluation
CN103425757A (en) Cross-medial personage news searching method and system capable of fusing multi-mode information
CN104050682A (en) Image segmentation method fusing color and depth information
CN112464328A (en) Deep design method and system based on BIM technology and atlas
Cai et al. A comparative study of deep learning approaches to rooftop detection in aerial images
CN110377659A (en) A kind of intelligence chart recommender system and method
Lynen et al. Trajectory-based place-recognition for efficient large scale localization
Li et al. Multi-label pattern image retrieval via attention mechanism driven graph convolutional network
CN104392439A (en) Image similarity confirmation method and device
Kuric et al. ANNOR: Efficient image annotation based on combining local and global features
CN104063701A (en) Rapid television station caption recognition system based on SURF vocabulary tree and template matching and implementation method of rapid television station caption recognition system
Huang et al. Improved small-object detection using YOLOv8: A comparative study
JP2012022419A (en) Learning data creation device, learning data creation method, and program
CN107480693A (en) Condition random field framework is embedded in the Weakly supervised image scene understanding method of registration information
CN110766045A (en) Underground drainage pipeline disease identification method, intelligent terminal and storage medium
CN103744903A (en) Sketch based scene image retrieval method
CN105574535A (en) Graphic symbol identification method based on indirect distance angle histogram space relation expression model
Wang et al. GOReloc: Graph-Based Object-Level Relocalization for Visual SLAM

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120926

Termination date: 20180107

CF01 Termination of patent right due to non-payment of annual fee