CN102880612B - Image annotation method and device thereof - Google Patents

Image annotation method and device thereof Download PDF

Info

Publication number
CN102880612B
CN102880612B CN201110197235.0A CN201110197235A CN102880612B CN 102880612 B CN102880612 B CN 102880612B CN 201110197235 A CN201110197235 A CN 201110197235A CN 102880612 B CN102880612 B CN 102880612B
Authority
CN
China
Prior art keywords
image
input image
similarity
label
comparison
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110197235.0A
Other languages
Chinese (zh)
Other versions
CN102880612A (en
Inventor
曹琼
刘汝杰
于浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201110197235.0A priority Critical patent/CN102880612B/en
Publication of CN102880612A publication Critical patent/CN102880612A/en
Application granted granted Critical
Publication of CN102880612B publication Critical patent/CN102880612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the invention provides an image annotation method and a device thereof. The image annotation method comprises the steps that an initial label set comprising a plurality of labels is obtained for an input image; the label set based similarity degree between the label set of the input image and a label set of a comparison image stored in a data base is calculated; the label set based similarity degree and the vision based similarity degree are subjected to consolidation, so as to obtain the consolidated similarity degree between the input image and the comparison image; and the label set of the input image is updated based on the consolidated similarity degree. According to the embodiment of the invention, low level characteristics and high level semantics of the image can be considered at the same time; the precision of image annotation can be improved; the automatic label annotation is realized; and the annotation efficiency is improved.

Description

Image labeling method and device
Technical Field
The invention relates to the field of image classification and retrieval, in particular to an image labeling method and an image labeling device.
Background
With the development of computer networks and multimedia technology, the amount of multimedia information available on the internet has also grown very rapidly. The proliferation of multimedia information provides users with rich resources, and at the same time, how to quickly and effectively obtain interesting resources from massive information also brings huge challenges to researchers. Thus, image classification and retrieval techniques are gaining increasing attention.
Content-Based Image Retrieval (CBIR) technology has been widely studied since the introduction of the last ninety decades. Other images that are similar in visual characteristics can be retrieved by indexing the visual content characteristics of the image itself (e.g., underlying characteristics such as color, texture, shape, and spatial hierarchy). So that images can be directly compared and retrieved based on the visual similarity computed for low-level features of the images.
However, because the image is described by using the bottom visual features of the image, the features have no uniform rule correlation with the subjective judgment of people on the high-level semantics of the image. When completely different types of images are likely to have similar underlying features, the method of direct comparison based on visual similarity often fails to obtain accurate retrieval results.
On the other hand, some methods of labeling images by a Text-Based image retrieval (TBIR) technique have appeared. Similar images of the image to be marked are searched through the low-level features, and the label of the similar image is distributed to the image to be marked, so that the image vision and the related text information can be combined for retrieval.
However, in the process of implementing the invention, the inventor finds that the prior art has the following defects: at present, due to the distance between the low-level features and the high-level semantics of the images, the accuracy of image annotation is low; and if the image is only marked by human-computer interaction or manual mode, the efficiency is low and the burden of the user is heavy.
Disclosure of Invention
The embodiment of the invention provides an image annotation method and a device thereof, aiming at simultaneously considering the low-level characteristics and high-level semantics of an image and improving the accuracy of image annotation; and moreover, automatic labeling of the label is realized, and the labeling efficiency is improved.
According to an aspect of an embodiment of the present invention, there is provided an image annotation method, including:
obtaining an initial set of tags comprising a plurality of tags for an input image, wherein an accuracy of representing semantics of the input image is determined from the plurality of tags;
calculating a tagset-based similarity between the tagset of the input image and the tagsets of the comparison images stored in the database;
performing a merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
updating the label set of the input image based on the merged similarity.
According to another aspect of the embodiments of the present invention, there is provided an image annotation apparatus including:
an initializer for obtaining an initial set of tags for an input image, the set of tags comprising a plurality of tags from which an accuracy of representing semantics of the input image is determined;
a relation calculator that calculates a similarity based on a tag set between the tag set of the input image and a tag set of a comparison image stored in a database;
a merging calculator which performs merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
a tag set updater that updates a tag set of the input image based on the merged similarity.
The embodiment of the invention has the advantages that the low-level characteristics and high-level semantics of the image can be considered simultaneously by combining the similarity based on the label set and the similarity based on the vision, so that the accuracy of image annotation is improved; and moreover, automatic labeling of the label is realized, and the labeling efficiency is improved.
Features that are described and/or illustrated with respect to one embodiment may be used in one or more other embodiments, in combination with or instead of the features of the other embodiments, by the same or similar methods.
It should be emphasized that the terms "comprises" and "comprising," when used in this specification, are taken to specify the presence of stated features, integers, steps or components but do not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is a flowchart of an image annotation method according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of an annotated image in an embodiment of the invention;
FIG. 3 is a schematic diagram of obtaining an initial set of tags according to an embodiment of the present invention;
FIG. 4 is a flowchart of an image annotation method according to an embodiment of the invention;
FIG. 5 is a diagram illustrating an iterative process of an image annotation method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an image annotation apparatus according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of another configuration of the image annotation device in the embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
An embodiment of the present invention provides an image annotation method, and fig. 1 is a flowchart of the image annotation method according to the embodiment of the present invention. As shown in fig. 1, the image annotation method includes:
step 101, obtaining an initial label set comprising a plurality of labels for an input image, wherein the accuracy of the semantic meaning of the input image is determined according to the plurality of labels;
102, calculating the similarity based on the label set between the label set of the input image and the label set of the comparison image stored in the database;
103, carrying out merging calculation on the similarity based on the label set and the similarity based on the vision so as to obtain the merging similarity of the input image and the comparison image;
and 104, updating the label set of the input image based on the combined similarity.
In this embodiment, each image may be labeled with a label set, the label set may include a plurality of labels, and the accuracy of representing the semantics of the input image may be determined according to the arrangement order of the labels, for example, the accuracy of the label arranged in the front is higher than that of the label arranged in the back.
FIG. 2 is a schematic diagram of an annotated image in an embodiment of the invention. As shown in fig. 2, the image corresponds to a label set, which includes four labels: { gold, gate, bridge, sights }. In fig. 2, as can be seen from the arrangement sequence of the tags, the accuracy of gold is greater than that of gate, the accuracy of gate is greater than that of bridge, and the accuracy of bridge is greater than that of sights.
In specific implementation, a weight may also be given to each tag, for example, the weight of gold is 60, the weight of gate is 52, the weight of bridge is 48, the weight of sight is 30, and the accuracy of the semantic meaning of the tag expressed by the weight is reflected. It should be noted that the above is merely an illustrative illustration of the label set, but is not limited thereto, and the specific implementation can be determined according to actual situations.
In one embodiment, the obtaining an initial tag set for the input image may specifically include: a set of tags is randomly assigned.
In another embodiment, the initial set of tags for the input image may be obtained using a visual similarity-based approach. Calculating a vision-based similarity of the input image and the comparison image stored in the database; an initial set of labels for the input image is obtained based on the vision-based similarity.
Wherein a large number of comparison images which have been image-labeled can be stored in the database. The first few comparison images closest to the input image in terms of visual-based similarity may be selected after calculating the visual-based similarity, and an initial set of labels for the input image may be obtained based on the labels of these comparison images. For example, a collection of tags for the several comparison images may be employed as the initial set of tags; but not limited to, the initial set of tags may also be obtained by voting as described below.
FIG. 3 is a schematic diagram of obtaining an initial set of tags according to an embodiment of the present invention. As shown in fig. 3, based on the visual similarity, a plurality of comparison images that are relatively close in visual similarity may be found in the database for the input image; then Voting (Voting) or statistics is carried out according to the searched label set of the comparison image, and an initial label set of the input image can be obtained.
The above is merely an illustrative description of how to obtain the initial tag set, but is not limited thereto. As for the specific implementation of how to calculate the similarity based on the vision, how to search and compare images according to the similarity based on the vision, how to perform voting, and the like, the prior art can be adopted, and details are not repeated here.
In this embodiment, after the input image is initialized to obtain the initial tab set, the relationship between the tab set of the input image and the tab set of the comparison image may be calculated, so as to obtain the similarity based on the tab sets.
In one embodiment, the calculating a relationship between the tag set of the input image and the tag set of the comparison image stored in the database may specifically include: calculating the intersection of the tag set of the input image and the tag set of the comparison image and the ratio of the tag set of the input image to the union of the tag sets of the comparison images; determining a similarity based on the tag set between the tag set of the input image and the tag set of the comparison image stored in the database according to the obtained ratio.
For example, the initial set of labels for the input image as shown in FIG. 2 is: { gold, gate, bridge, sights }. Assume that the labelsets of the compared images are: { gold, gate, bridge, 2006}, then the similarity based on the tag set can be calculated according to the number of the same tags as: 3/5, respectively; assume that the labelsets of the compared images are: { facade, sanfrancisco, bridge, gold }, then the similarity based on the tag set is: 1/7.
In another embodiment, calculating the relationship between the tag set of the input image and the tag set of the comparison image stored in the database may specifically include: calculating semantic distances between the label sets of the input image and the comparison image; determining a similarity based on the tag set between the tag set of the input image and the tag set of the comparison image stored in the database according to the calculated semantic distance.
For example, the initial set of labels for the input image as shown in FIG. 2 is: { gold, gate, bridge, sights }, assuming that the tag sets of the compared images are: { gold, gate, bridge, 2006 }. First, the distance between the input image label and the comparison image label can be calculated, and a distance matrix can be obtained, as shown in the following table:
TABLE 1
golden gate bridge sights
golden 1 0 0.1 0
gate 0 1 0.1 0.3
bridge 0.1 0.1 1 0.6
2006 0 0 0 0
Then, based on the distance matrix, the optimal one-to-one correspondence is found, which can be realized by adopting a greedy matching method or a Munkre optimal matching method. For example: and by using a Munkre optimal matching method, the obtained correspondence among the labels is gold-gold, gate-gate, bridge-bridge and 2006-sights, and the similarity of the label set is (1+1+1+ 0)/4-3/4.
The above is only an illustrative description of how to calculate the similarity based on the tag set, but is not limited to this, for example, a weight may be added, or a specific implementation may be determined according to actual situations by using an existing method for calculating the similarity.
In this embodiment, after obtaining the similarity based on the tab set, the similarity based on the tab set and the similarity based on the vision may be subjected to a merging calculation to obtain a merged similarity of the input image and the comparison image.
For example, the similarity based on the tag set and the similarity based on the vision may be weighted and then added or multiplied. Assuming that the similarity based on the tab set is 1/7, the similarity based on the vision is 1/3, and the weights are 3 and 2, respectively, the combined similarity may be: (1/7) × 3+ (1/3) × 2 ═ 23/21.
The above description is only for illustrative purposes of how to calculate the merging similarity, but is not limited thereto, and for example, an existing calculation method may be adopted, and a specific implementation may be determined according to actual situations.
In this embodiment, after obtaining the combined similarity based on the similarity of the tab sets and the similarity based on the vision, the updating the tab sets of the input image based on the combined similarity may specifically include: if the merging similarity is larger than a preset value, counting the times of the tags appearing in the tag set of the comparison image; and adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
In the embodiment, the similarity based on the label set and the similarity based on the vision can be combined, so that the low-level features and the high-level semantics of the image can be considered simultaneously, and the accuracy of image annotation is improved. To improve accuracy, multiple iterations may be performed.
FIG. 4 is a flowchart of an image annotation method according to an embodiment of the invention. As shown in fig. 4, the image annotation method includes:
step 401, obtaining an initial label set comprising a plurality of labels for an input image, wherein the accuracy of representing the semantics of the input image is determined according to the plurality of labels;
step 402, calculating the similarity based on the label set between the label set of the input image and the label set of the comparison image stored in the database;
step 403, merging and calculating the similarity based on the label set and the similarity based on the vision to obtain the merged similarity of the input image and the comparison image;
step 404, judging whether the merging similarity is greater than a preset value, if so, executing step 405; otherwise, go to step 406;
step 405, counting the number of times that the labels appear in the label set of the comparison image; and adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
Step 406, judging whether a preset condition is reached;
if the preset condition is not met, selecting other comparison images from the database; and proceeds to step 402; and if the preset condition is reached, ending the image annotation process.
In this embodiment, when selecting the comparison image, only one comparison image may be selected. Multiple comparison images can also be selected; in step 405, statistics may be performed based on the label sets of the plurality of comparison images, and thus the results may be accumulated, further improving accuracy.
In this embodiment, the step 406 of determining whether the preset condition is reached may specifically include: judging whether the label set of the input image is the same as or similar to the label set of the previous iteration; if the two are the same or similar, the label set of the input image reaches a stable state, and the preset condition is determined to be reached. The preset condition may also be a preset number of iterations or an iteration time, and the specific iteration condition may be determined according to an actual situation.
FIG. 5 is a schematic diagram of an iterative process of an image annotation method according to an embodiment of the present invention. As shown in fig. 5, the input image is labeled with a label set having a plurality of labels, and the weight of the label may correspond to a statistical histogram, which represents the accuracy of the semantics of the input image. A plurality of comparison images may be selected based on the similarity based on the tagset and the similarity based on the vision, votes may be cast based on the tagsets of the comparison images, and the tagset of the input image may be updated. And the updating process can be iterated for multiple times, so that the accuracy of the tag set is further improved.
How to update the tag set is further explained by way of example below. For example, the tag set of the input image is { gate, gold, bridge, signs }. After the merging similarity is calculated, 5 comparison images with the merging similarity larger than a preset value are obtained, and the label sets of the comparison images are respectively as follows: { bridge, 2006, fast }, { sanfrancisco, bridge, gold }, { bridge, traffic, sanfrancisco, gate }, { bridge, gold, sanfrancisco, fast }, { gate, bridge, fast }.
The number of occurrences of the label can be counted in the label set of the 5 images. The statistical results are shown in the following table:
TABLE 2
Label (R) Number of times
bridge 5
favorite 3
sanfrancisco 3
gate 2
golden 2
2006 1
traffic 1
The first 4 tags with the largest number of votes can be selected as new tags to update the tag set of the input image, i.e. the tag set of the input image is updated to { bridge, favorite, sanfrancisco, gate }. The above is only an explanation of one iteration process, and multiple iterations of the above process may be performed.
In addition, in this embodiment, the tag set obtained by the previous round of calculation in the iterative process may participate in the process of obtaining a new tag, in addition to being used for calculating the tag similarity. Moreover, normalization processing can be carried out on the number of votes, and the updating accuracy is further improved.
For example, as shown in fig. 3, the 4 tags with the largest number of votes obtained are the initial tags { gate, gold, bridge, signs }, and the number of votes obtained is 4, 3, 3, and 2, respectively. The number of votes obtained may be normalized, for example, to make the sum 1, and the result after normalization is: {4, 3, 3, 2}/12 ═ 0.33, 0.25, 0.25, 0.17 }.
The merged similarity may then be calculated based on the tag set { gate, golden, bridge, signs }, and the vision-based similarity. According to the merging similarity, 5 comparison images with the merging similarity larger than a preset value are obtained, and the label sets of the 5 comparison images are respectively as follows: { bridge, 2006, favorite, ca }, { sanfranciscos, bridge, gold }, { bridge, traffic, gate }, { bridge, gold, sanfranciscos, favorite }, { gate, bridge, favorite }.
Then in the tag set of these 5 images, the number of occurrences of each tag can be counted as shown in the following table:
TABLE 3
Label (R) Number of times
bridge 5
favorite 3
sanfrancisco 2
gate 2
golden 2
2006 1
traffic 1
ca 1
Similarly, if the number of votes obtained is normalized, for example, the sum is 1, the result after normalization is: {5, 3, 2, 2, 2, 1, 1, 1}/17 ═ 0.294, 0.177, 0.118, 0.118, 0.118, 0.059, 0.059}, it is possible to add the number of votes obtained and the number of votes obtained before by different weights. Assuming that the weight of the previous vote count is a (0 < a < 1), and the weight of the round vote count is 1-a, the results shown in the following table can be obtained:
TABLE 4
Label (R) Number of votes obtained before Number of votes obtained in this round Addition result a is 0.5
gate 0.33 0.118 0.224
golden 0.25 0.118 0.184
bridge 0.25 0.294 0.272
sights 0.17 0.085
favorite 0.177 0.0885
sanfrancisco 0.118 0.0590
2006 0.059 0.0295
traffic 0.059 0.0295
Thus, the four tags with the largest results can be selected: { gate, gold, bridge, favorite }, to update the tag set of the input image. Meanwhile, the weights of the 4 labels can be normalized, and {0.29, 0.24, 0.35, 0.12} can be obtained from {0.224, 0.184, 0.272, 0.0885} for use in the next round of calculation of the iterative process.
According to the embodiment, the similarity based on the label set and the similarity based on the vision are combined, so that the low-level features and the high-level semantics of the image can be considered at the same time, and the accuracy of image annotation is improved; and moreover, automatic labeling of the label is realized, and the labeling efficiency is improved.
An embodiment of the present invention further provides an image annotation apparatus, and fig. 6 is a schematic diagram of a structure of the image annotation apparatus in the embodiment of the present invention. As shown in fig. 6, the image labeling apparatus includes: an initializer 601, a relation calculator 602, a merging calculator 603 and a tag set updater 604; wherein,
the initializer 601 acquires an initial label set including a plurality of labels for the input image, wherein the accuracy of representing the semantics of the input image is determined according to the plurality of labels;
the relation calculator 602 calculates a similarity based on the tag set between the tag set of the input image and the tag set of the comparison image stored in the database;
the merging calculator 603 performs merging calculation on the similarity based on the tag set and the similarity based on the vision to obtain a merged similarity of the input image and the comparison image;
the label set updater 604 updates the label set of the input image based on the merged similarity.
In particular, the initializer 601 may be specifically configured to: randomly assigning an initial tag set; or calculating a vision-based similarity of the input image and a comparison image stored in a database; an initial set of labels for the input image is obtained based on the vision-based similarity.
In one embodiment, the relationship calculator 602 may specifically include: a set calculator and a relationship determiner. Wherein the set calculator calculates an intersection of the tag set of the input image and the tag set of the comparison image, and a ratio of a union of the tag set of the input image and the tag set of the comparison image; a relationship determiner determines a tag-based similarity between the tag set of the input image and the tag set of the comparison image stored in the database based on the calculated ratio.
In another embodiment, the relationship calculator 602 may specifically include: a distance calculator and a relationship determiner. Wherein the distance calculator calculates a semantic distance between the set of labels of the input image and the set of labels of the comparison image; a relationship determiner determines a tag set-based similarity between the tag set of the input image and the tag set of the comparison image stored in the database according to the calculated semantic distance.
In a specific implementation, the tag set updater 604 may specifically include: a count counter and a tag adjuster; wherein, the times counting unit counts the times of the occurrence of the labels in the label set of the comparison image when the merging similarity obtained by the merging calculator 603 is greater than a preset value; and the label adjuster adjusts the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
FIG. 7 is a schematic diagram of another configuration of the image annotation device in the embodiment of the invention. As shown in fig. 7, the image labeling apparatus includes: an initializer 701, a relation calculator 702, a merging calculator 703 and a tag set updater 704; as mentioned above, no further description is provided herein.
As shown in fig. 7, the image annotation apparatus may further include: a condition determiner 705 and an image selector 706; the condition judger 705 is configured to judge whether a preset condition is reached; when the condition determiner 705 determines that the preset condition is not reached, the image selector 706 selects another comparative image from the database. Also, the relationship calculator 702 is further configured to calculate a similarity based on the tag set between the tag set of the input image and the tag sets of the other comparison images.
In one embodiment, the condition decider 705 may be specifically configured to: judging whether the label set of the input image is the same as or similar to the label set of the previous iteration; if the two are the same or similar, the label set of the input image reaches a stable state and is determined to reach a preset condition.
According to the embodiment, the similarity based on the label set and the similarity based on the vision are combined, so that the low-level features and the high-level semantics of the image can be considered at the same time, and the accuracy of image annotation is improved; and moreover, automatic labeling of the label is realized, and the labeling efficiency is improved.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in functional terms in the foregoing description for the purpose of clearly illustrating the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
With regard to the embodiments including the above embodiments, the following remarks are also disclosed.
(supplementary note 1) an image annotation method comprising:
obtaining an initial set of tags comprising a plurality of tags for an input image, wherein an accuracy of representing semantics of the input image is determined from the plurality of tags;
calculating a tagset-based similarity between the tagset of the input image and the tagsets of the comparison images stored in the database;
performing a merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
updating the label set of the input image based on the merged similarity.
(supplementary note 2) the image annotation method according to supplementary note 1, further comprising:
judging whether a preset condition is reached or not;
if the preset condition is not met, selecting other comparison images from the database; and calculating a similarity based on the tag set between the tag set of the input image and the tag set of the other comparison image.
(supplementary note 3) the step of judging whether the preset condition is met according to the image labeling method described in supplementary note 2 specifically comprises:
judging whether the label set of the input image is the same as or similar to the label set of the previous iteration; and if the two are the same or similar, determining that the preset condition is reached.
(supplementary note 4) according to the image labeling method of supplementary note 1 or 2, calculating a similarity based on a tag set between the tag set of the input image and a tag set of a comparison image stored in a database, specifically comprising:
calculating a ratio of an intersection of the set of labels of the input image and the set of labels of the comparison image and a union of the set of labels of the input image and the set of labels of the comparison image;
determining a similarity based on the tag set between the tag set of the input image and the tag set of the comparison image stored in the database according to the obtained ratio.
(supplementary note 5) according to the image labeling method of supplementary note 1 or 2, calculating a similarity based on a tag set between the tag set of the input image and a tag set of a comparison image stored in a database, specifically comprising:
calculating semantic distances of the tag sets of the input image and the tag sets of the comparison image;
and determining the similarity based on the label sets between the label sets of the input image and the label sets of the comparison images stored in the database according to the obtained semantic distance.
(supplementary note 6) according to the image annotation method of supplementary note 1 or 2, acquiring an initial tag set for an input image specifically includes: randomly assigning the initial set of tags; or
Calculating a vision-based similarity of the input image to a comparison image stored in a database; and obtaining the initial label set according to the vision-based similarity.
(supplementary note 7) updating the label set of the input image based on the merging similarity according to the image labeling method described in supplementary note 1 or 2, specifically including:
if the merging similarity is larger than a preset value, counting the times of the tags appearing in the tag set of the comparison image;
and adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
(supplementary note 8) an image annotation apparatus, comprising:
an initializer for obtaining an initial set of tags for an input image, the set of tags comprising a plurality of tags from which an accuracy of representing semantics of the input image is determined;
a relation calculator that calculates a similarity based on a tag set between the tag set of the input image and a tag set of a comparison image stored in a database;
a merging calculator which performs merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
a tag set updater that updates a tag set of the input image based on the merged similarity.
(supplementary note 9) the image annotation apparatus according to supplementary note 8, said image annotation apparatus further comprising:
a condition judger for judging whether a preset condition is reached;
an image selector for selecting another comparative image from the database if the preset condition is not met;
and the relation calculator is further used for calculating the similarity based on the label sets between the label sets of the input image and the other label sets of the comparison images.
(supplementary note 10) the image annotation apparatus according to supplementary note 9, wherein said condition determiner is configured to: judging whether the label set of the input image is the same as or similar to the label set of the previous iteration; and if the two are the same or similar, determining that the preset condition is reached.
(supplementary note 11) the image annotation apparatus according to supplementary note 8 or 9, wherein the relation calculator specifically includes:
a set calculator that calculates an intersection of the tag set of the input image and the tag set of the comparison image and a ratio of a union of the tag set of the input image and the tag set of the comparison image;
and a relation determiner for determining a similarity based on the label set between the label set of the input image and the label set of the comparison image stored in the database according to the obtained ratio.
(supplementary note 12) the image annotation apparatus according to supplementary note 8 or 9, wherein the relationship calculator specifically includes:
a distance calculator that calculates a semantic distance of the set of labels of the input image and the set of labels of the comparison image;
and a relation determiner for determining a similarity based on the tag sets between the tag sets of the input image and the tag sets of the comparison images stored in the database according to the obtained semantic distance.
(supplementary note 13) the image annotation apparatus according to supplementary note 8 or 9, wherein the initializer is specifically configured to: randomly assigning the initial set of tags; or
Calculating a vision-based similarity of the input image to a comparison image stored in a database; and obtaining the initial label set according to the vision-based similarity.
(supplementary note 14) the image annotation apparatus according to supplementary note 8 or 9, wherein the tag set updater specifically includes:
a frequency counter for counting the frequency of the tag appearing in the tag set of the comparison image if the merging similarity is larger than a preset value;
and the label adjuster is used for adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.

Claims (10)

1. An image annotation method, comprising:
obtaining an initial set of tags comprising a plurality of tags for an input image, wherein an accuracy of representing semantics of the input image is determined from the plurality of tags;
calculating a tagset-based similarity between the tagset of the input image and the tagsets of the comparison images stored in the database;
performing a merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
updating the label set of the input image based on the merged similarity.
2. The image annotation method of claim 1, after updating the set of labels of the input image based on the merged similarity, further comprising:
judging whether a preset condition is reached or not;
if the preset condition is not met, selecting other comparison images from the database; and calculating a similarity based on the tag set between the tag set of the input image and the tag set of the other comparison image.
3. The image annotation method according to claim 2, wherein the step of determining whether the preset condition is met specifically comprises: judging whether the label set of the input image is the same as or similar to the label set of the previous iteration; and if the two are the same or similar, determining that the preset condition is reached.
4. The image annotation method according to claim 1 or 2, wherein the calculating of the similarity between the label set of the input image and the label set of the comparison image stored in the database based on the label sets comprises:
calculating a ratio of an intersection of the set of labels of the input image and the set of labels of the comparison image and a union of the set of labels of the input image and the set of labels of the comparison image;
determining a similarity based on the tag set between the tag set of the input image and the tag set of the comparison image stored in the database according to the obtained ratio.
5. The image annotation method according to claim 1 or 2, wherein the calculating of the similarity between the label set of the input image and the label set of the comparison image stored in the database based on the label sets comprises:
calculating semantic distances of the tag sets of the input image and the tag sets of the comparison image;
and determining the similarity based on the label sets between the label sets of the input image and the label sets of the comparison images stored in the database according to the obtained semantic distance.
6. The image annotation method according to claim 1 or 2, wherein the obtaining of the initial tag set for the input image specifically comprises: randomly assigning the initial set of tags; or
Calculating a vision-based similarity of the input image to a comparison image stored in a database; and obtaining the initial label set according to the vision-based similarity.
7. The image annotation method according to claim 1 or 2, wherein updating the label set of the input image based on the merged similarity includes:
if the merging similarity is larger than a preset value, counting the times of the tags appearing in the tag set of the comparison image;
and adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
8. An image annotation apparatus, comprising:
an initializer for obtaining an initial set of tags for an input image, the set of tags comprising a plurality of tags from which an accuracy of representing semantics of the input image is determined;
a relation calculator that calculates a similarity based on a tag set between the tag set of the input image and a tag set of a comparison image stored in a database;
a merging calculator which performs merging calculation on the similarity based on the label set and the similarity based on vision to obtain a merged similarity of the input image and the comparison image;
a tag set updater that updates a tag set of the input image based on the merged similarity.
9. The image annotation device of claim 8, further comprising:
a condition judger for judging whether a preset condition is reached;
an image selector for selecting another comparative image from the database if the preset condition is not met;
and the relation calculator is further used for calculating the similarity based on the label sets between the label sets of the input image and the other label sets of the comparison images.
10. The image annotation device according to claim 8 or 9, wherein the tag set updater specifically comprises:
a frequency counter for counting the frequency of the tag appearing in the tag set of the comparison image if the merging similarity is larger than a preset value;
and the label adjuster is used for adjusting the labels in the label set of the input image according to the statistical result so as to update the label set of the input image.
CN201110197235.0A 2011-07-14 2011-07-14 Image annotation method and device thereof Active CN102880612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110197235.0A CN102880612B (en) 2011-07-14 2011-07-14 Image annotation method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110197235.0A CN102880612B (en) 2011-07-14 2011-07-14 Image annotation method and device thereof

Publications (2)

Publication Number Publication Date
CN102880612A CN102880612A (en) 2013-01-16
CN102880612B true CN102880612B (en) 2015-05-06

Family

ID=47481940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110197235.0A Active CN102880612B (en) 2011-07-14 2011-07-14 Image annotation method and device thereof

Country Status (1)

Country Link
CN (1) CN102880612B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035916B (en) * 2013-03-07 2017-05-24 富士通株式会社 Method and device for standardizing annotation tool
CN103218460B (en) * 2013-05-14 2016-08-10 清华大学 Image tag complementing method based on the sparse reconstruct of optimum linearity
CN103810274B (en) * 2014-02-12 2017-03-29 北京联合大学 Multi-characteristic image tag sorting method based on WordNet semantic similarities
FR3030846B1 (en) * 2014-12-23 2017-12-29 Commissariat Energie Atomique SEMANTIC REPRESENTATION OF THE CONTENT OF AN IMAGE
JP6402653B2 (en) * 2015-03-05 2018-10-10 オムロン株式会社 Object recognition device, object recognition method, and program
US10002136B2 (en) * 2015-07-27 2018-06-19 Qualcomm Incorporated Media label propagation in an ad hoc network
CN106250396B (en) * 2016-07-19 2021-09-03 厦门雅迅网络股份有限公司 Automatic image label generation system and method
CN107766853B (en) * 2016-08-16 2021-08-06 阿里巴巴集团控股有限公司 Image text information generation and display method and electronic equipment
CN107818160A (en) * 2017-10-31 2018-03-20 上海掌门科技有限公司 Expression label updates and realized method, equipment and the system that expression obtains
CN110033018B (en) * 2019-03-06 2023-10-31 平安科技(深圳)有限公司 Graph similarity judging method and device and computer readable storage medium
CN111797653B (en) * 2019-04-09 2024-04-26 华为技术有限公司 Image labeling method and device based on high-dimensional image
CN110069647B (en) * 2019-05-07 2023-05-09 广东工业大学 Image tag denoising method, device, equipment and computer readable storage medium
CN113408633B (en) * 2021-06-29 2023-04-18 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for outputting information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1150215A2 (en) * 2000-04-28 2001-10-31 Canon Kabushiki Kaisha A method of annotating an image
CN1936892A (en) * 2006-10-17 2007-03-28 浙江大学 Image content semanteme marking method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1150215A2 (en) * 2000-04-28 2001-10-31 Canon Kabushiki Kaisha A method of annotating an image
CN1936892A (en) * 2006-10-17 2007-03-28 浙江大学 Image content semanteme marking method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JING LIU等.AN ADAPTIVE GRAPH MODEL FOR AUTOMATIC IMAGE ANNOTATION.《PROCEEDING MIR "06 PROCEEDINGS OF THE 8TH ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA INFORMATION RETRIEVAL 》.2006, *
一种基于上下文语义信息的图像块视觉单词生成算法;刘硕研等;《电子学报》;20100525;全文 *

Also Published As

Publication number Publication date
CN102880612A (en) 2013-01-16

Similar Documents

Publication Publication Date Title
CN102880612B (en) Image annotation method and device thereof
WO2022126971A1 (en) Density-based text clustering method and apparatus, device, and storage medium
CN110209808B (en) Event generation method based on text information and related device
CN105022754B (en) Object classification method and device based on social network
Yasmin et al. Content based image retrieval by shape, color and relevance feedback
CN109508374B (en) Text data semi-supervised clustering method based on genetic algorithm
CN103902597A (en) Method and device for determining search relevant categories corresponding to target keywords
CN110737788B (en) Rapid three-dimensional model index establishing and retrieving method
CN109145143A (en) Sequence constraints hash algorithm in image retrieval
CN106997379A (en) A kind of merging method of the close text based on picture text click volume
CN111708942B (en) Multimedia resource pushing method, device, server and storage medium
CN111797267A (en) Medical image retrieval method and system, electronic device and storage medium
CN107391594A (en) A kind of image search method based on the sequence of iteration vision
CN102760127B (en) Method, device and the equipment of resource type are determined based on expanded text information
CN109657695A (en) A kind of fuzzy division clustering method and device based on definitive operation
JP6017277B2 (en) Program, apparatus and method for calculating similarity between contents represented by set of feature vectors
CN112417091A (en) Text retrieval method and device
CN111930885A (en) Method and device for extracting text topics and computer equipment
CN104123382B (en) A kind of image set abstraction generating method under Social Media
CN114943285B (en) Intelligent auditing system for internet news content data
CN114168733B (en) Rule retrieval method and system based on complex network
CN107609006B (en) Search optimization method based on local log research
CN112487782B (en) Article popularity calculation method based on similar quantity of articles
CN112395408B (en) Stop word list generation method and device, electronic equipment and storage medium
WO2021017638A2 (en) Method for determining similarity of any two technology systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant