CN110738267B - Image classification method, device, electronic equipment and storage medium - Google Patents

Image classification method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110738267B
CN110738267B CN201910995342.4A CN201910995342A CN110738267B CN 110738267 B CN110738267 B CN 110738267B CN 201910995342 A CN201910995342 A CN 201910995342A CN 110738267 B CN110738267 B CN 110738267B
Authority
CN
China
Prior art keywords
preset number
labels
label
tags
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910995342.4A
Other languages
Chinese (zh)
Other versions
CN110738267A (en
Inventor
张志伟
李焱
吴丽军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201910995342.4A priority Critical patent/CN110738267B/en
Publication of CN110738267A publication Critical patent/CN110738267A/en
Application granted granted Critical
Publication of CN110738267B publication Critical patent/CN110738267B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure relates to an image classification method, an image classification device, electronic equipment and a storage medium, and relates to the technical field of computers, wherein the method comprises the following steps: inputting the target image into a preset number of image classification models, acquiring a preset number of labels output by the preset number of image classification models, determining the information entropy of the preset number of labels, and if the information entropy is greater than or equal to a preset threshold value, determining the upper label of the preset number of labels as the classification label of the target image. By adopting the method and the device, the electronic equipment can accurately determine the classification label of the target image, and the classification label can accurately embody the classification result of the target image.

Description

Image classification method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computers, and in particular, to an image classification method, an image classification device, an electronic device, and a storage medium.
Background
Currently, image recognition technology is increasingly applied to various fields, including recognition of the kind of objects in an image according to the image recognition technology.
In the related art, the electronic device may determine the classification label of the image according to the image classification model. For example, if there is a hashty in the image a, the electronic device may determine that the classification label of the image a is "hashty" according to the image classification model.
However, for poorly resolved objects, such as the hastelloy in image a above, there is a class of dogs that closely resemble the hastelloy: if the electronic device determines that the classification label of the image a is "asky dog", the electronic device may determine that the classification label of the image a is "asky dog", and thus the classification result may be inaccurate.
Disclosure of Invention
The disclosure provides an image classification method, an image classification device, an electronic device and a storage medium, so as to at least solve the problem of inaccurate image classification result in the related art. The technical scheme of the present disclosure is as follows:
according to a first aspect of an embodiment of the present disclosure, there is provided an image classification method, including:
inputting target images into a preset number of image classification models, and obtaining preset number of labels output by the preset number of image classification models;
determining information entropy of the preset number of labels, wherein the information entropy represents the coincidence degree of the preset number of labels and the target image;
if the information entropy is greater than or equal to a preset threshold value, determining that the public superior labels of the preset number of labels are classification labels of the target image according to a preset tree label structure, wherein the classification granularity represented by labels with higher levels in the preset tree label structure is larger, and the classification granularity represented by labels with the same level is the same.
Optionally, the step of determining the information entropy of the preset number of tags includes:
establishing probability vectors according to the preset number of tags, wherein the probability vectors comprise the probability of each identical tag in the preset number of tags;
and determining the information entropy corresponding to the probability vector.
Optionally, after the step of determining the entropy of the information of the preset number of tags, the method further includes:
if the information entropy is smaller than a preset threshold, determining that the label with the highest occurrence probability in the preset number of labels is the classification label of the target image.
Optionally, the step of determining, according to a preset tree tag structure, that the common superior tag of the preset number of tags is a classified tag of the target image includes:
according to a preset tree-shaped tag structure, determining public superior tags of the lowest level of the preset number of tags in the preset tree-shaped tag structure;
and determining the common upper-level label of the lowest level as the classification label of the target image.
According to a second aspect of embodiments of the present disclosure, there is provided an image classification apparatus, comprising:
an acquisition unit configured to perform inputting a target image into a preset number of image classification models, and acquire a preset number of labels output by the preset number of image classification models;
a determining unit configured to perform determining an information entropy of the preset number of tags, the information entropy representing a degree of coincidence of the preset number of tags with the target image;
the determining unit is further configured to determine, according to a preset tree-shaped tag structure, that the common superior tag of the preset number of tags is a classification tag of the target image if the information entropy is greater than or equal to a preset threshold, wherein the classification granularity represented by the tag with the higher level in the preset tree-shaped tag structure is greater, and the classification granularity represented by the tag with the same level is the same.
Optionally, the determining unit is specifically configured to perform:
establishing probability vectors according to the preset number of tags, wherein the probability vectors comprise the probability of each identical tag in the preset number of tags;
and determining the information entropy corresponding to the probability vector.
Alternatively to this, the method may comprise,
and the determining unit is further configured to determine that the label with the highest occurrence probability in the preset number of labels is the classified label of the target image if the information entropy is smaller than a preset threshold value.
Optionally, the determining unit is specifically configured to perform:
according to a preset tree-shaped tag structure, determining public superior tags of the lowest level of the preset number of tags in the preset tree-shaped tag structure;
and determining the common upper-level label of the lowest level as the classification label of the target image.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and a processor, configured to implement the method steps described in the first aspect when executing the program stored in the memory.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the electronic device may input the target image to a preset number of image classification models, obtain a preset number of labels output by the preset number of image classification models, and then determine information entropy of the preset number of labels. If the information entropy is larger than or equal to a preset threshold value, determining the public superior labels of the preset number of labels as classification labels of the target image according to a preset tree label structure. When the information entropy is greater than or equal to a preset threshold, the preset number of labels cannot accurately represent the classification of the target image, so that the electronic equipment can determine that the common superior label of the preset number of labels in the preset tree-shaped label structure is the classification label of the target image. The common upper-level label can represent all lower-level labels of the common upper-level label, so that the classification label can represent any one of the preset number of labels, and the classification label can accurately reflect the classification result of the target image.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
FIG. 1 is a flow chart illustrating a method of image classification according to an exemplary embodiment;
FIG. 2 is a flow chart illustrating another image classification method according to an exemplary embodiment;
FIG. 3 is a flowchart illustrating another image classification method according to an exemplary embodiment;
FIG. 4 is a flowchart illustrating another image classification method according to an exemplary embodiment;
FIG. 5 is a schematic diagram illustrating a preset tree tag structure according to an exemplary embodiment;
FIG. 6 is a flowchart illustrating another image classification method according to an exemplary embodiment;
FIG. 7 is a block diagram of an image classification device according to an exemplary embodiment;
fig. 8 is a block diagram illustrating a configuration of an electronic device according to an exemplary embodiment.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
Fig. 1 is a flowchart illustrating an image classification method according to an exemplary embodiment, which is used in an electronic device, as shown in fig. 1, and includes the steps of:
step 101, inputting target images into a preset number of image classification models, and obtaining preset number of labels output by the preset number of image classification models.
The electronic equipment can input the target image into the image classification model, and acquire a label output by the image classification model, wherein the label represents a classification result of the image classification model on the target image.
In the embodiment of the disclosure, the electronic device may classify the target image through a plurality of image classification models, and since each image classification model is different, the electronic device may determine a plurality of classification results of the target image through the plurality of image classification models.
In the embodiment of the present disclosure, the preset number of image classification models may be a preset number of different image classification models, and the preset number is not limited in the embodiment of the present disclosure.
For example, the electronic device is provided with 10 image classification models, and the electronic device may input the image a into the 10 image classification models, respectively, and acquire 10 labels output by the 10 image classification models: "cat", "dog", "monkey", "halfti", "cat", "dog".
Step 102, determining the information entropy of the preset number of tags.
The information entropy represents the degree to which the preset number of labels match the target image. The smaller the information entropy, the higher the degree of coincidence, indicating that the preset number of tags is more accurate. The greater the entropy of the information, the lower the degree of match, indicating that the preset number of tags are more likely to be erroneous or inaccurate.
The information entropy is a mathematical calculation result, and the electronic equipment can determine the stability degree of the probability vector through the numerical value of the information entropy. I.e. the more the terms in the probability vector fit the target image, the less the information entropy, the more stable the probability vector.
It can also be understood that the larger the difference of the preset number of labels output by the preset number of image classification models, the larger the information entropy, and the lower the degree of coincidence between the preset number of labels and the target image is proved.
And step 103, if the information entropy is greater than or equal to a preset threshold value, determining the public superior labels of the preset number of labels as classification labels of the target image according to a preset tree label structure.
The upper-level labels are labels with the level higher than the preset number in the preset tree-shaped label structure, the classification granularity represented by the labels with the same level in the preset tree-shaped label structure is the same, and the classification granularity represented by the labels with the higher level is larger.
If the information entropy is greater than or equal to the preset threshold, it is proved that the label of the target image cannot accurately conform to the target image, so that the public superior labels of the preset number of labels need to be determined.
Wherein the upper level tag is an upper level tag having a larger content range than the content range represented by the preset number of tags, for example: the superior tag of "halftoning" may be "dog" and the superior tag of "dog" may be "animal".
The common superior tag is a superior tag common to at least two tags, for example: the common superior tag of "Hasky" and "Ainsylmoline" may be "dogs" or "animals".
The preset threshold may be a preset empirical value, which may be set by a person according to experience, and the magnitude and setting manner of the preset threshold are not limited in the embodiments of the present disclosure.
In the embodiment of the disclosure, if the information entropy is greater than or equal to the preset threshold, it is indicated that the matching degree of the preset number of labels and the target image is low, that is, the probability of including an error or an inaccurate label in the preset number of labels is high, the electronic device may determine a public superior label of the preset number of labels, where the public superior label can more accurately reflect the category of the target image.
Because objects in many images are indistinguishable by the electronic device alone, such as "halftoning" and "askid" dogs, there are cases where the electronic device cannot distinguish the exact tag of the preset number of tags when it classifies the target image.
However, the superior labels of these labels are accurate, for example, there is one dog in the target image, and the electronic device cannot distinguish whether the dog is a "halfty" or an "askimo dog", but the common superior label "dog" of "halfty" and "askimo dog" can accurately reflect the classification of the target image, so the electronic device can attach an accurate classification label to the target image according to such a rule of determining the label.
For example, after inputting the image a into 10 image classification models, 10 labels output by the 10 classification models are obtained: "cat", "dog", "monkey", "halfti", "cat", "dog". Among the above 10 tags, "cat", "dog" and "monkey" belong to three kinds of animals, there are obvious individual differences between them, and the 10 tags output by these 10 image classification models for image a cannot obviously represent the category of image a, so the above 10 tag phases have a low degree of coincidence with image a.
For another example, there are 10 labels that 10 image classification models output for image B: "dog", "halftime", "dog", "halftime", "dog". Of these 10 tags, "dog" and "Husky" both belong to the class of dogs, then image B is highly probable to belong to the class of dogs tags, so the 10 tags agree with image A to a high degree.
The embodiment of the disclosure provides an image classification method, an electronic device may input a target image into a preset number of image classification models, obtain a preset number of labels output by the preset number of image classification models, and then determine information entropy of the preset number of labels. If the information entropy is larger than or equal to a preset threshold value, determining the public superior labels of the preset number of labels as classification labels of the target image according to a preset tree label structure. When the information entropy is greater than or equal to a preset threshold, the preset number of labels cannot accurately represent the classification of the target image, so that the electronic equipment can determine that the common superior label of the preset number of labels in the preset tree-shaped label structure is the classification label of the target image. The common upper-level label can represent all lower-level labels of the common upper-level label, so that the classification label can represent any one of the preset number of labels, and the classification label can accurately reflect the classification result of the target image.
Optionally, as shown in fig. 2, the specific implementation manner of determining the information entropy of the preset number of tags in step 102 may include:
step 1021, establishing probability vectors according to the preset number of tags.
The probability vector is the probability that each identical label appears in the preset number of labels.
Step 1022, determining the information entropy corresponding to the probability vector.
In fig. 2, steps 101 to 103 are the same as steps 101 to 103 in fig. 1, and are not described here again.
In one implementation, the electronic device may determine a tag vector of a predetermined number of tags, and then establish a probability vector of the predetermined number of tags according to the tag vector.
The label vector comprises a preset number of labels.
For example, there are 10 labels that 10 image classification models output for image a: "cat", "dog", "monkey", "halfti", "cat", "dog". The tag vector of the 10 tags is: [ cat, dog, monkey, halftime, cat, dog ].
The probability vector determined by the electronic device according to the tag vector is: [3/10, 2/10], wherein "cat" appears 3 times among the 10 tags, and thus the probability of correspondence of "cat" is 3/10.
The "dog" appeared 3 times among the 10 tags, so the probability of the "dog" correspondence was 3/10.
"monkey" appears 2 times among the 10 tags, so the probability of "monkey" correspondence is 2/10.
"Hashiqi" appears 2 times among the 10 tags, so the probability of "Hashiqi" correspondence is 2/10.
After the electronic device determines the probability vector, the information entropy corresponding to the probability vector can be calculated according to the probability vector, and the specific formula is as follows:
wherein H (x) is information entropy, i is label corresponding to each probability in probability vector, p i The probability corresponding to each tag in the probability vector.
For example, p 1 Probability corresponding to tag "cat": 3/10, p 2 Probability corresponding to tag "dog": 3/10, p 3 Probability corresponding to the label "monkey": 2/10, p 4 Probability corresponding to the label "Hashiqi": 2/10.
As can be seen from the above-mentioned formula, i for the probability of each label corresponding to the probability vector, the lower the coincidence degree of each label and the target image is, the greater the information entropy of the probability vector is, and the lower the coincidence degree of the probability vector and the target image is.
In practical application, the information entropy corresponding to the probability vector shows the stability of the probability vector, and the more stable the probability vector is, the more the preset number of labels corresponding to the probability vector accord with the classification of the target image. Therefore, the electronic equipment can more accurately determine the classification label of the target image through the information entropy corresponding to the probability vector.
Optionally, as shown in fig. 3, after determining the information entropy of the preset number of labels in step 102, the electronic device may further determine the classification label of the target image by the following steps:
step 301, if the information entropy of the preset number of labels is smaller than a preset threshold, determining that the label with the highest occurrence probability in the preset number of labels is the classified label of the target image.
In fig. 3, steps 101 to 103 are the same as steps 101 to 103 in fig. 1, and are not described here again.
For example, image A is a Hastey image, three tags for "Hastey", "dog" and "animal" all conform to image A, and the probabilities for the three tags correspond to 8/10, 1/10 and 1/10, respectively. If the information entropy is smaller than the preset threshold, the electronic device can directly determine the label with the highest occurrence probability ("Hashiqi") in the three labels as the classified label of the target image, namely, the label which is the label most in line with the image A.
If the information entropy of the preset number of labels is smaller than a preset threshold, the coincidence degree of the preset number of labels relative to the target image is proved to be high, namely the preset number of labels can accurately represent the target image, and the electronic equipment does not need to determine the public superior label of the lowest level of the preset number of labels. Instead, the electronic device may directly determine, as the classification label of the target image, a label with the highest occurrence probability among the preset number of labels. Therefore, the electronic equipment can determine the label which can more accurately represent the target image for the target image while ensuring the label accuracy.
Optionally, as shown in fig. 4, in step 103, if the information entropy is greater than or equal to the preset threshold, according to a preset tree-shaped tag structure, a specific implementation manner of determining that a common upper-level tag of a preset number of tags is a classification tag of the target image may be:
step 1031, determining the public superior labels of the lowest level of the preset number of labels in the preset tree-shaped label structure according to the preset tree-shaped label structure.
Step 1032, determining the common top-level label of the lowest level as the classification label of the target image.
The common superior labels of the preset number of labels in the preset tree label structure can comprise more than one, for example, the common superior labels of the 'Haste' and the 'Ainsylmoline' can be 'dogs' or 'animals'.
In fig. 4, steps 101 to 103 are the same as steps 101 to 103 in fig. 1, and are not described here again.
By adopting the method, the electronic equipment can determine the public superior label of the lowest level in the preset tree label structure as the classification label of the target image. Therefore, the classification represented by the public superior label can cover the classification represented by any label in the preset number of labels, and the label with the minimum classification granularity can be determined for the target image on the premise of ensuring the classification accuracy of the target image, so that the finally determined label can represent the target image more accurately.
Fig. 5 is a schematic diagram of a preset tree tag structure according to an embodiment of the disclosure, as shown in fig. 5.
Where "husky" is the first level tag, then "dog" is the husky upper level tag. Meanwhile, "Ainsylmolar dogs" and "Hashiqi" are the same class classification labels, and "Ainsylmolar dogs" and "Hashiqi" are the same class classification granularity.
If there are 4 different tags at this time: "cat", "dog", "monkey" and "hastelloy", the upper level tags of "cat" are "animal" and "organism", the upper level tags of "dog" are "animal" and "organism", the upper level tags of "monkey" are "animal" and "organism", and the upper level tags of "hastelloy" are "dog", "animal" and "organism", according to the above-described procedure of determining the common upper level tags of the predetermined number of tags and the predetermined tree tag structure shown in fig. 3.
Thus, the lowest level common superior tag corresponding to "cat", "dog", "monkey" and "halftime" is "animal", which ensures that the lowest level common superior tag can represent any one of "cat", "dog", "monkey" and "halftime".
According to a schematic diagram of a preset tree tag structure shown in fig. 5, an embodiment of the present disclosure provides an implementation manner of determining a common upper level tag of a lowest level, which may specifically include:
the probability vector determined by the electronic equipment comprises a label: "cat", "dog", "monkey" and "half.
a. Traversing from the top label "biology" to the label "cat";
b. caching labels encountered in the traversal process according to the sequence, wherein the labels are path labels;
c. in the same manner, "dog", "monkey" and "hastelloy" are traversed;
d. and determining the label 'animal' of the lowest level in the path labels corresponding to the labels as the public upper level label of the lowest level.
In the embodiment of the disclosure, through the above-mentioned method for determining the public superior label of the lowest level, the electronic device can ensure the accuracy of the classification label of the target image and the finest classification granularity of the classification label of the target image.
As shown in fig. 6, an embodiment of the present disclosure provides an embodiment of an image classification method, and specific steps may be as follows:
step 601, inputting a target image into a preset number of image classification models, and obtaining a preset number of labels output by the preset number of image classification models.
Step 602, a predicted result tag vector of a preset number of tags is established.
Step 603, establishing a probability vector corresponding to the prediction result label vector according to the prediction result label vector.
And step 604, calculating the information entropy corresponding to the probability vector.
The information entropy is a judging basis for judging whether the preset number of labels can accurately represent the content represented by the target image by the electronic equipment.
Step 605, judging whether the information entropy is smaller than a preset threshold value. If the entropy is less than the preset threshold, step 407 is performed, otherwise step 606 is performed.
The formula according to which the judging process is based is specifically as follows:
wherein label (X) is a classification label of the target image, vote (X) is a label with the largest occurrence probability among the preset number of labels, that is, an execution result of step 307, down (X) is a common superior label with the lowest level of the preset number of labels, that is, an execution result of step 606, and α is a preset threshold.
If the information entropy H (x) is smaller than the preset threshold, step 307 is executed, and if the information entropy H (x) is greater than or equal to the preset threshold, step 606 is executed.
And 606, taking the common superior label of the lowest level of the preset number of labels as the classification label of the target image.
Step 607, determining the label with the largest occurrence probability in the preset number of labels as the target classification label of the target image.
According to the embodiment shown in fig. 6, if the preset number of tags can accurately represent the content represented by the target image, the electronic device may directly determine the classification tag of the target image from the preset number of classification tags, and if the preset number of tags cannot accurately represent the content represented by the target image, the electronic device may determine the common superior tag of the lowest level as the classification tag of the target image.
In either case, therefore, the electronic device can determine a classification label for the target image that can accurately represent the content of the target image.
Fig. 7 is a block diagram of an image classification device according to an exemplary embodiment. Referring to fig. 7, the apparatus includes an acquisition unit 701 and a determination unit 702.
An obtaining unit 701 configured to perform inputting the target image into a preset number of image classification models, and obtain a preset number of labels output by the preset number of image classification models;
a determining unit 702 configured to perform determining an information entropy of the preset number of tags, the information entropy representing a degree of coincidence of the preset number of tags with the target image;
the determining unit 702 is further configured to determine, according to the preset tree label structure, that the common superior label of the preset number of labels is a classification label of the target image if the information entropy is greater than or equal to the preset threshold, wherein the classification granularity represented by the label with the higher level in the preset tree label structure is greater, and the classification granularity represented by the label with the same level is the same.
Optionally, the determining unit 702 is specifically configured to:
establishing probability vectors according to the preset number of tags, wherein the probability vectors comprise the probability of the same tags in the preset number of tags;
and determining the information entropy corresponding to the probability vector.
Alternatively to this, the method may comprise,
the determining unit 702 is further configured to determine, as the classification label of the target image, a label with the largest occurrence probability among the preset number of labels if the information entropy is smaller than the preset threshold.
Optionally, the determining unit 702 is specifically configured to perform:
according to a preset tree-shaped label structure, determining the public superior labels of the lowest level of a preset number of labels in the preset tree-shaped label structure;
and determining the common upper-level label of the lowest level as the classification label of the target image.
The embodiment of the disclosure provides an image classification device, an electronic device can input a target image into a preset number of image classification models, acquire a preset number of labels output by the preset number of image classification models, and then determine information entropy of the preset number of labels. If the information entropy is larger than or equal to a preset threshold value, determining the public superior labels of the preset number of labels as classification labels of the target image according to a preset tree label structure. When the information entropy is greater than or equal to a preset threshold, the preset number of labels cannot accurately represent the classification of the target image, so that the electronic equipment can determine that the common superior label of the preset number of labels in the preset tree-shaped label structure is the classification label of the target image. The common upper-level label can represent all lower-level labels of the common upper-level label, so that the classification label can represent any one of the preset number of labels, and the classification label can accurately reflect the classification result of the target image.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Fig. 8 is a block diagram of an electronic device, according to an example embodiment. For example, the electronic device may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, an electronic device may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the electronic device, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 806 provides power to various components of the electronic device. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic devices.
The multimedia component 808 includes a screen between the device and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device. For example, the sensor assembly 814 may detect an on/off state of the electronic device, a relative positioning of the components, such as a display and keypad of the electronic device, the sensor assembly 814 may also detect a change in position of the electronic device or a component of the electronic device, the presence or absence of user contact with the electronic device, an orientation or acceleration/deceleration of the electronic device, and a change in temperature of the electronic device. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the electronic device and other devices, either wired or wireless. The electronic device may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a storage medium is also provided, such as a memory 804 including instructions executable by a processor 820 of an electronic device to perform the above-described method. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (8)

1. A method of classifying images, the method comprising:
inputting target images into a preset number of image classification models, and obtaining preset number of labels output by the preset number of image classification models;
determining information entropy of the preset number of labels, wherein the information entropy represents the coincidence degree of the preset number of labels and the target image;
if the information entropy is larger than or equal to a preset threshold value, determining that the public superior labels of the preset number of labels are classification labels of the target image according to a preset tree-shaped label structure, wherein the classification granularity represented by the labels with higher levels in the preset tree-shaped label structure is larger, and the classification granularity represented by the labels with the same level is the same; the public upper-level tags are upper-level tags which are shared by at least two tags, wherein the upper-level tags are higher-level tags with a content range larger than that represented by a preset number of tags;
the step of determining the common superior labels of the preset number of labels as the classified labels of the target image according to the preset tree label structure comprises the following steps:
according to a preset tree-shaped tag structure, determining public superior tags of the lowest level of the preset number of tags in the preset tree-shaped tag structure;
determining the public superior label of the lowest level as the classification label of the target image;
the method for determining the common upper-level tag of the lowest level is as follows:
traversing from the top level tag to each tag for that tag;
caching labels encountered in the traversal process according to the sequence, wherein the labels encountered in the traversal process are path labels;
and determining the lowest-level label in the path labels corresponding to each label as the common upper-level label of the lowest level.
2. The image classification method according to claim 1, wherein the step of determining the information entropy of the preset number of tags includes:
establishing probability vectors according to the preset number of tags, wherein the probability vectors comprise the probability of each identical tag in the preset number of tags;
and determining the information entropy corresponding to the probability vector.
3. The image classification method according to claim 2, characterized in that, after the step of determining the information entropy of the preset number of tags, the method further comprises:
if the information entropy is smaller than a preset threshold, determining that the label with the highest occurrence probability in the preset number of labels is the classification label of the target image.
4. An image classification apparatus, the apparatus comprising:
an acquisition unit configured to perform inputting a target image into a preset number of image classification models, and acquire a preset number of labels output by the preset number of image classification models;
a determining unit configured to perform determining an information entropy of the preset number of tags, the information entropy representing a degree of coincidence of the preset number of tags with the target image;
the determining unit is further configured to determine, according to a preset tree-shaped tag structure, that a common superior tag of the preset number of tags is a classification tag of the target image if the information entropy is greater than or equal to a preset threshold, wherein classification granularity represented by a tag with a higher level in the preset tree-shaped tag structure is greater, and classification granularity represented by a tag with the same level is the same; the public upper-level tags are upper-level tags which are shared by at least two tags, wherein the upper-level tags are higher-level tags with a content range larger than that represented by a preset number of tags;
the determination unit is specifically configured to perform:
according to a preset tree-shaped tag structure, determining public superior tags of the lowest level of the preset number of tags in the preset tree-shaped tag structure;
determining the public superior label of the lowest level as the classification label of the target image; the method for determining the common upper-level tag of the lowest level is as follows: traversing from the top level tag to each tag for that tag; caching labels encountered in the traversal process according to the sequence, wherein the labels encountered in the traversal process are path labels; and determining the lowest-level label in the path labels corresponding to each label as the common upper-level label of the lowest level.
5. The image classification apparatus according to claim 4, wherein the determination unit is specifically configured to perform:
establishing probability vectors according to the preset number of tags, wherein the probability vectors comprise the probability of each identical tag in the preset number of tags;
and determining the information entropy corresponding to the probability vector.
6. The image classification apparatus according to claim 5, wherein,
and the determining unit is further configured to determine that the label with the highest occurrence probability in the preset number of labels is the classified label of the target image if the information entropy is smaller than a preset threshold value.
7. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the image classification method of any one of claims 1 to 3.
8. A storage medium, which when executed by a processor of an image classification electronic device, causes the image classification electronic device to perform the image classification method of any of claims 1-3.
CN201910995342.4A 2019-10-18 2019-10-18 Image classification method, device, electronic equipment and storage medium Active CN110738267B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910995342.4A CN110738267B (en) 2019-10-18 2019-10-18 Image classification method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910995342.4A CN110738267B (en) 2019-10-18 2019-10-18 Image classification method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110738267A CN110738267A (en) 2020-01-31
CN110738267B true CN110738267B (en) 2023-08-22

Family

ID=69270221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910995342.4A Active CN110738267B (en) 2019-10-18 2019-10-18 Image classification method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110738267B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428806B (en) * 2020-04-03 2023-10-10 北京达佳互联信息技术有限公司 Image tag determining method and device, electronic equipment and storage medium
CN112052799A (en) * 2020-09-08 2020-12-08 中科光启空间信息技术有限公司 Rosemary planting distribution high-resolution satellite remote sensing identification method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116588A (en) * 2011-11-17 2013-05-22 腾讯科技(深圳)有限公司 Method and system for personalized recommendation
CN108171254A (en) * 2017-11-22 2018-06-15 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
CN108664989A (en) * 2018-03-27 2018-10-16 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
CN109117862A (en) * 2018-06-29 2019-01-01 北京达佳互联信息技术有限公司 Image tag recognition methods, device and server
CN109409414A (en) * 2018-09-28 2019-03-01 北京达佳互联信息技术有限公司 Sample image determines method and apparatus, electronic equipment and storage medium
CN109583501A (en) * 2018-11-30 2019-04-05 广州市百果园信息技术有限公司 Picture classification, the generation method of Classification and Identification model, device, equipment and medium
CN109685110A (en) * 2018-11-28 2019-04-26 北京陌上花科技有限公司 Training method, image classification method and device, the server of image classification network
CN109961094A (en) * 2019-03-07 2019-07-02 北京达佳互联信息技术有限公司 Sample acquiring method, device, electronic equipment and readable storage medium storing program for executing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10984054B2 (en) * 2017-07-27 2021-04-20 Robert Bosch Gmbh Visual analytics system for convolutional neural network based classifiers

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116588A (en) * 2011-11-17 2013-05-22 腾讯科技(深圳)有限公司 Method and system for personalized recommendation
CN108171254A (en) * 2017-11-22 2018-06-15 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
CN108664989A (en) * 2018-03-27 2018-10-16 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
CN109117862A (en) * 2018-06-29 2019-01-01 北京达佳互联信息技术有限公司 Image tag recognition methods, device and server
CN109409414A (en) * 2018-09-28 2019-03-01 北京达佳互联信息技术有限公司 Sample image determines method and apparatus, electronic equipment and storage medium
CN109685110A (en) * 2018-11-28 2019-04-26 北京陌上花科技有限公司 Training method, image classification method and device, the server of image classification network
CN109583501A (en) * 2018-11-30 2019-04-05 广州市百果园信息技术有限公司 Picture classification, the generation method of Classification and Identification model, device, equipment and medium
CN109961094A (en) * 2019-03-07 2019-07-02 北京达佳互联信息技术有限公司 Sample acquiring method, device, electronic equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN110738267A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
CN111539443B (en) Image recognition model training method and device and storage medium
CN107527059A (en) Character recognition method, device and terminal
CN109961094B (en) Sample acquisition method and device, electronic equipment and readable storage medium
CN110781323A (en) Method and device for determining label of multimedia resource, electronic equipment and storage medium
EP3767488A1 (en) Method and device for processing untagged data, and storage medium
CN105528403B (en) Target data identification method and device
CN109886211B (en) Data labeling method and device, electronic equipment and storage medium
CN111046927B (en) Method and device for processing annotation data, electronic equipment and storage medium
CN110738267B (en) Image classification method, device, electronic equipment and storage medium
CN113920293A (en) Information identification method and device, electronic equipment and storage medium
CN113779257A (en) Method, device, equipment, medium and product for analyzing text classification model
CN111125388B (en) Method, device and equipment for detecting multimedia resources and storage medium
CN110213062B (en) Method and device for processing message
CN112328809A (en) Entity classification method, device and computer readable storage medium
CN109842688B (en) Content recommendation method and device, electronic equipment and storage medium
CN109145151B (en) Video emotion classification acquisition method and device
CN110659726B (en) Image processing method and device, electronic equipment and storage medium
CN112784858B (en) Image data processing method and device and electronic equipment
CN113807540A (en) Data processing method and device
CN109711386B (en) Method and device for obtaining recognition model, electronic equipment and storage medium
CN109460458B (en) Prediction method and device for query rewriting intention
CN112381223A (en) Neural network training and image processing method and device
CN107526683B (en) Method and device for detecting functional redundancy of application program and storage medium
CN112036247A (en) Expression package character generation method and device and storage medium
US20150262033A1 (en) Method and terminal device for clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant