CN111353549B - Image label verification method and device, electronic equipment and storage medium - Google Patents

Image label verification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111353549B
CN111353549B CN202010169690.9A CN202010169690A CN111353549B CN 111353549 B CN111353549 B CN 111353549B CN 202010169690 A CN202010169690 A CN 202010169690A CN 111353549 B CN111353549 B CN 111353549B
Authority
CN
China
Prior art keywords
image
checked
label
images
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010169690.9A
Other languages
Chinese (zh)
Other versions
CN111353549A (en
Inventor
秦永强
李素莹
纪双西
张祥伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ainnovation Chongqing Technology Co ltd
Original Assignee
Ainnovation Chongqing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ainnovation Chongqing Technology Co ltd filed Critical Ainnovation Chongqing Technology Co ltd
Priority to CN202010169690.9A priority Critical patent/CN111353549B/en
Publication of CN111353549A publication Critical patent/CN111353549A/en
Application granted granted Critical
Publication of CN111353549B publication Critical patent/CN111353549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The application provides a method and a device for checking an image label, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: classifying and calculating images to be checked in a set to be checked by using a machine learning model trained by an initial training set to obtain a prediction label and label reliability of the images to be checked; screening the credible images from the to-be-checked set according to the label credibility of the to-be-checked images; according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement; and comparing the label prediction result of the image to be checked with the label to obtain a checking result. According to the embodiment provided by the application, the whole verification process is executed through the computer, and the labor cost is reduced.

Description

Image tag verification method and device, electronic device and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for verifying an image tag, an electronic device, and a computer-readable storage medium.
Background
For a machine learning model with supervised learning, a large amount of labeled data is required to be used as a training set, and the data quality of the training set is important for the learning effect and the realization capability of the machine learning model. In order to ensure the data quality, after the data is labeled manually, multiple rounds of checking and repeated cleaning are usually required to be performed on the labeled data, so as to ensure that the data is labeled with a correct label. The labor cost of this process is very high.
Disclosure of Invention
The embodiment of the application aims to provide an image label verification method and device, electronic equipment and a computer readable storage medium, which are used for reducing the labor cost for verifying image labels.
In one aspect, the present application provides a method for verifying an image tag, including:
classifying and calculating images to be checked in a set to be checked by using a machine learning model trained by an initial training set to obtain a prediction label and label reliability of the images to be checked;
screening credible images from the to-be-checked set according to the label credibility of the to-be-checked images;
according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement;
and comparing the label prediction result of the image to be checked with the label to obtain a checking result.
In an embodiment, before the performing the classification calculation on the image to be checked in the image to be checked set by using the machine learning model trained by the initial training set, the method further includes:
selecting an annotation image with the credible annotation label as a sample image to be added into the initial training set, and adding an annotation image except the initial training set as an image to be checked into the set to be checked;
training the machine learning model using the sample images in the initial training set.
In one embodiment, the selecting an annotated image for which the annotation tag is authentic includes:
clustering and dividing the marked images of which the marking labels indicate the same image category information into a plurality of clusters;
calculating consistency parameters between the cluster center image of each cluster and each labeled image in the cluster;
and screening out the credible annotation images of the annotation labels according to the consistency parameters.
In one embodiment, the tag trustworthiness comprises a feature distance; the method for carrying out classification calculation on the images to be checked in the image to be checked to obtain the prediction labels and the label credibility of the images to be checked comprises the following steps:
extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked;
for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model;
and calculating the characteristic distance between the image characteristic of the image to be checked and the image characteristic of the sample image.
In an embodiment, the screening of the credible images from the to-be-verified set according to the label credibility of the to-be-verified image includes:
for each prediction label, screening candidate credible images from the images to be checked with the prediction label;
and for each candidate credible image, judging whether the characteristic distance corresponding to the candidate credible image is smaller than a preset distance threshold, and if so, determining that the candidate credible image is a credible image.
In an embodiment, for each prediction label, the screening of candidate credible images from the images to be checked with the prediction label comprises:
for each prediction label, determining a first number of the candidate credible images according to a preset ratio corresponding to the training times and the total number of the images to be checked with the prediction label;
and selecting the image to be checked with the minimum characteristic distance as a candidate credible image corresponding to the prediction label based on the first quantity.
In an embodiment, for each prediction tag, the screening of candidate credible images from the images to be checked with the prediction tag includes:
and judging whether the confidence corresponding to the prediction label of each image to be checked is larger than a preset confidence threshold, if so, taking the image to be checked as a candidate confidence image corresponding to the prediction label.
On the other hand, the present application further provides an image tag verification apparatus, including:
the prediction module is used for carrying out classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked;
the screening module is used for screening the credible images from the to-be-checked set according to the label credibility of the to-be-checked images;
the training module is used for carrying out training of the machine learning model and classification calculation of the image to be checked again according to the prediction label of the credible image until the label prediction result of the image to be checked meets the credibility requirement;
and the verification module is used for comparing the label prediction result of the image to be verified with the labeled label to obtain a verification result.
Further, the present application also provides an electronic device, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the image tag verification method.
In addition, the present application also provides a computer readable storage medium, which stores a computer program executable by a processor to perform the above-mentioned image tag verification method.
In the embodiment provided by the application, a machine learning model is trained to perform classified calculation on an image to be checked to obtain a prediction label and label reliability, a credible image is screened out according to the label reliability, then the machine learning model is retrained according to the prediction label of the credible image, and the trained machine learning model is used for performing classified calculation on the image to be checked until the label prediction result of the image to be checked meets the reliability requirement, and a checking result can be obtained by comparing the label prediction result of the image to be checked with a label; the whole checking process is executed by a computer, so that the labor cost is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application;
fig. 2 is a schematic flowchart of a method for checking an image tag according to an exemplary embodiment of the present application;
FIG. 3 is a schematic flowchart of a method for verifying an image tag according to another exemplary embodiment of the present application;
fig. 4 is a block diagram of an apparatus for verifying an image tag according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
As shown in fig. 1, an electronic device 1 provided in an embodiment of the present application includes: at least one processor 11 and a memory 12, one processor 11 being exemplified in fig. 1. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11, so that the electronic device 1 can execute all or part of the flow of the method in the embodiments described below. In an embodiment, the electronic device 1 may be a host that performs verification of image tags.
The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.
The present application also provides a computer-readable storage medium storing a computer program executable by the processor 11 to perform the method for verifying an image tag provided by the present application.
Referring to fig. 2, a flowchart of a method for verifying an image tag according to an embodiment of the present application is shown, and as shown in fig. 2, the method may include the following steps 210 to 240.
Step 210: and performing classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked.
The machine learning model is a network model for image classification, and may be any one of network models such as AlexNet, ZFNet, VGGNet, google net, and ResNet.
The initial training set comprises sample images selected in an initial state, and the sample images carry labeling labels indicating image category information. The sample images in the initial training set can be manually selected, and the labeling labels of the sample images have accuracy through manual verification. Through training of the sample images, the machine learning model may perform classification calculations.
The to-be-verified set can comprise all to-be-verified images of which the labeling labels need to be verified.
In one embodiment, the host computer extracts image features of the image to be checked through the machine learning model, and performs classification calculation based on the image features to obtain a prediction tag of the image to be checked and a confidence corresponding to the prediction tag. Here, the prediction tag indicates image category information calculated by the machine learning model for the image to be checked.
After all the images to be checked in the image to be checked in the checking set are classified and calculated through the machine learning model, the prediction label of each image to be checked and the label credibility corresponding to the prediction label can be obtained. Tag confidence is used to characterize the correctness of the predicted tag.
For example, the label confidence may be a feature distance between an image feature of the image to be verified and an image feature of the sample image. The characteristic distance is inversely proportional to the reliability of the label, the larger the characteristic distance is, the lower the reliability of the label is, and conversely, the smaller the characteristic distance is, the higher the reliability of the label is.
In an embodiment, if the label reliability is the feature distance, for each image to be checked, the host computer may extract, through the machine learning model, the image features of the sample image with the label identical to the predicted label of the image to be checked.
Here, the image feature may be a feature vector or a feature map, and the feature distance may be a euclidean distance between the feature vector of the image to be verified and the feature vector of the sample image, or between the feature map of the image to be verified and the feature map of the sample image.
For each image to be checked, the host computer may select a sample image with the label identical to the prediction label of the image to be checked from the initial training set.
And the host computer extracts the image characteristics of the sample image through a machine learning model and calculates the characteristic distance between the image characteristics of the image to be verified and the image characteristics of the sample image.
Because a plurality of sample images with the same label exist, for each image to be checked, the host computer can calculate the Euclidean distance between the image characteristics of the image to be checked and the image characteristics of each sample image, and then select the minimum Euclidean distance as the characteristic distance. Alternatively, an average value of a plurality of euclidean distances is calculated, and the average value is defined as a characteristic distance.
Step 220: and screening the credible images from the set to be checked according to the label credibility of the images to be checked.
For each prediction tag, the host may screen candidate trusted images from the images to be verified that have that prediction tag.
In an embodiment, for each prediction tag, the host may determine the first number of candidate reliable images according to a preset ratio corresponding to the training times and the total number of images to be checked with the prediction tag.
Here, the number of training times refers to the number of training times for the machine learning model. As the prediction capability of the machine learning model is increased along with the increase of the training times, the preset ratio screened by the host computer is gradually increased. Such as: if the total number of training times is set to 5 times, the preset ratios corresponding to the number of training times may be 20%, 40%, 60%, 80%, and 100%, respectively.
The total number of the images to be checked corresponding to each prediction label may be different, and based on the total number and the preset ratio, the first number used for screening the candidate trusted images at this time may be determined. Such as: if the image to be checked corresponding to the prediction label 'face' is 100, and the preset ratio corresponding to the training times of the machine learning model before the prediction is 60%, the first number is 60.
The host computer can select the image to be checked with the minimum feature distance as the candidate credible image corresponding to the prediction label based on the first number.
After the first number is calculated, for the images to be checked with the same prediction label, the host may sort the images to be checked according to the characteristic distance corresponding to each image to be checked from large to small, and then select the image to be checked with the minimum characteristic distance as the candidate trusted image corresponding to the prediction label. Such as: the first number is 60, the total number of the images to be checked with the same prediction label is 100, the 100 images to be checked are sorted according to the characteristic distance, and the first 60 images to be checked with the minimum characteristic distance are selected as candidate credible images.
In yet another embodiment, the host computer may determine whether the confidence corresponding to the prediction tag of each image to be checked is greater than a preset confidence threshold. Illustratively, the confidence threshold may be 0.5.
In one case, the confidence threshold is not reached, which indicates that the prediction result of the host computer for the image to be verified is not credible.
And under the other condition, when the confidence threshold is reached, the host computer is more credible for the prediction result of the image to be checked, and the image to be checked is determined to be a candidate credible image.
After the candidate trusted images are screened out, the host computer can verify whether the candidate trusted images are trusted images.
For each candidate trusted image, the host may determine whether a feature distance corresponding to the candidate trusted image is smaller than a preset distance threshold. Here, the distance threshold may be an empirical value.
In one case, the feature distance is not less than the distance threshold, which indicates that the prediction result of the candidate trusted image by the host is not trusted.
In another case, the characteristic distance is smaller than the threshold value, which indicates that the prediction result of the candidate trusted image by the host is trusted, and the candidate trusted image is determined to be a trusted image.
Step 230: and according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement.
After the credible images are screened out, the host can retrain the machine learning model according to the prediction labels of the credible images and the sample images in the initial training set, and perform classification calculation on the images to be checked based on the trained machine learning model.
The prediction capability of the machine learning model is enhanced after each training, and the accuracy of the machine learning model on the label prediction result of each image to be checked is improved. The whole process is carried out in an iteration mode, the credible images predicted by the machine learning model can be increased gradually along with the increase of the training times, and the prediction capability of the machine learning model can be improved through the training of the credible images on the machine learning model, so that the model is more stable.
For each image to be checked, the label prediction result output by the machine learning model each time is possibly different, the predicted label tends to be stable along with the increase of the training times, and the reliability of the corresponding label is increased.
The process is circulated until the label prediction result of the image to be checked meets the reliability requirement. The reliability requirement can be that the label reliability corresponding to all the images to be checked in the image to be checked indicates that the images to be checked are reliable images.
Step 240: and comparing the label prediction result of the image to be checked with the label to obtain a checking result.
When all the images to be checked are credible images, the host can compare the prediction label and the labeling label of each image to be checked to obtain a checking result. The verification result may include the image to be verified with the inconsistent prediction tag and the label tag.
In an embodiment, before performing step 210, the host computer may select an annotated image with a credible annotation tag, add the annotated image as a sample image into the initial training set, and add an annotated image outside the initial training set as an image to be verified into the set to be verified.
By the measures of this embodiment, the labor cost for acquiring the sample image can be further reduced.
The host may cluster-divide the tagged images for which the tag indicates the same image category information into a plurality of clusters.
Taking the K-means algorithm as an example, the host can obtain the number K of clusters for any image category information. Wherein the number of clusters may be predetermined. For example, an image whose image type information is "face" can be divided into 5 clusters according to the shooting angle, i.e., front view, left view, right view, top view, and bottom view.
The host computer can randomly select K cluster center images from the annotation images of which the annotation labels indicate the image type information. Here, the cluster center image is an image at the center position in one cluster.
For each annotation image of which the annotation label indicates the image type information, the host computer may calculate a euclidean distance between the annotation image and each cluster center image, and determine that the annotation image and the cluster center image having the smallest euclidean distance belong to the same cluster. When the Euclidean distance is calculated, the difference value of the pixel points at the same position of the marked image and the cluster center image can be calculated, the square of the difference value is calculated, and after the squares of all the difference values are accumulated, evolution is carried out to obtain the Euclidean distance.
After one round of calculation, the host divides all the labeled images of the same labeled label into a plurality of clusters. For each cluster of annotated images, the host computes a new cluster center image. The host can calculate the average value of the pixel points at the same position of all the marked images, and an image formed by the average values of all the pixel points is used as a new cluster center image.
After obtaining the new cluster center image, the host may recalculate the euclidean distance between each annotation image and each cluster center image, thereby dividing all annotation images into a plurality of clusters according to the minimum euclidean distance.
Generally, since the cluster center image is different for each selection, the labeling image divided into each cluster is different. When the host repeats the calculation process for many times, the division result is not changed any more, and at the moment, the clustering division is completed.
The host computer can calculate a consistency parameter between the cluster-center image of each cluster and each annotated image within the cluster. The host can screen out the credible annotation image of the annotation label according to the consistency parameter.
For example, if the host performs cluster partitioning according to the euclidean distance between the cluster center image and the label image in the cluster, the consistency parameter may be the euclidean distance.
In this case, the host may directly obtain the euclidean distance between each tagged image in the cluster and the cluster center image during the last calculation, and determine whether the consistency parameter corresponding to each tagged image is smaller than a preset parameter threshold, and if so, take the tagged image as a tagged image with a credible tag.
The host can screen out the annotation images with high enough similarity in each cluster by the Euclidean distance smaller than the parameter threshold, and the annotation labels of the annotation images with the same and similar screened annotation labels can be determined to be credible at the moment because the condition that a large number of errors occur in the annotation labels is rare.
Through the measures, the labor cost for selecting the sample image is reduced.
Referring to fig. 3, a flowchart of a method for verifying an image tag according to another exemplary embodiment of the present application is shown.
The host computer executes step 310 to obtain a sample image with a credible label, and further executes step 320 to train a machine learning model through the sample image. After the trained machine learning model is obtained, the host computer executes step 330, and performs classification calculation on the images to be checked through the trained machine learning model to obtain a prediction label corresponding to each image to be checked. The host computer may calculate the tag confidence level of the predicted tag for each image to be verified, and perform step 340 to determine whether the tag prediction result meets the confidence requirement. The reliability requirement is the label reliability of the prediction label of each image to be checked, and the image to be checked can be determined to be a reliable image.
If the tag prediction does not meet the confidence requirement, the host re-executes step 310. The host can screen out the credible image as a sample image according to the label credibility. The host computer may then perform steps 320 through 340. Until the label prediction result meets the reliability requirement.
If the predicted label result meets the reliability requirement, the host computer may execute step 350 to compare the predicted label and the labeled label of each image to be verified, thereby obtaining a verification result. The verification result may include the image to be verified with the inconsistent prediction tag and the label tag.
Fig. 4 is a block diagram of an image tag verification apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus may include: prediction module 410, screening module 420, training module 430, and verification module 440.
The prediction module 410 is configured to perform classification calculation on the images to be checked in the image to be checked in the initial training set by using the machine learning model trained by the initial training set, so as to obtain the prediction tags and the tag reliability of the images to be checked.
And the screening module 420 is configured to screen the trusted images from the set to be verified according to the tag credibility of the image to be verified.
The training module 430 is configured to perform the training of the machine learning model and the classification calculation of the image to be verified again according to the prediction label of the trusted image until the label prediction result of the image to be verified meets the requirement of the confidence level.
And the verification module 440 is configured to compare the label prediction result of the image to be verified with the labeled label to obtain a verification result.
In one embodiment, the apparatus further comprises:
and the selecting unit (not shown in the figure) is used for selecting the labeling image with the credible labeling label, adding the labeling image serving as a sample image into the initial training set, and adding the labeling images except the initial training set serving as images to be checked into the set to be checked.
A training unit (not shown in the figure) for training the machine learning model by using the sample images in the initial training set.
In an embodiment, the selecting unit (not shown in the figure) is further configured to:
clustering and dividing the labeled images of which the label labels indicate the same image category information into a plurality of clusters;
calculating consistency parameters between the cluster center image of each cluster and each marked image in the cluster;
and screening out the credible annotation image of the annotation label according to the consistency parameter.
In an embodiment, the prediction module 410 is further configured to:
extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked;
for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model;
and calculating the characteristic distance between the image characteristic of the image to be checked and the image characteristic of the sample image.
In an embodiment, the filtering module 420 is further configured to:
for each prediction label, screening candidate credible images from the images to be checked with the prediction label;
and for each candidate credible image, judging whether the characteristic distance corresponding to the candidate credible image is smaller than a preset distance threshold, and if so, determining that the candidate credible image is a credible image.
In an embodiment, the filtering module 420 is further configured to:
for each prediction label, determining a first number of the candidate credible images according to a preset ratio corresponding to the training times and the total number of the images to be checked with the prediction label;
and selecting the image to be checked with the minimum characteristic distance as a candidate credible image corresponding to the prediction label based on the first quantity.
In an embodiment, the screening module 420 is further configured to:
and judging whether the confidence corresponding to the prediction label of each image to be checked is larger than a preset confidence threshold, if so, taking the image to be checked as a candidate confidence image corresponding to the prediction label.
The implementation processes of the functions and actions of the modules in the device are specifically described in the implementation processes of the corresponding steps in the image tag verification method, and are not described herein again.
In the embodiments provided in the present application, the disclosed apparatus and method can also be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (9)

1. A method for verifying an image tag, comprising:
the method comprises the following steps of carrying out classification calculation on images to be checked in an image to be checked in an initial training set by using a machine learning model trained by the initial training set to obtain a prediction label and label credibility of the images to be checked, and comprises the following steps: extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked; for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model; calculating a characteristic distance between the image characteristics of the image to be checked and the image characteristics of the sample image, wherein the characteristic distance is used for representing the reliability of the label;
screening credible images from the to-be-checked set according to the label credibility of the to-be-checked images;
according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement;
and comparing the label prediction result of the image to be checked with the label to obtain a checking result.
2. The method of claim 1, wherein prior to the performing the classification calculation on the image to be checked in the image to be checked set by using the machine learning model trained by using the initial training set, the method further comprises:
selecting an annotated image with a credible annotated label as a sample image to be added into the initial training set, and adding an annotated image except the initial training set as an image to be checked into the set to be checked;
training the machine learning model using the sample images in the initial training set.
3. The method of claim 2, wherein selecting an annotated image for which the annotation tag is authentic comprises:
clustering and dividing the labeled images of which the label labels indicate the same image category information into a plurality of clusters;
calculating consistency parameters between the cluster center image of each cluster and each marked image in the cluster;
and screening out the credible annotation images of the annotation labels according to the consistency parameters.
4. The method according to claim 1, wherein the screening of the credible images from the set to be verified according to the label credibility of the images to be verified comprises:
for each prediction label, screening candidate credible images from the images to be checked with the prediction label;
and judging whether the characteristic distance corresponding to each candidate credible image is smaller than a preset distance threshold or not, and if so, determining that the candidate credible image is a credible image.
5. The method of claim 4, wherein for each prediction tag, the screening of candidate reliable images from the images to be checked with the prediction tag comprises:
for each prediction label, determining a first number of the candidate credible images according to a preset ratio corresponding to the training times and the total number of the images to be checked with the prediction label;
and selecting the image to be checked with the minimum characteristic distance as a candidate credible image corresponding to the prediction label based on the first quantity.
6. The method of claim 4, wherein for each prediction tag, the screening of candidate reliable images from the images to be checked with the prediction tag comprises:
and judging whether the confidence corresponding to the prediction label of each image to be checked is larger than a preset confidence threshold, if so, taking the image to be checked as a candidate confidence image corresponding to the prediction label.
7. An image tag verification apparatus, comprising:
the prediction module is used for performing classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked, and comprises the following steps: extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked; for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model; calculating a characteristic distance between the image characteristics of the image to be checked and the image characteristics of the sample image, wherein the characteristic distance is used for representing the reliability of the label;
the screening module is used for screening the credible images from the to-be-checked set according to the label credibility of the to-be-checked images;
the training module is used for carrying out training of the machine learning model and classification calculation of the image to be checked again according to the prediction label of the credible image until the label prediction result of the image to be checked meets the credibility requirement;
and the verification module is used for comparing the label prediction result of the image to be verified with the labeled label to obtain a verification result.
8. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of image tag verification of any of claims 1-6.
9. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the method of verifying an image tag according to any one of claims 1 to 6.
CN202010169690.9A 2020-03-10 2020-03-10 Image label verification method and device, electronic equipment and storage medium Active CN111353549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010169690.9A CN111353549B (en) 2020-03-10 2020-03-10 Image label verification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010169690.9A CN111353549B (en) 2020-03-10 2020-03-10 Image label verification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111353549A CN111353549A (en) 2020-06-30
CN111353549B true CN111353549B (en) 2023-01-31

Family

ID=71197376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010169690.9A Active CN111353549B (en) 2020-03-10 2020-03-10 Image label verification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111353549B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115996A (en) * 2020-09-11 2020-12-22 北京达佳互联信息技术有限公司 Image data processing method, device, equipment and storage medium
CN112131415A (en) * 2020-09-18 2020-12-25 北京影谱科技股份有限公司 Method and device for improving data acquisition quality based on deep learning
CN112163110B (en) * 2020-09-27 2023-01-03 Oppo(重庆)智能科技有限公司 Image classification method and device, electronic equipment and computer-readable storage medium
CN112801114B (en) * 2021-01-20 2024-03-08 杭州依图医疗技术有限公司 Method and device for determining projection position information of breast image
CN112906817A (en) * 2021-03-16 2021-06-04 中科海拓(无锡)科技有限公司 Intelligent image labeling method
CN113407680B (en) * 2021-06-30 2023-06-02 竹间智能科技(上海)有限公司 Heterogeneous integrated model screening method and electronic equipment
CN114863242B (en) * 2022-04-26 2022-11-29 北京拙河科技有限公司 Deep learning network optimization method and system for image recognition

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117429A (en) * 2015-08-05 2015-12-02 广东工业大学 Scenario image annotation method based on active learning and multi-label multi-instance learning
CN105848109A (en) * 2016-04-26 2016-08-10 国网安徽省电力公司信息通信分公司 Indoor Internet of things active tag location method
CN108416370A (en) * 2018-02-07 2018-08-17 深圳大学 Image classification method, device based on semi-supervised deep learning and storage medium
CN109241903A (en) * 2018-08-30 2019-01-18 平安科技(深圳)有限公司 Sample data cleaning method, device, computer equipment and storage medium
CN109271529A (en) * 2018-10-10 2019-01-25 内蒙古大学 Cyrillic Mongolian and the double language knowledge mapping construction methods of traditional Mongolian
CN109345515A (en) * 2018-09-17 2019-02-15 代黎明 Sample label confidence calculations method, apparatus, equipment and model training method
CN109784391A (en) * 2019-01-04 2019-05-21 杭州比智科技有限公司 Sample mask method and device based on multi-model
CN110263814A (en) * 2019-05-27 2019-09-20 南京信息工程大学 Increment cluster data mining method based on dynamic clustering trend analysis
CN110458107A (en) * 2019-08-13 2019-11-15 北京百度网讯科技有限公司 Method and apparatus for image recognition
WO2019233271A1 (en) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Image processing method, computer readable storage medium and electronic device
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8194938B2 (en) * 2009-06-02 2012-06-05 George Mason Intellectual Properties, Inc. Face authentication using recognition-by-parts, boosting, and transduction
CN108764208B (en) * 2018-06-08 2021-06-08 Oppo广东移动通信有限公司 Image processing method and device, storage medium and electronic equipment
CN108875821A (en) * 2018-06-08 2018-11-23 Oppo广东移动通信有限公司 The training method and device of disaggregated model, mobile terminal, readable storage medium storing program for executing
CN109800320B (en) * 2019-01-04 2023-08-18 平安科技(深圳)有限公司 Image processing method, device and computer readable storage medium
CN110598033B (en) * 2019-08-14 2023-03-28 中国平安财产保险股份有限公司 Intelligent self-checking vehicle method and device and computer readable storage medium
CN110781859B (en) * 2019-11-05 2022-08-19 深圳奇迹智慧网络有限公司 Image annotation method and device, computer equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117429A (en) * 2015-08-05 2015-12-02 广东工业大学 Scenario image annotation method based on active learning and multi-label multi-instance learning
CN105848109A (en) * 2016-04-26 2016-08-10 国网安徽省电力公司信息通信分公司 Indoor Internet of things active tag location method
CN108416370A (en) * 2018-02-07 2018-08-17 深圳大学 Image classification method, device based on semi-supervised deep learning and storage medium
WO2019233271A1 (en) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Image processing method, computer readable storage medium and electronic device
CN109241903A (en) * 2018-08-30 2019-01-18 平安科技(深圳)有限公司 Sample data cleaning method, device, computer equipment and storage medium
CN109345515A (en) * 2018-09-17 2019-02-15 代黎明 Sample label confidence calculations method, apparatus, equipment and model training method
CN109271529A (en) * 2018-10-10 2019-01-25 内蒙古大学 Cyrillic Mongolian and the double language knowledge mapping construction methods of traditional Mongolian
CN109784391A (en) * 2019-01-04 2019-05-21 杭州比智科技有限公司 Sample mask method and device based on multi-model
CN110263814A (en) * 2019-05-27 2019-09-20 南京信息工程大学 Increment cluster data mining method based on dynamic clustering trend analysis
CN110458107A (en) * 2019-08-13 2019-11-15 北京百度网讯科技有限公司 Method and apparatus for image recognition
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Change Detection in Remote Sensing Images Based on Image Mapping and a Deep Capsule Network;Wenping Ma 等;《Remote Sensing》;20190314;1-24 *
Metric Learning for Regression Problems and Human Age Estimation;Bo Xiao 等;《Pacific-Rim Conference on Multimedia 2009》;20091231;88–99 *
基于弱标签数据的图像精细分类研究;肖浩泉;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200215;I138-1311 *
复杂仿真实验结果可信度评估方法研究;胡晓峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200115;I138-2315 *

Also Published As

Publication number Publication date
CN111353549A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN111353549B (en) Image label verification method and device, electronic equipment and storage medium
CN111401609B (en) Prediction method and prediction device for traffic flow time series
CN111680688B (en) Character recognition method and device, electronic equipment and storage medium
CN106156766B (en) Method and device for generating text line classifier
CN110135505B (en) Image classification method and device, computer equipment and computer readable storage medium
CN111507371A (en) Method and apparatus
CN107133629B (en) Picture classification method and device and mobile terminal
CN110909868A (en) Node representation method and device based on graph neural network model
CN112613553A (en) Picture sample set generation method and device, computer equipment and storage medium
CN112818162A (en) Image retrieval method, image retrieval device, storage medium and electronic equipment
CN112541372B (en) Difficult sample screening method and device
US20140241618A1 (en) Combining Region Based Image Classifiers
CN111488939A (en) Model training method, classification method, device and equipment
CN113822336A (en) Cloud hard disk fault prediction method, device and system and readable storage medium
CN112597997A (en) Region-of-interest determining method, image content identifying method and device
CN116563868A (en) Text image recognition method and device, computer equipment and storage medium
CN112990350B (en) Target detection network training method and target detection network-based coal and gangue identification method
CN112132239B (en) Training method, device, equipment and storage medium
CN110942073A (en) Container trailer number identification method and device and computer equipment
CN111931229B (en) Data identification method, device and storage medium
CN112699908B (en) Method for labeling picture, electronic terminal, computer readable storage medium and equipment
CN114549513A (en) Part identification method, part identification device, quality inspection method, electronic device, and storage medium
CN111385342B (en) Internet of things industry identification method and device, electronic equipment and storage medium
CN113298182A (en) Early warning method, device and equipment based on certificate image
CN111160429B (en) Training method of image detection model, image detection method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant