CN111353549B

CN111353549B - Image label verification method and device, electronic equipment and storage medium

Info

Publication number: CN111353549B
Application number: CN202010169690.9A
Authority: CN
Inventors: 秦永强; 李素莹; 纪双西; 张祥伟
Original assignee: Ainnovation Chongqing Technology Co ltd
Current assignee: Ainnovation Chongqing Technology Co ltd
Priority date: 2020-03-10
Filing date: 2020-03-10
Publication date: 2023-01-31
Anticipated expiration: 2040-03-10
Also published as: CN111353549A

Abstract

The application provides a method and a device for checking an image label, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: classifying and calculating images to be checked in a set to be checked by using a machine learning model trained by an initial training set to obtain a prediction label and label reliability of the images to be checked; screening the credible images from the to-be-checked set according to the label credibility of the to-be-checked images; according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement; and comparing the label prediction result of the image to be checked with the label to obtain a checking result. According to the embodiment provided by the application, the whole verification process is executed through the computer, and the labor cost is reduced.

Description

Image tag verification method and device, electronic device and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for verifying an image tag, an electronic device, and a computer-readable storage medium.

Background

For a machine learning model with supervised learning, a large amount of labeled data is required to be used as a training set, and the data quality of the training set is important for the learning effect and the realization capability of the machine learning model. In order to ensure the data quality, after the data is labeled manually, multiple rounds of checking and repeated cleaning are usually required to be performed on the labeled data, so as to ensure that the data is labeled with a correct label. The labor cost of this process is very high.

Disclosure of Invention

The embodiment of the application aims to provide an image label verification method and device, electronic equipment and a computer readable storage medium, which are used for reducing the labor cost for verifying image labels.

In one aspect, the present application provides a method for verifying an image tag, including:

classifying and calculating images to be checked in a set to be checked by using a machine learning model trained by an initial training set to obtain a prediction label and label reliability of the images to be checked;

screening credible images from the to-be-checked set according to the label credibility of the to-be-checked images;

according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement;

and comparing the label prediction result of the image to be checked with the label to obtain a checking result.

In an embodiment, before the performing the classification calculation on the image to be checked in the image to be checked set by using the machine learning model trained by the initial training set, the method further includes:

selecting an annotation image with the credible annotation label as a sample image to be added into the initial training set, and adding an annotation image except the initial training set as an image to be checked into the set to be checked;

training the machine learning model using the sample images in the initial training set.

In one embodiment, the selecting an annotated image for which the annotation tag is authentic includes:

clustering and dividing the marked images of which the marking labels indicate the same image category information into a plurality of clusters;

calculating consistency parameters between the cluster center image of each cluster and each labeled image in the cluster;

and screening out the credible annotation images of the annotation labels according to the consistency parameters.

In one embodiment, the tag trustworthiness comprises a feature distance; the method for carrying out classification calculation on the images to be checked in the image to be checked to obtain the prediction labels and the label credibility of the images to be checked comprises the following steps:

extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked;

for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model;

and calculating the characteristic distance between the image characteristic of the image to be checked and the image characteristic of the sample image.

In an embodiment, the screening of the credible images from the to-be-verified set according to the label credibility of the to-be-verified image includes:

for each prediction label, screening candidate credible images from the images to be checked with the prediction label;

and for each candidate credible image, judging whether the characteristic distance corresponding to the candidate credible image is smaller than a preset distance threshold, and if so, determining that the candidate credible image is a credible image.

In an embodiment, for each prediction label, the screening of candidate credible images from the images to be checked with the prediction label comprises:

for each prediction label, determining a first number of the candidate credible images according to a preset ratio corresponding to the training times and the total number of the images to be checked with the prediction label;

and selecting the image to be checked with the minimum characteristic distance as a candidate credible image corresponding to the prediction label based on the first quantity.

In an embodiment, for each prediction tag, the screening of candidate credible images from the images to be checked with the prediction tag includes:

and judging whether the confidence corresponding to the prediction label of each image to be checked is larger than a preset confidence threshold, if so, taking the image to be checked as a candidate confidence image corresponding to the prediction label.

On the other hand, the present application further provides an image tag verification apparatus, including:

the prediction module is used for carrying out classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked;

the screening module is used for screening the credible images from the to-be-checked set according to the label credibility of the to-be-checked images;

the training module is used for carrying out training of the machine learning model and classification calculation of the image to be checked again according to the prediction label of the credible image until the label prediction result of the image to be checked meets the credibility requirement;

and the verification module is used for comparing the label prediction result of the image to be verified with the labeled label to obtain a verification result.

Further, the present application also provides an electronic device, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the image tag verification method.

In addition, the present application also provides a computer readable storage medium, which stores a computer program executable by a processor to perform the above-mentioned image tag verification method.

In the embodiment provided by the application, a machine learning model is trained to perform classified calculation on an image to be checked to obtain a prediction label and label reliability, a credible image is screened out according to the label reliability, then the machine learning model is retrained according to the prediction label of the credible image, and the trained machine learning model is used for performing classified calculation on the image to be checked until the label prediction result of the image to be checked meets the reliability requirement, and a checking result can be obtained by comparing the label prediction result of the image to be checked with a label; the whole checking process is executed by a computer, so that the labor cost is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application;

fig. 2 is a schematic flowchart of a method for checking an image tag according to an exemplary embodiment of the present application;

FIG. 3 is a schematic flowchart of a method for verifying an image tag according to another exemplary embodiment of the present application;

fig. 4 is a block diagram of an apparatus for verifying an image tag according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

As shown in fig. 1, an electronic device 1 provided in an embodiment of the present application includes: at least one processor 11 and a memory 12, one processor 11 being exemplified in fig. 1. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11, so that the electronic device 1 can execute all or part of the flow of the method in the embodiments described below. In an embodiment, the electronic device 1 may be a host that performs verification of image tags.

The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.

The present application also provides a computer-readable storage medium storing a computer program executable by the processor 11 to perform the method for verifying an image tag provided by the present application.

Referring to fig. 2, a flowchart of a method for verifying an image tag according to an embodiment of the present application is shown, and as shown in fig. 2, the method may include the following steps 210 to 240.

Step 210: and performing classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked.

The machine learning model is a network model for image classification, and may be any one of network models such as AlexNet, ZFNet, VGGNet, google net, and ResNet.

The initial training set comprises sample images selected in an initial state, and the sample images carry labeling labels indicating image category information. The sample images in the initial training set can be manually selected, and the labeling labels of the sample images have accuracy through manual verification. Through training of the sample images, the machine learning model may perform classification calculations.

The to-be-verified set can comprise all to-be-verified images of which the labeling labels need to be verified.

In one embodiment, the host computer extracts image features of the image to be checked through the machine learning model, and performs classification calculation based on the image features to obtain a prediction tag of the image to be checked and a confidence corresponding to the prediction tag. Here, the prediction tag indicates image category information calculated by the machine learning model for the image to be checked.

After all the images to be checked in the image to be checked in the checking set are classified and calculated through the machine learning model, the prediction label of each image to be checked and the label credibility corresponding to the prediction label can be obtained. Tag confidence is used to characterize the correctness of the predicted tag.

For example, the label confidence may be a feature distance between an image feature of the image to be verified and an image feature of the sample image. The characteristic distance is inversely proportional to the reliability of the label, the larger the characteristic distance is, the lower the reliability of the label is, and conversely, the smaller the characteristic distance is, the higher the reliability of the label is.

In an embodiment, if the label reliability is the feature distance, for each image to be checked, the host computer may extract, through the machine learning model, the image features of the sample image with the label identical to the predicted label of the image to be checked.

Here, the image feature may be a feature vector or a feature map, and the feature distance may be a euclidean distance between the feature vector of the image to be verified and the feature vector of the sample image, or between the feature map of the image to be verified and the feature map of the sample image.

For each image to be checked, the host computer may select a sample image with the label identical to the prediction label of the image to be checked from the initial training set.

And the host computer extracts the image characteristics of the sample image through a machine learning model and calculates the characteristic distance between the image characteristics of the image to be verified and the image characteristics of the sample image.

Because a plurality of sample images with the same label exist, for each image to be checked, the host computer can calculate the Euclidean distance between the image characteristics of the image to be checked and the image characteristics of each sample image, and then select the minimum Euclidean distance as the characteristic distance. Alternatively, an average value of a plurality of euclidean distances is calculated, and the average value is defined as a characteristic distance.

Step 220: and screening the credible images from the set to be checked according to the label credibility of the images to be checked.

For each prediction tag, the host may screen candidate trusted images from the images to be verified that have that prediction tag.

In an embodiment, for each prediction tag, the host may determine the first number of candidate reliable images according to a preset ratio corresponding to the training times and the total number of images to be checked with the prediction tag.

Here, the number of training times refers to the number of training times for the machine learning model. As the prediction capability of the machine learning model is increased along with the increase of the training times, the preset ratio screened by the host computer is gradually increased. Such as: if the total number of training times is set to 5 times, the preset ratios corresponding to the number of training times may be 20%, 40%, 60%, 80%, and 100%, respectively.

The total number of the images to be checked corresponding to each prediction label may be different, and based on the total number and the preset ratio, the first number used for screening the candidate trusted images at this time may be determined. Such as: if the image to be checked corresponding to the prediction label 'face' is 100, and the preset ratio corresponding to the training times of the machine learning model before the prediction is 60%, the first number is 60.

The host computer can select the image to be checked with the minimum feature distance as the candidate credible image corresponding to the prediction label based on the first number.

After the first number is calculated, for the images to be checked with the same prediction label, the host may sort the images to be checked according to the characteristic distance corresponding to each image to be checked from large to small, and then select the image to be checked with the minimum characteristic distance as the candidate trusted image corresponding to the prediction label. Such as: the first number is 60, the total number of the images to be checked with the same prediction label is 100, the 100 images to be checked are sorted according to the characteristic distance, and the first 60 images to be checked with the minimum characteristic distance are selected as candidate credible images.

In yet another embodiment, the host computer may determine whether the confidence corresponding to the prediction tag of each image to be checked is greater than a preset confidence threshold. Illustratively, the confidence threshold may be 0.5.

In one case, the confidence threshold is not reached, which indicates that the prediction result of the host computer for the image to be verified is not credible.

And under the other condition, when the confidence threshold is reached, the host computer is more credible for the prediction result of the image to be checked, and the image to be checked is determined to be a candidate credible image.

After the candidate trusted images are screened out, the host computer can verify whether the candidate trusted images are trusted images.

For each candidate trusted image, the host may determine whether a feature distance corresponding to the candidate trusted image is smaller than a preset distance threshold. Here, the distance threshold may be an empirical value.

In one case, the feature distance is not less than the distance threshold, which indicates that the prediction result of the candidate trusted image by the host is not trusted.

In another case, the characteristic distance is smaller than the threshold value, which indicates that the prediction result of the candidate trusted image by the host is trusted, and the candidate trusted image is determined to be a trusted image.

Step 230: and according to the prediction label of the credible image, the training of the machine learning model and the classification calculation of the image to be checked are carried out again until the label prediction result of the image to be checked meets the credibility requirement.

After the credible images are screened out, the host can retrain the machine learning model according to the prediction labels of the credible images and the sample images in the initial training set, and perform classification calculation on the images to be checked based on the trained machine learning model.

The prediction capability of the machine learning model is enhanced after each training, and the accuracy of the machine learning model on the label prediction result of each image to be checked is improved. The whole process is carried out in an iteration mode, the credible images predicted by the machine learning model can be increased gradually along with the increase of the training times, and the prediction capability of the machine learning model can be improved through the training of the credible images on the machine learning model, so that the model is more stable.

For each image to be checked, the label prediction result output by the machine learning model each time is possibly different, the predicted label tends to be stable along with the increase of the training times, and the reliability of the corresponding label is increased.

The process is circulated until the label prediction result of the image to be checked meets the reliability requirement. The reliability requirement can be that the label reliability corresponding to all the images to be checked in the image to be checked indicates that the images to be checked are reliable images.

Step 240: and comparing the label prediction result of the image to be checked with the label to obtain a checking result.

When all the images to be checked are credible images, the host can compare the prediction label and the labeling label of each image to be checked to obtain a checking result. The verification result may include the image to be verified with the inconsistent prediction tag and the label tag.

In an embodiment, before performing step 210, the host computer may select an annotated image with a credible annotation tag, add the annotated image as a sample image into the initial training set, and add an annotated image outside the initial training set as an image to be verified into the set to be verified.

By the measures of this embodiment, the labor cost for acquiring the sample image can be further reduced.

The host may cluster-divide the tagged images for which the tag indicates the same image category information into a plurality of clusters.

Taking the K-means algorithm as an example, the host can obtain the number K of clusters for any image category information. Wherein the number of clusters may be predetermined. For example, an image whose image type information is "face" can be divided into 5 clusters according to the shooting angle, i.e., front view, left view, right view, top view, and bottom view.

The host computer can randomly select K cluster center images from the annotation images of which the annotation labels indicate the image type information. Here, the cluster center image is an image at the center position in one cluster.

For each annotation image of which the annotation label indicates the image type information, the host computer may calculate a euclidean distance between the annotation image and each cluster center image, and determine that the annotation image and the cluster center image having the smallest euclidean distance belong to the same cluster. When the Euclidean distance is calculated, the difference value of the pixel points at the same position of the marked image and the cluster center image can be calculated, the square of the difference value is calculated, and after the squares of all the difference values are accumulated, evolution is carried out to obtain the Euclidean distance.

After one round of calculation, the host divides all the labeled images of the same labeled label into a plurality of clusters. For each cluster of annotated images, the host computes a new cluster center image. The host can calculate the average value of the pixel points at the same position of all the marked images, and an image formed by the average values of all the pixel points is used as a new cluster center image.

After obtaining the new cluster center image, the host may recalculate the euclidean distance between each annotation image and each cluster center image, thereby dividing all annotation images into a plurality of clusters according to the minimum euclidean distance.

Generally, since the cluster center image is different for each selection, the labeling image divided into each cluster is different. When the host repeats the calculation process for many times, the division result is not changed any more, and at the moment, the clustering division is completed.

The host computer can calculate a consistency parameter between the cluster-center image of each cluster and each annotated image within the cluster. The host can screen out the credible annotation image of the annotation label according to the consistency parameter.

For example, if the host performs cluster partitioning according to the euclidean distance between the cluster center image and the label image in the cluster, the consistency parameter may be the euclidean distance.

In this case, the host may directly obtain the euclidean distance between each tagged image in the cluster and the cluster center image during the last calculation, and determine whether the consistency parameter corresponding to each tagged image is smaller than a preset parameter threshold, and if so, take the tagged image as a tagged image with a credible tag.

The host can screen out the annotation images with high enough similarity in each cluster by the Euclidean distance smaller than the parameter threshold, and the annotation labels of the annotation images with the same and similar screened annotation labels can be determined to be credible at the moment because the condition that a large number of errors occur in the annotation labels is rare.

Through the measures, the labor cost for selecting the sample image is reduced.

Referring to fig. 3, a flowchart of a method for verifying an image tag according to another exemplary embodiment of the present application is shown.

The host computer executes step 310 to obtain a sample image with a credible label, and further executes step 320 to train a machine learning model through the sample image. After the trained machine learning model is obtained, the host computer executes step 330, and performs classification calculation on the images to be checked through the trained machine learning model to obtain a prediction label corresponding to each image to be checked. The host computer may calculate the tag confidence level of the predicted tag for each image to be verified, and perform step 340 to determine whether the tag prediction result meets the confidence requirement. The reliability requirement is the label reliability of the prediction label of each image to be checked, and the image to be checked can be determined to be a reliable image.

If the tag prediction does not meet the confidence requirement, the host re-executes step 310. The host can screen out the credible image as a sample image according to the label credibility. The host computer may then perform steps 320 through 340. Until the label prediction result meets the reliability requirement.

If the predicted label result meets the reliability requirement, the host computer may execute step 350 to compare the predicted label and the labeled label of each image to be verified, thereby obtaining a verification result. The verification result may include the image to be verified with the inconsistent prediction tag and the label tag.

Fig. 4 is a block diagram of an image tag verification apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus may include: prediction module 410, screening module 420, training module 430, and verification module 440.

The prediction module 410 is configured to perform classification calculation on the images to be checked in the image to be checked in the initial training set by using the machine learning model trained by the initial training set, so as to obtain the prediction tags and the tag reliability of the images to be checked.

And the screening module 420 is configured to screen the trusted images from the set to be verified according to the tag credibility of the image to be verified.

The training module 430 is configured to perform the training of the machine learning model and the classification calculation of the image to be verified again according to the prediction label of the trusted image until the label prediction result of the image to be verified meets the requirement of the confidence level.

And the verification module 440 is configured to compare the label prediction result of the image to be verified with the labeled label to obtain a verification result.

In one embodiment, the apparatus further comprises:

and the selecting unit (not shown in the figure) is used for selecting the labeling image with the credible labeling label, adding the labeling image serving as a sample image into the initial training set, and adding the labeling images except the initial training set serving as images to be checked into the set to be checked.

A training unit (not shown in the figure) for training the machine learning model by using the sample images in the initial training set.

In an embodiment, the selecting unit (not shown in the figure) is further configured to:

clustering and dividing the labeled images of which the label labels indicate the same image category information into a plurality of clusters;

calculating consistency parameters between the cluster center image of each cluster and each marked image in the cluster;

and screening out the credible annotation image of the annotation label according to the consistency parameter.

In an embodiment, the prediction module 410 is further configured to:

In an embodiment, the filtering module 420 is further configured to:

In an embodiment, the screening module 420 is further configured to:

The implementation processes of the functions and actions of the modules in the device are specifically described in the implementation processes of the corresponding steps in the image tag verification method, and are not described herein again.

In the embodiments provided in the present application, the disclosed apparatus and method can also be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A method for verifying an image tag, comprising:

the method comprises the following steps of carrying out classification calculation on images to be checked in an image to be checked in an initial training set by using a machine learning model trained by the initial training set to obtain a prediction label and label credibility of the images to be checked, and comprises the following steps: extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked; for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model; calculating a characteristic distance between the image characteristics of the image to be checked and the image characteristics of the sample image, wherein the characteristic distance is used for representing the reliability of the label;

2. The method of claim 1, wherein prior to the performing the classification calculation on the image to be checked in the image to be checked set by using the machine learning model trained by using the initial training set, the method further comprises:

selecting an annotated image with a credible annotated label as a sample image to be added into the initial training set, and adding an annotated image except the initial training set as an image to be checked into the set to be checked;

3. The method of claim 2, wherein selecting an annotated image for which the annotation tag is authentic comprises:

4. The method according to claim 1, wherein the screening of the credible images from the set to be verified according to the label credibility of the images to be verified comprises:

and judging whether the characteristic distance corresponding to each candidate credible image is smaller than a preset distance threshold or not, and if so, determining that the candidate credible image is a credible image.

5. The method of claim 4, wherein for each prediction tag, the screening of candidate reliable images from the images to be checked with the prediction tag comprises:

6. The method of claim 4, wherein for each prediction tag, the screening of candidate reliable images from the images to be checked with the prediction tag comprises:

7. An image tag verification apparatus, comprising:

the prediction module is used for performing classification calculation on the images to be checked in the images to be checked by using the machine learning model trained by the initial training set to obtain the prediction labels and the label credibility of the images to be checked, and comprises the following steps: extracting image features of the image to be checked through the machine learning model, and performing classification calculation based on the image features to obtain a prediction label of the image to be checked; for each image to be checked, extracting the image characteristics of the sample image with the same labeling label as the prediction label of the image to be checked through the machine learning model; calculating a characteristic distance between the image characteristics of the image to be checked and the image characteristics of the sample image, wherein the characteristic distance is used for representing the reliability of the label;

8. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the method of image tag verification of any of claims 1-6.

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the method of verifying an image tag according to any one of claims 1 to 6.