CN111310846A

CN111310846A - Method, device, storage medium and server for selecting sample image

Info

Publication number: CN111310846A
Application number: CN202010127598.6A
Authority: CN
Inventors: 王俊; 高鹏; 谢国彤; 柳杨
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2020-06-19
Anticipated expiration: 2040-02-28
Also published as: CN111310846B; WO2021169301A1

Abstract

The application is applicable to the technical field of computers, and provides a method, a device, a storage medium and a server for selecting a sample image. By adopting the method, when the sample images are selected for manual annotation, the uncertainty index and the representative index of each unmarked sample image are respectively calculated, and the annotation value of each sample image is determined by combining the uncertainty index and the representative index. Because the samples with larger representativeness cannot be outlier samples and can better reflect the characteristics of each sample in the sample set, the samples are the samples with higher labeling value as the samples with large uncertainty. When the annotation value of the sample image is measured, the uncertainty and the representativeness of the sample are considered at the same time, so that the part of the sample image with high annotation value can be selected for manual annotation, and the performance of the image classification model is optimized better.

Description

Method, device, storage medium and server for selecting sample image

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a storage medium, and a server for selecting a sample image.

Background

Currently, when images are classified, an image annotation method based on active learning is generally adopted, and the image annotation method mainly includes: acquiring a part of marked image and a part of unmarked image; taking the part of marked images as a training set, and training to obtain an initial classification model; classifying and predicting the part of the unmarked images by using the initial classification model to obtain a prediction result of each image; respectively calculating the credibility of the prediction result of each image, selecting the image with the maximum uncertainty, and submitting the image to an expert for manual annotation of the image; adding the manually marked images into a training set, and retraining and optimizing the classification model; and repeating the iteration to execute the steps until the accuracy of the classification model meets the requirement, or the iteration times reach the specified times. By adopting the image labeling method based on active learning, a part of samples can be selected from a large number of unlabelled images and submitted to artificial labeling, thereby reducing the workload of artificial labeling. However, when selecting and manually labeling images from image prediction results, the uncertainty of the images is considered alone, so that the labeling value of the images cannot be reflected well, and it is difficult to ensure that the images with high labeling value are selected and manually labeled.

Disclosure of Invention

In view of this, the present application provides a method for selecting a sample image, which can select a portion of the sample image with a high labeling value for manual labeling, so as to better optimize the performance of an image classification model.

In a first aspect, an embodiment of the present application provides a method for selecting a sample image, including:

acquiring an unlabelled image set and an labeled image set, wherein the unlabelled image set comprises a plurality of unlabelled sample images, and the labeled image set comprises a plurality of labeled sample images;

training to obtain an image classification model by taking the marked image set as a training set;

classifying each unlabeled sample image in the unlabeled image set by adopting the image classification model to obtain a classification result of each unlabeled sample image;

for each unlabeled sample image, respectively calculating to obtain respective uncertainty indexes and representative indexes according to respective classification results, and determining respective labeling values by combining the respective uncertainty indexes and the representative indexes, wherein the uncertainty indexes are used for measuring the uncertainty of the image classification results of the samples, and the representative indexes are used for measuring the probability size of the samples which can be used as the representative samples of the unlabeled image set;

and selecting and outputting the sample image with the highest labeling value from the unlabeled sample images.

In the process, when the sample images are selected for manual annotation, the uncertainty index and the representative index of each unmarked sample image are respectively calculated, and the annotation value of each sample image is determined by combining the uncertainty index and the representative index. Because the samples with larger representativeness cannot be outlier samples and can better reflect the characteristics of each sample in the sample set, the samples are the samples with higher labeling value as the samples with large uncertainty. When the annotation value of the sample image is measured, the uncertainty and the representativeness of the sample are considered at the same time, so that the part of the sample image with high annotation value can be selected for manual annotation, and the performance of the image classification model is optimized better.

Further, the uncertainty indicator of any one unlabeled target sample image in the set of unlabeled images can be calculated by the following formula:

f(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(p_θ(y|x))

wherein f (x, L, u) represents an uncertainty indicator of the target sample image x, L represents a sample of the labeled image set, u represents a sample of the unlabeled image set, and p_θ(Y | x) represents the probability that the target sample image x belongs to a label Y, which is a pre-constructed set of label categories.

Further, the representative index of the target sample image may be calculated by the following formula:

wherein Rep (x) represents a representative index of the target sample image x, n represents the number of sample images of the unlabeled image set, sim (x, x)_i) Representing the target sample image x and one sample image x of the set of unlabeled images_iThe similarity between the target sample image x and the target sample image x is assumed to be expressed as x ═ x in the attribute space¹,x²,...,x^j,...,x^mThe sample image x_iIs expressed as

Then sim (x, x)_i) The specific expression of (A) is as follows:

the annotation value (x) of the target sample image can be calculated by the following formula:

Value(x)＝f(x,L,u)*Rep(x)。

further, the uncertainty index of any one unlabeled target sample image in the set of unlabeled images can be determined by:

calculating an information entropy index of the target sample image;

counting the number of labels obtained by classifying the target sample image according to the classification result of the target sample image;

and calculating to obtain an uncertainty index of the target sample image by combining the information entropy index and the number of the labels.

The number of the labels obtained by classifying the sample images can be used for measuring the information diversity of the samples, and more accurate index parameters comprehensively considering the uncertainty of sample prediction and the label diversity can be obtained by combining the information entropy index and the diversity of the sample prediction.

Specifically, the information entropy index of the target sample image can be calculated by the following formula:

Ent(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(y|x)

wherein Ent (x, L, u) represents an information entropy index of the target sample image x, L represents a sample of the labeled image set, u represents a sample of the unlabeled image set, and p_θ(Y | x) represents the probability that the target sample image x belongs to a label Y, Y being a pre-constructed label category set;

the uncertainty indicator for the target sample image may be calculated by the following formula:

f(x,L,u)＝Ent(x,L,u)*Mul(x)^a

wherein f (x, L, u) represents an uncertainty index of the target sample image x, mul (x) represents the number of labels, and a is a parameter for adjusting specific gravity.

Further, the representative index of the target sample image may be calculated by the following kernel density estimation formula:

Value(x)＝f(x,L,u)*Rep^β(x)；

rep (x) represents a representative index of the target sample image x, n represents the number of sample images of the unlabeled image set, h is the bandwidth of the kernel density estimation, and the sample images of the unlabeled image set are represented by { x₁,x₂,...,x_i,...,x_nK (, K) is a preset weight function, and β is a parameter for adjusting specific gravity.

Further, after selecting and outputting a sample image with the highest labeling value from the unlabeled sample images, the method may further include:

transferring the sample image with the highest labeling value after manual labeling from the unlabeled image set to the labeled image set, and updating the labeled image set;

taking the updated labeled image set as a training set, and performing optimization updating on the image classification model;

and if the optimization updating times of the image classification model reach the set iteration times or the accuracy of the image classification model reaches the set threshold value, determining the current image classification model as the final image classification model.

After a sample image with the highest labeling value is selected, manual labeling is carried out on the part of sample image, then the sample image after manual labeling is transferred from the unlabeled image set to the labeled image set, the updated labeled image set is used as a training set, and the image classification model is optimized and updated, so that the performance of the image classification model is improved.

In a second aspect, an embodiment of the present application provides an apparatus for extracting a sample image, including:

the image collection acquisition module is used for acquiring an unlabelled image collection and an labeled image collection, wherein the unlabelled image collection comprises a plurality of unlabelled sample images, and the labeled image collection comprises a plurality of labeled sample images;

the classification model training module is used for training to obtain an image classification model by taking the marked image set as a training set;

the sample image classification module is used for classifying each unlabeled sample image in the unlabeled image set by adopting the image classification model to obtain a classification result of each unlabeled sample image;

a sample annotation value determining module, configured to calculate, for each unlabeled sample image, a respective uncertainty index and a respective representative index according to a respective classification result, and determine a respective annotation value by combining the respective uncertainty index and the respective representative index, where the uncertainty index is used to measure uncertainty of an image classification result of a sample, and the representative index is used to measure a probability size of a sample that can be used as a representative sample of the unlabeled image set;

and the sample image selecting module is used for selecting and outputting the sample image with the highest labeling value from the unlabeled sample images.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for selecting a sample image as set forth in the first aspect of the embodiment of the present application is implemented.

In a fourth aspect, an embodiment of the present application provides a server, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the method for selecting a sample image as set forth in the first aspect of the embodiment of the present application.

In a fifth aspect, an embodiment of the present application provides a computer program product, which, when running on a terminal device, causes the terminal device to execute the method for selecting a sample image according to the first aspect.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a flowchart of a first embodiment of a method for selecting a sample image according to an embodiment of the present application;

FIG. 2 is a flowchart of a second embodiment of a method for selecting a sample image according to an embodiment of the present application;

FIG. 3 is a block diagram of an embodiment of an apparatus for extracting a sample image according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a server according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail. Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

The application provides a method for selecting sample images, which can select the sample images with high labeling value to be manually labeled, so that the performance of an image classification model is better optimized.

It should be understood that the execution subject of the method for selecting a sample image proposed in the embodiments of the present application is a server.

Referring to fig. 1, a first embodiment of a method for selecting a sample image according to an embodiment of the present application includes:

101. acquiring an unlabelled image set and an labeled image set, wherein the unlabelled image set comprises a plurality of unlabelled sample images, and the labeled image set comprises a plurality of labeled sample images;

firstly, an unlabelled image set and an labeled image set are obtained, wherein the unlabelled image set comprises a plurality of unlabelled sample images, and the labeled image set comprises a plurality of labeled sample images. These sample images may be multi-labeled images, i.e., one image contains a plurality of different class labels. For example, an ophthalmic OCT image may simultaneously include 1-6 focus type labels (vitreous macular traction, epiretinal membrane or epiretinal membrane, macular hole, intra-retinal effusion, pigment epithelium detachment, drusen or retinal atrophy, etc.).

102. Training to obtain an image classification model by taking the marked image set as a training set;

and then, training to obtain an image classification model by taking the marked image set as a training set. The image classification model may employ depth models of various classes, such as DenseNet, ResNet, resenxt, MobileNet, NASNet, and the like. Among them, preferably adopt DenseNet, realize the reuse of characteristic through the connection of characteristic on the channel, can obtain the better performance than ResNet under the situation that the parameter and computational cost are less.

103. Classifying each unlabeled sample image in the unlabeled image set by adopting the image classification model to obtain a classification result of each unlabeled sample image;

after an image classification model is obtained through training, the image classification model is adopted to classify each unlabeled sample image in the unlabeled image set respectively, and a classification result of each unlabeled sample image is obtained. The classification result of a sample image is specifically the probability that the sample image belongs to each different preset class label, such as class label a-90%, class label B-20%, and the like.

104. For each unlabeled sample image, respectively calculating to obtain respective uncertainty indexes and representative indexes according to respective classification results, and determining respective labeling values by combining the respective uncertainty indexes and the representative indexes;

the embodiment of the application selects a strategy of uncertainties) + retrieval (Representativeness) to measure the annotation value of the sample image. For any unmarked sample image, the uncertainty index and the representative index of the sample image are calculated according to the classification result of the sample image, and then the marking value of the sample image is determined by combining the uncertainty index and the representative index. The uncertainty index is used for measuring the uncertainty of the image classification result of the sample, and the representative index is used for measuring the probability size of the sample which can be used as the representative sample of the unlabeled image set.

Optionally, the uncertainty indicator of any unlabeled target sample image in the set of unlabeled images may be calculated by the following formula (1-1):

f(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(p_θ(y|x)) (1-1)

wherein f (x, L, u) represents an uncertainty indicator of the target sample image x, L represents a sample of the labeled image set, u represents a sample of the unlabeled image set, and p_θ(Y | x) represents the probability that the target sample image x belongs to a label Y, which is a pre-constructed set of label categories. In the classification result of the sample, the closer the prediction result for a certain label is to 0.5, the higher uncertainty that the current model has for the label of the sample, that is, the higher value the sample needs to be labeled.

Considering the representativeness of the samples, the representativeness of the samples can be measured by how many samples are similar to the samples, the samples with larger representativeness cannot be outlier samples, the higher the similarity among the samples is, the more consistent the characteristics of the samples are, and if the similarity is greater than a certain set threshold value, the sample information redundancy is considered. Specifically, the similarity between sample points may be calculated using the similarity coefficient in the following manner: assume that the attribute space of the target sample image x is represented by x ═ x¹,x²,...,x^j,...,x^m-one sample image x of the set of unlabelled images_iIs expressed as

The cosine formula is used to calculate the similarity between the two, i.e. the following formula (1-2):

then, the representative index of the target sample image x can be obtained by averaging the similarity, as shown in the following formula (1-3):

wherein rep (x) represents a representative index of the target sample image x, and n represents the number of sample images of the unlabeled image set.

After obtaining the uncertainty index and the representative index of the target sample image, the annotation value of the target sample image can be determined according to the 2 indexes, for example, the annotation value (x) can be calculated by the following formula (1-4):

Value(x)＝f(x,L,u)*Rep(x) (1-4)

the first term f (x, L, u) in the equation (1-4) represents the information amount (uncertainty index) of the sample x under the condition of the current query strategy, and the second term is a representative index, and is obtained by calculating the similarity of each sample relative to other samples and is represented as the average similarity of the sample x and all samples in the sample space. If the density of the space where the sample is located is higher, the information content of the sample is higher, and the sample has a larger chance to be selected for marking. Through the introduction of a representative criterion, samples with high similarity are deleted to reduce the addition of redundant samples, representative high-information-quantity samples are reserved by integrating the uncertainty indexes and the representative indexes of the samples, and the influence of isolated points on the sample selection quality is effectively solved.

Obviously, the sample image x with the highest labeling value is:

x*＝argmax[f(x,L,u)*Rep(x)]

argmax is a function that is a function of (a set of) parameters to the function. When we have another function y ═ f (x), if there is a result x₀Argmax (f (x)) means when the function f (x) takes x ═ x₀When f (x) is needed, the maximum value of the value range of f (x) is obtained; if there are multiple points such that f (x) takes the same maximum, then the result of argmax (f (x)) is a set of points. Sentence changingIn other words, argmax (f (x)) is the variable point x (or set of x) corresponding to the maximum value of f (x).

One way to calculate the uncertainty index, the representative index, and determine the annotation value of the sample image by combining these 2 indexes has been shown above, and another different way of calculation is proposed below.

Optionally, the uncertainty indicator of any unlabeled target sample image in the set of unlabeled images may be determined by:

(1) calculating an information entropy index of the target sample image;

(2) counting the number of labels obtained by classifying the target sample image according to the classification result of the target sample image;

(3) and calculating to obtain an uncertainty index of the target sample image by combining the information entropy index and the number of the labels.

For step (1), measuring the uncertainty of the current sample by information entropy, defining the information entropy index of the sample image x as end (x, L, u), where L represents the sample of the labeled set, u represents the sample of the unlabeled set, and end (x, L, u) can be calculated by the following formula (1-5):

Ent(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(y|x) (1-5)

wherein p is_θ(Y | x) represents the probability that the target sample image x belongs to a label Y, which is a pre-constructed set of label categories.

Regarding the steps (2) and (3), in consideration of the difference between the multi-label classification and the single-label classification, the sample selection strategy can be guided and gained by mining the label diversity in the multi-label classification, the uncertainty of the current sample is measured through the information entropy, the diversity of the labels is further fused through the sample label base number predicted by the model, when the number of the labels of the sample predicted by the previous round of depth model is more, the sample is considered to contain more information for improving the model performance, and the labeling value of the unlabeled sample can be more effectively measured through the comprehensive consideration of two dimensions of the sample and the labels. Specifically, the uncertainty index of the target sample image can be calculated by the following formula (1-6):

f(x,L,u)＝Ent(x,L,u)*Mul(x)^a(1-6)

For the representative index of the sample image, the representative index can be measured by the similarity between the sample and other samples, the sample with larger representativeness cannot be an outlier sample, when the similarity between the samples is calculated, firstly, the last full-link layer of the deep pre-training network is adopted to extract the feature vector of the image sample, then, the LargeVis method can be adopted to reduce the dimension of the extracted high-dimensional feature vector, in the two-dimensional space of the sample point after the dimension reduction, the distribution density of the position of the sample point is calculated by the kernel density method, and the higher the kernel density is, the higher the area of the sample point is, so that the representative degree of the sample is higher. That is, the representativeness of the sample points can be characterized by the kernel density of the sample points, and the process of calculating the representative index can be converted into the process of calculating the kernel density.

Further, the representative index of the target sample image may be calculated by the following kernel density estimation formula (1-7):

where rep (x) represents a representative index of the target sample image x, n represents the number of sample points, i.e. the number of sample images of the unlabeled image set, h is the bandwidth of kernel density estimation, and the sample images of the unlabeled image set are represented by { x }₁,x₂,...,x_i,...,x_nK (, K) is a preset weight function.

Equations (1-7) are a weighted average calculation, while kernel K (—) is a weight function whose shape and value range control the number and degree of utilization of data points used to estimate rep (x) at point x, and intuitively, the effect of kernel density estimation depends on the choice of kernel and bandwidth h. A general weight function is symmetric about the origin and has an integral of 1, such as the commonly used Uniform, Epanechikov, Quartic, and Gaussian functions.

For the Uniform kernel, then only

The points whose absolute value is less than 1 (or the points whose distance from x is less than the bandwidth h) are used to estimate the value of rep (x), but the weight of all contributing data is the same.

For Gaussian kernel function, as can be seen from the expression of Rep (x), if x_iThe closer to the x the more closely,

the closer to zero, the greater the density value, since the range of normal density is the entire real axis, all data are used to estimate the value of Rep (x), except that points closer to x have a greater effect on the estimation, and when h is small, only points particularly close to x will have a greater effect, and as h increases, the effect of points further away will increase.

If kernel functions of the form Epanechikov and quaritic are used, not only is there truncation (i.e. points at a distance from x greater than the bandwidth h do not work), but the weight of the points that work also becomes smaller as the distance from x increases. In general, the kernel function choice affects the kernel estimation much less than the bandwidth h choice.

The choice of the bandwidth h has a great influence on the estimator rep (x), if h is too small, the density estimation will favor assigning the probability density too close to the observation data, resulting in many false peaks in the estimated density function; if h is too large, the density estimate will spread the probability densities too far apart, thus losing some important density function features.

Specifically, the bandwidth h can be selected in the following 2 ways.

The selection mode (1) of the bandwidth is as follows:

to determine the bandwidth quality, it is necessary to know how to evaluate the nature of the density estimator rep (x). The integrated mean square error is typically used as a criterion to determine whether the density estimator is good or bad. The expression for the integrated mean square error is:

wherein:

AMISE (h) is called progressive mean square integral error, and to minimize this error, h must be set at some intermediate value, so that Rep (x) can be avoided from having too large a bias or variance. Regarding finding h to minimize the AMISE (h), it is desirable to accurately balance the order of the variance term and the variance term in AMISE (h), so that the optimal bandwidth is:

the selection mode (2) of the bandwidth is as follows:

bandwidth is chosen using the Silverman thumb rule, for simplicity, r (g) ═ g is defined²(z) dz, containing an unknown quantity R [ Rep (x) in the optimal bandwidth obtained for minimizing AMISE (h)]A first-class method, rule soft humb (thumb rule, i.e. empirical method), can be used: rep (x) is replaced by a normal density with variance and estimated variance matching, which is equivalent to

Estimating Rep (x), wherein

For standard normal density function, taking K (. about.X) as a Gaussian density kernel and σ as a sample variance

Then the optimal bandwidth can be obtained by using the Silverman thumb rule as follows:

after the uncertainty index and the representative index of the target sample image are calculated, the labeling value (x) of the target sample image can be calculated by the following formula (1-12):

Value(x)＝f(x,L,u)*Rep^β(x) (1-12)

in the formulas (1-12), the first term f (x, L, u)) is an uncertainty index of the target sample image x, and is based on the information content (including uncertainty and label diversity information) of the sample x under the current query policy condition, and the second term rep (x) represents a representative index of the target sample image x, and is obtained by calculating the kernel density of each instance in the reduced-dimension feature space, and is represented as the average similarity between x and other samples in the sample space. β is a parameter for adjusting the specific gravity of the two samples, and if the density of the space where the samples are located is higher, the samples have a higher chance to be selected for labeling.

Then, the sample image x with the highest labeling value is:

x*＝argmax[f(x,L,u)*Rep^β(x)]

105. and selecting and outputting the sample image with the highest labeling value from the unlabeled sample images.

And after the labeling value of each sample image is determined by combining the uncertainty index and the representative index, selecting the sample image with the highest labeling value for output, and submitting the sample image to manual labeling. Through testing, compared with the method for selecting sample images through Random Sampling (RS) under the general condition, the method for fusing deep Learning and Active Learning (AL) in the embodiment of the application can select the sample image with the highest classification contribution degree in the current unlabeled sample set for labeling each time based on the good feature expression capability of the depth model and the fusion of the Active selection strategy. And in a large number of unmarked original images, part of high-value samples are selected to be marked for experts without marking all samples, samples with lower quality are filtered, the most valuable samples for improving the deep learning model are selected each time to be added into training, and the workload of manual marking is effectively reduced on the basis of ensuring the task precision.

According to the method for selecting the sample images, when the sample images are selected for manual annotation, the uncertainty indexes and the representative indexes of each unmarked sample image are respectively calculated, and the annotation value of each sample image is determined by combining the uncertainty indexes and the representative indexes. Because the samples with larger representativeness cannot be outlier samples and can better reflect the characteristics of each sample in the sample set, the samples are the samples with higher labeling value as the samples with large uncertainty. When the annotation value of the sample image is measured, the uncertainty and the representativeness of the sample are considered at the same time, so that the part of the sample image with high annotation value can be selected for manual annotation, and the performance of the image classification model is optimized better.

Referring to fig. 2, a second embodiment of a method for selecting a sample image according to an embodiment of the present application includes:

201. acquiring an unlabelled image set and an labeled image set, wherein the unlabelled image set comprises a plurality of unlabelled sample images, and the labeled image set comprises a plurality of labeled sample images;

202. training to obtain an image classification model by taking the marked image set as a training set;

203. classifying each unlabeled sample image in the unlabeled image set by adopting the image classification model to obtain a classification result of each unlabeled sample image;

204. for each unlabeled sample image, respectively calculating to obtain respective uncertainty indexes and representative indexes according to respective classification results, and determining respective labeling values by combining the respective uncertainty indexes and the representative indexes;

205. selecting and outputting a sample image with the highest labeling value from the unlabeled sample images;

the steps 201-205 are the same as the steps 101-105, and the related description of the steps 101-105 can be referred to.

206. Transferring the sample image with the highest labeling value after manual labeling from the unlabeled image set to the labeled image set, and updating the labeled image set;

after the sample image with the highest labeling value is output, the sample image is handed to an expert for manual labeling. And then, transferring the manually marked sample image from the unmarked image set to the marked image set, and updating the marked image set.

207. Taking the updated labeled image set as a training set, and performing optimization updating on the image classification model;

and then, retraining the image classification model by taking the updated labeled image set as a training set so as to optimize and update model parameters and improve the performance of the model.

208. Judging whether the optimization updating times of the image classification model reach set iteration times or not, or whether the accuracy of the image classification model reaches a set threshold value or not;

after the image classification model is optimized and updated, whether the current optimization and update times reach the set iteration times or not or whether the accuracy of the image classification model reaches the set threshold value or not is judged. If yes, go to step 209; if not, returning to the step 203, and re-executing the iterative optimization operation of the classification model until the condition is met.

209. And determining the current image classification model as a final image classification model.

The number of times of optimizing and updating the image classification model reaches the set iteration number, or the accuracy of the image classification model reaches the set threshold value, and the optimizing and updating operation of the image classification model is completed at this time, so that the current model can be determined as a final image classification model, and the images to be classified are classified by adopting the final image classification model.

According to the method and the device, after the sample image with the highest labeling value is selected, the sample image is labeled manually, the sample image after being labeled manually is transferred from the unlabeled image set to the labeled image set, the updated labeled image set is used as a training set, the image classification model is optimized and updated, and therefore the performance of the image classification model is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 3 is a block diagram of a device for extracting a sample image according to an embodiment of the present application, which corresponds to the method for extracting a sample image described in the foregoing embodiment, and only shows portions related to the embodiment of the present application for convenience of description.

Referring to fig. 3, the apparatus includes:

an image set obtaining module 301, configured to obtain an unlabeled image set and an labeled image set, where the unlabeled image set includes multiple unlabeled sample images, and the labeled image set includes multiple labeled sample images;

a classification model training module 302, configured to train to obtain an image classification model by using the labeled image set as a training set;

a sample image classification module 303, configured to classify each unlabeled sample image in the unlabeled image set by using the image classification model, respectively, to obtain a classification result of each unlabeled sample image;

a sample annotation value determining module 304, configured to calculate, for each unlabeled sample image, a respective uncertainty index and a respective representative index according to a respective classification result, and determine a respective annotation value by combining the respective uncertainty index and the respective representative index, where the uncertainty index is used to measure uncertainty of an image classification result of a sample, and the representative index is used to measure a probability size of a sample that can be used as a representative sample of the unlabeled image set;

and a sample image selecting module 305, configured to select and output a sample image with the highest labeling value from the unlabeled sample images.

Further, the sample annotation value determination module can include:

a first uncertainty index calculation unit, configured to calculate an uncertainty index of any one unlabeled target sample image in the set of unlabeled images by using the following formula:

f(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(p_θ(y|x))

Further, the sample annotation value determination module can include:

a first representative index calculation unit for calculating a representative index of the target sample image by the following formula:

Then sim (x, x)_i) The specific expression of (A) is as follows:

the annotation value (x) of the target sample image is calculated by the following formula:

Value(x)＝f(x,L,u)*Rep(x)。

further, the sample annotation value determination module can include:

the information entropy calculation unit is used for calculating an information entropy index of the target sample image;

the label quantity counting unit is used for counting the quantity of labels obtained by classifying the target sample images according to the classification result of the target sample images;

and the uncertainty index determining unit is used for calculating the uncertainty index of the target sample image by combining the information entropy index and the number of the labels.

Further, the information entropy calculating unit is specifically configured to calculate an information entropy index of the target sample image according to the following formula:

Ent(x,L,u)＝-∑_y∈Yp_θ(y|x)*log(y|x)

further, the sample annotation value determination module can include:

a second uncertainty index calculation unit for calculating an uncertainty index of the target sample image by the following formula:

f(x,L,u)＝Ent(x,L,u)*Mul(x)^a

Further, the sample annotation value determination module can include:

a second representative index calculation unit for calculating a representative index of the target sample image by the following kernel density estimation formula:

Value(x)＝f(x,L,u)*Rep^β(x)；

Further, the apparatus for extracting a sample image may further include:

the image set updating module is used for transferring the sample image with the highest labeling value after manual labeling from the unlabeled image set to the labeled image set and updating the labeled image set;

the image classification model optimization module is used for optimizing and updating the image classification model by taking the updated labeled image set as a training set;

and the image classification model determining module is used for determining the current image classification model as the final image classification model if the optimization updating times of the image classification model reach the set iteration times or the accuracy of the image classification model reaches the set threshold value.

Embodiments of the present application further provide a computer-readable storage medium, which stores computer-readable instructions, and the computer-readable instructions, when executed by a processor, implement the steps of any one of the methods for selecting a sample image as shown in fig. 1 or fig. 2.

Embodiments of the present application further provide a server, which includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, and the processor executes the computer readable instructions to implement the steps of any one of the methods for selecting a sample image as shown in fig. 1 or fig. 2.

Embodiments of the present application further provide a computer program product, which when run on a server, causes the server to execute the steps of implementing any one of the methods for selecting a sample image as shown in fig. 1 or fig. 2.

Fig. 4 is a schematic diagram of a server according to an embodiment of the present application. As shown in fig. 4, the server 4 of this embodiment includes: a processor 40, a memory 41, and computer readable instructions 42 stored in the memory 41 and executable on the processor 40. The processor 40, when executing the computer readable instructions 42, performs the steps in the various sample image extraction method embodiments described above, such as steps 101-105 shown in fig. 1. Alternatively, the processor 40, when executing the computer readable instructions 42, implements the functions of the modules/units in the above device embodiments, such as the functions of the modules 301 to 305 shown in fig. 3.

Illustratively, the computer readable instructions 42 may be partitioned into one or more modules/units that are stored in the memory 41 and executed by the processor 40 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer-readable instructions 42 in the server 4.

The server 4 may be a computing device such as a smart phone, a notebook, a palm computer, and a cloud server. The server 4 may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of a server 4 and does not constitute a limitation of server 4 and may include more or fewer components than shown, or some components in combination, or different components, e.g., server 4 may also include input output devices, network access devices, buses, etc.

The Processor 40 may be a CentraL Processing Unit (CPU), other general purpose Processor, a DigitaL SignaL Processor (DSP), an AppLication Specific Integrated Circuit (ASIC), an off-the-shelf ProgrammabLe Gate Array (FPGA) or other ProgrammabLe logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 41 may be an internal storage unit of the server 4, such as a hard disk or a memory of the server 4. The memory 41 may also be an external storage device of the server 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure DigitaL (SD) Card, a FLash memory Card (FLash Card), or the like, provided on the server 4. Further, the memory 41 may also include both an internal storage unit of the server 4 and an external storage device. The memory 41 is used to store the computer readable instructions and other programs and data required by the server. The memory 41 may also be used to temporarily store data that has been output or is to be output.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), random-access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of selecting a sample image, comprising:

2. The method of selecting a sample image as claimed in claim 1, wherein the uncertainty indicator for any one unlabeled target sample image in the set of unlabeled images is calculated by the following formula:

f(x，L，u)＝-∑_y∈Yp_θ(y|x)*log(p_θ(y|x))

3. The method of selecting a sample image as claimed in claim 2, wherein the representative index of the target sample image is calculated by the following formula:

wherein Rep (x) represents a representative index of the target sample image x, n represents the number of sample images of the unlabeled image set, sim (x, x)_i) Representing the target sample image x and one sample image x of the set of unlabeled images_iThe similarity between the target sample image x and the target sample image x is assumed to be expressed as x ═ x in the attribute space¹，x²，...，x^j，...，x^mThe sample image x_iIs expressed as

Then sim (x, x)_i) The specific expression of (A) is as follows:

Value(x)＝f(x，L，u)*Rep(x)。

4. the method of selecting a sample image as claimed in claim 1, wherein the uncertainty indicator for any one unlabeled target sample image in the set of unlabeled images is determined by:

calculating an information entropy index of the target sample image;

5. The method of selecting a sample image as claimed in claim 4, wherein the information entropy index of the target sample image is calculated by the following formula:

Ent(x，L，u)＝-∑_y∈Yp_θ(y|x)*log(y|x)

the uncertainty indicator of the target sample image is calculated by the following formula:

f(x，L，u)＝Ent(x，L，u)*Mul(x)^a

6. The method of selecting a sample image as claimed in claim 5, wherein the representative index of the target sample image is calculated by the following kernel density estimation formula:

Value(x)＝f(x,L,u)*Rep^β(x)；

7. The method of selecting sample images according to any one of claims 1 to 6, further comprising, after selecting and outputting a sample image with the highest labeling value from among the unlabeled sample images:

8. An apparatus for selecting an image of a sample, comprising:

9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of selecting a sample image according to any one of claims 1 to 7.

10. A server comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements a method of selecting a sample image as claimed in any one of claims 1 to 7.