CN114332521A - Image classification method and device, mobile terminal and computer-readable storage medium - Google Patents
Image classification method and device, mobile terminal and computer-readable storage medium Download PDFInfo
- Publication number
- CN114332521A CN114332521A CN202011018044.9A CN202011018044A CN114332521A CN 114332521 A CN114332521 A CN 114332521A CN 202011018044 A CN202011018044 A CN 202011018044A CN 114332521 A CN114332521 A CN 114332521A
- Authority
- CN
- China
- Prior art keywords
- probability
- target type
- final
- input image
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000013598 vector Substances 0.000 claims abstract description 61
- 238000000605 extraction Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 abstract description 16
- 230000000694 effects Effects 0.000 abstract description 8
- 238000012545 processing Methods 0.000 abstract description 8
- 230000006870 function Effects 0.000 description 22
- 238000013145 classification model Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 11
- 241000220225 Malus Species 0.000 description 10
- 238000012549 training Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 241000234295 Musa Species 0.000 description 7
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 5
- 235000021016 apples Nutrition 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 235000021015 bananas Nutrition 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The embodiment of the invention provides an image classification method, an image classification device, a mobile terminal and a computer readable storage medium, and relates to the technical field of image processing. The similarity of the feature vectors and a plurality of preset weight vectors is respectively determined, the similarity distribution condition of the target type is obtained, the extreme value probability of the input image as the target type is determined according to the similarity distribution condition of the target type and the similarity corresponding to the target type, and then the final probability is determined according to the extreme value probability corresponding to the target type and the normalization probability corresponding to the target type. Because the extreme value probability is determined firstly to determine the probability that the input image does not belong to a certain in-distribution class, and the final probability is determined by combining the extreme value probability and the normalization probability, even if the input image belongs to an out-distribution data class and the normalization probability of the input image on a certain class is high, the final probability of the input image on the class is low, and the effect of improving the negative sample resistance is achieved.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to an image classification method, an image classification device, a mobile terminal and a computer readable storage medium.
Background
With the continuous progress of the technology, the application of the neural network model is more and more extensive. Among them, the neural network model is widely applied to image classification. Specifically, the picture is input into a trained neural network classification model, and the neural network classification model can output a classification result label to determine the type of an object contained in the picture.
However, the existing classification models have low negative sample resistance, which results in inaccurate classification results.
Disclosure of Invention
In view of the above, the present invention provides an image classification method, an image classification device, a mobile terminal and a computer-readable storage medium to solve the above problems.
In a first aspect, an embodiment of the present application provides an image classification method, where the image classification method includes:
performing feature extraction on an input image to obtain a feature vector of the input image;
respectively determining the similarity between the feature vector and a plurality of preset weight vectors, wherein the preset weight vectors correspond to a plurality of preset types one to one, and the types comprise target types;
acquiring the similarity distribution condition of the target type, and determining the extreme value probability of the input image as the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type, wherein the extreme value probability represents the probability that the input image does not belong to the category in the similarity distribution condition, and the similarity distribution condition represents the distribution condition of the similarity of the sample of the target type and the weight vector corresponding to the target type;
determining a plurality of normalized probabilities of the input image according to the similarity, wherein the normalized probabilities are in one-to-one correspondence with the types;
and determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, wherein the final probability represents the probability that the input image is the target type.
In an optional implementation manner, the step of determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type includes:
and carrying out weighted average operation on the normalized probability by taking the extreme value probability as a weight to determine the final probability.
In an optional embodiment, the extreme value probability, the normalized probability and the final probability satisfy the following equation:
P_finali=(1-P_evti)*P_softmaxi
wherein i characterizes the target type, P _ finaliIs the final probability corresponding to the target type, P _ evtiIs extreme value probability, P _ softmax, corresponding to the target typeiAnd the normalized probability corresponding to the target type is obtained.
In an optional implementation manner, after the step of determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, the method further includes:
determining a negative sample probability from a plurality of the final probabilities, the negative sample probability characterizing a probability that the input image does not belong to any of the plurality of types.
In an alternative embodiment, the final probabilities and the negative sample probabilities satisfy the equation:
wherein, P _ finalfFor the negative sample probability, k characterizes the number of the plurality of types, i characterizes the target type, P _ finaliAnd the final probability corresponding to the target type.
In an alternative embodiment, after the step of determining a negative sample probability from a plurality of the final probabilities, the method further comprises:
determining a type of the input image according to the plurality of final probabilities and the negative sample probability.
In an alternative embodiment, the step of determining the type of the input image according to the plurality of final probabilities and the negative sample probability comprises:
determining a maximum probability value of the plurality of final probabilities and the negative sample probabilities;
determining the type corresponding to the maximum probability value as the type of the input image.
In a second aspect, an embodiment of the present application further provides an image classification apparatus, including:
the characteristic extraction module is used for extracting the characteristics of an input image to obtain a characteristic vector of the input image;
the calculation module is used for respectively determining the similarity between the feature vector and a plurality of preset weight vectors, wherein the preset weight vectors correspond to preset types in a one-to-one mode, and the types comprise target types;
the extreme value estimation module is used for acquiring the similarity distribution condition of the target type, and determining the extreme value probability of the input image as the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type, wherein the extreme value probability represents the probability that the input image does not belong to the category in the similarity distribution condition;
a normalized probability determining module, configured to determine multiple normalized probabilities of the input image according to the similarity, where the multiple normalized probabilities are in one-to-one correspondence with the multiple types;
and the final probability determining module is used for determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, wherein the final probability represents the probability that the input image is the target type.
In a third aspect, an embodiment of the present application further provides a mobile terminal, including a processor and a memory, where the memory stores machine executable instructions that can be executed by the processor, and the processor can execute the machine executable instructions to implement the image classification method in any one of the above embodiments.
In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the image classification method in any one of the above-mentioned embodiments.
According to the image classification method, the image classification device, the mobile terminal and the computer-readable storage medium, the similarity between the feature vector and the preset weight vectors is respectively determined, the similarity distribution situation of the target type is obtained, the extreme value probability of the input image as the target type is determined according to the similarity distribution situation of the target type and the similarity corresponding to the target type, and then the final probability is determined according to the extreme value probability corresponding to the target type and the normalization probability corresponding to the target type. Because the extreme value probability is determined firstly to determine the probability that the input image does not belong to a certain in-distribution class, and the final probability is determined by combining the extreme value probability and the normalization probability, even if the input image belongs to an out-distribution data class and the normalization probability of the input image on a certain class is high, the final probability of the input image on the class is low, and the effect of improving the negative sample resistance is achieved.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 shows a flowchart of an image classification method provided by an embodiment of the present invention.
Fig. 2 shows a detailed flowchart of S107 in fig. 1.
Fig. 3 is a functional block diagram of an image classification apparatus according to an embodiment of the present invention.
Fig. 4 is a block diagram of a mobile terminal according to an embodiment of the present invention.
Icon: 100-image classification means; 110-a feature extraction module; 120-a calculation module; 130-extreme estimation module; 140-a normalized probability determination module; 150-a final probability determination module; 160-type determination module; 200-a mobile terminal; 201-a radio frequency unit; 202-a network module; 203-an audio output unit; 204-an input unit; 2041-an image processor; 2042-microphone; 205-a sensor; 206-a display unit; 2061 — a display panel; 207-user input unit; 2071-touch panel; 2072-other input devices; 208-an interface unit; 209-a memory; 210-a processor; 211-power supply.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In a conventional neural network, it is often necessary to determine a normalized probability that an input image belongs to a certain type by using a classification model, and then output a classified result label according to the normalized probability. However, the traditional neural network has the problem of low negative sample resistance.
In the prior art, the negative sample resistance is mostly reduced by performing threshold filtering on the model output probability, that is, only when the probability that the model outputs a certain type of result is greater than a certain manually set threshold, the model is considered to have higher confidence to the result and is adopted, otherwise, the result is not output.
However, the method has a limited effect on improving the resistance of the negative sample, a large amount of testing and parameter adjusting work and experience are required, data relied on in the parameter adjusting process needs to be obtained by testing under a large amount of real environments, and the cost is high; meanwhile, parameter adjustment depends on the experience of a test engineer, the accuracy rate of the model is reduced when the threshold value is too large, correct classification cannot be obtained in a target scene, and the resistance of the negative sample is not obviously improved when the threshold value is too small; further, when the model is updated and maintained in the later stage, the threshold value obtained by the previous test adjustment cannot be used continuously, and needs to be retested, so that the maintenance cost is high; finally, the method cannot radically improve the negative sample resistance of the model, and partial classes distributed outside the model may have higher classification probability on a certain class.
The inventor finds that the main reason of the traditional classification model that the resistance of the negative sample is low is as follows:
first, the traditional classification model uses normalized probabilities to determine class labels. It will be understood that the probability generated by the softmax function, because it is normalized (i.e. the sum of the probabilities of the classes is 1), is essentially a relative probability, implying the assumption that the input image already belongs to a class of data within the distribution, or in other words, the probability generated by the softmax normalization is: what the probability of the picture being in a certain category is on the premise that the input image is not an out-of-distribution data category. Therefore, the softmax probability produced by the traditional classification model is not applicable to the off-distribution data.
The neural network model can obtain the capability of classifying the pictures only by using a large amount of picture data for training, and the classes contained in the data used in the training of the model are called distribution internal classes; accordingly, the data classes that have not been used in training are the out-of-distribution classes. For example, pictures of apples and oranges are used in advance for training and a classification neural network which can distinguish the apples from the oranges, so that the apples and the oranges are in the inner distribution data category, and the bananas are in the outer distribution data category.
Secondly, the last layer of the conventional classification model is linear regression, so the decision boundary is linear for the last layer of features of the model, and the linear decision boundary is not suitable for the infinite feature space to perform negative sample recognition.
Therefore, the embodiment of the application provides an image classification method, which can determine the extreme value probability to determine the probability that an input image does not belong to a certain in-distribution class, and determine the final probability by combining the extreme value probability and the normalization probability, so that even if the input image belongs to an out-of-distribution data class and the normalization probability of the input image on a certain class is high, the final probability of the input image on the class is low, and the effect of improving the negative sample resistance is achieved.
The following will describe embodiments of the present application in detail.
Referring to fig. 1, which is a flowchart of an image classification method according to an embodiment of the present application, an execution subject of a processing flow described in the embodiment may be a mobile terminal 200. The image classification method comprises the following steps:
s101, extracting the characteristics of the input image to obtain the characteristic vector of the input image.
It should be noted that the mobile terminal 200 is configured with a classification model that has been trained to converge. Thereby. After receiving the input image, the mobile terminal 200 may directly extract the feature vector of the input image. The algorithm for extracting features includes, but is not limited to, Histogram of Oriented Gradient (HOG) algorithm, Scale-invariant features transform (SIFT) algorithm, and the like.
S102, determining similarity of the feature vector and a plurality of preset weight vectors respectively, wherein the preset weight vectors correspond to a plurality of preset types one to one, and the plurality of types comprise target types.
It should be noted that the types that can be identified by the classification model are preset types, and each type has a corresponding preset weight vector. Furthermore, the plurality of predetermined weight vectors may be from a last fully-connected layer trained to a converged classification model based on the intra-distribution data set, and each weight vector in the last fully-connected layer may be represented as a feature vector of a data class in a certain distribution in the feature space, i.e., the predetermined weight vector.
In an alternative embodiment, the similarity may be a cosine similarity, and the cosine similarity may be calculated by the following equation:
wherein i represents the type, cos _ siRepresenting the similarity corresponding to the type i, x representing the feature vector, yiAnd characterizing the weight vector corresponding to the type i.
Further, the target type may be one or more of a plurality of types. The similarity corresponding to the target type represents the difference degree of the feature vector and the weight vector corresponding to the target type.
In other embodiments, the similarity may be other types of similarities, such as euclidean distance.
S103, obtaining the similarity distribution condition of the target type, and determining the extreme value probability of the input image as the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type.
It should be noted that the distribution of similarity may specifically refer to the distribution of similarity corresponding to a certain type. Specifically, the distribution of similarity characterizes the distribution of similarity between the samples of the target type and the weight vectors corresponding to the target type. In an optional implementation manner, after training of the classification model is completed, the similarity between the feature vector of each training sample in the verification set and a plurality of preset weight vectors is collected, all the similarities are grouped according to the type to which the similarity belongs, and the distribution of the similarity corresponding to the type can be obtained by determining the distribution of each group of similarities.
For example, a trained classification model may identify A, B, C three types of images, and the validation set includes a total of 3n training samples, n for each class. Each training sample has 3 similarities (and corresponds to A, B, C categories respectively), 9n similarities can be determined for 3n training samples, and the a-type similarities corresponding to n a-type samples, the B-type similarities corresponding to n B-type samples, and the C-type similarities corresponding to n C-type samples can be obtained by grouping the 9n similarities according to the types to which the samples belong.
In an alternative embodiment, the similarity distribution of the type sample on the type can be obtained by fitting a set of extreme estimation parameters to all similarities with the same type by using a Weibull distribution.
After the similarity distribution condition of the target type is obtained, performing extreme value Estimation (EVT) operation according to the similarity distribution condition of the target type and the similarity corresponding to the target type, thereby determining the extreme value probability of the target type.
It will be appreciated that the extreme probability characterizes the probability that the input image does not belong to a class within the similarity distribution, in other words the extreme probability of the target type characterizes the "probability of the input image belonging to an extreme of the target type".
Thus, the greater the extreme probability, the greater the probability that the input image belongs to an out-of-distribution category. Generally, if the input image belongs to the out-of-distribution category, the extreme probability is 1; if the input image belongs to the intra-distribution category, the extreme probability is close to 0.
It should be noted that the target type may be multiple, that is, extreme value probabilities of all types may be determined by this step.
And S104, determining a plurality of normalization probabilities of the input image according to the similarity, wherein the normalization probabilities correspond to the types one by one.
It will be appreciated that the normalized probability may be determined using softmax or AMsoftmax. In a preferred embodiment, the normalized probability can be determined by AMsoftmax, and by using AMsoftmax and combining a series of model optimization methods such as mixup, AC-block and the like, the generalization performance of the model can be improved, so that the characteristics with more representative categories in distribution can be extracted.
And S105, determining a final probability according to the extreme value probability corresponding to the target type and the normalization probability corresponding to the target type, wherein the final probability represents the probability that the input image is the target type.
In an alternative embodiment, the final probability is determined by performing a weighted average operation on the normalized probabilities with the extreme probability as a weight.
Specifically, the extreme value probability, the normalized probability and the final probability satisfy the following equations:
P_finali=(1-P_evti)*P_softmaxi
where i characterizes the target type, P _ finaliP _ evt, the final probability for the target typeiIs the extreme probability, P _ softmax, corresponding to the target typeiAnd the normalized probability corresponding to the target type is obtained.
Understandably, since P _ evtiSummary of extreme values characterizing an input image as belonging to a target typeRate, thus 1-P _ evtiThe probability that the input image is not the extreme value of a certain type can be represented, the extreme value probability is used as weight for measuring the probability that the input image is used as a category in distribution, and the probability is multiplied by the normalized probability of the same type, so that the final probability that the input image is of a certain type can be obtained.
It is understood that if the training model can identify n types, the final probabilities of the images being the n types, respectively, can be input.
Assuming that the input image belongs to the intra-distribution category, the extreme probability corresponding to each type is close to 0, and P _ final is performed at this timei≈P_softmaxiSimilar to conventional classification models; if the input image belongs to the category outside the distribution, the extreme value probability corresponding to each type is 1, and at this time, even if the normalized probability of a certain type is close to 1, the final probability corresponding to the type is also 0, so that the effect of improving the resistance of the negative sample is achieved.
In addition, because the final probability is determined by utilizing the similarity, the similarity can be regarded as an inner product of two vectors after unit vectorization, and a high-dimensional sphere is not a traditional Euclidean space capable of being extended wirelessly when the unit vector is in a feature space, after extreme value estimation is carried out, a decision boundary on the high-dimensional sphere is a circular ring on the sphere, and the circular ring naturally defines a limited space allowed to exist by a certain type of vector on the sphere.
And S106, determining the probability of the negative sample according to the plurality of final probabilities, wherein the probability of the negative sample represents the probability that the input image does not belong to any one of the plurality of types.
Specifically, the plurality of final probabilities and the negative sample probability satisfy the equation:
wherein, P _ finalfFor negative sample probability, k characterizes the number of multiple types, i characterizes the target type, P _ finaliAnd the final probability corresponding to the target type.
That is, assuming that the total probability of all the output results is 1, the final probabilities of the input images being n types, respectively, are subtracted from the total probability, and then the probability that the input image does not belong to any of the plurality of types, that is, the probability that the input image is a negative sample, can be determined.
And S107, determining the type of the input image according to the plurality of final probabilities and the negative sample probabilities.
It is to be understood that, assuming that n types can be recognized, the mobile terminal 200 may output results including n +1 types, i.e., the type of input image input or the input image is a negative example.
Please refer to fig. 2, which is a detailed flowchart of S107. The S207 includes:
s1071, a maximum probability value of the plurality of final probabilities and the negative sample probability is determined.
For example, the plurality of final probabilities and negative sample probabilities may be ranked to determine a maximum probability value of the plurality of final probabilities and negative sample probabilities.
S1072, determining the type corresponding to the maximum probability value as the type of the input image.
That is, the method and the device can determine the type of the input image without setting a threshold, have high negative sample resistance, save the cost of the threshold determination process, and avoid the problem of inaccurate classification result caused by unreasonable threshold setting.
Taking the classification model capable of identifying apples, bananas and oranges as an example, the principle of the image classification method provided by the embodiment of the application is explained.
Firstly, the preset weight vectors comprise a first weight vector, a second weight vector and a third weight vector which respectively correspond to the apple, the banana and the orange, after extracting the feature vectors of the input image, respectively calculating a first similarity between the feature vectors and the first weight vector, a second similarity between the feature vectors and the second weight vector and a third similarity between the feature vectors and the third weight vector.
And then respectively acquiring a first similarity distribution condition, a second similarity distribution condition and a third similarity distribution condition which respectively correspond to the apple, the banana and the orange, carrying out EVT on the first similarity and the first similarity distribution condition to obtain a first extreme value probability P1, carrying out EVT on the second similarity and the second similarity distribution condition to obtain a second extreme value probability P2, and carrying out EVT on the third similarity and the third similarity distribution condition to obtain a third extreme value probability P3.
And then respectively calculating a first normalized probability M1, a second normalized probability M2 and a third normalized probability M3 of the input image respectively being apple, banana and orange.
Then, a first final probability is determined from Pf1 ═ M1 (1-P1), a second final probability is determined from Pf2 ═ M2 (1-P2), and a third final probability is determined from Pf3 ═ M3 (1-P3). The first final probability Pf1 represents the probability that the input image is an apple, the second final probability Pf2 represents the probability that the input image is a banana, and the third final probability Pf3 represents the probability that the input image is an orange.
Then according to PfuDetermining a negative sample probability P, 1-Pf1-Pf2-Pf3fuAnd representing the probability that the input image is not any one of apple, banana and orange.
Finally, the first final probability Pf1, the second final probability Pf2, the third final probability Pf3 and the negative sample probability P are comparedfuAnd determining the type corresponding to the maximum probability value as the type of the input image. If the first final probability Pf1 is maximum, it may be determined that the type of the input image is apple; such as negative sample probability PfuAt maximum, the input image may be determined to be a negative example.
In order to perform the corresponding steps in the above embodiments and various possible manners, an implementation manner of the image classification apparatus 100 is given below. Referring to fig. 3, fig. 3 is a functional block diagram of an image classification apparatus 100 according to an embodiment of the present invention. It should be noted that the image classification apparatus 100 provided in the present embodiment has the same basic principle and technical effect as those of the above embodiments, and for the sake of brief description, no part of the present embodiment is mentioned, and reference may be made to the corresponding contents in the above embodiments. The image classification device 100 includes: feature extraction module 110, calculation module 120, extreme value estimation module 130, normalized probability determination module 140, final probability determination module 150, and type determination module 160.
The feature extraction module 110 performs feature extraction on the input image to obtain a feature vector of the input image.
It is to be appreciated that in an alternative embodiment, the feature extraction module 110 may be configured to execute S101 to implement the corresponding function.
The calculating module 120 is configured to determine similarity between the feature vector and a plurality of preset weight vectors, where the preset weight vectors correspond to a plurality of preset types one to one, and the plurality of types include a target type.
It is to be appreciated that in an alternative embodiment, the calculation module 120 may be configured to execute S102 to implement the corresponding functions.
The extreme value estimation module 130 is configured to obtain a similarity distribution condition of the target type, and determine an extreme value probability that the input image is the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type.
It is to be appreciated that in an alternative embodiment, the extreme value estimation module 130 may be configured to execute S103 to implement the corresponding function.
The normalized probability determination module 140 is configured to determine a plurality of normalized probabilities of the input image according to the similarity, where the plurality of normalized probabilities correspond to a plurality of types one to one.
It is to be appreciated that in an alternative embodiment, the normalized probability determination module 140 may be configured to execute S104 to implement the corresponding functions.
The final probability determination module 150 is configured to determine a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, where the final probability represents a probability that the input image is the target type.
It is to be appreciated that in an alternative embodiment, the final probability determination module 150 may be configured to execute S105 to implement the corresponding function.
The final probability determination module 150 is further configured to determine a negative sample probability from the plurality of final probabilities, the negative sample probability characterizing a probability that the input image does not belong to any of the plurality of types.
It is to be appreciated that in an alternative embodiment, the final probability determination module 150 may be configured to execute S106 to implement the corresponding function.
The type determining module 160 is configured to determine the type of the input image according to the plurality of final probabilities and the negative sample probability.
Specifically, the type determining module 160 is configured to determine a maximum probability value of the plurality of final probabilities and the negative sample probabilities, and determine a type corresponding to the maximum probability value as the type of the input image.
It is to be appreciated that in an alternative embodiment, the type determining module 160 may be configured to execute S107 to implement the corresponding function.
To sum up, according to the image classification method, the image classification device, the mobile terminal 200, and the computer-readable storage medium provided in the embodiments of the present application, the similarity between the feature vector and the predetermined weight vectors is respectively determined, the distribution of the similarity of the target type is obtained, the extreme probability that the input image is the target type is determined according to the distribution of the similarity of the target type and the similarity corresponding to the target type, and the final probability is determined according to the extreme probability corresponding to the target type and the normalization probability corresponding to the target type. Because the extreme value probability is determined firstly to determine the probability that the input image does not belong to a certain in-distribution class, and the final probability is determined by combining the extreme value probability and the normalization probability, even if the input image belongs to an out-distribution data class and the normalization probability of the input image on a certain class is high, the final probability of the input image on the class is low, and the effect of improving the negative sample resistance is achieved.
In the several embodiments provided in the present application, the coupling or direct coupling or communication connection between the modules shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be in an electrical, mechanical or other form.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Fig. 4 is a block diagram of the mobile terminal 200. The mobile terminal 200 includes a radio frequency unit 201, a network module 202, an audio output unit 203, an input unit 204, a sensor 205, a display unit 206, a user input unit 207, an interface unit 208, a memory 209, a processor 210, a power supply 211, and the like. Those skilled in the art will appreciate that the mobile terminal architecture illustrated in fig. 4 is not intended to be limiting of the mobile terminal 200, and that the mobile terminal 200 may include more or less components than those illustrated, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the mobile terminal 200 includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, and the like.
The processor 210 is configured to determine similarity between the feature vector and a plurality of preset weight vectors, acquire a similarity distribution condition of the target type, determine an extreme probability that the input image is the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type, and determine a final probability according to the extreme probability corresponding to the target type and the normalization probability corresponding to the target type.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 201 may be used for receiving and sending signals during a message transmission and reception process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 210; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 201 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 201 can also communicate with a network and other devices through a wireless communication system.
The mobile terminal provides the user with wireless broadband internet access through the network module 202, such as helping the user send and receive e-mails, browse webpages, access streaming media, and the like.
The audio output unit 203 may convert audio data received by the radio frequency unit 201 or the network module 202 or stored in the memory 209 into an audio signal and output as sound. Also, the audio output unit 203 may also provide audio output related to a specific function performed by the mobile terminal 200 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 203 includes a speaker, a buzzer, a receiver, and the like.
The input unit 204 is used to receive an audio or video signal. The input unit 204 may include a Graphics Processing Unit (GPU) 2041 and a microphone 2042, and the graphics processing unit 2041 processes image data of a still picture or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 206. The image frames processed by the graphic processor 2041 may be stored in the memory 209 (or other storage medium) or transmitted via the radio frequency unit 201 or the network module 202. The microphone 2042 may receive sound and may be capable of processing such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 201 in case of a phone call mode.
The mobile terminal 200 also includes at least one sensor 205, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 2061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 2061 and/or the backlight when the mobile terminal 200 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of the mobile terminal (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 205 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.
The display unit 206 is used to display information input by the user or information provided to the user. The display unit 206 may include a display panel 2061, and the display panel 2061 may be configured in the form of a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), or the like.
The user input unit 207 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 207 includes a touch panel 2071 and other input devices 2072. Touch panel 2071, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 2071 (e.g., user operation on or near the touch panel 2071 using a finger, a stylus, or any other suitable object or attachment). The touch panel 2071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 210, and receives and executes commands sent by the processor 210. In addition, the touch panel 2071 may be implemented by using various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The user input unit 207 may include other input devices 2072 in addition to the touch panel 2071. In particular, the other input devices 2072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not further described herein.
Further, a touch panel 2071 may be overlaid on the display panel 2061, and when the touch panel 2071 detects a touch operation on or near the touch panel 2071, the touch panel is transmitted to the processor 210 to determine the type of the touch event, and then the processor 210 provides a corresponding visual output on the display panel 2061 according to the type of the touch event. Although the touch panel 2071 and the display panel 2061 are shown as two separate components in fig. 4, in some embodiments, the touch panel 2071 and the display panel 2061 may be integrated to implement the input and output functions of the mobile terminal, which is not limited herein.
The interface unit 208 is an interface through which an external device is connected to the mobile terminal 200. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 208 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 200 or may be used to transmit data between the mobile terminal 200 and external devices.
The memory 209 may be used to store software programs as well as various data. The memory 209 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 209 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 210 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 209 and calling data stored in the memory 209, thereby performing overall monitoring of the mobile terminal. Processor 210 may include one or more processing units; preferably, the processor 210 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 210.
The mobile terminal 200 may further include a power source 211 (e.g., a battery) for supplying power to various components, and preferably, the power source 211 may be logically connected to the processor 210 through a power management system, so as to manage charging, discharging, and power consumption management functions through the power management system.
In addition, the mobile terminal 200 includes some functional modules that are not shown, and thus, the detailed description thereof is omitted.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 210, the computer program implements each process of the above-mentioned embodiment of the image classification method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. An image classification method, characterized in that the image classification method comprises:
performing feature extraction on an input image to obtain a feature vector of the input image;
respectively determining the similarity between the feature vector and a plurality of preset weight vectors, wherein the preset weight vectors correspond to a plurality of preset types one to one, and the types comprise target types;
acquiring the similarity distribution condition of the target type, and determining the extreme value probability of the input image as the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type, wherein the extreme value probability represents the probability that the input image does not belong to the category in the similarity distribution condition, and the similarity distribution condition represents the distribution condition of the similarity of the sample of the target type and the weight vector corresponding to the target type;
determining a plurality of normalized probabilities of the input image according to the similarity, wherein the normalized probabilities are in one-to-one correspondence with the types;
and determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, wherein the final probability represents the probability that the input image is the target type.
2. The image classification method according to claim 1, wherein the step of determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type comprises:
and carrying out weighted average operation on the normalized probability by taking the extreme value probability as a weight to determine the final probability.
3. The image classification method according to claim 2, wherein the extreme value probability, the normalized probability, and the final probability satisfy the equation:
P_finali=(1-P_evti)*P_soft maxi
wherein i characterizes the target type, P _ finaliIs the final probability corresponding to the target type, P _ evtiIs the extreme value probability corresponding to the target type, P _ soft maxiAnd the normalized probability corresponding to the target type is obtained.
4. The method of image classification according to claim 1, characterized in that after the step of determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, the method further comprises:
determining a negative sample probability from a plurality of the final probabilities, the negative sample probability characterizing a probability that the input image does not belong to any of the plurality of types.
5. The image classification method according to claim 4, wherein the final probabilities and the negative sample probabilities satisfy the equation:
wherein, P _ finalfFor the negative sample probability, k characterizes the number of the plurality of types, i characterizes the target type, P _ finaliAnd the final probability corresponding to the target type.
6. The method of image classification according to claim 4, characterized in that after the step of determining a negative sample probability from a plurality of the final probabilities, the method further comprises:
determining a type of the input image according to the plurality of final probabilities and the negative sample probability.
7. The image classification method according to claim 6, wherein the step of determining the type of the input image according to the plurality of final probabilities and the negative sample probability comprises:
determining a maximum probability value of the plurality of final probabilities and the negative sample probabilities;
determining the type corresponding to the maximum probability value as the type of the input image.
8. An image classification apparatus, characterized by comprising:
the characteristic extraction module is used for extracting the characteristics of an input image to obtain a characteristic vector of the input image;
the calculation module is used for respectively determining the similarity between the feature vector and a plurality of preset weight vectors, wherein the preset weight vectors correspond to preset types in a one-to-one mode, and the types comprise target types;
the extreme value estimation module is used for acquiring the similarity distribution condition of the target type, and determining the extreme value probability of the input image as the target type according to the similarity distribution condition of the target type and the similarity corresponding to the target type, wherein the extreme value probability represents the probability that the input image does not belong to the category in the similarity distribution condition;
a normalized probability determining module, configured to determine multiple normalized probabilities of the input image according to the similarity, where the multiple normalized probabilities are in one-to-one correspondence with the multiple types;
and the final probability determining module is used for determining a final probability according to the extreme value probability corresponding to the target type and the normalized probability corresponding to the target type, wherein the final probability represents the probability that the input image is the target type.
9. A mobile terminal comprising a processor and a memory, the memory storing machine executable instructions executable by the processor to implement the image classification method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the image classification method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011018044.9A CN114332521B (en) | 2020-09-24 | 2020-09-24 | Image classification method, device, mobile terminal and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011018044.9A CN114332521B (en) | 2020-09-24 | 2020-09-24 | Image classification method, device, mobile terminal and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114332521A true CN114332521A (en) | 2022-04-12 |
CN114332521B CN114332521B (en) | 2024-10-01 |
Family
ID=81010848
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011018044.9A Active CN114332521B (en) | 2020-09-24 | 2020-09-24 | Image classification method, device, mobile terminal and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114332521B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107505133A (en) * | 2017-08-10 | 2017-12-22 | 滁州学院 | The probability intelligent diagnosing method of rolling bearing fault based on adaptive M RVM |
CN110197252A (en) * | 2018-02-26 | 2019-09-03 | Gsi 科技公司 | Deep learning based on distance |
CN110263697A (en) * | 2019-06-17 | 2019-09-20 | 哈尔滨工业大学(深圳) | Pedestrian based on unsupervised learning recognition methods, device and medium again |
CN111353542A (en) * | 2020-03-03 | 2020-06-30 | 腾讯科技(深圳)有限公司 | Training method and device of image classification model, computer equipment and storage medium |
-
2020
- 2020-09-24 CN CN202011018044.9A patent/CN114332521B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107505133A (en) * | 2017-08-10 | 2017-12-22 | 滁州学院 | The probability intelligent diagnosing method of rolling bearing fault based on adaptive M RVM |
CN110197252A (en) * | 2018-02-26 | 2019-09-03 | Gsi 科技公司 | Deep learning based on distance |
CN110263697A (en) * | 2019-06-17 | 2019-09-20 | 哈尔滨工业大学(深圳) | Pedestrian based on unsupervised learning recognition methods, device and medium again |
CN111353542A (en) * | 2020-03-03 | 2020-06-30 | 腾讯科技(深圳)有限公司 | Training method and device of image classification model, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114332521B (en) | 2024-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111260665B (en) | Image segmentation model training method and device | |
US12136210B2 (en) | Image processing method and apparatus | |
CN109558512B (en) | Audio-based personalized recommendation method and device and mobile terminal | |
CN108304758B (en) | Face characteristic point tracking method and device | |
CN111223143B (en) | Key point detection method and device and computer readable storage medium | |
CN112820299B (en) | Voiceprint recognition model training method and device and related equipment | |
CN106874906B (en) | Image binarization method and device and terminal | |
CN108427873B (en) | Biological feature identification method and mobile terminal | |
CN109495616B (en) | Photographing method and terminal equipment | |
CN109993234B (en) | Unmanned driving training data classification method and device and electronic equipment | |
CN114722937B (en) | Abnormal data detection method and device, electronic equipment and storage medium | |
WO2021190387A1 (en) | Detection result output method, electronic device, and medium | |
CN109063558A (en) | A kind of image classification processing method, mobile terminal and computer readable storage medium | |
CN113190646B (en) | User name sample labeling method and device, electronic equipment and storage medium | |
WO2017088434A1 (en) | Human face model matrix training method and apparatus, and storage medium | |
CN111738100B (en) | Voice recognition method based on mouth shape and terminal equipment | |
CN113314126A (en) | Knowledge distillation method, voice recognition processing method and related equipment | |
CN111046742B (en) | Eye behavior detection method, device and storage medium | |
CN114399813B (en) | Face shielding detection method, model training method, device and electronic equipment | |
CN113870862A (en) | Voiceprint recognition model training method, voiceprint recognition method and related equipment | |
CN110674294A (en) | Similarity determination method and electronic equipment | |
CN113052198A (en) | Data processing method, device, equipment and storage medium | |
CN114332521B (en) | Image classification method, device, mobile terminal and computer readable storage medium | |
CN115240250A (en) | Model training method and device, computer equipment and readable storage medium | |
CN116259083A (en) | Image quality recognition model determining method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |