CN111325242A - Image classification method, terminal and computer storage medium - Google Patents
Image classification method, terminal and computer storage medium Download PDFInfo
- Publication number
- CN111325242A CN111325242A CN202010078654.1A CN202010078654A CN111325242A CN 111325242 A CN111325242 A CN 111325242A CN 202010078654 A CN202010078654 A CN 202010078654A CN 111325242 A CN111325242 A CN 111325242A
- Authority
- CN
- China
- Prior art keywords
- image
- trained
- feature vector
- fine
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000003860 storage Methods 0.000 title claims abstract description 18
- 238000013145 classification model Methods 0.000 claims abstract description 108
- 239000013598 vector Substances 0.000 claims description 166
- 238000012549 training Methods 0.000 claims description 24
- 238000004891 communication Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 description 65
- 238000010586 diagram Methods 0.000 description 12
- 238000007635 classification algorithm Methods 0.000 description 9
- 238000004590 computer program Methods 0.000 description 7
- 241000282472 Canis lupus familiaris Species 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241001544485 Cordulegastridae Species 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000007620 mathematical function Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the application discloses a method for classifying images, which is applied to a terminal and comprises the following steps: and acquiring an image to be classified, and classifying the image to be classified by adopting a pre-trained fine-grained classification model to obtain a classified image. The embodiment of the application also provides a terminal and a computer storage medium.
Description
Technical Field
The present application relates to fine-grained image classification technologies, and in particular, to a method, a terminal, and a computer storage medium for classifying images.
Background
The fine-grained image classification technology is a branch of image classification, and because the classes of the fine-grained image classification technology belong to the same large class, for example, dogs of different varieties belong to the large class of dogs, the direct difference of the classes is relatively small, but the difference between the classes is still much caused by the diversity of backgrounds and appearances.
At present, the image fine-grained classification method can be roughly divided into the following branches: the method mainly focuses on the improvement of a general classification algorithm, and the most common loss function of the current classification algorithm is softmax, but softmax has low inter-class discrimination for fine-grained classification; therefore, the accuracy of image classification is poor due to the adoption of the existing fine-grained image classification algorithm.
Disclosure of Invention
The embodiment of the application provides an image classification method, a terminal and a computer storage medium, which can improve the accuracy of image classification.
The technical scheme of the application is realized as follows:
the embodiment of the application provides an image classification method, which is applied to a terminal and comprises the following steps:
acquiring an image to be classified;
classifying the images to be classified by adopting a pre-trained fine-grained classification model to obtain classified images;
the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; wherein the image tag is used for characterizing the category of the image;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
The embodiment of the application provides a terminal, the terminal includes:
the acquisition module is used for acquiring an image to be classified;
the classification module is used for classifying the images to be classified by adopting a pre-trained fine-grained classification model to obtain classified images;
the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; wherein the image tag is used for characterizing the category of the image;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
An embodiment of the present application further provides a terminal, where the terminal includes: the image classification method comprises a processor and a storage medium storing instructions executable by the processor, wherein the storage medium depends on the processor to execute operations through a communication bus, and when the instructions are executed by the processor, the image classification method of one or more embodiments is executed.
The embodiment of the application provides a computer storage medium, which stores executable instructions, and when the executable instructions are executed by one or more processors, the processors execute the image classification method of one or more embodiments.
The embodiment of the application provides a classification method of images, a terminal and a computer storage medium, wherein the method is applied to the terminal and comprises the following steps: obtaining an image to be classified, classifying the image to be classified by adopting a pre-trained fine-grained classification model to obtain a classified image, wherein the trained fine-grained classification model is obtained by adopting the following method: grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; the image labels are used for representing the categories of images, extracting feature vectors from the grouped images to be trained to obtain a feature vector group, and training the fine-grained classification model by adopting the feature vector group to determine model parameters when the loss function and the target function in the fine-grained classification model have the minimum values, so as to obtain the trained fine-grained classification model; that is to say, in the embodiment of the application, images to be classified are classified by using a pre-trained fine-grained classification model, wherein the trained fine-grained classification model is to group the images of an image set to be trained according to image labels of the image set to be trained and extract feature vectors, the fine-grained classification model is trained by using the feature vector groups, and the trained fine-grained classification model is obtained by setting a loss function and a target function in the model and obtaining model parameters when values of the loss function and the target function are minimum.
Drawings
Fig. 1 is a schematic flowchart of an alternative image classification method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an example of an alternative image classification method according to an embodiment of the present disclosure;
fig. 3 is a first schematic structural diagram of a terminal according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Example one
An embodiment of the present application provides a method for classifying an image, where the method is applied to a terminal, fig. 1 is a schematic flow chart of an optional method for classifying an image according to the embodiment of the present application, and as shown in fig. 1, the method for classifying an image may include:
s101: acquiring an image to be classified;
at present, the fine-grained classification model of an image can be roughly divided into the following branches: the method based on the fine adjustment of the existing classification network, the method based on the learning of fine-grained features, the method based on the combination of the detection and classification of target blocks and the method based on the visual attention mechanism, wherein the method based on the fine adjustment of the existing classification network usually uses the existing classification network (such as MobileNet, Xception and the like) to carry out preliminary training on ImageNet to obtain a trained classification model, and then continues to carry out the fine adjustment on a data set with fine granularity, so that the model can be more suitable for distinguishing subclasses; the method based on fine-grained feature learning needs the combination of information acquired by two networks, one network is used for acquiring the position information of a target, and the other network is used for extracting the abstract feature expression of the target; the fine-grained classification method based on the combination of the detection and the classification of the target block draws the idea of target detection for reference, firstly frames a target area of an image through a target detection module, and then carries out fine-grained classification based on the target area, wherein a classification algorithm can be a traditional Support Vector Machine (SVM) classifier or a general classification network; compared with a general classification algorithm, the attention mechanism is added to the fine-grained classification algorithm based on the attention mechanism, so that the model focuses more on information expression of the target position.
The method is mainly improved on the basis of a general classification algorithm, the most common loss function of the current classification algorithm is softmax, and softmax has low classification degree for fine-grained classification, and the method has the following defects: firstly, the distance between the feature centers between classes is short, so that the problem of wrong classification between classes is easily caused; secondly, the intra-class features are not gathered enough, so that the feature distribution among a plurality of classes is overlapped, and the inter-class false separation is also caused; thirdly, the algorithm of adding the detection module introduces complex operation, increases the calculation cost and causes more time delay; resulting in less accuracy in image classification.
In order to improve the accuracy of image classification, here, the terminal first acquires an image to be classified, where the image to be classified may include an object to be classified, for example: and objects such as dogs, cars and trees are subdivided according to the categories of the dogs, cars and trees so as to classify the dogs, cars and trees in the images to be classified.
S102: classifying the images to be classified by adopting a pre-trained fine-grained classification model to obtain classified images;
in order to realize the classification of the images to be classified, the images to be classified are classified by adopting a pre-trained fine-grained classification model, wherein the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine the model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
Specifically, a terminal firstly acquires an image set to be trained and image labels of images in the image set to be trained, wherein the image labels are used for representing the categories of the images; for example, for images classified for a category of vehicles, the image tags may classify vehicles by brand, and the image tags in the set of images to be trained may include biddi, audi, bmw, and so on.
After the image labels of the image sets to be trained are obtained, the image sets to be trained are grouped according to the image labels of the image sets to be trained, the grouped image sets to be trained can be obtained, for each group of image sets to be trained, feature vectors of the images in each group of image sets to be trained are extracted, then feature vectors of the images in each group of image sets to be trained are obtained, and namely each group of feature vectors forms a feature vector group.
After the feature vector group is obtained, the feature vector group is adopted to train the fine-grained classification model, wherein the fine-grained classification model comprises not only a loss function but also an objective function, and in the process of training the model, an optimized objective function is added, so that the intra-class features are gathered together, and the inter-class features are spaced apart, so that the trained fine-grained classification model is more accurate in classification of classes.
Specifically, when the fine-grained classification model is trained, a model parameter with the smallest value of a loss function and a target function is determined mainly in an iterative mode, and after the model parameter is determined, the trained fine-grained classification model can be obtained.
In order to implement grouping of the model to be trained and determine the feature vector group, in an alternative embodiment, grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained includes:
sequentially determining images in an image set to be trained as first images;
aiming at the first image, selecting a second image and a third image from the image set to be trained except the first image; the image label of the second image is the same as that of the first image, and the image label of the third image is different from that of the first image;
forming a group by using the first image, the second image and the third image to obtain a grouped image set to be trained;
correspondingly, extracting a feature vector from the grouped images to be trained in the set to obtain a feature vector group, comprising:
extracting feature vectors of the first image, the second image and the third image respectively by adopting a fine-grained classification model to obtain the feature vector of the first image, the feature vector of the second image and the feature vector of the third image;
and forming a feature vector group by using the feature vector of the first image, the feature vector of the second image and the feature vector of the third image.
Determining a first image from an image set to be trained, sequentially determining each image in the image set to be trained as the first image, then selecting a second image and a third image from the image set to be trained except the first image, wherein the rule of the selection is as follows:
the image label of the first image is the same as that of the second image, and the image label of the first image is different from that of the third image, that is, the images are traversed in the image set to be trained, each image is determined as the first image, and then the second image and the third image are selected according to the rules, so that a group is formed by the first image, the second image and the third image, and a plurality of groups of image sets to be trained can be obtained, namely the grouped image sets to be trained.
After the grouped image sets to be trained are obtained, each group of image sets to be trained comprises three images which are respectively a first image, a second image and a third image, wherein feature vectors are respectively extracted from the three images in each group of image sets to be trained to obtain a feature vector of the first image, a feature vector of the second image and a feature vector of the third image, and a group of feature vectors is formed by utilizing the feature vector of the first image, the feature vector of the second image and the feature vector of the third image.
In order to make the classification accuracy of the fine-grained classification model obtained by training higher, the feature vector group needs to be screened, in an optional embodiment, after extracting feature vectors from the grouped images to be trained to obtain the feature vector group, the fine-grained classification model is trained by using the feature vector group to determine a model parameter when the loss function and the target function in the fine-grained classification model have the smallest values, and before obtaining the trained fine-grained classification model, the method further includes:
selecting groups which do not meet preset conditions from the feature vector groups;
and deleting the groups which do not meet the preset condition from the feature vector group to update the feature vector group.
Specifically, a group that does not satisfy the preset condition is selected from the feature vector group, and the group that does not satisfy the preset condition is deleted, so as to update the feature vector group, where the group that does not satisfy the preset condition may have various forms, and in general, a group with a larger intra-class feature of the first image and the second image needs to be deleted, and/or a group with a smaller inter-class feature of the first image and the third image needs to be deleted, which is not specifically limited in this embodiment of the present application.
In order to select the group that does not satisfy the preset condition from the feature vector group, in an optional embodiment, the selecting the group that does not satisfy the preset condition from the feature vector group includes:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image;
when the first distance value is larger than or equal to a first preset threshold value, determining a group containing the first image and the second image as a group which does not meet a preset condition, and selecting the group which does not meet the preset condition;
and/or the presence of a gas in the gas,
calculating a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the second distance value is smaller than or equal to a second preset threshold value, determining the group comprising the first image and the third image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition.
Specifically, the distance between the feature vector of the first image and the feature vector of the second image may be calculated first and recorded as a first distance value, a first preset threshold is preset in the terminal, the magnitude between the first distance value and the first preset threshold is compared, and when the first distance value is greater than or equal to the first preset threshold, it is described that the intra-class feature difference between the first image and the second image is large, and here, a group with a small intra-class feature difference between the first image and the second image is required, so that the feature vector group of the first image and the second image corresponding to the first distance value is determined as a group that does not satisfy the preset condition, and the group is deleted from the feature vector group.
The distance between the feature vector of the first image and the feature vector of the third image can be calculated first and recorded as a second distance value, a second preset threshold is preset in the terminal, the magnitude between the second distance value and the second preset threshold is compared, when the second distance value is smaller than or equal to the second preset threshold, it is indicated that the inter-class feature difference between the first image and the third image is smaller, and here, a group with a larger inter-class feature difference between the first image and the third image is required, so that the feature vector group of the first image and the third image corresponding to the second distance value is determined as a group which does not meet the preset condition, and the group is deleted from the feature vector group.
Alternatively, the distance between the feature vector of the first image and the feature vector of the second image may be calculated and recorded as a first distance value, the distance between the feature vector of the first image and the feature vector of the third image may be calculated and recorded as a second distance value, setting a first preset threshold and a second preset threshold in the terminal, comparing the relationship between the first distance value and the first preset threshold, comparing the relationship between the second distance value and the second preset threshold, when the first distance value is greater than or equal to a first preset threshold value and the second distance value is less than or equal to a second preset threshold value, the difference between the intra-class features of the first image and the second image is larger, and the difference between the inter-class features of the first image and the third image is smaller, therefore, the group corresponding to the first distance value and the second distance value is determined as the group which does not satisfy the preset condition, and the group is deleted from the feature vector group.
Further, in order to delete the groups that do not satisfy the predetermined condition, in an alternative embodiment, the selecting the groups that do not satisfy the predetermined condition from the feature vector group includes:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image, and a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the difference value between the second distance value and the first distance value is larger than a third preset threshold value, determining the group comprising the first image, the second image and the third image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition.
Firstly, calculating a distance value between the vector feature of the first image and the feature vector of the second image, namely a first distance value, calculating a distance value between the feature vector of the first image and the feature vector of the third image, namely a second distance value, calculating a difference value between the first distance value and the second distance value, presetting a third preset threshold value in a terminal, comparing the relation between the difference value and the third preset threshold value, and when the difference value is greater than the third preset threshold value, indicating that the inter-class feature distance between the first image and the third image in the group is far greater than the intra-class feature distance between the first image and the second image, thus ensuring the inter-class feature distance to be a larger value and ensuring the classification accuracy of the trained fine-grained classification model to be higher.
Therefore, the group is determined as the group which does not meet the preset condition, and the group is selected and deleted to update the feature vector group, so that the intra-class features of the updated feature vector group are gathered, the inter-class features are spaced, and the accuracy of the trained fine-grained classification model is improved.
In order to train a more optimized fine-grained classification model, in an optional embodiment, the method for training a fine-grained classification model by using a feature vector group to determine a model parameter when values of a loss function and a target function in the fine-grained classification model are minimum to obtain a trained fine-grained classification model includes:
training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the loss function value in the fine-grained classification model is minimum and the value of the target function is minimum, so as to obtain the trained fine-grained classification model;
or,
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the sum of the value of the loss function and the value of the target function in the fine-grained classification model is minimum, so as to obtain the trained fine-grained classification model.
Here, the feature vector group may be used to train the fine-grained classification model, and a model parameter when the loss function value in the model is minimum and the newly added objective function value is minimum is determined, so that training is completed after the model parameter is obtained, and a trained fine-grained classification model is obtained.
The fine-grained classification model can also be trained by adopting the feature vector group, a weight value is set for a loss function of the model, a weight value is set for a target function in the model, and after weighted summation of a value of the mathematical function and a value of the target function is determined in the training, a model parameter with the minimum weighted summation value is obtained, so that the trained fine-grained classification model is obtained; generally, the same weight value can be selected for the loss function and the objective function, so that only the model parameter when the sum of the value of the loss function and the value of the objective function in the model is minimum needs to be determined, and the trained fine-grained classification model is obtained.
In order to optimize the fine-grained classification model, an objective function is added to the model. In an alternative embodiment, the objective function is a function related to the feature vector of the first image, the feature vector of the second image, the feature vector of the third image and a third preset threshold.
In the method, the target function is used, so that when the trained fine-grained classification model is used for classification, the characteristics in the classes are gathered, and the characteristics between the classes are spaced, so that the classes of the objects to be classified in the images to be recognized can be classified more accurately in the classification process.
The following describes a classification method of images described in one or more embodiments above by way of example.
In practical applications, the loss function in the fine-grained classification model is usually based on softmax, and the calculation formula of softmax is as follows:
xirepresenting the abstract feature vector, y, extracted by the ith sample based on the classification networkiDenoted is the label of the ith sample, denoted is the classification level weight for the full convolution network extraction features, WjRepresents the jth column (i.e., class j) weight, like bjDenoted is the bias term for that column, T is the matrix transpose operation, and N is the number of classes.
The formula (1) does not include distance measurement optimization of features belonging to the same category, and on the basis of softmax, the features of the same category are constrained according to the constructed triple, so that the features are closer in distance, and the features of different categories are also limited, so that the features are farther in distance. The most complex module is a construction triple, fig. 2 is a schematic flow chart of an example of an alternative image classification method provided in the embodiment of the present application, and as shown in fig. 2, a process of constructing a triple is as follows:
the method comprises the steps that an Anchor is a current input image Va, Positive is a Positive sample Vp consistent with a label of the Anchor input image, Negative is a Negative sample Vn different from the label of the current input image, a process of constructing a triple can be offline or online, the triple is updated conveniently in an online mode, a Convolutional Neural Network (CNN) is adopted to process according to input data in each branch (N) to obtain a feature vector of each image, random combination is a triple, and N groups of triples { Va, Vp, Vn } can be combinedi. However, not all triples are reasonable, and too many useless triples may cause the algorithm to fail to converge or converge very slowly, so that the triples still need to be filtered, and in the N × N group of triples, only the image labels of Va and Vp are consistent, and when the image labels of Va and Vn are opposite, the valid triples described in fig. 2 are satisfied.
Because the purpose of the algorithm is to make the features in the classes more gathered and the features between the classes more distant, the purpose of the algorithm optimization is to make the distance between Va and Vp smaller than the distance between Va and Vn, and the optimization objective function is described as follows:
a in the above formula (2) is margin, in order to make the class interval larger constant, and preliminary 0.5 can be tried according to experimental experience setting. The objective function in combination with the loss can improve the precision of fine-grained classification.
By the aid of the image classification method, inter-class distinctiveness and intra-class cohesion of the classification algorithm can be improved, fine-grained classification effect can be further improved, and the image classification method can be applied to fine-grained classification recognition.
The embodiment of the application provides an image classification method, which is applied to a terminal and comprises the following steps: obtaining an image to be classified, classifying the image to be classified by adopting a pre-trained fine-grained classification model to obtain a classified image, wherein the trained fine-grained classification model is obtained by adopting the following method: grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; the image labels are used for representing the categories of images, extracting feature vectors from the grouped images to be trained to obtain a feature vector group, and training the fine-grained classification model by adopting the feature vector group to determine model parameters when the loss function and the target function in the fine-grained classification model have the minimum values, so as to obtain the trained fine-grained classification model; that is to say, in the embodiment of the application, images to be classified are classified by using a pre-trained fine-grained classification model, wherein the trained fine-grained classification model is to group the images of an image set to be trained according to image labels of the image set to be trained and extract feature vectors, the fine-grained classification model is trained by using the feature vector groups, and the trained fine-grained classification model is obtained by setting a loss function and a target function in the model and obtaining model parameters when values of the loss function and the target function are minimum.
Example two
Fig. 3 is a first schematic structural diagram of a terminal provided in an embodiment of the present application, and as shown in fig. 3, an embodiment of the present application provides a terminal, including:
an obtaining module 31, configured to obtain an image to be classified;
the classification module 32 is configured to classify the image to be classified by using a pre-trained fine-grained classification model to obtain a classified image;
the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; the image label is used for representing the category of the image;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine the model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
Optionally, the method for grouping the images of the image set to be trained by the terminal according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained includes:
sequentially determining images in an image set to be trained as first images;
aiming at the first image, selecting a second image and a third image from the image set to be trained except the first image; the image label of the second image is the same as that of the first image, and the image label of the third image is different from that of the first image;
forming a group by using the first image, the second image and the third image to obtain a grouped image set to be trained;
correspondingly, the terminal extracts the feature vectors from the grouped images to be trained in the set to obtain a feature vector group, which includes:
extracting feature vectors of the first image, the second image and the third image respectively by adopting a fine-grained classification model to obtain the feature vector of the first image, the feature vector of the second image and the feature vector of the third image;
and forming a feature vector group by using the feature vector of the first image, the feature vector of the second image and the feature vector of the third image.
Optionally, the terminal is further configured to:
after feature vectors are extracted from grouped images to be trained in a centralized mode to obtain a feature vector group, training a fine-grained classification model by adopting the feature vector group to determine a model parameter when a loss function and a target function in the fine-grained classification model have the minimum value, and before the trained fine-grained classification model is obtained, selecting a group which does not meet preset conditions from the feature vector group;
and deleting the groups which do not meet the preset condition from the feature vector group to update the feature vector group.
Optionally, the selecting, by the terminal, a group that does not satisfy the preset condition from the feature vector group includes:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image;
when the first distance value is larger than or equal to a first preset threshold value, determining a group containing the first image and the second image as a group which does not meet a preset condition, and selecting the group which does not meet the preset condition;
and/or the presence of a gas in the gas,
calculating a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the second distance value is smaller than or equal to a second preset threshold value, determining the group comprising the first image and the third image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition.
Optionally, the selecting, by the terminal, a group that does not satisfy the preset condition from the feature vector group includes:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image, and a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the difference value between the second distance value and the first distance value is larger than a third preset threshold value, determining the group comprising the first image, the second image and the third image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition.
Optionally, the training of the fine-grained classification model by the terminal through the feature vector group to determine a model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, and obtaining the trained fine-grained classification model includes:
training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the loss function value in the fine-grained classification model is minimum and the value of the target function is minimum, so as to obtain the trained fine-grained classification model;
or,
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the sum of the value of the loss function and the value of the target function in the fine-grained classification model is minimum, so as to obtain the trained fine-grained classification model.
Optionally, the objective function is a function related to the feature vector of the first image, the feature vector of the second image, the feature vector of the third image, and a third preset threshold.
In practical applications, the obtaining module 31 and the classifying module 32 may be implemented by a processor located on a terminal, specifically, implemented by a CPU, a Microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like.
Fig. 4 is a schematic structural diagram of a terminal according to an embodiment of the present application, and as shown in fig. 4, an embodiment of the present application provides a terminal 400, including:
a processor 41 and a storage medium 42 storing instructions executable by the processor 41, wherein the storage medium 42 depends on the processor 41 to perform operations through a communication bus 43, and when the instructions are executed by the processor 41, the method for classifying images according to the first embodiment is performed.
It should be noted that, in practical applications, the various components in the terminal are coupled together by a communication bus 43. It will be appreciated that the communication bus 43 is used to enable communications among the components. The communication bus 43 includes a power bus, a control bus, and a status signal bus, in addition to a data bus. But for clarity of illustration the various buses are labeled in figure 4 as communication bus 43.
The embodiment of the application provides a computer storage medium, which stores executable instructions, and when the executable instructions are executed by one or more processors, the processors execute the image classification method of the first embodiment.
The computer-readable storage medium may be a magnetic random access Memory (FRAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM), among others.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application.
Claims (10)
1. A method for classifying images is applied to a terminal and comprises the following steps:
acquiring an image to be classified;
classifying the images to be classified by adopting a pre-trained fine-grained classification model to obtain classified images;
the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; wherein the image tag is used for characterizing the category of the image;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
2. The method according to claim 1, wherein the grouping images of the image set to be trained according to the obtained image labels of the image set to be trained to obtain a grouped image set to be trained, comprises:
sequentially determining the images in the image set to be trained as first images;
selecting a second image and a third image from the image set to be trained except the first image aiming at the first image; wherein the image label of the second image is the same as the image label of the first image, and the image label of the third image is different from the image label of the first image;
forming a group by using the first image, the second image and the third image to obtain the grouped image set to be trained;
correspondingly, the extracting the feature vectors from the grouped images to be trained in the set to obtain a feature vector group includes:
extracting feature vectors of the first image, the second image and the third image by adopting the fine-grained classification model respectively to obtain the feature vector of the first image, the feature vector of the second image and the feature vector of the third image;
and forming the characteristic vector group by using the characteristic vector of the first image, the characteristic vector of the second image and the characteristic vector of the third image.
3. The method according to claim 2, wherein after extracting feature vectors from the grouped images in the image set to be trained to obtain a feature vector group, before training a fine-grained classification model by using the feature vector group to determine a model parameter when values of a loss function and an objective function in the fine-grained classification model are minimum to obtain the trained fine-grained classification model, the method further comprises:
selecting a group which does not meet a preset condition from the feature vector group;
and deleting the groups which do not meet the preset condition from the feature vector group so as to update the feature vector group.
4. The method of claim 3, wherein the selecting the group of the feature vector group that does not satisfy the predetermined condition comprises:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image;
when the first distance value is larger than or equal to a first preset threshold value, determining a group containing the first image and the second image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition;
and/or the presence of a gas in the gas,
calculating a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the second distance value is smaller than or equal to a second preset threshold value, determining the group comprising the first image and the third image as the group which does not meet the preset condition, and selecting the group which does not meet the preset condition.
5. The method of claim 3, wherein the selecting the group of the feature vector group that does not satisfy the predetermined condition comprises:
calculating a first distance value between the feature vector of the first image and the feature vector of the second image, and a second distance value between the feature vector of the first image and the feature vector of the third image;
and when the difference value between the second distance value and the first distance value is larger than a third preset threshold value, determining the group containing the first image, the second image and the third image as a group which does not meet a preset condition, and selecting the group which does not meet the preset condition.
6. The method of claim 1, wherein the training of the fine-grained classification model by using the feature vector group to determine a model parameter when values of a loss function and an objective function in the fine-grained classification model are minimum to obtain the trained fine-grained classification model comprises:
training a fine-grained classification model by adopting the feature vector group to determine a model parameter when a loss function value is minimum and a target function value is minimum in the fine-grained classification model, so as to obtain the trained fine-grained classification model;
or,
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the sum of the value of the loss function and the value of the target function in the fine-grained classification model is minimum, so as to obtain the trained fine-grained classification model.
7. The method according to claim 5 or 6, wherein the objective function is a function related to the feature vector of the first image, the feature vector of the second image, the feature vector of the third image and the third preset threshold.
8. A terminal, comprising:
the acquisition module is used for acquiring an image to be classified;
the classification module is used for classifying the images to be classified by adopting a pre-trained fine-grained classification model to obtain classified images;
the trained fine-grained classification model is obtained by adopting the following method:
grouping images of the image set to be trained according to the acquired image labels of the image set to be trained to obtain a grouped image set to be trained; wherein the image tag is used for characterizing the category of the image;
extracting a characteristic vector from the grouped images to be trained in the set to obtain a characteristic vector group;
and training the fine-grained classification model by adopting the feature vector group to determine a model parameter when the values of the loss function and the target function in the fine-grained classification model are minimum, so as to obtain the trained fine-grained classification model.
9. A terminal, characterized in that the terminal comprises: a processor and a storage medium storing instructions executable by the processor to perform operations in dependence on the processor via a communication bus, the instructions when executed by the processor performing the method of classifying an image according to any one of claims 1 to 7.
10. A computer storage medium having stored thereon executable instructions which, when executed by one or more processors, perform the method of classifying an image of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010078654.1A CN111325242A (en) | 2020-02-03 | 2020-02-03 | Image classification method, terminal and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010078654.1A CN111325242A (en) | 2020-02-03 | 2020-02-03 | Image classification method, terminal and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111325242A true CN111325242A (en) | 2020-06-23 |
Family
ID=71168785
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010078654.1A Pending CN111325242A (en) | 2020-02-03 | 2020-02-03 | Image classification method, terminal and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111325242A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111914761A (en) * | 2020-08-04 | 2020-11-10 | 南京华图信息技术有限公司 | Thermal infrared face recognition method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436302A (en) * | 2008-12-10 | 2009-05-20 | 南京大学 | Method for sorting colors of colorful three-dimensional model based on neural network |
CN109359684A (en) * | 2018-10-17 | 2019-02-19 | 苏州大学 | Fine granularity model recognizing method based on Weakly supervised positioning and subclass similarity measurement |
CN109784366A (en) * | 2018-12-07 | 2019-05-21 | 北京飞搜科技有限公司 | The fine grit classification method, apparatus and electronic equipment of target object |
CN110263659A (en) * | 2019-05-27 | 2019-09-20 | 南京航空航天大学 | A kind of finger vein identification method and system based on triple loss and lightweight network |
WO2019227614A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method and device for obtaining triple of samples, computer device and storage medium |
-
2020
- 2020-02-03 CN CN202010078654.1A patent/CN111325242A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436302A (en) * | 2008-12-10 | 2009-05-20 | 南京大学 | Method for sorting colors of colorful three-dimensional model based on neural network |
WO2019227614A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method and device for obtaining triple of samples, computer device and storage medium |
CN109359684A (en) * | 2018-10-17 | 2019-02-19 | 苏州大学 | Fine granularity model recognizing method based on Weakly supervised positioning and subclass similarity measurement |
CN109784366A (en) * | 2018-12-07 | 2019-05-21 | 北京飞搜科技有限公司 | The fine grit classification method, apparatus and electronic equipment of target object |
CN110263659A (en) * | 2019-05-27 | 2019-09-20 | 南京航空航天大学 | A kind of finger vein identification method and system based on triple loss and lightweight network |
Non-Patent Citations (1)
Title |
---|
王耀玮: "基于卷积神经网络的细粒度车辆识别系统研究", pages 034 - 1452 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111914761A (en) * | 2020-08-04 | 2020-11-10 | 南京华图信息技术有限公司 | Thermal infrared face recognition method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109936582B (en) | Method and device for constructing malicious traffic detection model based on PU learning | |
CN106897738B (en) | A kind of pedestrian detection method based on semi-supervised learning | |
CN110348580B (en) | Method and device for constructing GBDT model, and prediction method and device | |
CN108681746B (en) | Image identification method and device, electronic equipment and computer readable medium | |
CN109978893A (en) | Training method, device, equipment and the storage medium of image, semantic segmentation network | |
CN112016464A (en) | Method and device for detecting face shielding, electronic equipment and storage medium | |
CN111274926B (en) | Image data screening method, device, computer equipment and storage medium | |
CN109034086B (en) | Vehicle weight identification method, device and system | |
KR20230171966A (en) | Image processing method and device and computer-readable storage medium | |
CN109903053B (en) | Anti-fraud method for behavior recognition based on sensor data | |
CN111368772A (en) | Identity recognition method, device, equipment and storage medium | |
CN112766170B (en) | Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image | |
CN112101156B (en) | Target identification method and device and electronic equipment | |
CN112507912A (en) | Method and device for identifying illegal picture | |
CN114255377A (en) | Differential commodity detection and classification method for intelligent container | |
CN116563868A (en) | Text image recognition method and device, computer equipment and storage medium | |
CN117615359B (en) | Bluetooth data transmission method and system based on multiple rule engines | |
CN116977834B (en) | Method for identifying internal and external images distributed under open condition | |
CN111325242A (en) | Image classification method, terminal and computer storage medium | |
CN113723431B (en) | Image recognition method, apparatus and computer readable storage medium | |
CN113177603B (en) | Training method of classification model, video classification method and related equipment | |
CN112818917B (en) | Real-time pedestrian detection and re-identification method and device | |
CN115345248A (en) | Deep learning-oriented data depolarization method and device | |
CN113468936A (en) | Food material identification method, device and equipment | |
Sikand et al. | Using Classifier with Gated Recurrent Unit-Sigmoid Perceptron, Order to Get the Right Bird Species Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200623 |
|
RJ01 | Rejection of invention patent application after publication |