CN111507403A - Image classification method and device, computer equipment and storage medium - Google Patents

Image classification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111507403A
CN111507403A CN202010303814.8A CN202010303814A CN111507403A CN 111507403 A CN111507403 A CN 111507403A CN 202010303814 A CN202010303814 A CN 202010303814A CN 111507403 A CN111507403 A CN 111507403A
Authority
CN
China
Prior art keywords
image
classification
classified
classifiers
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010303814.8A
Other languages
Chinese (zh)
Inventor
李岩
康斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010303814.8A priority Critical patent/CN111507403A/en
Publication of CN111507403A publication Critical patent/CN111507403A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and provides an image classification method, an image classification device, computer equipment and a storage medium, wherein an image to be classified is obtained, at least two image features of the image to be classified are correspondingly input into at least two image classifiers, the at least two image classifiers are respectively corresponding to at least two classification levels, and the image features input into the image classifiers corresponding to the adjacent classification levels have a similarity constraint relation and are used for reducing the similarity between the image features; and acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level. According to the scheme, the similarity constraint relation is applied among the image features to reduce the similarity among the image features input to the image classifiers corresponding to the adjacent classification levels, so that the image classifiers corresponding to different classification levels pay attention to different image features on the image to be classified, and the accuracy of hierarchical classification of the image is improved.

Description

Image classification method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to an image classification method, apparatus, computer device, and storage medium.
Background
With the development of artificial intelligence technology, technology for classifying images based on deep learning technology such as a deep neural network has emerged, for example, an image classifier can be constructed based on the deep neural network to classify input images. Wherein for a conventional image classification task the status between the categories of the images is generally equivalent, i.e. no classification between all images is made, in which case the image classifier is easier to implement for a simpler classification task, such as distinguishing between car images and other categories, such as cat and dog images. By adopting the hierarchical image classification technology, the hierarchical classification task can be completed by judging whether the image belongs to the animal class or the non-animal class, and then further learning the cat class and the dog class from the animal class, for example.
The hierarchical image classification method provided by the conventional technology usually directly inputs images into, for example, two different image classifiers, one of which performs classification training for a large category, and the other of which performs classification training for sub-categories under the large category. However, the hierarchical classification of images in this way is less accurate.
Disclosure of Invention
In view of the above, it is necessary to provide an image classification method, apparatus, computer device and storage medium for solving the above technical problems.
A method of image classification, the method comprising:
acquiring an image to be classified;
correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features;
and acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified on the corresponding classification level output by the image classifier.
An image classification apparatus, the apparatus comprising:
the image acquisition module is used for acquiring an image to be classified;
the characteristic input module is used for correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features;
and the result acquisition module is used for acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an image to be classified; correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features; and acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified on the corresponding classification level output by the image classifier.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an image to be classified; correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features; and acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified on the corresponding classification level output by the image classifier.
The image classification method, the image classification device, the computer equipment and the storage medium acquire an image to be classified, correspondingly input at least two image characteristics of the image to be classified into at least two image classifiers, wherein the at least two image classifiers are respectively corresponding to at least two classification levels and input between the image characteristics of the image classifiers corresponding to the adjacent classification levels, and have a similarity constraint relation for reducing the similarity between the image characteristics; then, the hierarchical classification result of the image to be classified can be obtained according to the classification result of the image to be classified output by the image classifier on the corresponding classification level. According to the scheme, the similarity constraint relation is applied among the image features, so that the similarity among the image features input to the image classifiers corresponding to the adjacent classification levels can be reduced as much as possible, the image classifiers corresponding to different classification levels can pay attention to different image features on the same image to be classified, the image is classified on respective classification levels according to the corresponding image features, the accuracy of hierarchical classification of the image is improved, and the classification tasks of the image to be classified on a plurality of classification levels can be completed simultaneously.
Drawings
FIG. 1 is a diagram of an exemplary embodiment of an image classification method;
FIG. 2 is a diagram illustrating an image classification task in one embodiment;
FIG. 3 is a flow diagram illustrating a method for image classification in one embodiment;
FIG. 4 is a flowchart illustrating steps of constructing an image classifier in one embodiment;
FIG. 5 is a schematic diagram of image classification in one embodiment;
FIG. 6 is a schematic flow chart illustrating the steps for obtaining features of a sample image in one embodiment;
FIG. 7 is a schematic diagram of an interface for displaying image information in one embodiment;
FIG. 8 is a schematic diagram of image classification in an example application;
FIG. 9 is a block diagram showing the structure of an image classification apparatus according to an embodiment;
FIG. 10 is a diagram showing an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The image classification method provided by the present application can be applied to the application environment shown in fig. 1, where fig. 1 is an application environment diagram of the image classification method in one embodiment. Wherein the terminal 110 may communicate with the server 120 through a network. The terminal 110 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 120 may be implemented by an independent server or a server cluster formed by a plurality of servers.
The application provides an image classification method, and relates to the technical field of artificial intelligence. Artificial Intelligence (AI) is a theory, method, technique and application system that can simulate, extend and expand human Intelligence using a digital computer or a machine controlled by a digital computer, and can sense the environment, acquire knowledge and use the knowledge to obtain the best result.
The artificial intelligence technology includes Computer Vision technology (CV), which means that a camera, a Computer and other terminal devices can be used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further perform image processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmit to an instrument to detect. And computer vision techniques may include image recognition and image classification techniques, such as recognizing that the image is an image of a car, an image of a cat, or an image of a dog, etc.
The machine learning technology is combined into the computer vision technology, so that the terminal equipment can intelligently classify the images to be classified according to the learned image classification knowledge. Furthermore, the terminal equipment can be used for carrying out hierarchical classification on the images to be classified.
Fig. 2 is a schematic diagram of an image classification task in an embodiment, where fig. 2 illustrates differences between general classification and hierarchical classification, in a general classification task, statuses between classes of images are equal, and an image classifier will not distinguish the classes, that is, the images are directly recognized as cat images, dog images, bicycle images, or automobile images, but actually, relationships between different classes are different, for example, among four classes of cats, dogs, automobiles, and bicycles, cats and dogs belong to animal classes, relationships are relatively close to each other, and relationships between the animal classes and vehicle classes to which automobiles and bicycles belong are actually relatively far away. Therefore, the hierarchical classification task can judge whether the image belongs to an animal class or a non-animal class such as a vehicle class and further identify a cat, a dog, a bicycle or an automobile and the like in the animal class and the vehicle class, wherein the animal class and the vehicle class belong to the same classification level and can be called a large class classification level; cats and dogs under the animal category and bicycles and cars under the vehicle category belong to another classification level, which can be a subclass classification level. The image classification method provided by the application can obtain the classification result of the image to be classified on each classification level, and obtain the hierarchical classification result of the image to be classified, for example, the hierarchical classification result of the image to be classified belonging to the animal category and the cat image can be obtained.
The image classification method provided by the application can be applied to various types of content auditing and content understanding tasks including images, for example, for video type services, a video frame extraction strategy can be adopted to realize the content auditing and understanding of the video services.
Specifically, the image classification method provided by the present application may be executed by the terminal 110 or the server 120 alone, or may be executed by the terminal 110 and the server 120 in cooperation.
Firstly, taking the terminal 110 to execute alone as an example for explanation, the terminal 110 may obtain an image to be classified, and correspondingly input at least two image features of the image to be classified to at least two image classifiers; the at least two image classifiers may be pre-configured on the terminal 110, respectively correspond to the at least two classification levels, and have a similarity constraint relationship between image features input to the image classifiers corresponding to adjacent classification levels, so as to reduce similarity between the image features; finally, the terminal 110 may obtain the hierarchical classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level.
The image classification method provided by the application can also be executed by the cooperation of the terminal 110 and the server 120, specifically, the terminal 110 can acquire an image to be classified and send the image to be classified to the server 120, and the server 120 correspondingly inputs at least two image characteristics of the image to be classified to at least two image classifiers; the at least two image classifiers may be pre-configured on the server 120, respectively correspond to the at least two classification levels, and have a similarity constraint relationship between image features input to the image classifiers corresponding to adjacent classification levels, so as to reduce similarity between the image features; then, the server 120 may send the classification result of the image to be classified output by each image classifier on the corresponding classification level to the terminal 110, and the terminal 110 may obtain the hierarchical classification result of the image to be classified according to the classification result.
In an embodiment, as shown in fig. 3, fig. 3 is a flowchart illustrating an image classification method in an embodiment, and an image classification method is provided, which is described by taking the method as an example applied to the terminal 110 in fig. 1, and includes the following steps:
step S301, acquiring an image to be classified;
in this step, the terminal 110 may obtain an image to be classified. The image to be classified may be an image captured by the terminal 110 through an image capturing device such as a camera, or may be an image pre-stored in an electronic gallery of the terminal 110, and the image to be classified may include objects such as animals and plants. Specifically, the terminal 110 may capture an image of the cat in real time through a camera configured therein, and the image of the cat may be used as an image to be classified.
Step S302, correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers;
in this step, the terminal 110 may correspondingly input at least two image features of the image to be classified into at least two image classifiers. In the hierarchical classification task, the terminal 110 may be configured with two image classifiers, for example, one image classifier is used for classifying animals and vehicles at the level, and the other image classifier is used for classifying cats and dogs under the animal, bicycles and cars under the vehicle at the level.
Furthermore, the image features input to the image classifiers corresponding to the adjacent classification levels have a similarity constraint relationship, and the similarity constraint relationship is used for reducing the similarity between the image features input to the image classifiers corresponding to the adjacent classification levels. Specifically, a three-level classification task is taken as an example for explanation, and the first-level classification hierarchy is set as: a plant or animal; second-level classification hierarchy, animals for example: a mammal or a reptile; third-level classification hierarchy, exemplified by reptiles: lizards or snakes. In this case, two similarity constraint relationships are added, the first similarity constraint relationship being applied between the image features input to the image classifiers corresponding to the first and second classification levels, and the second similarity constraint relationship being applied between the image features input to the image classifiers corresponding to the second and third classification levels. The image features can be generally expressed by vectors, so that similarity constraint relation can be applied to the vectors by reducing the similarity among the vectors, and the similarity among the image features input to the image classifiers corresponding to the adjacent classification levels can be reduced.
Illustratively, the similarity constraint relationship may include a mutual information constraint relationship or an orthogonal constraint relationship. For the mutual information constraint relationship, the similarity between two image features can be reduced by obtaining the mutual information between the image features input to the image classifiers corresponding to the adjacent classification levels so as to minimize the mutual information between the two image features. Similarly, for the orthogonal constraint relationship, the similarity between two image features can be reduced by making the image features input to the image classifiers corresponding to the adjacent classification levels orthogonal to each other.
In this way, the terminal 110 can decouple two image features input to the image classifier corresponding to the adjacent classification levels, so that similar parts between different image features can be eliminated, the two image features respectively correspond to different features on the image, and the image classifier can focus on different types of features on the image based on the input decoupled image features, and can better and accurately classify the images to be classified on the corresponding classification levels by using different image features more suitable for the corresponding classification levels, and simultaneously complete classification tasks of a plurality of classification levels.
Step S303, acquiring a layering classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level.
In this step, the terminal 110 may obtain the classification result output by each image classifier, where the classification result may include the classification result of the image to be classified on each classification level, for example, in the case of having three image classifiers, the terminal 110 may obtain the classification results output by the three image classifiers respectively and corresponding to the three classification levels, and the terminal 110 may use the three classification results as the hierarchical classification result of the image to be classified, or may select the classification result of one classification level from the hierarchical classification results as the hierarchical classification result required by the terminal, so that the terminal 110 may simultaneously complete the classification task of the image to be classified on multiple classification levels.
In the image classification method, the terminal 110 obtains an image to be classified and correspondingly inputs at least two image characteristics of the image to be classified into at least two image classifiers, wherein the at least two image classifiers correspond to at least two classification levels respectively, and the image characteristics input into the image classifiers corresponding to the adjacent classification levels have a similarity constraint relation for reducing the similarity between the image characteristics; then, the terminal 110 may obtain a hierarchical classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level. According to the scheme, the similarity constraint relation is applied among the image features, so that the similarity among the image features input to the image classifiers corresponding to the adjacent classification levels can be reduced as much as possible, the image classifiers corresponding to different classification levels can pay attention to different image features on the same image to be classified, the image is classified on respective classification levels according to the corresponding image features, the accuracy of hierarchical classification of the image is improved, and the classification tasks of the image to be classified on a plurality of classification levels can be completed simultaneously.
In one embodiment, the inputting of the at least two image features of the image to be classified into the at least two image classifiers in step S302 may include:
at least two image features are obtained through a pre-constructed feature extractor and are correspondingly input to at least two image classifiers.
In this embodiment, the terminal 110 may obtain at least two image features from the image to be classified by using a pre-constructed feature extractor, and input the at least two image features to the at least two image classifiers. And the feature extractor and the at least two image classifiers are constructed based on the similarity constraint relation. The feature extractor may be implemented based on a neural network model. The terminal 110 may input the image to be classified to a feature extractor based on a neural network model, divide the image features output by the last convolutional layer of the feature extractor into the at least two image features, and correspondingly input the at least two image features to the at least two image classifiers.
In the scheme of this embodiment, the terminal 110 may perform feature extraction on the image to be classified through the feature extractor obtained by matching training with at least two image classifiers in advance under the similarity constraint relationship, so as to obtain multiple image features that can be applied to each image classifier for classification, and it is not necessary to recalculate the similarity between the image features each time an image is performed, thereby improving the image classification efficiency.
In one embodiment, further, the feature extractor may further include a feature extraction network and an encoder; the step of obtaining at least two image features through the pre-constructed feature extractor may specifically include:
inputting the image to be classified into a feature extraction network to obtain initial image features output by the feature extraction network; inputting the initial image characteristics to an encoder to obtain the encoded initial image characteristics output by the encoder; and acquiring at least two image characteristics based on the coded initial image characteristics.
In this embodiment, the feature extractor may further include a feature extraction network and an encoder. Wherein the feature extraction network and the encoder may be implemented based on a neural network model, such as a ResNet residual network model.
The feature extraction network can be used for preliminarily acquiring the image features of the images to be classified, and the image features acquired by the feature extraction network are often high in dimensionality and contain redundant information. Therefore, the terminal 110 may input the image to be classified into the feature extraction network first to obtain the initial image features output by the feature extraction network, and then the terminal 110 further inputs the initial image features into the encoder, where the terminal 110 may map the image features to the encoding features using the encoder, which may be used to reduce feature dimensionality and play a role in removing redundant information in the initial image features. Thus, the terminal 110 obtains at least two image features from the encoded initial image features output from the encoder based on the encoded initial image features. By adopting the scheme of the embodiment, the terminal 110 can directly use the trained feature extraction network to obtain the initial image features when classifying the images, can divide the initial image features coded by the coder into at least two image features after performing feature dimension reduction by using the coder, and has a constraint relationship between the image features input to the image classifiers corresponding to adjacent classification levels, thereby improving the efficiency and accuracy of image classification.
In one embodiment, as shown in fig. 4, fig. 4 is a flowchart illustrating steps of constructing an image classifier in an embodiment, before at least two image features are obtained by a pre-constructed feature extractor, the feature extractor and the image classifier may be constructed by the following steps:
step S401, obtaining a sample image, and obtaining classification labels of the sample image on at least two classification levels as real classification labels of at least two image classifiers;
in this step, the terminal 110 may obtain a sample image, and the number of the sample images is generally multiple. The terminal 110 further needs to obtain a classification label corresponding to the sample image, where the classification label includes a classification label of the sample image at each classification level. As described with reference to fig. 2, the terminal 110 may obtain a cat image as a sample image, and the terminal 110 further needs to obtain classification labels at two classification levels corresponding to the cat image, that is, "animal" and "cat". Further, the terminal 110 may use the classification labels of the sample image on at least two classification levels as real classification labels of at least two image classifiers, for training the image classifiers and the feature extractor.
Step S402, inputting the sample image into the feature extractor, and acquiring at least two sample image features with the same dimension according to the image features of the sample image output by the feature extractor.
Referring to fig. 5, fig. 5 is a schematic diagram of image classification in an embodiment, the terminal 110 inputs a sample image to the feature extractor, the feature extractor outputs image features of the sample image, and the terminal 110 further divides the sample image features into at least two sample image features with the same dimension. Specifically, the terminal 110 may divide the sample image feature into an image feature a and an image feature B having the same dimension. That is, assuming that the dimension of the sample image feature is 2d, the terminal 110 splits the sample image feature into an image feature a and an image feature B, the dimensions of which are 1d, respectively. For example, assuming that the image feature dimension of the sample image output by the acquired feature extractor is a 2048-dimensional vector, the splitting process may be to correspond the first 1/2-dimensional vector, i.e., 0-rd to 1023-th-dimensional vectors, in the 2048-dimensional vector to the image feature a and the last 1/2-dimensional vector, i.e., 1024-th to 2047-th-dimensional vectors, to the image feature B.
In one embodiment, the inputting the sample image to the feature extractor in step S402 may specifically include: preprocessing the sample image to obtain a sample image with the image size being a preset image size; the sample image of the preset image size is input to a feature extractor.
Specifically, for example, the image size of the model training of the image classifier, the feature extractor and the like generally needs to be fixed, so that the embodiment can scale the sample image with any image size into the image with the image size of 256 × 256, and then randomly cut out the image with the size of 224 × 224 from the sample image to be used as the sample image with the preset image size for training.
Step S403, respectively inputting at least two sample image characteristics to at least two image classifiers, and obtaining predicted classification labels of sample images output by the at least two image classifiers on corresponding classification levels;
in this step, referring to fig. 5, the terminal 110 may input the image feature a to the image classifier a and input the image feature B to the image classifier B. The image classifier A and the image classifier B are used for classifying the images to be classified on different classification levels. In the process of model construction, the image classifier a can predict the classification result according to the input image feature a to obtain a predicted classification label a, and similarly, the image classifier B can obtain a predicted classification label B according to the input image feature B. The predictive classification tag may correspond to a probability value that belongs to a class at the corresponding classification level. Specifically, as described with reference to fig. 2, the predicted classification label may be a probability value of the sample image belonging to an animal or a vehicle at a classification level of "animal or vehicle".
S404, constructing a similarity constraint relation among sample image features input to the image classifiers corresponding to the adjacent classification levels;
in this step, the terminal 110 constructs a similarity constraint relationship between sample image features input to the image classifiers corresponding to the adjacent classification levels.
Step S405, training the feature extractor and the at least two image classifiers based on the real classification label, the prediction classification label and the similarity constraint relation, and constructing the feature extractor and the at least two image classifiers.
According to the technical scheme of the embodiment, the terminal 110 may perform joint training on the feature extractor and the at least two image classifiers based on the real classification label, the prediction classification label and the similarity constraint relationship of the sample image, and construct the feature extractor and the at least two image classifiers, so that the trained feature extractor can obtain at least two image features having the similarity constraint relationship from the image to be classified, and input the at least two image features into the at least two image classifiers for classification, thereby realizing rapid and accurate classification of the image to be classified.
In one embodiment, as shown in fig. 6, fig. 6 is a flow chart illustrating the steps of obtaining the features of the sample image in one embodiment, and the feature extractor may include a feature extraction network and an encoder; the step of inputting the sample image into the feature extractor and obtaining at least two sample image features with the same dimension according to the image features of the sample image output by the feature extractor in step S402 may include:
step S601, inputting a sample image into a feature extraction network to obtain initial sample image features output by the feature extraction network;
step S602, inputting the initial sample image characteristics to an encoder to obtain the encoded initial sample image characteristics output by the encoder;
step S603, splitting the initial sample image features into at least two sample image features with the same dimension.
In this embodiment, the terminal 110 may obtain the at least two sample image features based on the feature extraction network and the encoder included in the feature extractor. Referring to fig. 5, fig. 5 is a schematic diagram of image classification in an embodiment, the terminal 110 may input a sample image to a feature extraction network, the feature extraction network may be configured to preliminarily obtain image features from the sample image, the terminal 110 obtains initial sample image features output by the feature extraction network, as described in the above embodiment, the image features obtained by the feature extraction network often contain redundant information and feature dimensions are relatively high, so that the terminal 110 further inputs the initial sample image features to the encoder, the encoder may be configured to map image features to encoded features, to reduce a feature dimension of initial sample image features acquired by a feature extraction network, and removing redundant information in the initial sample image features, and finally, the terminal 110 splits the encoded initial sample image features output by the encoder into at least two sample image features with the same dimension. By adopting the scheme of the embodiment, the feature extraction network, the encoder and the at least two image classifiers can be trained together based on the similarity constraint relationship, so that the trained feature extraction network, the encoder and the at least two image classifiers can be used as a whole of an image classification tool to rapidly and accurately classify the images to be classified.
In one embodiment, the training of the feature extractor and the at least two image classifiers based on the real classification label, the prediction classification label and the similarity constraint relationship in step S405 may include:
according to the real classification label and the prediction classification label, constructing first loss functions corresponding to two classification levels to obtain at least two first loss functions; constructing a second loss function according to the similarity constraint relation; the feature extractor and the at least two image classifiers are trained based on the at least two first loss functions and the second loss function such that the at least two first loss functions and the second loss function are maximized.
The present embodiment provides a specific way to train the feature extractor and the at least two image classifiers. Specifically, the terminal 110 may construct a first loss function based on the real classification label and the predicted classification label, where the first loss function may include a plurality of loss functions, and each loss function corresponds to a different classification level, for example, in a case where there are three classification levels, the first loss function includes three loss functions. In addition, the terminal 110 further constructs a second loss function according to the similarity constraint relationship, that is, the second loss function is constructed by the similarity constraint relationship between the image features input to the image classifiers corresponding to the adjacent classification levels. Accordingly, if there are two classification levels, the number of second loss functions constructed is one, and if there are three classification levels, the number of second loss functions constructed is two. Thus, the terminal 110 trains the feature extractor and the at least two image classifiers using the at least two first loss functions and the second loss function to maximize the at least two first loss functions and the second loss function.
Specifically, the image classification task with two classification levels is explained, which corresponds to the large category classification and the sub-category classification, wherein the large category label is set as ysuperThe sub-category label is ysubThen the first penalty functions corresponding to the two classification levels are:
Figure BDA0002455014870000121
Figure BDA0002455014870000122
wherein, LsuperIs shown as largeFirst loss function of class, LsubA first loss function, C, representing the subclasssuperRepresents the total amount of big category (super category) categories, CsubRepresents the total amount of sub-category categories,
Figure BDA0002455014870000123
a true class label that represents a large class,
Figure BDA0002455014870000124
the true category label that represents the sub-category,
Figure BDA0002455014870000125
a predictive classification tag representing a large class, i.e. the probability value of the image to be classified under this large class belonging to class i,
Figure BDA0002455014870000126
the friend indicates the predicted classification label of the sub-category, i.e. the probability value of the image to be classified under the sub-category belonging to the category i.
In addition, let E be the image features of the image classifiers input to the large and sub-categories, respectivelyα(x) And Eβ(x) In that respect Wherein, the similarity constraint relation applied to the two image features can be mutual information constraint relation or orthogonal constraint relation. Taking the example of applying mutual information constraint, the corresponding second loss function is:
Figure BDA0002455014870000131
wherein, LmulA second penalty function is represented, where r in the mutual information constraint represents a gradient inversion layer (gradientreverse layer), whose role is to multiply the gradient by-1, i.e. "invert the gradient" when the network propagates the gradient backwardsα(x) And Eβ(x) Normalization using L2 (i.e., L2 Normal)ization) limits the range of values of the feature. In this way, only when the two images are identical in feature, i.e. Eα(x)==Eβ(x) Time of day, second loss function LmulTake the minimum value-1 and when the two image features are perfectly orthogonal and mutually different, the second loss function LmulThe maximum value of 0 is obtained. Since a gradient inversion layer is used, minimizing mutual information loss after gradient inversion is equivalent to maximizing the second loss function, i.e. it is required that the similarity between two image features is reduced as much as possible.
Finally, two first loss functions, together with one second loss function, are based on L-Lsuper+Lsub+LmulAnd maximally training the whole network in a cooperative mode, wherein the training of the feature extractor and the at least two image classifiers is carried out, and when the feature extractor comprises the feature extraction network and the encoder, the feature extraction network and the encoder are trained together with the at least two image classifiers.
In one embodiment, the inputting of the at least two image features of the image to be classified into the at least two image classifiers in step S302 may include:
sending the image to be classified to a server so that the server correspondingly inputs at least two image characteristics of the image to be classified to at least two image classifiers to obtain classification results of the image to be classified output by the image classifiers on corresponding classification levels; and receiving the classification result obtained by the server.
Referring to fig. 1, the present embodiment mainly deals with the image classification process by the server 120. Specifically, the terminal 110 may obtain an image to be classified, and then send the image to be classified to the server 120, where the server 120 may be configured with the at least two image classifiers in advance. After receiving the image to be classified, the server 120 correspondingly inputs at least two image features of the image to be classified to at least two image classifiers to obtain classification results of the image to be classified output by the image classifiers on corresponding classification levels, and then sends the classification results to the terminal 110, and the terminal 110 receives the classification results sent by the server 120.
By adopting the technical solution of the embodiment, the terminal 110 may transfer the task of image classification processing to the server 120 for processing, so as to reduce the data processing pressure of the terminal 110.
In an embodiment, as shown in fig. 7, fig. 7 is an interface schematic diagram showing image information in an embodiment, and after the hierarchical classification result of the image to be classified is obtained according to the classification result of the image to be classified on the corresponding classification level output by the image classifier in step S303, the method may further include the following steps:
acquiring image classification information carrying a layering classification result; and displaying the image classification information on the image to be classified.
The present embodiment mainly includes that the terminal 110 can directly display the classification result of the image to be classified on the image to be classified. Referring to fig. 7, the terminal 110 may display an image 700 to be classified, and after obtaining a hierarchical classification result of the image 700 to be classified, the terminal 110 may display image classification information carrying the hierarchical classification result in an information display area 710. The hierarchical classification result of the image to be classified 700 may include a large classification result a1 and a sub classification result B2 of the image to be classified. Specifically, assuming that the image 700 to be classified is a cat image, the image classification information displayed by the terminal 110 may include a large classification result: an animal; and (5) subclass classification result: a cat. By adopting the technical scheme of the embodiment, the layering classification result can be displayed on the image to be classified in an overlapping mode, and the display efficiency of the layering classification result is improved.
In order to clarify the technical solutions provided by the present application more clearly, the principle of image classification is described in detail with reference to fig. 8, and fig. 8 is a schematic diagram of the principle of image classification in an application example.
In general, the input image (x) may be an image of any size, and training of the model (including training of the feature extraction network, the encoder, and the class and subclass classifiers) generally requires the use of images of fixed image size, so that images of any image size can be adjusted to 256 × 256 image size first, and then randomized therefromA224 × 224 image size image is cropped as the image to be processed.A feature extraction network can then be used to extract image features f (x). then, an encoder is used to map the image features to encoding features E (x), which can be viewed as image features prior to decouplingα(x) For training a broad class classifier, second partial features Eβ(x) The method is used for training the subclass classifier, and meanwhile mutual information constraint is applied between the two decoupling features, so that the similarity between the two decoupling features is reduced. It should be noted that, two parts of features extracted from the image to be processed and input to the large class classifier and the sub class classifier may not have any correlation characteristics, that is, only the image to be processed is provided for the large class classifier and the sub class classifier, and no special processing is required to be performed on the image features of the image in advance, the training of models including the feature extraction network, the encoder, the large class classifier and the sub class classifier can be completed by inputting the image to be processed to the model, and based on the trained model, the hierarchical classification of the image can be realized by inputting the image to be classified.
Specifically, assume that the input image is x and its class label is ysuperThe subclass label is ysubThe image features are extracted using a feature extraction network, which is not particularly limited, and various neural network models, for example, may be used. Generally, the output of the last convolutional layer in the neural network can be used as the picture extraction feature f (x). Next, an encoder is used to map the image extraction features f (x) to encoding features e (x) having a dimension of 2 d. The encoder structure may employ a single fully connected layer (full connected layer). Then, the encoding feature e (x) of 2d dimension is split into two parts of the same dimension: e (x) → [ E [)α(x);Eβ(x)]. Wherein, feature Eα(x) And Eβ(x) The features are all 1 d-dimensional features and are used for large category classification and sub-category classification respectively. The corresponding loss functions are:
Figure BDA0002455014870000151
Figure BDA0002455014870000152
wherein, LsuperLoss function representing large class, LsubLoss function representing subcategories, CsuperRepresents the total amount of big category (super category) categories, CsubRepresents the total amount of sub-category categories,
Figure BDA0002455014870000153
a true class label that represents a large class,
Figure BDA0002455014870000154
the true category label that represents the sub-category,
Figure BDA0002455014870000155
a predictive classification tag representing a large class, i.e. a probability value of an image under the large class belonging to class i,
Figure BDA0002455014870000161
a predictive classification tag representing a sub-category, i.e. a probability value that an image belongs to category i under that sub-category.
In addition, to ensure the characteristic Eα(x) And Eβ(x) The image features which are different as much as possible can be learned, and a mutual information constraint relation is applied between the two image features:
Figure BDA0002455014870000162
wherein, LmulRepresenting the mutual information loss function, where r in the mutual information constraint represents a gradient inversion layer (gradient inversion layer) whose role is to multiply the gradient by-1 when the network propagates the gradient backwards, i.e. "invert the gradient"In one case, for feature E after decouplingα(x) And Eβ(x) The range of values of the features can be limited using L2 Normalization (i.e., L2 Normalization).
In this case, E is only the same for both image featuresα(x)==Eβ(x) When the mutual information loss function takes a minimum value of-1, and when the two image features are perfectly orthogonal and mutually different, the second loss function LmulFinally, the three loss functions are used to cooperatively train the whole network (including the feature extraction network, the encoder and the class and subclass classifiers), L-Lsuper+Lsub+Lmul
The technical scheme provided by the application example can decouple the image features of the image into two partial image features suitable for large-class classification and sub-class classification, and simultaneously reduce the similarity degree between the two partial image features by using mutual information constraint, so that the two partial image features pay attention to different characteristics in the image as far as possible, and a hierarchical classification task is better completed.
It should be understood that, although the steps in the flowcharts of fig. 3 to 6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 3 to 6 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the other steps or stages.
In an embodiment, as shown in fig. 9, fig. 9 is a block diagram of an image classification apparatus in an embodiment, and provides an image classification apparatus, which may adopt a software module or a hardware module, or a combination of the two modules, as a part of a computer device, where the apparatus 900 specifically includes:
an image obtaining module 901, configured to obtain an image to be classified;
a feature input module 902, configured to correspondingly input at least two image features of an image to be classified to at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features;
and the result obtaining module 903 is configured to obtain a hierarchical classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level.
In one embodiment, the feature input module 902 is further configured to obtain at least two image features through a pre-constructed feature extractor, and correspondingly input the at least two image features to at least two image classifiers; the feature extractor and the at least two image classifiers are constructed based on similarity constraint relations.
In one embodiment, the feature extractor includes a feature extraction network and an encoder; the feature input module 902 is further configured to input the image to be classified into the feature extraction network, so as to obtain an initial image feature output by the feature extraction network; inputting the initial image characteristics to an encoder to obtain the encoded initial image characteristics output by the encoder; and acquiring at least two image characteristics based on the coded initial image characteristics.
In one embodiment, the apparatus 900 may further include:
the classifier building module is used for obtaining a sample image and obtaining classification labels of the sample image on at least two classification levels as real classification labels of at least two image classifiers; inputting the sample image into a feature extractor, and acquiring at least two sample image features with the same dimension according to the image features of the sample image output by the feature extractor; respectively inputting at least two sample image characteristics to at least two image classifiers, and acquiring predicted classification labels of sample images output by the at least two image classifiers on corresponding classification levels; constructing similarity constraint relation among sample image features input to image classifiers corresponding to adjacent classification levels; training the feature extractor and the at least two image classifiers based on the real classification label, the prediction classification label and the similarity constraint relation, and constructing the feature extractor and the at least two image classifiers.
In one embodiment, the feature extractor includes a feature extraction network and an encoder; a classifier building module further configured to: inputting the sample image into a feature extraction network to obtain initial sample image features output by the feature extraction network; inputting the initial sample image characteristics to an encoder to obtain the encoded initial sample image characteristics output by the encoder; and splitting the initial sample image features into at least two sample image features with the same dimension.
In one embodiment, the classifier building module is further configured to: according to the real classification label and the prediction classification label, constructing first loss functions corresponding to two classification levels to obtain at least two first loss functions; constructing a second loss function according to the similarity constraint relation; the feature extractor and the at least two image classifiers are trained based on the at least two first loss functions and the second loss function such that the at least two first loss functions and the second loss function are maximized.
In one embodiment, the similarity constraint relationship comprises a mutual information constraint relationship or an orthogonal constraint relationship.
In one embodiment, the classifier building module is further configured to: preprocessing the sample image to obtain a sample image with the image size being a preset image size; a sample image of a preset image size is input to the feature extractor.
In one embodiment, the apparatus 900 may further include:
the information display module is used for acquiring image classification information carrying the layering classification result; and displaying the image classification information on the image to be classified.
In an embodiment, the feature input module 902 is further configured to send the image to be classified to a server, so that the server correspondingly inputs at least two image features of the image to be classified to at least two image classifiers, and obtains a classification result of the image to be classified on a corresponding classification level, which is output by the image classifiers; and receiving the classification result obtained by the server.
For the specific definition of the image classification device, reference may be made to the above definition of the image classification method, which is not described herein again. The modules in the image classification device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, the computer device may be a terminal, an internal structure diagram of which may be as shown in fig. 10, and fig. 10 is an internal structure diagram of the computer device in one embodiment. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image classification method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. A method of image classification, the method comprising:
acquiring an image to be classified;
correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features;
and acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified on the corresponding classification level output by the image classifier.
2. The method according to claim 1, wherein the inputting at least two image features of the image to be classified into at least two image classifiers comprises:
acquiring the at least two image characteristics through a pre-constructed characteristic extractor, and correspondingly inputting the at least two image characteristics to the at least two image classifiers; the feature extractor and the at least two image classifiers are constructed based on the similarity constraint relationship.
3. The method of claim 2, wherein the feature extractor comprises a feature extraction network and an encoder; the obtaining of the at least two image features by a pre-constructed feature extractor includes:
inputting the image to be classified into the feature extraction network to obtain the initial image features output by the feature extraction network;
inputting the initial image characteristics to the encoder to obtain encoded initial image characteristics output by the encoder;
and acquiring the at least two image characteristics based on the encoded initial image characteristics.
4. The method of claim 2, wherein prior to obtaining the at least two image features by the pre-constructed feature extractor, further comprising:
acquiring a sample image, and acquiring classification labels of the sample image on the at least two classification levels as real classification labels of the at least two image classifiers;
inputting the sample image into the feature extractor, and acquiring at least two sample image features with the same dimension according to the image features of the sample image output by the feature extractor;
inputting the at least two sample image features to the at least two image classifiers respectively, and obtaining predicted classification labels of the sample images output by the at least two image classifiers on corresponding classification levels;
constructing similarity constraint relation among sample image features input to image classifiers corresponding to adjacent classification levels;
training the feature extractor and the at least two image classifiers based on the real classification label, the prediction classification label and the similarity constraint relation to construct the feature extractor and the at least two image classifiers.
5. The method of claim 4, wherein the feature extractor comprises a feature extraction network and an encoder; the inputting the sample image into the feature extractor, and obtaining at least two sample image features with the same dimension according to the image features of the sample image output by the feature extractor includes:
inputting the sample image into the feature extraction network to obtain the initial sample image feature output by the feature extraction network;
inputting the initial sample image characteristics to the encoder to obtain encoded initial sample image characteristics output by the encoder;
splitting the initial sample image features into at least two sample image features with the same dimension.
6. The method of claim 4, wherein training the feature extractor and the at least two image classifiers based on the true classification label, the predicted classification label, and a similarity constraint relationship comprises:
according to the real classification label and the prediction classification label, constructing first loss functions corresponding to the two classification levels to obtain at least two first loss functions;
constructing a second loss function according to the similarity constraint relation;
training the feature extractor and the at least two image classifiers based on the at least two first loss functions and the second loss function such that the at least two first loss functions and the second loss function are maximized.
7. The method of claim 6, wherein the similarity constraint relationship comprises a mutual information constraint relationship or an orthogonal constraint relationship.
8. The method of claim 4, wherein inputting the sample image to the feature extractor comprises:
preprocessing the sample image to obtain a sample image with an image size of a preset image size;
inputting the sample image of the preset image size to the feature extractor.
9. The method according to claim 1, wherein after obtaining the hierarchical classification result of the image to be classified according to the classification result of the image to be classified at the corresponding classification level output by the image classifier, the method comprises:
acquiring image classification information carrying the hierarchical classification result;
and displaying the image classification information on the image to be classified.
10. The method according to claim 1, wherein the inputting at least two image features of the image to be classified into at least two image classifiers comprises:
sending the image to be classified to a server so that the server correspondingly inputs at least two image features of the image to be classified to the at least two image classifiers to obtain classification results of the image to be classified on corresponding classification levels output by the image classifiers;
and receiving the classification result obtained by the server.
11. An image classification apparatus, characterized in that the apparatus comprises:
the image acquisition module is used for acquiring an image to be classified;
the characteristic input module is used for correspondingly inputting at least two image characteristics of the image to be classified into at least two image classifiers; the at least two image classifiers correspond to the at least two classification levels respectively; the image features input to the image classifiers corresponding to the adjacent classification levels have similarity constraint relation and are used for reducing the similarity between the image features;
and the result acquisition module is used for acquiring the layering classification result of the image to be classified according to the classification result of the image to be classified output by the image classifier on the corresponding classification level.
12. The apparatus of claim 11, wherein the feature input module is further configured to obtain the at least two image features through a pre-constructed feature extractor, and correspondingly input the at least two image features to the at least two image classifiers; the feature extractor and the at least two image classifiers are constructed based on the similarity constraint relationship.
13. The apparatus of claim 12, wherein the feature extractor comprises a feature extraction network and an encoder; the feature input module is further configured to input the image to be classified into the feature extraction network, so as to obtain an initial image feature output by the feature extraction network; inputting the initial image characteristics to the encoder to obtain encoded initial image characteristics output by the encoder; and acquiring the at least two image characteristics based on the encoded initial image characteristics.
14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 10 when executing the computer program.
15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 10.
CN202010303814.8A 2020-04-17 2020-04-17 Image classification method and device, computer equipment and storage medium Pending CN111507403A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010303814.8A CN111507403A (en) 2020-04-17 2020-04-17 Image classification method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010303814.8A CN111507403A (en) 2020-04-17 2020-04-17 Image classification method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111507403A true CN111507403A (en) 2020-08-07

Family

ID=71864176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010303814.8A Pending CN111507403A (en) 2020-04-17 2020-04-17 Image classification method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111507403A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364933A (en) * 2020-11-23 2021-02-12 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN113255766A (en) * 2021-05-25 2021-08-13 平安科技(深圳)有限公司 Image classification method, device, equipment and storage medium
CN113673576A (en) * 2021-07-26 2021-11-19 浙江大华技术股份有限公司 Image detection method, terminal and computer readable storage medium thereof
CN113836338A (en) * 2021-07-21 2021-12-24 北京邮电大学 Fine-grained image classification method and device, storage medium and terminal
JP7082239B1 (en) 2021-06-09 2022-06-07 京セラ株式会社 Recognition device, terminal device, recognizer construction device, recognizer correction device, construction method, and correction method

Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001160057A (en) * 1999-12-03 2001-06-12 Nippon Telegr & Teleph Corp <Ntt> Method for hierarchically classifying image and device for classifying and retrieving picture and recording medium with program for executing the method recorded thereon
US20020122596A1 (en) * 2001-01-02 2002-09-05 Bradshaw David Benedict Hierarchical, probabilistic, localized, semantic image classifier
US20130322740A1 (en) * 2012-05-31 2013-12-05 Lihui Chen Method of Automatically Training a Classifier Hierarchy by Dynamic Grouping the Training Samples
CN104200238A (en) * 2014-09-22 2014-12-10 北京酷云互动科技有限公司 Station caption recognition method and station caption recognition device
KR20160032533A (en) * 2014-09-16 2016-03-24 삼성전자주식회사 Feature extracting method of input image based on example pyramid and apparatus of face recognition
US20170193328A1 (en) * 2015-12-31 2017-07-06 Microsoft Technology Licensing, Llc Structure and training for image classification
CN107067022A (en) * 2017-01-04 2017-08-18 美的集团股份有限公司 The method for building up of image classification model, set up device and equipment
US9928448B1 (en) * 2016-09-23 2018-03-27 International Business Machines Corporation Image classification utilizing semantic relationships in a classification hierarchy
CN108171254A (en) * 2017-11-22 2018-06-15 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
CN108664924A (en) * 2018-05-10 2018-10-16 东南大学 A kind of multi-tag object identification method based on convolutional neural networks
CN108681695A (en) * 2018-04-26 2018-10-19 北京市商汤科技开发有限公司 Video actions recognition methods and device, electronic equipment and storage medium
CN108875934A (en) * 2018-05-28 2018-11-23 北京旷视科技有限公司 A kind of training method of neural network, device, system and storage medium
CN109002845A (en) * 2018-06-29 2018-12-14 西安交通大学 Fine granularity image classification method based on depth convolutional neural networks
CN109189959A (en) * 2018-09-06 2019-01-11 腾讯科技(深圳)有限公司 A kind of method and device constructing image data base
CN109241880A (en) * 2018-08-22 2019-01-18 北京旷视科技有限公司 Image processing method, image processing apparatus, computer readable storage medium
CN109359566A (en) * 2018-09-29 2019-02-19 河南科技大学 The gesture identification method of hierarchical classification is carried out using finger characteristic
US20190171904A1 (en) * 2017-12-01 2019-06-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for training fine-grained image recognition model, fine-grained image recognition method and apparatus, and storage mediums
CN109919177A (en) * 2019-01-23 2019-06-21 西北工业大学 Feature selection approach based on stratification depth network
CN110163127A (en) * 2019-05-07 2019-08-23 国网江西省电力有限公司检修分公司 A kind of video object Activity recognition method from thick to thin
CN110287836A (en) * 2019-06-14 2019-09-27 北京迈格威科技有限公司 Image classification method, device, computer equipment and storage medium
CN110309888A (en) * 2019-07-11 2019-10-08 南京邮电大学 A kind of image classification method and system based on layering multi-task learning
CN110347870A (en) * 2019-06-19 2019-10-18 西安理工大学 The video frequency abstract generation method of view-based access control model conspicuousness detection and hierarchical clustering method
CN110390350A (en) * 2019-06-24 2019-10-29 西北大学 A kind of hierarchical classification method based on Bilinear Structure
CN110647907A (en) * 2019-08-05 2020-01-03 广东工业大学 Multi-label image classification algorithm using multi-layer classification and dictionary learning
CN110659378A (en) * 2019-09-07 2020-01-07 吉林大学 Fine-grained image retrieval method based on contrast similarity loss function
CN110738247A (en) * 2019-09-30 2020-01-31 中国科学院大学 fine-grained image classification method based on selective sparse sampling
CN110796183A (en) * 2019-10-17 2020-02-14 大连理工大学 Weak supervision fine-grained image classification algorithm based on relevance-guided discriminant learning
CN110929624A (en) * 2019-11-18 2020-03-27 西北工业大学 Construction method of multi-task classification network based on orthogonal loss function
CN110929730A (en) * 2019-11-18 2020-03-27 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium

Patent Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001160057A (en) * 1999-12-03 2001-06-12 Nippon Telegr & Teleph Corp <Ntt> Method for hierarchically classifying image and device for classifying and retrieving picture and recording medium with program for executing the method recorded thereon
US20020122596A1 (en) * 2001-01-02 2002-09-05 Bradshaw David Benedict Hierarchical, probabilistic, localized, semantic image classifier
US20130322740A1 (en) * 2012-05-31 2013-12-05 Lihui Chen Method of Automatically Training a Classifier Hierarchy by Dynamic Grouping the Training Samples
KR20160032533A (en) * 2014-09-16 2016-03-24 삼성전자주식회사 Feature extracting method of input image based on example pyramid and apparatus of face recognition
CN104200238A (en) * 2014-09-22 2014-12-10 北京酷云互动科技有限公司 Station caption recognition method and station caption recognition device
US20170193328A1 (en) * 2015-12-31 2017-07-06 Microsoft Technology Licensing, Llc Structure and training for image classification
US9928448B1 (en) * 2016-09-23 2018-03-27 International Business Machines Corporation Image classification utilizing semantic relationships in a classification hierarchy
CN107067022A (en) * 2017-01-04 2017-08-18 美的集团股份有限公司 The method for building up of image classification model, set up device and equipment
CN108171254A (en) * 2017-11-22 2018-06-15 北京达佳互联信息技术有限公司 Image tag determines method, apparatus and terminal
US20190171904A1 (en) * 2017-12-01 2019-06-06 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for training fine-grained image recognition model, fine-grained image recognition method and apparatus, and storage mediums
CN108681695A (en) * 2018-04-26 2018-10-19 北京市商汤科技开发有限公司 Video actions recognition methods and device, electronic equipment and storage medium
CN108664924A (en) * 2018-05-10 2018-10-16 东南大学 A kind of multi-tag object identification method based on convolutional neural networks
CN108875934A (en) * 2018-05-28 2018-11-23 北京旷视科技有限公司 A kind of training method of neural network, device, system and storage medium
CN109002845A (en) * 2018-06-29 2018-12-14 西安交通大学 Fine granularity image classification method based on depth convolutional neural networks
CN109241880A (en) * 2018-08-22 2019-01-18 北京旷视科技有限公司 Image processing method, image processing apparatus, computer readable storage medium
CN109189959A (en) * 2018-09-06 2019-01-11 腾讯科技(深圳)有限公司 A kind of method and device constructing image data base
CN109359566A (en) * 2018-09-29 2019-02-19 河南科技大学 The gesture identification method of hierarchical classification is carried out using finger characteristic
CN109919177A (en) * 2019-01-23 2019-06-21 西北工业大学 Feature selection approach based on stratification depth network
CN110163127A (en) * 2019-05-07 2019-08-23 国网江西省电力有限公司检修分公司 A kind of video object Activity recognition method from thick to thin
CN110287836A (en) * 2019-06-14 2019-09-27 北京迈格威科技有限公司 Image classification method, device, computer equipment and storage medium
CN110347870A (en) * 2019-06-19 2019-10-18 西安理工大学 The video frequency abstract generation method of view-based access control model conspicuousness detection and hierarchical clustering method
CN110390350A (en) * 2019-06-24 2019-10-29 西北大学 A kind of hierarchical classification method based on Bilinear Structure
CN110309888A (en) * 2019-07-11 2019-10-08 南京邮电大学 A kind of image classification method and system based on layering multi-task learning
CN110647907A (en) * 2019-08-05 2020-01-03 广东工业大学 Multi-label image classification algorithm using multi-layer classification and dictionary learning
CN110659378A (en) * 2019-09-07 2020-01-07 吉林大学 Fine-grained image retrieval method based on contrast similarity loss function
CN110738247A (en) * 2019-09-30 2020-01-31 中国科学院大学 fine-grained image classification method based on selective sparse sampling
CN110796183A (en) * 2019-10-17 2020-02-14 大连理工大学 Weak supervision fine-grained image classification algorithm based on relevance-guided discriminant learning
CN110929624A (en) * 2019-11-18 2020-03-27 西北工业大学 Construction method of multi-task classification network based on orthogonal loss function
CN110929730A (en) * 2019-11-18 2020-03-27 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
ANG LI 等: "Adaptive Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition", 2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), pages 1 - 5 *
RUYI JI 等: "Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization", ARXIV, pages 1 - 10 *
XINQI ZHU 等: "B-CNN: Branch Convolutional Neural Network for Hierarchical Classification", ARXIV, pages 1 - 9 *
ZIHAO MAO 等: "Multi-branch Structure for Hierarchical Classification in Plant Disease Recognition", PATTERN RECOGNITION AND COMPUTER VISION. PRCV 2019. LECTURE NOTES IN COMPUTER SCIENCE, pages 528 *
刘鹏 等: "一种多层次抽象语义决策图像分类方法", 自动化学报, vol. 41, no. 05, pages 960 - 969 *
孙延鹏 等: "一种基于层次语义图像分类的改进方法", 计算机应用与软件, vol. 30, no. 09, pages 263 - 265 *
杨恢先 等: "激光与光电子学进展", 激光与光电子学进展, vol. 56, no. 18, pages 134 - 142 *
胡敏 等: "基于几何和纹理特征的表情层级分类方法", 电子学报, vol. 45, no. 01, pages 164 - 172 *
董雄雄: "基于注意力机制深度神经网络的车辆细分类系统设计与实现", 中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑, vol. 2019, no. 8, pages 034 - 197 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364933A (en) * 2020-11-23 2021-02-12 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN113255766A (en) * 2021-05-25 2021-08-13 平安科技(深圳)有限公司 Image classification method, device, equipment and storage medium
CN113255766B (en) * 2021-05-25 2023-12-22 平安科技(深圳)有限公司 Image classification method, device, equipment and storage medium
JP7082239B1 (en) 2021-06-09 2022-06-07 京セラ株式会社 Recognition device, terminal device, recognizer construction device, recognizer correction device, construction method, and correction method
WO2022260148A1 (en) * 2021-06-09 2022-12-15 京セラ株式会社 Recognizing device, terminal device, recognizer constructing device, recognizer correcting device, construction method, and correction method
JP2022188727A (en) * 2021-06-09 2022-12-21 京セラ株式会社 Recognition device, terminal device, recognizer builder, recognizer corrector, building method and correction method
CN113836338A (en) * 2021-07-21 2021-12-24 北京邮电大学 Fine-grained image classification method and device, storage medium and terminal
CN113836338B (en) * 2021-07-21 2024-05-24 北京邮电大学 Fine granularity image classification method, device, storage medium and terminal
CN113673576A (en) * 2021-07-26 2021-11-19 浙江大华技术股份有限公司 Image detection method, terminal and computer readable storage medium thereof

Similar Documents

Publication Publication Date Title
CN111507403A (en) Image classification method and device, computer equipment and storage medium
Springenberg et al. Improving deep neural networks with probabilistic maxout units
CN111680672B (en) Face living body detection method, system, device, computer equipment and storage medium
CN110827236B (en) Brain tissue layering method, device and computer equipment based on neural network
CN111797326A (en) False news detection method and system fusing multi-scale visual information
CN114170516B (en) Vehicle weight recognition method and device based on roadside perception and electronic equipment
JP2022014776A (en) Activity detection device, activity detection system, and activity detection method
CN114418030A (en) Image classification method, and training method and device of image classification model
CN115512005A (en) Data processing method and device
CN117079299B (en) Data processing method, device, electronic equipment and storage medium
US11062141B2 (en) Methods and apparatuses for future trajectory forecast
Ousmane et al. Automatic recognition system of emotions expressed through the face using machine learning: Application to police interrogation simulation
CN115238888A (en) Training method, using method, device, equipment and medium of image classification model
CN112364828B (en) Face recognition method and financial system
CN112329735B (en) Training method of face recognition model and online education system
CN111178370A (en) Vehicle retrieval method and related device
CN115115910A (en) Training method, using method, device, equipment and medium of image processing model
CN113516182A (en) Visual question-answering model training method and device, and visual question-answering method and device
CN114639132A (en) Feature extraction model processing method, device and equipment in face recognition scene
CN116612466B (en) Content identification method, device, equipment and medium based on artificial intelligence
CN117437425B (en) Semantic segmentation method, semantic segmentation device, computer equipment and computer readable storage medium
CN113569887B (en) Picture recognition model training and picture recognition method, device and storage medium
CN115100419B (en) Target detection method and device, electronic equipment and storage medium
CN113255408B (en) Behavior recognition method, behavior recognition device, electronic equipment and storage medium
Andreș et al. Automatic License Plate Recognition and Real-Time Car Vignette Notifications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40029152

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination