CN113255766A - Image classification method, device, equipment and storage medium - Google Patents

Image classification method, device, equipment and storage medium Download PDF

Info

Publication number
CN113255766A
CN113255766A CN202110570158.2A CN202110570158A CN113255766A CN 113255766 A CN113255766 A CN 113255766A CN 202110570158 A CN202110570158 A CN 202110570158A CN 113255766 A CN113255766 A CN 113255766A
Authority
CN
China
Prior art keywords
decoupling
class
image
feature map
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110570158.2A
Other languages
Chinese (zh)
Other versions
CN113255766B (en
Inventor
陈凌智
高艳
王立龙
杜青
吕传峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANDONG EYE INSTITUTE
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110570158.2A priority Critical patent/CN113255766B/en
Publication of CN113255766A publication Critical patent/CN113255766A/en
Application granted granted Critical
Publication of CN113255766B publication Critical patent/CN113255766B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The application is applicable to the technical field of image processing and provides an image classification method, an image classification system and a storage medium. The image classification method comprises the following steps: extracting a multi-channel feature map of an image to be classified; decoupling the multichannel characteristic diagram to obtain a plurality of single-class decoupling characteristic diagrams; acquiring inter-class relation according to each single-class decoupling characteristic diagram obtained by decoupling; calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph; and determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph. According to the method and the device, the probability of existence of the possibly coexisting features is independently calculated through feature decoupling and inter-class relation extraction, whether the feature labels exist or not is respectively judged, and the accuracy of image feature classification is improved.

Description

Image classification method, device, equipment and storage medium
Technical Field
The present application belongs to the field of image processing technologies, and in particular, to an image classification method, apparatus, device, and storage medium.
Background
In image recognition scenarios, several similar features often appear simultaneously on one image. When the existing machine learning model learns the samples, the phenomenon of coexistence of the characteristics can be learned. When the machine learning model is used for identifying the images, and for a plurality of characteristics with more coexistence phenomena in the sample, the model usually considers that when one characteristic appears, other characteristics also appear, so that different characteristics are difficult to be identified separately, and result confusion occurs, namely the classification accuracy is influenced.
Taking the identification of the color photos of the myopia eyeground as an example, at present, the identification of the color photos of the myopia eyeground can be realized by training a multi-label classification convolution neural network. A single sample in the training set may simultaneously contain one or more classification labels from the group consisting of myeoid fundus, diffuse atrophy, plaque atrophy, and macular atrophy; however, in the process of model training, the dependency relationship among all the classification labels is not considered, and when a large number of classes coexist in the training set, the convolutional neural network learns a large number of coupling features, so that different feature labels are difficult to distinguish individually in inference prediction, and the label classes with high coexistence frequency are easy to confuse. For example, when the training set includes many images in which the leopard-streak fundus oculi and the plaque atrophy coexist, and the network model estimates the images having the leopard-streak fundus oculi, the network model tends to assume that the plaque atrophy exists at the same time, and thus the result of the classification prediction varies.
Disclosure of Invention
The embodiment of the application provides an image classification method, device, equipment and storage medium, and the accuracy of image classification can be improved.
In a first aspect, an embodiment of the present application provides an image classification method, including:
extracting a multi-channel feature map of an image to be classified;
decoupling the multichannel characteristic diagram to obtain a plurality of single-class decoupling characteristic diagrams;
acquiring inter-class relation according to each single-class decoupling characteristic diagram obtained by decoupling;
calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph;
and determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph.
Wherein the decoupling the multichannel signature graph obtains a plurality of single class decoupling signature graphs, including:
performing convolution operation on the multi-channel feature map to obtain a convolution feature map;
processing the convolution characteristic graphs based on a preset activation function to obtain a plurality of single-type characteristic graphs;
and multiplying each single-class feature map with the multi-channel feature map respectively to obtain a single-class decoupling feature map.
Specifically, the multi-channel feature map comprises a shallow feature map and a deep feature map, and the single-type decoupling feature map comprises a single-type shallow decoupling feature map and a single-type deep decoupling feature map;
the decoupling the multichannel signature graph obtains a plurality of single-class decoupling signature graphs, including:
and respectively decoupling the shallow layer characteristic diagram and the deep layer characteristic diagram to obtain a single type shallow layer decoupling characteristic diagram and a single type deep layer decoupling characteristic diagram.
Illustratively, the inter-class relationships comprise an inter-class relationship matrix;
the obtaining of the inter-class relationship according to each single-class decoupling characteristic diagram obtained by decoupling includes:
fusing the single-type shallow decoupling characteristic diagram and the single-type deep decoupling characteristic diagram to generate an inter-class relation matrix;
the step of calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph comprises the following steps:
inputting the single-class deep decoupling characteristic diagrams and the inter-class relation matrix into a preset diagram convolution network to obtain a first classification probability of each single-class decoupling characteristic diagram;
the determining the category of the image to be classified according to the classification probability of each single-type decoupling feature map comprises the following steps:
and determining the category of the image to be classified according to the first classification probability of each single-category decoupling feature map.
As a possible implementation manner, after the extracting the multi-channel feature map of the image to be classified, the method further includes:
calculating a second classification probability of the multi-channel feature map through a full connection layer;
fusing the first classification probability and the second classification probability to obtain a third classification probability;
the determining the category of the image to be classified according to the classification probability of each single-type decoupling feature map comprises the following steps:
and determining the category of the image to be classified according to the third classification probability of each single-category decoupling characteristic graph.
In a second aspect, an embodiment of the present application provides an image classification apparatus, including:
the feature extraction module is used for extracting a multi-channel feature map of the image to be classified;
the characteristic decoupling module is used for decoupling the multichannel characteristic diagram to obtain a plurality of single-class decoupling characteristic diagrams;
the inter-class relation extraction module is used for acquiring inter-class relations according to the single-class decoupling characteristic graphs obtained by decoupling; calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph;
and the classification module is used for determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph.
Wherein the feature decoupling module comprises:
the convolution unit comprises a convolution network with 1 x 1 convolution kernel and is used for carrying out convolution operation on the multi-channel characteristic diagram to obtain a convolution characteristic diagram;
the activation unit comprises an activation function and is used for processing the convolution characteristic graph based on a preset activation function to obtain a plurality of single-type characteristic graphs;
and the decoupling unit is used for multiplying each single-class characteristic diagram with the multi-channel characteristic diagram respectively to obtain a single-class decoupling characteristic diagram.
Wherein the single-class decoupling feature map comprises a single-class shallow decoupling feature map and a single-class deep decoupling feature map;
correspondingly, the inter-class relationship extraction module comprises:
the fusion unit is used for fusing the single shallow decoupling characteristic diagram and the single deep decoupling characteristic diagram to generate an inter-class relation matrix;
and the graph convolution unit is used for inputting the single-type deep decoupling characteristic graphs and the inter-class relation matrix into a preset graph convolution network to obtain a first classification probability of each single-type decoupling characteristic graph.
In a third aspect, an embodiment of the present application provides an image classification apparatus, including: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the image classification method as described in any of the first aspect above when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, including: the computer readable storage medium stores a computer program which, when executed by a processor, implements the image classification method as described above in the first aspect.
In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the image classification method according to any one of the first aspect.
It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Compared with the prior art, the embodiment of the application has the advantages that: the method comprises the steps of obtaining a plurality of single-class feature maps by decoupling a multi-channel feature map of an image, extracting inter-class relations of the single-class feature maps, and determining the possibility of each feature label of the image according to the inter-class relations so as to classify the labels of the image. Through feature decoupling and inter-class relation extraction, the existence probability of the possibly coexisting features is independently calculated, whether the feature labels exist or not is respectively judged, and the accuracy of image feature classification is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flowchart of an image classification method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating an image classification method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of an image classification apparatus according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a convolutional network model provided in an embodiment of the present application;
fig. 5 is a schematic diagram of an image classification apparatus according to another embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
The embodiment provides an image classification method, which is suitable for a scene for performing feature recognition on an image and performing label classification according to image features. For example, it is particularly suitable for classifying trees in landscape photographs, classifying gender or age of a human photograph, classifying food in a refrigerator, and the like, and is particularly suitable for classifying image representations of fundus color photographs. The category of the image is represented by category labels corresponding to different features of the image, and in fundus color photography classification, the category labels include leopard streaks, diffuse atrophy, plaque atrophy, macular atrophy and the like, and in different fundus color photographs, a plurality of category labels may coexist.
The image to be classified is acquired through image acquisition equipment, and the image acquisition equipment can be a digital camera, a mobile phone, a fundus camera and the like according to a specific application scene. The image classification method is executed by a corresponding image classification device, the image acquisition equipment and the image classification device can be independent equipment, communication connection is established between the image acquisition equipment and the image classification device, the image classification device can be equipment with image processing capacity, such as a PC (personal computer), a tablet computer, a server, a cloud computing terminal and the like, and a memory and a processor are integrated on the image classification device; in other implementations of the present application, the image classification device may be integrated with the image acquisition device. In the eyeground color photograph classification scene, eyeground color photograph collection is performed through an eyeground camera, and eyeground color photographs are classified through a PC.
When the image acquisition equipment and the image classification device are independently arranged, the image acquisition equipment and the image classification device are also respectively provided with a communication unit, the communication unit provides wired or wireless communication between the image acquisition equipment and the image classification device, and the image acquisition equipment sends the acquired image to be classified to the image classification device.
Next, an example of classifying the fundus color photograph will be described. Fig. 1 is a flowchart illustrating an image classification method according to this embodiment. As shown in fig. 1, the image classification method is performed by an image classification apparatus, and includes the steps of:
and S11, extracting the multi-channel feature map of the image to be classified.
And inputting the image to be classified into a backbone neural network (backbone) to extract the spatial semantic features, and obtaining a multi-channel feature map. The input image to be classified is usually a three-channel color image, and the multi-channel feature map comprises a shallow feature map and a deep feature map.
The backsbone adopts a Resnet50 network structure, the Resnet50 comprises four groups of residual blocks, an image to be classified is used as the input of a 1 st group of residual blocks, the output of the 1 st group of residual blocks is used as the input of a 2 nd group of residual blocks, the output of the 2 nd group of residual blocks is used as the input of a 3 rd group of residual blocks, and by analogy, shallow feature maps are respectively output from convolution calculation results of the 2 nd group of residual blocks, and deep feature maps are output from convolution calculation results of the 4 th group of residual blocks.
And S12, obtaining a plurality of single-class decoupling characteristic graphs by the decoupling multichannel characteristic graphs.
Performing convolution operation on the multi-channel feature map through a convolution network with 1 x 1 convolution kernel to extract features to obtain a convolution feature map; the 1-by-1 convolution kernel can reduce the channel number of the feature map to be consistent with the number of the classification labels under the condition of not changing the size of the feature map. And inputting the feature maps with the channel number consistent with the label number into a preset activation function, and outputting a plurality of single-class feature maps corresponding to the classification labels by the activation function. And multiplying each single-class characteristic diagram with the multi-channel characteristic diagram respectively to obtain a single-class decoupling characteristic diagram.
Correspondingly, the shallow layer characteristic diagram and the deep layer characteristic diagram are subjected to characteristic decoupling respectively, and the obtained single-class decoupling characteristic diagram comprises a single-class shallow layer decoupling characteristic diagram and a single-class deep layer decoupling characteristic diagram corresponding to each class label.
And S13, acquiring the inter-class relation according to each single class decoupling characteristic diagram obtained by decoupling.
And for the single-class decoupling characteristic diagram of each class label, fusing the single-class shallow decoupling characteristic diagram and the single-class deep decoupling characteristic diagram through concat fusion operation to generate an inter-class relation matrix. And representing the inter-class relation of the class labels through an inter-class relation matrix.
And S14, calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph.
And inputting the single-class deep decoupling characteristic graphs and the inter-class relation matrix into a preset graph convolution network, and calculating to obtain a first classification probability of each single-class decoupling characteristic graph. The first classification probability is a classification probability based on a relationship between classes.
And S15, determining the category of the image to be classified according to the classification probability of each single-category decoupling feature map.
And respectively setting probability threshold values for different category labels, and when the classification probability of the category labels reaches the probability threshold value, determining that the image features in the image to be classified have the category labels. For example, for the classification of fundus color photographs, when the classification probability of the single-class characteristic map of the plaque-like atrophy reaches a probability threshold value, the fundus color photograph is considered to have a label of the plaque-like atrophy.
In the embodiment, a plurality of single-class decoupling feature maps corresponding to class labels are obtained by decoupling a multi-channel feature map of an image, the inter-class relation of the single-class decoupling feature maps is extracted, and the label coexistence condition of the image is determined according to the inter-class relation, so as to classify the label of the image. The classification mode takes objective relation among the class labels into consideration, avoids misjudgment of the trained model on the coexistence situation of the class labels caused by sample imbalance in the traditional convolutional network model training, and improves the accuracy of image feature classification.
Fig. 2 is a schematic flowchart of an image classification method according to another embodiment. On the basis of the above embodiments, for the case that the differences between the image features in the image to be classified are not obvious and the mixture coexists more, the inter-class relationship and the image features need to be comprehensively considered for more accurate classification.
As shown in fig. 2, the image classification method includes the steps of:
and S21, extracting the multi-channel feature map of the image to be classified.
The multi-channel feature map includes a shallow feature map and a deep feature map.
And S22, calculating a second classification probability of the multi-channel feature map through the full connection layer.
And the full connection layer is used for carrying out a classification task based on the characteristics and calculating a second classification probability of the multi-channel characteristic diagram. The second classification probability is a classification probability based on the image feature.
And S23, obtaining a plurality of single-class decoupling characteristic graphs by the decoupling multichannel characteristic graphs.
And obtaining a single type shallow decoupling characteristic map by the decoupling shallow characteristic map, and obtaining a single type deep decoupling characteristic map by the decoupling deep characteristic map.
And S24, acquiring the inter-class relation according to each single class decoupling characteristic diagram obtained by decoupling.
And fusing the single-class shallow decoupling characteristic diagram and the single-class deep decoupling characteristic diagram to generate an inter-class relation matrix for representing the inter-class relation.
And S25, calculating the first classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph.
And inputting the single-class deep decoupling characteristic graphs and the inter-class relation matrix into a preset graph convolution network to obtain a first classification probability of each single-class decoupling characteristic graph.
And S26, fusing the first classification probability and the second classification probability to obtain a third classification probability.
For the class label corresponding to each single-class decoupling characteristic graph, the simplest fusion mode is to take the average value of the first classification probability and the second classification probability as the third classification probability of the class label.
And respectively setting weights for the first classification probability and the second classification probability according to the complexity of the image in the actual application scene for fusion.
And S27, determining the category of the image to be classified according to the third classification probability of each single-category decoupling characteristic graph.
And respectively setting probability threshold values for different category labels, and when the classification probability of the category labels reaches the probability threshold value, determining that the image features in the image to be classified have the category labels.
On the basis of the method embodiment, the first classification probability is obtained based on the inter-class relationship, the second classification probability is obtained based on the image features, and the values of the first classification probability and the second classification probability are comprehensively considered according to the complexity of the image to be classified, so that the image classification result is not influenced by feature coexistence, and frequently coexisting features are not omitted.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The working process of the steps with the same content is not described again.
Further, the present embodiment also provides an image classification apparatus, which is composed of software and/or hardware and is used for executing the image classification method. Fig. 3 is a schematic structural diagram of the image classification apparatus provided in this embodiment. As shown in fig. 3, the image classification apparatus includes:
the characteristic extraction module is used for extracting a multi-channel characteristic diagram of the image to be classified, and the multi-channel characteristic diagram comprises a shallow characteristic diagram and a deep characteristic diagram;
the characteristic decoupling module is used for decoupling the multichannel characteristic diagram to obtain a plurality of single-type decoupling characteristic diagrams, and the single-type decoupling characteristic diagrams comprise a single-type shallow decoupling characteristic diagram and a single-type deep decoupling characteristic diagram;
the inter-class relation extraction module is used for acquiring inter-class relations according to the single-class decoupling characteristic graphs obtained by decoupling; calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph;
and the classification module is used for determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph.
Wherein the feature decoupling module comprises: convolution unit, activation unit and decoupling unit.
The convolution unit comprises a convolution network with a 1 x 1 convolution kernel and is used for performing convolution operation on the multi-channel feature map to obtain a convolution feature map;
the activation unit comprises a preset activation function, and the convolution characteristic graphs are processed based on the preset activation function to obtain a plurality of single-type characteristic graphs;
and the decoupling unit is used for multiplying each single-class characteristic diagram with the multi-channel characteristic diagram respectively to obtain a single-class decoupling characteristic diagram.
Correspondingly, the inter-class relationship extraction module comprises: a fusion unit and a graph convolution unit.
The fusion unit is used for fusing the single-type shallow decoupling characteristic diagram and the single-type deep decoupling characteristic diagram to generate an inter-class relation matrix;
and the graph convolution unit is used for inputting the single-class deep decoupling characteristic graphs and the inter-class relation matrix into a preset graph convolution network to obtain a first classification probability of each single-class decoupling characteristic graph.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
Specifically, a feature extraction module, a feature decoupling module and an inter-class relation extraction module of the image classification device form a convolution network model. Fig. 4 is a schematic structural diagram of the convolutional network model provided in this embodiment.
The feature extraction module extracts the spatial semantic features through a convolutional layer trunk neural network (backbone), wherein the trunk neural network comprises a plurality of groups of residual blocks, and each group of residual blocks comprises a plurality of residual modules. In some embodiments, the feature extraction module may employ a Residual Network (Resnet), such as a Resnet50 Network structure. The residual error network is more sensitive to data fluctuation and is more beneficial to identifying and extracting approximate image characteristics.
As a non-limiting example, as shown in fig. 4, the Resnet50 network includes four sets of Residual blocks, which in turn include 3, 4, 6, and 3 Residual blocks (Residual blocks). And the multiple groups of residual blocks are used for sequentially extracting the multi-channel feature map of the image to be classified to respectively obtain the shallow feature map and the deep feature map. Specifically, the image to be classified is used as the input of the 1 st group of residual blocks, the output of the 1 st group of residual blocks is used as the input of the 2 nd group of residual blocks, the output of the 2 nd group of residual blocks is used as the input of the 3 rd group of residual blocks, and so on, the shallow feature map and the deep feature map are respectively output from the convolution calculation results of the 2 nd group of residual blocks and the 4 th group of residual blocks.
The characteristic decoupling module is used for respectively performing characteristic decoupling on the shallow characteristic diagram and the deep characteristic diagram to obtain a single shallow decoupling characteristic diagram and a single deep decoupling characteristic diagram of each single decoupling characteristic diagram.
The characteristic decoupling module comprises a convolution network with a 1 x 1 convolution kernel, an activation function and a decoupling calculation unit; the convolution network is used for carrying out convolution operation on the multi-channel feature map to extract features so as to obtain a convolution feature map; the convolution kernel of 1 × 1 can reduce the number of channels of the feature map without changing the size of the feature map, and obtain the number of the feature map consistent with the number of the class labels. And inputting each convolution feature map into the activation function, calculating and outputting a single-class feature map corresponding to the class label by the activation function, wherein the activation function can select a sigmoid function, and can select a Tanh function or a ReLU function in other application scenes. And the decoupling calculation unit multiplies each single-class characteristic diagram with the multi-channel characteristic diagram respectively to obtain a single-class decoupling characteristic diagram.
The inter-class relation extraction module comprises a fusion calculation unit and a two-layer Graph volume Network (GCN). And the fusion calculation unit fuses the single shallow decoupling characteristic diagram and the single deep decoupling characteristic diagram through fusion (concat) operation to generate a Correlation Matrix. And calculating to obtain a first classification probability of the single-class deep decoupling characteristic graph by taking the single-class deep decoupling characteristic graph and the inter-class relation matrix as the input of a graph convolution network. The first classification probability is a classification probability based on a relationship between classes.
In other embodiments, the feature extraction module further includes a full connection layer, and the full connection layer is configured to perform a classification task and calculate a second classification probability of the multi-channel feature map. The second classification probability is a classification probability based on the image feature.
The inter-class relation extraction module is further configured to fuse the first classification probability and the second classification probability to obtain a third classification probability. For the class label corresponding to each single-class decoupling characteristic graph, the simplest fusion mode is to take the average value of the two as the classification probability of the class label. In other embodiments, the first classification probability and the second classification probability may be fused by setting weights for the two probabilities, respectively.
The inter-class relation extraction module is specifically configured to determine the class of the image to be classified according to the first classification probability or the third classification probability. For the case that the image features in the image to be classified are obviously different, the image to be classified can be subjected to label classification only by adopting the first classification probability, and otherwise, the image to be classified is subjected to label classification by adopting the third classification probability.
And respectively setting probability threshold values for different category labels, and when the classification probability of the category labels reaches the probability threshold value, determining that the image features in the image to be classified have the category labels.
Before practical application, the convolutional network model needs to be trained, the convolutional network model performs feature extraction and feature decoupling on sample images to obtain a plurality of single decoupling feature maps, category labels of the single decoupling feature maps are respectively labeled on each single decoupling feature map for learning of the convolutional network model, the convolutional network model can obtain inter-category relationships among the category labels through learning of a large number of sample images, and classification probability is further obtained through learning.
In the application, the convolutional network model performs feature extraction and feature decoupling on the image to be classified, based on the learned inter-class relationship, the classification probability of each class label is calculated and output according to the single-class decoupling feature map, and further, whether the image to be classified has the corresponding class label is determined according to the classification probability and a probability threshold.
The existing machine learning model only extracts image features, the output classification probability ignores the relation before the class label and is easily influenced by the training sample, and when the training sample is not comprehensive and balanced enough, the learned classification probability is not accurate enough. The convolution network model provided by the embodiment decouples the features, extracts the relation between the coexisting features, weakens the influence of the training sample deviation on the coexisting features, and enables classification to be more objective.
Fig. 5 is a schematic structural diagram of an image classification device according to another embodiment of the present application. As shown in fig. 5, the present embodiment also provides an image classification apparatus, including: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the image classification method as described in the above method embodiments when executing the computer program.
Specifically, the memory stores the trained convolutional network model, and the processor calls the convolutional network model to implement the image classification method according to the above method embodiment, so as to determine the category of the image to be classified.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be an internal storage unit of the image capturing device or the image sorting apparatus in some embodiments, such as a hard disk or a memory, and may also be an external storage device of the image capturing device or the image sorting apparatus in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory may also include both an internal storage unit and an external storage device. The memory is used for storing an operating system, application programs, a BootLoader (BootLoader), data, and other programs, such as program codes of computer programs. The memory may also be used to temporarily store data that has been or will be output, such as captured images to be classified, various types of feature maps, and the like.
The processor may determine the class of the image to be classified, for example, to classify the under-eye illumination, by the trained convolutional network model described above.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.
The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. An image classification method, comprising:
extracting a multi-channel feature map of an image to be classified;
decoupling the multichannel characteristic diagram to obtain a plurality of single-class decoupling characteristic diagrams;
acquiring inter-class relation according to each single-class decoupling characteristic diagram obtained by decoupling;
calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph;
and determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph.
2. The image classification method of claim 1, wherein the decoupling the multi-channel feature map obtains a plurality of single-class decoupled feature maps, comprising:
performing convolution operation on the multi-channel feature map to obtain a convolution feature map;
processing the convolution characteristic graphs based on a preset activation function to obtain a plurality of single-type characteristic graphs;
and multiplying each single-class feature map with the multi-channel feature map respectively to obtain a single-class decoupling feature map.
3. The image classification method according to claim 1, wherein the multi-channel feature map comprises a shallow feature map and a deep feature map, and the single-class decoupling feature map comprises a single-class shallow decoupling feature map and a single-class deep decoupling feature map;
the decoupling the multichannel signature graph obtains a plurality of single-class decoupling signature graphs, including:
and respectively decoupling the shallow layer characteristic diagram and the deep layer characteristic diagram to obtain a single type shallow layer decoupling characteristic diagram and a single type deep layer decoupling characteristic diagram.
4. The image classification method according to claim 3, characterized in that the inter-class relationship comprises an inter-class relationship matrix;
the obtaining of the inter-class relationship according to each single-class decoupling characteristic diagram obtained by decoupling includes:
fusing the single-type shallow decoupling characteristic diagram and the single-type deep decoupling characteristic diagram to generate an inter-class relation matrix;
the step of calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph comprises the following steps:
inputting the single-class deep decoupling characteristic diagrams and the inter-class relation matrix into a preset diagram convolution network to obtain a first classification probability of each single-class decoupling characteristic diagram;
the determining the category of the image to be classified according to the classification probability of each single-type decoupling feature map comprises the following steps:
and determining the category of the image to be classified according to the first classification probability of each single-category decoupling feature map.
5. The image classification method according to claim 4, wherein after extracting the multi-channel feature map of the image to be classified, the method further comprises:
calculating a second classification probability of the multi-channel feature map through a full connection layer;
fusing the first classification probability and the second classification probability to obtain a third classification probability;
the determining the category of the image to be classified according to the classification probability of each single-type decoupling feature map comprises the following steps:
and determining the category of the image to be classified according to the third classification probability of each single-category decoupling characteristic graph.
6. An image classification apparatus, comprising:
the feature extraction module is used for extracting a multi-channel feature map of the image to be classified;
the characteristic decoupling module is used for decoupling the multichannel characteristic diagram to obtain a plurality of single-class decoupling characteristic diagrams;
the inter-class relation extraction module is used for acquiring inter-class relations according to the single-class decoupling characteristic graphs obtained by decoupling; calculating the classification probability of the corresponding single-class decoupling characteristic graph according to the inter-class relation and the single-class decoupling characteristic graph aiming at each single-class decoupling characteristic graph;
and the classification module is used for determining the category of the image to be classified according to the classification probability of each single-category decoupling characteristic graph.
7. The image classification device of claim 6, wherein the feature decoupling module comprises:
the convolution unit comprises a convolution network with 1 x 1 convolution kernel and is used for carrying out convolution operation on the multi-channel characteristic diagram to obtain a convolution characteristic diagram;
the activation unit comprises an activation function and is used for processing the convolution characteristic graph based on a preset activation function to obtain a plurality of single-type characteristic graphs;
and the decoupling unit is used for multiplying each single-class characteristic diagram with the multi-channel characteristic diagram respectively to obtain a single-class decoupling characteristic diagram.
8. The image classification device according to claim 6, wherein the single-class decoupling feature map comprises a single-class shallow decoupling feature map and a single-class deep decoupling feature map;
the inter-class relationship extraction module comprises:
the fusion unit is used for fusing the single shallow decoupling characteristic diagram and the single deep decoupling characteristic diagram to generate an inter-class relation matrix;
and the graph convolution unit is used for inputting the single-type deep decoupling characteristic graphs and the inter-class relation matrix into a preset graph convolution network to obtain a first classification probability of each single-type decoupling characteristic graph.
9. An image classification apparatus characterized by comprising: memory, processor and computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the image classification method according to any one of claims 1 to 5.
CN202110570158.2A 2021-05-25 2021-05-25 Image classification method, device, equipment and storage medium Active CN113255766B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110570158.2A CN113255766B (en) 2021-05-25 2021-05-25 Image classification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110570158.2A CN113255766B (en) 2021-05-25 2021-05-25 Image classification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113255766A true CN113255766A (en) 2021-08-13
CN113255766B CN113255766B (en) 2023-12-22

Family

ID=77184151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110570158.2A Active CN113255766B (en) 2021-05-25 2021-05-25 Image classification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113255766B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023068330A1 (en) * 2021-10-20 2023-04-27 DeepEyeVision株式会社 Information processing device, information processing method, and computer-readable recording medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180075628A1 (en) * 2016-09-12 2018-03-15 Zebra Medical Vision Ltd. Systems and methods for automated detection of an indication of malignancy in a mammographic image
US20180189325A1 (en) * 2016-12-29 2018-07-05 Shutterstock, Inc. Clustering search results based on image composition
CN109145743A (en) * 2018-07-19 2019-01-04 叶涵 A kind of image-recognizing method and device based on deep learning
CN109886273A (en) * 2019-02-26 2019-06-14 四川大学华西医院 A kind of CMR classification of image segmentation system
CN111488945A (en) * 2020-04-17 2020-08-04 上海眼控科技股份有限公司 Image processing method, image processing device, computer equipment and computer readable storage medium
CN111507403A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Image classification method and device, computer equipment and storage medium
CN111553419A (en) * 2020-04-28 2020-08-18 腾讯科技(深圳)有限公司 Image identification method, device, equipment and readable storage medium
CN111914775A (en) * 2020-08-06 2020-11-10 平安科技(深圳)有限公司 Living body detection method, living body detection device, electronic apparatus, and storage medium
US20210117687A1 (en) * 2019-10-22 2021-04-22 Sensetime International Pte. Ltd. Image processing method, image processing device, and storage medium
CN112749653A (en) * 2020-12-31 2021-05-04 平安科技(深圳)有限公司 Pedestrian detection method, device, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180075628A1 (en) * 2016-09-12 2018-03-15 Zebra Medical Vision Ltd. Systems and methods for automated detection of an indication of malignancy in a mammographic image
US20180189325A1 (en) * 2016-12-29 2018-07-05 Shutterstock, Inc. Clustering search results based on image composition
CN109145743A (en) * 2018-07-19 2019-01-04 叶涵 A kind of image-recognizing method and device based on deep learning
CN109886273A (en) * 2019-02-26 2019-06-14 四川大学华西医院 A kind of CMR classification of image segmentation system
US20210117687A1 (en) * 2019-10-22 2021-04-22 Sensetime International Pte. Ltd. Image processing method, image processing device, and storage medium
CN111488945A (en) * 2020-04-17 2020-08-04 上海眼控科技股份有限公司 Image processing method, image processing device, computer equipment and computer readable storage medium
CN111507403A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Image classification method and device, computer equipment and storage medium
CN111553419A (en) * 2020-04-28 2020-08-18 腾讯科技(深圳)有限公司 Image identification method, device, equipment and readable storage medium
CN111914775A (en) * 2020-08-06 2020-11-10 平安科技(深圳)有限公司 Living body detection method, living body detection device, electronic apparatus, and storage medium
CN112749653A (en) * 2020-12-31 2021-05-04 平安科技(深圳)有限公司 Pedestrian detection method, device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023068330A1 (en) * 2021-10-20 2023-04-27 DeepEyeVision株式会社 Information processing device, information processing method, and computer-readable recording medium

Also Published As

Publication number Publication date
CN113255766B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN110889312B (en) Living body detection method and apparatus, electronic device, computer-readable storage medium
CN111881707B (en) Image reproduction detection method, identity verification method, model training method and device
CN110188829B (en) Neural network training method, target recognition method and related products
CN109359196B (en) Text multi-modal representation method and device
CN112633297B (en) Target object identification method and device, storage medium and electronic device
CN114419363A (en) Target classification model training method and device based on label-free sample data
CN112949459A (en) Smoking image recognition method and device, storage medium and electronic equipment
CN112613508A (en) Object identification method, device and equipment
CN111784665A (en) OCT image quality assessment method, system and device based on Fourier transform
CN113255766B (en) Image classification method, device, equipment and storage medium
CN114091551A (en) Pornographic image identification method and device, electronic equipment and storage medium
CN112257628A (en) Method, device and equipment for identifying identities of outdoor competition athletes
CN115713669A (en) Image classification method and device based on inter-class relation, storage medium and terminal
CN114610942A (en) Image retrieval method and device based on joint learning, storage medium and electronic equipment
CN113420801A (en) Network model generation method, device, terminal and storage medium
Gavilan Ruiz et al. Image categorization using color blobs in a mobile environment
CN111881187A (en) Method for automatically establishing data processing model and related product
CN112580750A (en) Image recognition method and device, electronic equipment and storage medium
CN111144298A (en) Pedestrian identification method and device
CN113128262A (en) Target identification method and device, storage medium and electronic device
CN113542866B (en) Video processing method, device, equipment and computer readable storage medium
CN113255665B (en) Target text extraction method and system
CN111079468A (en) Method and device for robot to recognize object
CN113709563B (en) Video cover selecting method and device, storage medium and electronic equipment
CN110866532B (en) Object matching method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221031

Address after: 518000 Guangdong, Shenzhen, Futian District Futian street Fu'an community Yitian road 5033, Ping An financial center, 23 floor.

Applicant after: PING AN TECHNOLOGY (SHENZHEN) Co.,Ltd.

Applicant after: SHANDONG EYE INSTITUTE

Address before: 518000 Guangdong, Shenzhen, Futian District Futian street Fu'an community Yitian road 5033, Ping An financial center, 23 floor.

Applicant before: PING AN TECHNOLOGY (SHENZHEN) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant