CN114299343A - Multi-granularity information fusion fine-granularity image classification method and system - Google Patents

Multi-granularity information fusion fine-granularity image classification method and system Download PDF

Info

Publication number
CN114299343A
CN114299343A CN202111664965.7A CN202111664965A CN114299343A CN 114299343 A CN114299343 A CN 114299343A CN 202111664965 A CN202111664965 A CN 202111664965A CN 114299343 A CN114299343 A CN 114299343A
Authority
CN
China
Prior art keywords
image
granularity
training
information fusion
fine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111664965.7A
Other languages
Chinese (zh)
Inventor
胡建国
杨学彬
肖辉敏
卢星宇
吴劲
王德明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Development Research Institute Of Guangzhou Smart City
Sun Yat Sen University
Shenzhen Research Institute of Sun Yat Sen University
Original Assignee
Development Research Institute Of Guangzhou Smart City
Sun Yat Sen University
Shenzhen Research Institute of Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Development Research Institute Of Guangzhou Smart City, Sun Yat Sen University, Shenzhen Research Institute of Sun Yat Sen University filed Critical Development Research Institute Of Guangzhou Smart City
Priority to CN202111664965.7A priority Critical patent/CN114299343A/en
Publication of CN114299343A publication Critical patent/CN114299343A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a multi-granularity information fusion fine-granularity image classification method and a system, wherein an image data set is formed by inputting images; constructing a triple through a global interference module; inputting the triples into a CNNs backbone network for training by a progressive multi-granularity information fusion training strategy to obtain an optimized classification model; the method classifies the input images through the optimized classification model so as to obtain the final classification result, does not need to use any manual labeling, and is realized through two mutually cooperative processes: the global interference and the progressive multi-granularity information fusion enable the network to fuse information of different granularities, so that local features with higher discriminability are found out, and the identification precision can be greatly improved on all data sets by the reference network and the method.

Description

Multi-granularity information fusion fine-granularity image classification method and system
Technical Field
The invention belongs to the technical field of machine learning technology and image classification, and particularly relates to a multi-granularity information fusion fine-granularity image classification method and system.
Background
Fine-grained image recognition is a very challenging task in the field of computer vision, usually to distinguish sub-categories under the same super-category. Unlike general image recognition, there is a similar structure and only slight differences between different sub-categories, which results in low variance between fine-grained image categories. In addition, due to uncertain factors such as illumination, occlusion and the like, images in the same category have high variance. Therefore, a method of fine-grained image recognition must be able to accurately find subtle differences between different sub-classes of the same super-class, which is a challenging task.
Most of the existing fine-grained image recognition methods can be divided into two categories, namely strong supervision recognition and weak supervision recognition, according to the used supervision information. The two methods use different monitoring information in the training process, so that the algorithms of the two methods are greatly different. The strong supervision identification method can use other manual labeling information such as position frames besides the category label information, the method needs extensive labeling, so that time and labor are wasted, and fine-grained categories can be distinguished only by sufficient professional knowledge, so that the method is a huge bottleneck. With the rise of deep learning and transfer learning, a weakly supervised identification method only requiring image level labels begins to occupy a mainstream position, but most of the method depends on weights pre-trained by large-scale annotation data (such as ImageNet data sets).
Self-supervised learning has made a great breakthrough in the last years because no labeled information is needed in the pre-training process and results similar to or even higher than those of the supervised learning can be obtained, so that the self-supervised learning gradually becomes a trend, but the self-supervised method is a new paradigm in the field of fine-grained image classification. This patent has studied the fine-grained image classification problem under the self-supervised learning.
The existing fine-grained image classification method can be roughly divided into strong supervision identification and weak supervision identification according to the amount of supervision information used in training. The strong supervision identification method is time-consuming and labor-consuming because of the need of additional manual labeling and the need of strong expert knowledge for judging the fine-grained image category. At present, the mainstream weak supervision fine-grained image classification method only uses image-level label information to extract distinguishing information from training data, although good results are obtained, most of the methods heavily depend on the weight of large-scale data set (such as ImageNet data set) pre-training, however, the target of the ImageNet data set pre-training does not consider the characteristics of downstream classification tasks, and therefore the obtained model is suboptimal for the fine-grained classification tasks. Therefore, in the classification of fine-grained images, it is necessary to design a learning method which can successfully learn the visual representation of the image without manual labeling. In summary, the common problems of the fine-grained image method are as follows:
first, fine-grained image classification requires more expert knowledge than general image classification, and manual labeling of these data is cost prohibitive.
Secondly, most of the existing fine-grained image classification methods rely on a pre-training model on a large-scale data set (ImageNet), but the model does not consider the characteristics of downstream tasks, so that the model is suboptimal for fine-grained classification tasks.
Disclosure of Invention
The invention aims to provide a technical scheme of a multi-granularity information fusion fine-granularity image classification method and system, which are used for solving one or more technical problems in the prior art and at least providing a beneficial selection or creation condition.
In order to solve the problems, the invention provides a multi-granularity information fusion algorithm based on self-supervision contrast learning based on the characteristics of large intra-class variance and small inter-class variance of fine-granularity images, and the multi-granularity information fusion algorithm is used for a fine-granularity image classification task. The algorithm can improve the accuracy of the self-supervision fine-grained image classification without using any artificial label, and greatly closes the difference between the self-supervision learning and the supervision learning in the fine-grained image classification field without using the ImageNet data set pre-training weight. The self-supervision algorithm mainly comprises two stages of pre-training and fine-tuning.
In order to achieve the above object, according to an aspect of the present invention, there is provided a multi-granularity information fusion fine-granularity image classification method, including:
s100, inputting N images to form an image data set;
s200, randomly extracting an image from an image data set, and randomly cutting 2 images and the randomly extracted image of the image data set from the image;
s300, constructing a triple through a global interference module;
s400, inputting the triples [ a, p, n ] into a CNNs backbone network for training through a progressive multi-granularity information fusion training strategy to obtain an optimized classification model;
and S500, classifying the input images through the optimized classification model to obtain a final classification result.
Given an unmarked input image data set batch containing N images { x ═ x1,x2,…,xNFor a certain picture c in each batch in the training process, firstly, two randomly cut pictures in the pictures c are generated through enhancement of a random data set, and the picture c is defined as c1,c2And any other picture in the batch is defined as c3
Constructing the triple [ a, p, n ] by means of a global perturbation module, defined as f (·)]Wherein, the anchor sample (anchor) is recorded as a ═ f (c)1) Positive sample (positive sample) is denoted as p ═ f (c)2) The negative sample is denoted as n ═ f (c)3). For the triplet [ a, p, n]The anchor sample a and the positive sample p form a positive sample pair, the anchor sample a and the negative sample n form a negative sample pair, and the positive sample pair comes from the same image, and the negative sample pair comes from different images, so that the positive sample pair carries similar semantic content or visual features;
inputting the triples [ a, p, n ] into a CNNs backbone network for training, projecting the features obtained by training to a D-dimensional embedding space through a multilayer perceptron (MLP), then carrying out L2 normalization, and finally calculating the contrast loss for back propagation, wherein the pre-training process does not need to use any information such as manually labeled labels;
wherein, the training is a progressive multi-granularity information fusion training strategy;
the method used in the pre-training process and the calculation of the contrast loss are described in detail below.
(1) Global interference module
Since the class-to-class differences of fine-grained images are small, in most cases, different fine-grained classes often share similar global information, with only different local details. Therefore, we design a global perturbation module to generate the positive and negative sample pairs, which can make the neural network focus better on the local discriminant features by destroying the global structure.
Specifically, the method comprises the following steps:
the specific method in the global interference module is as follows: the image is divided into M multiplied by M blocks of subareas, then each subarea is randomly arranged with the same probability and is combined into a new image, M is a hyper-parameter and is used for controlling the generation of the images with different granularity information, and the images with different granularity information can be obtained by setting different M values.
Through the global interference, the global semantic information of the image is damaged, the neural network is forced to pay more attention to the local semantic information to complete discrimination, and the method can make the network more sensitive to local features without an accurate positioning frame.
(2) Progressive multi-granularity information fusion training strategy
Since the variance in the fine-grained image class is large, the discrimination information of the fine-grained class naturally exists in different visual granularities, and the discrimination region cannot be sufficiently found out by the local information of a single granularity. In order to make the network fully utilize the information among different granularities and further better find out the discrimination information, a simple progressive multi-granularity information fusion training strategy is designed. The strategy and the global interference module are cooperated together, the global interference module encourages the network to learn information with a certain specific granularity, and the progressive multi-granularity training strategy encourages the network to fuse information with different granularities, so that the information with different granularities can cooperate with each other, and the influence caused by large intra-class variance is avoided.
Specifically, the progressive multi-granularity information fusion training strategy is to divide the pre-training process into S stages uniformly, and each stage sends the image of different granularity information automatically generated by the global interference module to the CNNs backbone network. At each stage, the emphasis of training is to let the network learn information of a certain granularity. The training method is similar to reinforcement learning, and at the end of each training stage, the parameters trained in the current stage are transferred to the next training step to be initialized as parameters, and the transfer operation essentially enables the network to mine more granular information based on the region learned in the previous training step, so that the complementary relation between different granular information is fully explored.
(4) Multi-layer perceptron MLP
Before the target loss function is calculated, MLP-based nonlinear projection is adopted, so that the invariant feature of each input image can be identified, and the capability of a network for identifying different transformations of the same image is improved to the maximum extent. The multi-layer sensor MLP adopts two full-connection layers, so that the nonlinear information of data can be learned, the characteristics learned by a backbone network are enhanced, and the common information characteristics of the same class of data can be obtained through the learning of the step.
(5) Loss calculation
Firstly, selecting NCE-Loss, wherein the purpose of using NCE-Loss is to draw a positive sample pair with strong similarity in a hidden space close and push a negative sample pair far, and the NCE-Loss formula is as follows:
Figure BDA0003450840440000041
where q represents the query sample, k+Represents the corresponding positive sample, k-Representing other negative examples, τ is a hyper-parameter used to measure the distance distribution;
characteristic triplet (z) obtained by encoding triplet through CNNsa,zp,zn) Inserting into NCE-Loss, and setting the hyperparameter τ to 1, then NCE-Loss is optimized to the following form:
Figure BDA0003450840440000042
wherein,
Figure BDA0003450840440000043
k is the number of samples in the image dataset.
2. Fine tuning process
After the pre-training phase is completed, we migrate the pre-trained model weights to downstream tasks for fine-tuning. The fine tuning process follows common image classification, the weight of a model pre-trained by the user is used as initialization during fine tuning, the model is optimized by adopting cross entropy loss in the fine tuning process to obtain an optimized classification model, and the input image is classified through the optimized classification model to obtain a final classification result.
The invention also provides a multi-granularity information fusion fine-granularity image classification system, which comprises: the processor executes the computer program to realize the steps in the multi-granularity information fusion fine-granularity image classification method, the multi-granularity information fusion fine-granularity image classification system can be operated in computing equipment such as a desktop computer, a notebook computer, a palm computer and a cloud data center, the operable system can include, but is not limited to, a processor, a memory and a server cluster, and the processor executes the computer program to operate in the following units of the system:
an image input unit for inputting N images to constitute an image data set;
the image selecting unit is used for randomly extracting an image from the image data set and randomly cutting 2 images and the randomly extracted image of the image data set from the image;
the global interference unit is used for constructing a triple through a global interference module;
the progressive training unit is used for inputting the triples [ a, p, n ] into the CNNs backbone network for training to obtain an optimized classification model through a progressive multi-granularity information fusion training strategy;
and the image classification unit is used for classifying the input images through the optimized classification model so as to obtain a final classification result.
The invention has the beneficial effects that: the invention provides a multi-granularity information fusion fine-granularity image classification method and a system, wherein the method does not need any manual marking and adopts two mutually cooperative processes: the global interference and the progressive multi-granularity information fusion enable the network to fuse information of different granularities, and therefore local features with better discriminability are found out. A large number of test experiments are carried out on the classical fine-grained image classification data sets CUB, Stanford Cars and Aircraft, the recognition accuracy can be greatly improved on all data sets by the aid of the reference network and the method, and results on the Stanford Cars and the Aircraft even exceed those of ImageNet supervised learning.
Drawings
The above and other features of the present invention will become more apparent by describing in detail embodiments thereof with reference to the attached drawings in which like reference numerals designate the same or similar elements, it being apparent that the drawings in the following description are merely exemplary of the present invention and other drawings can be obtained by those skilled in the art without inventive effort, wherein:
fig. 1 is a structural diagram of a multi-granularity information fusion fine-granularity image classification system.
Detailed Description
The conception, the specific structure and the technical effects of the present invention will be clearly and completely described in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the schemes and the effects of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
As shown in fig. 1, it is necessary to construct positive and negative examples for an input image, and the algorithm constructs positive examples with different granularity information for each image through a global interference module, while treating other images as negative examples of the image. Compared with the original input image, the positive sample is the transformation of the original image, the global information of the positive sample is damaged, but the local information is reserved, so that the original image and the positive sample form a positive sample pair. Negative examples are transformations of other images such that a series of negative example pairs are formed between the original image and the negative examples. And then, sending the positive and negative samples with different granularity information generated by the global interference module into an encoder, pre-training the positive and negative samples with different granularity information in a halving stage by adopting a multi-granularity information fusion training mode, taking the weight trained in the previous stage as the initialization of the next training stage, and enabling the network to fuse the different granularity information through the incremental learning process so as to effectively avoid the problem of large intra-class variance. The whole process does not use any manual marking information. And then, transferring the trained model to a fine-grained data set for fine adjustment to obtain a final classification result.
Giving a picture containing N picturesMarked input image dataset batch ═ { x ═ x1,x2,…,xNFor a certain picture c in each batch in the training process, firstly, two randomly cut pictures in the pictures c are generated through enhancement of a random data set, and the picture c is defined as c1,c2And any other picture in the batch is defined as c3
Constructing the triple [ a, p, n ] by means of a global perturbation module, defined as f (·)]Wherein, the anchor sample (anchor) is recorded as a ═ f (c)1) Positive sample (positive sample) is denoted as p ═ f (c)2) The negative sample is denoted as n ═ f (c)3). For the triplet [ a, p, n]The anchor sample a and the positive sample p form a positive sample pair, the anchor sample a and the negative sample n form a negative sample pair, and the positive sample pair comes from the same image, and the negative sample pair comes from different images, so that the positive sample pair carries similar semantic content or visual features;
inputting the triples [ a, p, n ] into a CNNs backbone network for training, projecting the features obtained by training to a D-dimensional embedding space through a multilayer perceptron (MLP), then carrying out L2 normalization, and finally calculating the contrast loss for back propagation, wherein the pre-training process does not need to use any information such as manually labeled labels;
wherein, the training is a progressive multi-granularity information fusion training strategy;
the method used in the pre-training process and the calculation of the contrast loss are described in detail below.
(1) Global interference module
Since the class-to-class differences of fine-grained images are small, in most cases, different fine-grained classes often share similar global information, with only different local details. Therefore, we design a global perturbation module to generate the positive and negative sample pairs, which can make the neural network focus better on the local discriminant features by destroying the global structure.
Specifically, the method comprises the following steps:
the specific method in the global interference module is as follows: the image is divided into M × M blocks of sub-regions, and then each sub-region is randomly arranged with the same probability and merged into a new image, M is a hyper-parameter for controlling the generation of images with different granularity information, and setting different M values can obtain images with different granularity information, preferably, M is [1,8 ].
Through the global interference, the global semantic information of the image is damaged, the neural network is forced to pay more attention to the local semantic information to complete discrimination, and the method can make the network more sensitive to local features without an accurate positioning frame.
(2) Progressive multi-granularity information fusion training strategy
Since the variance in the fine-grained image class is large, the discrimination information of the fine-grained class naturally exists in different visual granularities, and the discrimination region cannot be sufficiently found out by the local information of a single granularity. In order to make the network fully utilize the information among different granularities and further better find out the discrimination information, a simple progressive multi-granularity information fusion training strategy is designed. The strategy and the global interference module are cooperated together, the global interference module encourages the network to learn information with a certain specific granularity, and the progressive multi-granularity training strategy encourages the network to fuse information with different granularities, so that the information with different granularities can cooperate with each other, and the influence caused by large intra-class variance is avoided.
Specifically, the progressive multi-granularity information fusion training strategy is to divide the pre-training process into S stages uniformly, and each stage sends the image of different granularity information automatically generated by the global interference module to the CNNs backbone network. At each stage, the emphasis of training is to let the network learn information of a certain granularity. The training method is similar to reinforcement learning, and at the end of each training stage, the parameters trained in the current stage are transferred to the next training step to be initialized as parameters, and the transfer operation essentially enables the network to mine more granular information based on the region learned in the previous training step, so that the complementary relation between different granular information is fully explored.
(5) Multi-layer perceptron MLP
Before the target loss function is calculated, MLP-based nonlinear projection is adopted, so that the invariant feature of each input image can be identified, and the capability of a network for identifying different transformations of the same image is improved to the maximum extent. The multi-layer sensor MLP adopts two full-connection layers, so that the nonlinear information of data can be learned, the characteristics learned by a backbone network are enhanced, and the common information characteristics of the same class of data can be obtained through the learning of the step.
(5) Loss calculation
Firstly, selecting NCE-Loss, wherein the purpose of using NCE-Loss is to draw a positive sample pair with strong similarity in a hidden space close and push a negative sample pair far, and the NCE-Loss formula is as follows:
Figure BDA0003450840440000071
where q represents the query sample, k+Represents the corresponding positive sample, k-Representing other negative examples, τ is a hyper-parameter used to measure the distance distribution;
characteristic triplet (z) obtained by encoding triplet through CNNsa,zp,zn) Inserting into NCE-Loss, and setting the hyperparameter τ to 1, then NCE-Loss is optimized to the following form:
Figure BDA0003450840440000072
wherein,
Figure BDA0003450840440000073
k is the number of images in the image dataset.
2. Fine tuning process
After the pre-training phase is completed, we migrate the pre-trained model weights to downstream tasks for fine-tuning. The fine tuning process follows common image classification, the weight of a model pre-trained by the user is used as initialization during fine tuning, the model is optimized by adopting cross entropy loss in the fine tuning process to obtain an optimized classification model, and the input image is classified through the optimized classification model to obtain a final classification result.
An embodiment of the present invention provides a multi-granularity information fusion fine-granularity image classification system, which is a structural diagram of the multi-granularity information fusion fine-granularity image classification system shown in fig. 1, and the multi-granularity information fusion fine-granularity image classification system of the embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor implementing the steps in one of the above-described embodiments of the multi-granular information fusion fine-grained image classification system when executing the computer program.
The system comprises: a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor executing the computer program to run in the units of the following system:
an image input unit for inputting N images to constitute an image data set;
the image selecting unit is used for randomly extracting an image from the image data set and randomly cutting 2 images and the randomly extracted image of the image data set from the image;
the global interference unit is used for constructing a triple through a global interference module;
the progressive training unit is used for inputting the triples [ a, p, n ] into the CNNs backbone network for training to obtain an optimized classification model through a progressive multi-granularity information fusion training strategy;
and the image classification unit is used for classifying the input images through the optimized classification model so as to obtain a final classification result.
The multi-granularity information fusion fine-granularity image classification system can be operated in computing equipment such as desktop computers, notebooks, palm computers and cloud servers. The system for classifying the multi-granularity information fusion fine-granularity image can be operated by a system comprising but not limited to a processor and a memory. Those skilled in the art will appreciate that the example is merely an example of a multi-granular information fusion fine-grained image classification system, and does not constitute a limitation of a multi-granular information fusion fine-grained image classification system, and may include more or less components than a sub-scale, or combine certain components, or different components, for example, the multi-granular information fusion fine-grained image classification system may further include an input-output device, a network access device, a bus, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. The general processor may be a microprocessor or the processor may be any conventional processor, and the processor is a control center of the multi-granularity information fusion fine-granularity image classification system operating system, and various interfaces and lines are used to connect various parts of the whole multi-granularity information fusion fine-granularity image classification system operable system.
The memory may be used to store the computer program and/or module, and the processor may implement various functions of the multi-granular information fusion fine-grained image classification system by running or executing the computer program and/or module stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
While the present invention has been described in considerable detail and with particular reference to a few illustrative embodiments thereof, it is not intended to be limited to any such details or embodiments or any particular embodiments, but it is to be construed as effectively covering the intended scope of the invention by providing a broad, potential interpretation of such claims in view of the prior art with reference to the appended claims. Furthermore, the foregoing describes the invention in terms of embodiments foreseen by the inventor for which an enabling description was available, notwithstanding that insubstantial modifications of the invention, not presently foreseen, may nonetheless represent equivalent modifications thereto.

Claims (8)

1. A multi-granularity information fusion fine-granularity image classification method is characterized by comprising the following steps:
s100, inputting N images to form an image data set;
s200, randomly extracting an image from an image data set, and randomly cutting 2 images and the randomly extracted image of the image data set from the image;
s300, constructing a triple through a global interference module;
s400, inputting the triples [ a, p, n ] into a CNNs backbone network for training through a progressive multi-granularity information fusion training strategy to obtain an optimized classification model;
and S500, classifying the input images through the optimized classification model to obtain a final classification result.
2. The method for classifying the multi-granularity information fusion fine-granularity image according to claim 1, wherein in S200, an image is randomly extracted from an image data set, and 2 pictures and the randomly extracted image of the image data set are randomly cropped from the image by: given an unmarked input image data set batch containing N images { x ═ x1,x2,…,xNFor a certain picture c in each batch in the training process, the student is first enhanced by a random data setTwo randomly cut pictures in the picture c are defined as c1,c2And any other picture in the batch is defined as c3
3. The method for classifying the multi-granularity information fusion fine-granularity image according to claim 2, wherein in S300, the method for constructing the triplet through the global interference module is as follows: construction of triplets [ a, p, n ] by global perturbation modules]Wherein, the anchor sample is marked as a ═ f (c)1) The positive sample is denoted as p ═ f (c)2) The negative sample is denoted as n ═ f (c)3). For the triplet [ a, p, n]The anchor sample a and the positive sample p form a positive sample pair, the anchor sample a and the negative sample n form a negative sample pair, the positive sample pair is from the same image, and the negative sample pair is from different images, so the positive sample pair carries similar semantic content or visual features, and the global interference module is defined as f (·).
4. The method for classifying the multi-granularity information fusion fine-granularity image according to claim 3, wherein in S400, the method for inputting the triples [ a, p, n ] into the CNNs backbone network for training to obtain the optimized classification model by using the progressive multi-granularity information fusion training strategy comprises the following steps: inputting the triples [ a, p, n ] into a CNNs backbone network for training, projecting the trained features to a D-dimensional embedding space through a multi-layer perceptron MLP, then carrying out L2 normalization, and finally calculating the contrast loss for back propagation, wherein the pre-training process does not need to use any manually labeled information such as labels; wherein, the training is a progressive multi-granularity information fusion training strategy.
5. The method for classifying the multi-granularity information fusion fine-granularity image according to claim 3, wherein in S300, the specific method in the global interference module is as follows: the image is divided into M multiplied by M blocks of subareas, then each subarea is randomly arranged with the same probability and is combined into a new image, M is a hyper-parameter and is used for controlling the generation of the images with different granularity information, and the images with different granularity information can be obtained by setting different M values.
6. The method of claim 4, wherein in S300, the progressive multi-granularity information fusion training strategy is to divide the pre-training process into a plurality of stages uniformly, and each stage sends the image with different granularity information automatically generated by the global interference module to the CNNs backbone network; at the end of each training phase, the parameters of the training of the current phase will be passed to the next training step to be initialized as parameters.
7. The method of claim 4, wherein in S400, the loss function of CNNs backbone network is lc
Combining the three groups [ a, p, n ]]CNNs-encoded feature triplets (z)a,zp,zn) Inserted into the loss function, the loss function is then of the form:
Figure FDA0003450840430000021
wherein,
Figure FDA0003450840430000022
k is the number of samples in the image dataset.
8. A multi-granularity information fusion fine-granularity image classification system is characterized by comprising: the processor, the memory and the computer program stored in the memory and running on the processor, when executing the computer program, implement the steps in the multi-granularity information fusion fine-granularity image classification method in claim 1, wherein the multi-granularity information fusion fine-granularity image classification system can be operated in computing devices of desktop computers, notebooks, palmtop computers and cloud data centers.
CN202111664965.7A 2021-12-31 2021-12-31 Multi-granularity information fusion fine-granularity image classification method and system Pending CN114299343A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111664965.7A CN114299343A (en) 2021-12-31 2021-12-31 Multi-granularity information fusion fine-granularity image classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111664965.7A CN114299343A (en) 2021-12-31 2021-12-31 Multi-granularity information fusion fine-granularity image classification method and system

Publications (1)

Publication Number Publication Date
CN114299343A true CN114299343A (en) 2022-04-08

Family

ID=80972653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111664965.7A Pending CN114299343A (en) 2021-12-31 2021-12-31 Multi-granularity information fusion fine-granularity image classification method and system

Country Status (1)

Country Link
CN (1) CN114299343A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035389A (en) * 2022-08-10 2022-09-09 华东交通大学 Fine-grained image identification method and device based on reliability evaluation and iterative learning
CN116188916A (en) * 2023-04-17 2023-05-30 杰创智能科技股份有限公司 Fine granularity image recognition method, device, equipment and storage medium
CN116452896A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035389A (en) * 2022-08-10 2022-09-09 华东交通大学 Fine-grained image identification method and device based on reliability evaluation and iterative learning
CN115035389B (en) * 2022-08-10 2022-10-25 华东交通大学 Fine-grained image identification method and device based on reliability evaluation and iterative learning
CN116188916A (en) * 2023-04-17 2023-05-30 杰创智能科技股份有限公司 Fine granularity image recognition method, device, equipment and storage medium
CN116452896A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance
CN116452896B (en) * 2023-06-16 2023-10-20 中国科学技术大学 Method, system, device and medium for improving fine-grained image classification performance

Similar Documents

Publication Publication Date Title
Bansal et al. An efficient technique for object recognition using Shi-Tomasi corner detection algorithm
Taherkhani et al. Deep-FS: A feature selection algorithm for Deep Boltzmann Machines
Maji et al. Efficient classification for additive kernel SVMs
He et al. Supercnn: A superpixelwise convolutional neural network for salient object detection
Kishore et al. Indian classical dance action identification and classification with convolutional neural networks
CN114299343A (en) Multi-granularity information fusion fine-granularity image classification method and system
US8606022B2 (en) Information processing apparatus, method and program
CN111680678B (en) Target area identification method, device, equipment and readable storage medium
CN110781856A (en) Heterogeneous face recognition model training method, face recognition method and related device
Al-Garaawi et al. BRIEF-based face descriptor: an application to automatic facial expression recognition (AFER)
Raj et al. Optimal feature selection and classification of Indian classical dance hand gesture dataset
Chakraborty et al. Application of daisy descriptor for language identification in the wild
Tang et al. Learning extremely shared middle-level image representation for scene classification
Hameed et al. Content based image retrieval based on feature fusion and support vector machine
Qi et al. Supervised deep semantics-preserving hashing for real-time pulmonary nodule image retrieval
Du et al. MIL-SKDE: Multiple-instance learning with supervised kernel density estimation
Che et al. Image retrieval by information fusion based on scalable vocabulary tree and robust Hausdorff distance
Do et al. ImageNet challenging classification with the Raspberry Pis: a federated learning algorithm of local stochastic gradient descent models
US10534980B2 (en) Method and apparatus for recognizing object based on vocabulary tree
Huang et al. Image retrieval based on ASIFT features in a Hadoop clustered system
US20210209473A1 (en) Generalized Activations Function for Machine Learning
CN111767710B (en) Indonesia emotion classification method, device, equipment and medium
Dalara et al. Entity Recognition in Indian Sculpture using CLAHE and machine learning
Kejriwal et al. Multi instance multi label classification of restaurant images
Ali et al. Context awareness based Sketch-DeepNet architecture for hand-drawn sketches classification and recognition in AIoT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination