CN112132004A - Fine-grained image identification method based on multi-view feature fusion - Google Patents
Fine-grained image identification method based on multi-view feature fusion Download PDFInfo
- Publication number
- CN112132004A CN112132004A CN202010992253.7A CN202010992253A CN112132004A CN 112132004 A CN112132004 A CN 112132004A CN 202010992253 A CN202010992253 A CN 202010992253A CN 112132004 A CN112132004 A CN 112132004A
- Authority
- CN
- China
- Prior art keywords
- feature
- bilinear
- loss function
- image
- fine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000004927 fusion Effects 0.000 title claims abstract description 29
- 239000013598 vector Substances 0.000 claims abstract description 48
- 230000006870 function Effects 0.000 claims abstract description 38
- 230000001629 suppression Effects 0.000 claims abstract description 17
- 238000000605 extraction Methods 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 238000005457 optimization Methods 0.000 claims description 5
- 230000002401 inhibitory effect Effects 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 abstract description 4
- 230000002452 interceptive effect Effects 0.000 abstract description 2
- 238000012545 processing Methods 0.000 abstract description 2
- 230000005764 inhibitory process Effects 0.000 description 4
- 238000011176 pooling Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 210000003323 beak Anatomy 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000452 restraining effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
A fine-grained image recognition method based on multi-view feature fusion relates to the technical field of image processing and solves the problems that the existing fine-grained image recognition method ignores detail information of images, has poor adaptability of visual difference between the images, has complex introduced loss functions, increases the parameter number of a model and the like; the method introduces the suppression branch, and forces the network to search subtle discriminant features among confusable categories by suppressing the most significant region in the image. And introducing a similar comparison learning module, fusing the characteristic vectors of similar samples, and increasing the interactive information of different images in the same category. And a center loss function is also introduced to minimize the distance between the feature and the center of the corresponding class, so that the learned feature is more discriminable. The accuracy of fine-grained image recognition is improved.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a fine-grained image identification method based on multi-view feature fusion.
Background
The fine-grained image classification is to perform finer subclass classification on the basis of distinguishing basic classes, such as birds or dogs. Therefore, the problem is to capture fine inter-class differences and to sufficiently mine the discriminative features of the image.
Fine-grained objects are widely existed in real life, and fine-grained image recognition corresponding to the fine-grained objects is an important research subject in computer vision recognition. The current fine-grained image recognition mainly has the following three challenges: (1) the same category may look with large differences due to differences in pose, background, and shooting angle. (2) The different classes, by belonging to the same father class, differ only in some subtle areas, such as the birds' beaks and tails. (3) The collection and the labeling of the fine-grained images are time-consuming and labor-consuming. As shown in fig. 1.
The existing method mainly achieves the identification purpose through the following three aspects: (1) and carrying out fine-grained image identification based on the positioning-classification network. (2) More discriminative tokens are learned directly by developing powerful depth models for fine-grained recognition. (3) And combining the global features and the local features of the image to realize fine-grained classification of the image.
In the prior art 1, Bilinear pooling fine-grained image classification (Bilinear posing) extracts features through a pre-trained twin convolutional neural network (Bilinear neural networks), and Bilinear pooling is performed on each channel level of the features to obtain a high-order representation of the features, so that the discrimination capability of the features is enhanced. The method benefits from a new pooling mode, and the fine-grained image identification accuracy is improved.
The method provides a new bilinear pooling mode, but effective design is not carried out on the aspects of the relationship among fine-grained image categories, the number of model parameters, the number of detailed areas and the like. That is, the influence of factors such as various detailed information contained in the fine-grained image, small inter-class difference, large intra-class difference and the like is not considered.
In prior art 2, a Multi-Attention Multi-classification constrained network (Multi-Attention Multi-Class Constraint) extracts multiple Attention (Attention) regions of an input image through a one-query Multi-expansion (one-query Multi-interaction) module, then introduces metric learning (metric learning), trains the network by using triple loss and softmax loss, draws the same Attention of the same kind of features, and pushes away different Attention or different kinds of features, thereby enhancing the relationship between components and realizing the improvement of fine-grained image recognition accuracy.
The method mainly utilizes metric learning to improve sample distribution in a feature space, and therefore, the method has poor adaptability for mining visual difference between a pair of images. Moreover, the introduced loss function is complex, a large number of sample pairs need to be constructed, and the parameter quantity of the model is greatly increased.
Disclosure of Invention
The invention provides a fine-grained image identification method based on multi-view feature fusion, aiming at solving the problems that the existing fine-grained image identification method ignores the detail information of images, has poor adaptability of visual difference between the images, has complex introduced loss function, increases the parameter quantity of a model and the like.
A fine-grained image recognition method based on multi-view feature fusion is realized by the following steps:
step one, extracting bilinear features;
inputting the original image into a bilinear feature extraction network, and fusing feature images output by different convolution layers to obtain a bilinear feature vector; the feature extraction network adopts a network structure pre-trained in a data set ImageNet;
step two, inhibiting branch learning, and the specific process is as follows:
step two, generating an attention drawing according to the size and the threshold of the feature drawing output by different convolution layers of the feature extraction network in the step one;
step two, generating a suppression mask according to the attention map in the step two, and covering the suppression mask on the original image to generate a suppression image with a local area masked;
step two, extracting bilinear features of the suppressed image obtained in the step two by adopting the step one to obtain a bilinear feature vector, inputting the bilinear feature vector to a full-connection layer to obtain a predicted class probability value, and calculating a multi-classification cross entropy according to the predicted class probability value;
step three, learning by a similar comparison module;
step three, randomly selecting other N images in the same category as the original image as positive sample images;
step two, sending the target image and the positive sample image in the step one into the feature extraction network in the step one to perform bilinear feature vector fusion, and obtaining a bilinear feature vector of a plurality of images with the same type of fusion features;
thirdly, averaging the bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-link layer to obtain a predicted probability, and calculating a multi-class cross entropy of the obtained predicted probability of the same category;
step four, a central loss function LCCalculating;
let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, N is the number of samples of the current batch, and then the center loss function LCThe formula of (1) is as follows:
fifthly, calculating a model optimization loss function;
and weighting and summing the cross entropy loss function of the bilinear feature vector of the original image, the cross entropy loss function of the bilinear feature vector of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain a loss function optimized by the model.
The invention has the beneficial effects that: the method comprehensively considers factors such as large intra-class difference, small inter-class difference and large background noise influence of fine-grained images, introduces an inhibition branch, and forces a network to search subtle discriminative characteristics among confusable classes by inhibiting the most significant region in the images. A similar comparison learning module is introduced to fuse the feature vectors of similar samples, so that the interactive information of different images in the same category is increased. Meanwhile, a center loss function is introduced, the distance between the feature and the corresponding class center is minimized, and the learned feature is more discriminable.
By combining the above points, the invention comprehensively utilizes the global features and the local features in the judgment process, obviously improves the performance of a plurality of fine-grained image classification tasks, has higher robustness compared with the existing method, and is easy to actually deploy. The accuracy of fine-grained image recognition is improved.
Drawings
FIG. 1 is a schematic diagram of 4 groups of existing fine-grained images in all of FIG. 1a, FIG. 1b, FIG. 1c and FIG. 1 d;
FIG. 2 is a schematic diagram of bilinear feature extraction in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
FIG. 3 is a schematic diagram of similar comparison learning in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
FIG. 4 is a schematic diagram of model optimization loss function calculation in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
fig. 5 is a characteristic visualization effect diagram obtained by the fine-grained image recognition method based on multi-view characteristic fusion according to the present invention.
Detailed Description
The embodiment is described with reference to fig. 2 to 5, and a fine-grained image recognition method based on multi-view feature fusion is implemented by the following steps:
step one, bilinear feature extraction: and inputting an original image with a fixed size by adopting a pre-trained ResNet-50 network structure on ImageNet, and fusing the feature graphs output by different convolution layers to obtain a bilinear feature vector.
In the feature extraction step, a network pre-trained in the data set ImageNet is used as a base network for feature extraction, and commonly used image classification networks such as VGGNet, google lenet, ResNet, and the like can be fine-tuned (fine-tune) to adapt the model to a specific task. Specifically, the original image is input to a feature extraction network to obtain feature maps (feature maps) output by the last two convolutional layers, which are respectively recorded asWherein D1, D2 represent the number of channels of the two features, and H and W represent the height and width of the feature map, respectively. In order to solve the problem of overhigh feature dimension after fusion and ensure enough feature information contained in the generated feature vector, only F is randomly extracted2Characteristic diagram of middle n channels and F1Fusion is performed. Feature recording chart F1And F2The feature vector of each position along the channel is The bilinear matrix can be obtained by multiplying two eigenvectorsAdding the bilinear matrixes corresponding to all positions in the characteristic diagram, and spreading the matrixes into a vector, namely a bilinear vectorWherein D is D1 × D2. The bilinear vectors provide a stronger representation of features than the linear model.
Step two, a step of restraining branch learning:
A. note that the map generation step: and generating the attention drawing according to the size of the feature map and the threshold value.
B. And an image suppression step of generating a suppression mask according to the attention map and covering the suppression mask on the original image to generate a suppression image with a local area masked.
C. And (3) multi-classification cross entropy calculation: and (4) obtaining bilinear feature vectors from the inhibition image through the first step, inputting the bilinear feature vectors into the full-connection layer to obtain a prediction probability value, and calculating multi-classification cross entropy of the obtained class prediction value.
In the step of learning the suppression branch, the following three aspects are included:
step A, a characteristic diagram output by the convolution layer in the characteristic extraction networkIs averaged over the individual channels of pdSorting according to the average value, selecting the value of top-5 to calculate entropy:
attention A was constructed by comparing the entropy to the size of the threshold:
step B enlarges the attention map to the original image size, calculates the average value M thereof, sets the element larger than the threshold value in the attention map to 0 and the other elements to 1 with M × θ as the threshold value, thereby obtaining a suppression mask M:
calculating the average value m of the attention points, setting a threshold value theta in a range of 0-1,
step C, covering the inhibition mask on the original image, thereby obtaining an inhibition image with a local area masked:
Is(x,y)=I(x,y)*M(x,y)
in the formula, I (x, y) is the value of the (x, y) position in I in the original image.
The most salient regions of the image are suppressed, thereby distracting the neural network and forcing the neural network to learn discriminant information from other regions. The dependence of the network on training samples can be reduced, overfitting is prevented, and the robustness of the model is further improved.
Step three, learning by a similar comparison module:
A. an image sampling step: and randomly selecting other N images in the same category as positive samples.
B. And a characteristic fusion step, namely fusing the target image and the randomly sampled positive sample image by the bilinear characteristic vector obtained in the step one to obtain fusion characteristics, wherein the obtained fusion characteristics integrate the characteristic information of a plurality of images in the same category.
C. Calculating a fusion characteristic loss function: and directly inputting the fused feature vectors into a full-connection layer to obtain prediction probability, and calculating multi-class cross entropy of the obtained class prediction values.
With reference to fig. 3, step a randomly selects N images belonging to the same category as the input image, and all the N images are fed into the bilinear feature extraction network of step one.
Step B, averaging the bilinear feature vectors of the multiple images of the same type output in the step A to obtain a fused feature vector:
wherein j is the position of the characteristic vector, V (j) is the value of the characteristic vector at the jth position, and T is the number of the selected positive samples; vr(j) Is the value of the r-th positive sample at the j-th position;
step four, calculating the central loss;
A. a class center generating step: and continuously updating the feature vectors of the centers of all the categories learned by the network in the training process.
B. A central loss calculation step: and (3) taking the distance between the bilinear feature vector obtained by each input image and the quasi-center vector as the center loss, and continuously optimizing in the training process.
In this embodiment, a feature vector is calculated for each category as the class center of the corresponding category, and the feature vector is continuously updated as the training progresses. And the bilinear feature vectors of each sample and the offset of the center of the corresponding class sample are punished, so that the samples in the same class are gathered together as much as possible, and the complex sample pair construction process is omitted. Let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, are obtained, N is the number of samples in the current batch, and the formula is as follows:
step five, calculating a model optimization loss function:
and weighting and summing the cross entropy loss function of the bilinear feature of the original image, the cross entropy loss function of the bilinear feature of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain the loss function optimized by the model.
With reference to FIG. 4, the intersection of bilinear feature vectors of the original image is notedFork entropy loss of LCE1Suppressing the cross-entropy loss of bilinear feature vectors of an image to LCE2Cross entropy loss of fusion features is LCE3Center loss of LCAnd weighting and summing the loss functions to obtain a model optimized loss function L:
where λ is the weight of the central loss function.
In fig. 5, a first row is an original image selected randomly in a data set, a second row is a class activation map obtained by global branching of an original image input, and a third row is a class activation map obtained by suppressing branching. It can be seen that in the global branch, the network learns the most salient regions of the image, such as the beak of a bird, the headlights of an automobile, etc., and in the inhibitory branch, the network learns subtle features that facilitate fine-grained classification, such as the torso of a bird, the wheels, etc. The combination of multiple visual angles enables the judgment basis of the network model to be more comprehensive, and the method can be used for not only distinguishing areas but also subtly capturing fine granularity characteristics.
The fine-grained image recognition method according to the embodiment introduces a new data enhancement mode, and suppresses component regions in an image through attention map guidance, so that attention is dispersed, and a network learns more complementary region feature information. A similar comparison module is introduced, and feature information from a plurality of images in the same category is fused, so that the images in the same category are represented in an embedding space as close as possible, and the classification performance is improved.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (4)
1. A fine-grained image recognition method based on multi-view feature fusion is characterized by comprising the following steps: the method is realized by the following steps:
step one, extracting bilinear features;
inputting the original image into a bilinear feature extraction network, and fusing feature images output by different convolution layers to obtain a bilinear feature vector; the feature extraction network adopts a network structure pre-trained in a data set ImageNet;
step two, inhibiting branch learning, and the specific process is as follows:
step two, generating an attention drawing according to the size and the threshold of the feature drawing output by different convolution layers of the feature extraction network in the step one;
step two, generating a suppression mask according to the attention map in the step two, and covering the suppression mask on the original image to generate a suppression image with a local area masked;
step two, extracting bilinear features of the suppressed image obtained in the step two by adopting the step one to obtain a bilinear feature vector, inputting the bilinear feature vector to a full-connection layer to obtain a predicted class probability value, and calculating a multi-classification cross entropy according to the predicted class probability value;
step three, learning by a similar comparison module;
step three, randomly selecting other N images in the same category as the original image as positive sample images;
step two, sending the target image and the positive sample image in the step one into the feature extraction network in the step one to perform bilinear feature vector fusion, and obtaining a bilinear feature vector of a plurality of images with the same type of fusion features;
thirdly, averaging the bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-link layer to obtain a predicted probability, and calculating a multi-class cross entropy of the obtained predicted probability of the same category;
step four, a central loss function LCCalculating;
let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, N is the number of samples of the current batch, and then the center loss function LCThe formula of (1) is as follows:
fifthly, calculating a model optimization loss function;
and weighting and summing the cross entropy loss function of the bilinear feature vector of the original image, the cross entropy loss function of the bilinear feature vector of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain a loss function optimized by the model.
2. The fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the first step, the specific process of generating the attention intention is as follows:
feature map output for the last convolutional layer in a feature extraction networkIs averaged over the individual channels of pdIn the formula, D is the channel number of the characteristic, and H and W are the height and width of the characteristic diagram respectively; sorting according to the average value to obtain an entropy E according to the formula:
attention A was constructed by comparing the entropy to the size of the threshold:
in the formula, FkThe two-dimensional characteristic graphs corresponding to the channels are sorted according to the channels;
in the second step, the specific process of generating the suppression mask is as follows:
enlarging the attention map in the first step to the size of an original image, calculating an average value M of the attention maps, setting a threshold value theta within the range of 0-1, setting elements larger than the threshold value M theta in the attention maps as 0 and setting other elements as 1 by taking M theta as the threshold value, and obtaining a suppression mask M:
wherein A (x, y) is a value indicating the (x, y) position in FIG. A;
covering the suppression mask on the original image to obtain a suppression image I with a local area maskeds(x,y);
Is(x,y)=I(x,y)*M(x,y)
In the formula, I (x, y) is the value of the (x, y) position in I in the original image.
3. The fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the third step, the bilinear feature vectors of a plurality of images under the same category are averaged to obtain a fused feature vector, which is expressed by the following formula:
where j is the bit of the feature vectorV (j) is the value of the characteristic vector at the jth position, and T is the quantity of the selected positive samples; vr(j) Is the value of the r-th positive sample at the j-th position;
4. the fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the fifth step, the cross entropy loss function of the bilinear feature vector of the original image is LCE1Suppressing the cross entropy loss function of the bilinear eigenvector of the image to be LCE2The cross entropy loss function of the fusion feature is LCE3Center loss function of LCCarrying out weighted summation on the images to obtain a loss function L of model optimization, and finally realizing fine-grained image identification; represented by the formula:
where λ is the weight of the central loss function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010992253.7A CN112132004B (en) | 2020-09-21 | 2020-09-21 | Fine granularity image recognition method based on multi-view feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010992253.7A CN112132004B (en) | 2020-09-21 | 2020-09-21 | Fine granularity image recognition method based on multi-view feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112132004A true CN112132004A (en) | 2020-12-25 |
CN112132004B CN112132004B (en) | 2024-06-25 |
Family
ID=73841694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010992253.7A Active CN112132004B (en) | 2020-09-21 | 2020-09-21 | Fine granularity image recognition method based on multi-view feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112132004B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712066A (en) * | 2021-01-19 | 2021-04-27 | 腾讯科技(深圳)有限公司 | Image recognition method and device, computer equipment and storage medium |
CN112733912A (en) * | 2020-12-31 | 2021-04-30 | 华侨大学 | Fine-grained image recognition method based on multi-grained countermeasure loss |
CN112766378A (en) * | 2021-01-19 | 2021-05-07 | 北京工商大学 | Cross-domain small sample image classification model method focusing on fine-grained identification |
CN112800927A (en) * | 2021-01-25 | 2021-05-14 | 北京工业大学 | AM-Softmax loss-based butterfly image fine granularity identification method |
CN112990270A (en) * | 2021-02-10 | 2021-06-18 | 华东师范大学 | Automatic fusion method of traditional feature and depth feature |
CN113065443A (en) * | 2021-03-25 | 2021-07-02 | 携程计算机技术(上海)有限公司 | Training method, recognition method, system, device and medium of image recognition model |
CN113255793A (en) * | 2021-06-01 | 2021-08-13 | 之江实验室 | Fine-grained ship identification method based on contrast learning |
CN113449613A (en) * | 2021-06-15 | 2021-09-28 | 北京华创智芯科技有限公司 | Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium |
CN113642571A (en) * | 2021-07-12 | 2021-11-12 | 中国海洋大学 | Fine-grained image identification method based on saliency attention mechanism |
CN113705489A (en) * | 2021-08-31 | 2021-11-26 | 中国电子科技集团公司第二十八研究所 | Remote sensing image fine-grained airplane identification method based on priori regional knowledge guidance |
CN114119979A (en) * | 2021-12-06 | 2022-03-01 | 西安电子科技大学 | Fine-grained image classification method based on segmentation mask and self-attention neural network |
CN114676777A (en) * | 2022-03-25 | 2022-06-28 | 中国科学院软件研究所 | Self-supervision learning fine-grained image classification method based on twin network |
CN115424086A (en) * | 2022-07-26 | 2022-12-02 | 北京邮电大学 | Multi-view fine-granularity identification method and device, electronic equipment and medium |
CN117725483A (en) * | 2023-09-26 | 2024-03-19 | 电子科技大学 | Supervised signal classification method based on neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685115A (en) * | 2018-11-30 | 2019-04-26 | 西北大学 | A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features |
CN110135502A (en) * | 2019-05-17 | 2019-08-16 | 东南大学 | A kind of image fine granularity recognition methods based on intensified learning strategy |
CN110210550A (en) * | 2019-05-28 | 2019-09-06 | 东南大学 | Image fine granularity recognition methods based on integrated study strategy |
CN110222636A (en) * | 2019-05-31 | 2019-09-10 | 中国民航大学 | The pedestrian's attribute recognition approach inhibited based on background |
CN110807465A (en) * | 2019-11-05 | 2020-02-18 | 北京邮电大学 | Fine-grained image identification method based on channel loss function |
CN111523534A (en) * | 2020-03-31 | 2020-08-11 | 华东师范大学 | Image description method |
-
2020
- 2020-09-21 CN CN202010992253.7A patent/CN112132004B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685115A (en) * | 2018-11-30 | 2019-04-26 | 西北大学 | A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features |
CN110135502A (en) * | 2019-05-17 | 2019-08-16 | 东南大学 | A kind of image fine granularity recognition methods based on intensified learning strategy |
CN110210550A (en) * | 2019-05-28 | 2019-09-06 | 东南大学 | Image fine granularity recognition methods based on integrated study strategy |
CN110222636A (en) * | 2019-05-31 | 2019-09-10 | 中国民航大学 | The pedestrian's attribute recognition approach inhibited based on background |
CN110807465A (en) * | 2019-11-05 | 2020-02-18 | 北京邮电大学 | Fine-grained image identification method based on channel loss function |
CN111523534A (en) * | 2020-03-31 | 2020-08-11 | 华东师范大学 | Image description method |
Non-Patent Citations (1)
Title |
---|
黄伟锋;张甜;常东良;闫冬;王嘉希;王丹;马占宇;: "基于多视角融合的细粒度图像分类方法", 信号处理, no. 09 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733912A (en) * | 2020-12-31 | 2021-04-30 | 华侨大学 | Fine-grained image recognition method based on multi-grained countermeasure loss |
CN112733912B (en) * | 2020-12-31 | 2023-06-09 | 华侨大学 | Fine granularity image recognition method based on multi-granularity countering loss |
CN112766378A (en) * | 2021-01-19 | 2021-05-07 | 北京工商大学 | Cross-domain small sample image classification model method focusing on fine-grained identification |
CN112712066A (en) * | 2021-01-19 | 2021-04-27 | 腾讯科技(深圳)有限公司 | Image recognition method and device, computer equipment and storage medium |
CN112766378B (en) * | 2021-01-19 | 2023-07-21 | 北京工商大学 | Cross-domain small sample image classification model method focusing on fine granularity recognition |
CN112800927A (en) * | 2021-01-25 | 2021-05-14 | 北京工业大学 | AM-Softmax loss-based butterfly image fine granularity identification method |
CN112800927B (en) * | 2021-01-25 | 2024-03-29 | 北京工业大学 | Butterfly image fine-granularity identification method based on AM-Softmax loss |
CN112990270B (en) * | 2021-02-10 | 2023-04-07 | 华东师范大学 | Automatic fusion method of traditional feature and depth feature |
CN112990270A (en) * | 2021-02-10 | 2021-06-18 | 华东师范大学 | Automatic fusion method of traditional feature and depth feature |
CN113065443A (en) * | 2021-03-25 | 2021-07-02 | 携程计算机技术(上海)有限公司 | Training method, recognition method, system, device and medium of image recognition model |
CN113255793B (en) * | 2021-06-01 | 2021-11-30 | 之江实验室 | Fine-grained ship identification method based on contrast learning |
CN113255793A (en) * | 2021-06-01 | 2021-08-13 | 之江实验室 | Fine-grained ship identification method based on contrast learning |
CN113449613A (en) * | 2021-06-15 | 2021-09-28 | 北京华创智芯科技有限公司 | Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium |
CN113449613B (en) * | 2021-06-15 | 2024-02-27 | 北京华创智芯科技有限公司 | Multi-task long tail distribution image recognition method, system, electronic equipment and medium |
CN113642571A (en) * | 2021-07-12 | 2021-11-12 | 中国海洋大学 | Fine-grained image identification method based on saliency attention mechanism |
CN113642571B (en) * | 2021-07-12 | 2023-10-10 | 中国海洋大学 | Fine granularity image recognition method based on salient attention mechanism |
CN113705489A (en) * | 2021-08-31 | 2021-11-26 | 中国电子科技集团公司第二十八研究所 | Remote sensing image fine-grained airplane identification method based on priori regional knowledge guidance |
CN113705489B (en) * | 2021-08-31 | 2024-06-07 | 中国电子科技集团公司第二十八研究所 | Remote sensing image fine-granularity airplane identification method based on priori regional knowledge guidance |
CN114119979A (en) * | 2021-12-06 | 2022-03-01 | 西安电子科技大学 | Fine-grained image classification method based on segmentation mask and self-attention neural network |
CN114676777A (en) * | 2022-03-25 | 2022-06-28 | 中国科学院软件研究所 | Self-supervision learning fine-grained image classification method based on twin network |
CN115424086A (en) * | 2022-07-26 | 2022-12-02 | 北京邮电大学 | Multi-view fine-granularity identification method and device, electronic equipment and medium |
CN117725483A (en) * | 2023-09-26 | 2024-03-19 | 电子科技大学 | Supervised signal classification method based on neural network |
Also Published As
Publication number | Publication date |
---|---|
CN112132004B (en) | 2024-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112132004A (en) | Fine-grained image identification method based on multi-view feature fusion | |
CN112101150B (en) | Multi-feature fusion pedestrian re-identification method based on orientation constraint | |
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
CN108960140B (en) | Pedestrian re-identification method based on multi-region feature extraction and fusion | |
CN111881714B (en) | Unsupervised cross-domain pedestrian re-identification method | |
CN113516012B (en) | Pedestrian re-identification method and system based on multi-level feature fusion | |
Xie et al. | Multilevel cloud detection in remote sensing images based on deep learning | |
Lin et al. | RSCM: Region selection and concurrency model for multi-class weather recognition | |
CN108596211B (en) | Shielded pedestrian re-identification method based on centralized learning and deep network learning | |
Wang et al. | A survey of vehicle re-identification based on deep learning | |
Pasolli et al. | SVM active learning approach for image classification using spatial information | |
Awad et al. | Multicomponent image segmentation using a genetic algorithm and artificial neural network | |
CN107633226B (en) | Human body motion tracking feature processing method | |
CN114005096A (en) | Vehicle weight recognition method based on feature enhancement | |
CN109063649B (en) | Pedestrian re-identification method based on twin pedestrian alignment residual error network | |
CN114067143B (en) | Vehicle re-identification method based on double sub-networks | |
CN111639564B (en) | Video pedestrian re-identification method based on multi-attention heterogeneous network | |
CN113408492A (en) | Pedestrian re-identification method based on global-local feature dynamic alignment | |
CN110633708A (en) | Deep network significance detection method based on global model and local optimization | |
CN111709313B (en) | Pedestrian re-identification method based on local and channel combination characteristics | |
CN111274922A (en) | Pedestrian re-identification method and system based on multi-level deep learning network | |
CN109299668A (en) | A kind of hyperspectral image classification method based on Active Learning and clustering | |
CN108596195B (en) | Scene recognition method based on sparse coding feature extraction | |
CN112329662B (en) | Multi-view saliency estimation method based on unsupervised learning | |
CN114037640A (en) | Image generation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |