CN112132004A - Fine-grained image identification method based on multi-view feature fusion - Google Patents

Fine-grained image identification method based on multi-view feature fusion Download PDF

Info

Publication number
CN112132004A
CN112132004A CN202010992253.7A CN202010992253A CN112132004A CN 112132004 A CN112132004 A CN 112132004A CN 202010992253 A CN202010992253 A CN 202010992253A CN 112132004 A CN112132004 A CN 112132004A
Authority
CN
China
Prior art keywords
feature
bilinear
loss function
image
fine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010992253.7A
Other languages
Chinese (zh)
Other versions
CN112132004B (en
Inventor
黄伟锋
张甜
常东良
马占宇
柳斐
王丹
刘念
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South To North Water Transfer Middle Route Information Technology Co ltd
Beijing University of Posts and Telecommunications
Original Assignee
South To North Water Transfer Middle Route Information Technology Co ltd
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South To North Water Transfer Middle Route Information Technology Co ltd, Beijing University of Posts and Telecommunications filed Critical South To North Water Transfer Middle Route Information Technology Co ltd
Priority to CN202010992253.7A priority Critical patent/CN112132004B/en
Publication of CN112132004A publication Critical patent/CN112132004A/en
Application granted granted Critical
Publication of CN112132004B publication Critical patent/CN112132004B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

A fine-grained image recognition method based on multi-view feature fusion relates to the technical field of image processing and solves the problems that the existing fine-grained image recognition method ignores detail information of images, has poor adaptability of visual difference between the images, has complex introduced loss functions, increases the parameter number of a model and the like; the method introduces the suppression branch, and forces the network to search subtle discriminant features among confusable categories by suppressing the most significant region in the image. And introducing a similar comparison learning module, fusing the characteristic vectors of similar samples, and increasing the interactive information of different images in the same category. And a center loss function is also introduced to minimize the distance between the feature and the center of the corresponding class, so that the learned feature is more discriminable. The accuracy of fine-grained image recognition is improved.

Description

Fine-grained image identification method based on multi-view feature fusion
Technical Field
The invention relates to the technical field of image processing, in particular to a fine-grained image identification method based on multi-view feature fusion.
Background
The fine-grained image classification is to perform finer subclass classification on the basis of distinguishing basic classes, such as birds or dogs. Therefore, the problem is to capture fine inter-class differences and to sufficiently mine the discriminative features of the image.
Fine-grained objects are widely existed in real life, and fine-grained image recognition corresponding to the fine-grained objects is an important research subject in computer vision recognition. The current fine-grained image recognition mainly has the following three challenges: (1) the same category may look with large differences due to differences in pose, background, and shooting angle. (2) The different classes, by belonging to the same father class, differ only in some subtle areas, such as the birds' beaks and tails. (3) The collection and the labeling of the fine-grained images are time-consuming and labor-consuming. As shown in fig. 1.
The existing method mainly achieves the identification purpose through the following three aspects: (1) and carrying out fine-grained image identification based on the positioning-classification network. (2) More discriminative tokens are learned directly by developing powerful depth models for fine-grained recognition. (3) And combining the global features and the local features of the image to realize fine-grained classification of the image.
In the prior art 1, Bilinear pooling fine-grained image classification (Bilinear posing) extracts features through a pre-trained twin convolutional neural network (Bilinear neural networks), and Bilinear pooling is performed on each channel level of the features to obtain a high-order representation of the features, so that the discrimination capability of the features is enhanced. The method benefits from a new pooling mode, and the fine-grained image identification accuracy is improved.
The method provides a new bilinear pooling mode, but effective design is not carried out on the aspects of the relationship among fine-grained image categories, the number of model parameters, the number of detailed areas and the like. That is, the influence of factors such as various detailed information contained in the fine-grained image, small inter-class difference, large intra-class difference and the like is not considered.
In prior art 2, a Multi-Attention Multi-classification constrained network (Multi-Attention Multi-Class Constraint) extracts multiple Attention (Attention) regions of an input image through a one-query Multi-expansion (one-query Multi-interaction) module, then introduces metric learning (metric learning), trains the network by using triple loss and softmax loss, draws the same Attention of the same kind of features, and pushes away different Attention or different kinds of features, thereby enhancing the relationship between components and realizing the improvement of fine-grained image recognition accuracy.
The method mainly utilizes metric learning to improve sample distribution in a feature space, and therefore, the method has poor adaptability for mining visual difference between a pair of images. Moreover, the introduced loss function is complex, a large number of sample pairs need to be constructed, and the parameter quantity of the model is greatly increased.
Disclosure of Invention
The invention provides a fine-grained image identification method based on multi-view feature fusion, aiming at solving the problems that the existing fine-grained image identification method ignores the detail information of images, has poor adaptability of visual difference between the images, has complex introduced loss function, increases the parameter quantity of a model and the like.
A fine-grained image recognition method based on multi-view feature fusion is realized by the following steps:
step one, extracting bilinear features;
inputting the original image into a bilinear feature extraction network, and fusing feature images output by different convolution layers to obtain a bilinear feature vector; the feature extraction network adopts a network structure pre-trained in a data set ImageNet;
step two, inhibiting branch learning, and the specific process is as follows:
step two, generating an attention drawing according to the size and the threshold of the feature drawing output by different convolution layers of the feature extraction network in the step one;
step two, generating a suppression mask according to the attention map in the step two, and covering the suppression mask on the original image to generate a suppression image with a local area masked;
step two, extracting bilinear features of the suppressed image obtained in the step two by adopting the step one to obtain a bilinear feature vector, inputting the bilinear feature vector to a full-connection layer to obtain a predicted class probability value, and calculating a multi-classification cross entropy according to the predicted class probability value;
step three, learning by a similar comparison module;
step three, randomly selecting other N images in the same category as the original image as positive sample images;
step two, sending the target image and the positive sample image in the step one into the feature extraction network in the step one to perform bilinear feature vector fusion, and obtaining a bilinear feature vector of a plurality of images with the same type of fusion features;
thirdly, averaging the bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-link layer to obtain a predicted probability, and calculating a multi-class cross entropy of the obtained predicted probability of the same category;
step four, a central loss function LCCalculating;
let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, N is the number of samples of the current batch, and then the center loss function LCThe formula of (1) is as follows:
Figure BDA0002691299670000021
fifthly, calculating a model optimization loss function;
and weighting and summing the cross entropy loss function of the bilinear feature vector of the original image, the cross entropy loss function of the bilinear feature vector of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain a loss function optimized by the model.
The invention has the beneficial effects that: the method comprehensively considers factors such as large intra-class difference, small inter-class difference and large background noise influence of fine-grained images, introduces an inhibition branch, and forces a network to search subtle discriminative characteristics among confusable classes by inhibiting the most significant region in the images. A similar comparison learning module is introduced to fuse the feature vectors of similar samples, so that the interactive information of different images in the same category is increased. Meanwhile, a center loss function is introduced, the distance between the feature and the corresponding class center is minimized, and the learned feature is more discriminable.
By combining the above points, the invention comprehensively utilizes the global features and the local features in the judgment process, obviously improves the performance of a plurality of fine-grained image classification tasks, has higher robustness compared with the existing method, and is easy to actually deploy. The accuracy of fine-grained image recognition is improved.
Drawings
FIG. 1 is a schematic diagram of 4 groups of existing fine-grained images in all of FIG. 1a, FIG. 1b, FIG. 1c and FIG. 1 d;
FIG. 2 is a schematic diagram of bilinear feature extraction in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
FIG. 3 is a schematic diagram of similar comparison learning in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
FIG. 4 is a schematic diagram of model optimization loss function calculation in a fine-grained image recognition method based on multi-view feature fusion according to the present invention;
fig. 5 is a characteristic visualization effect diagram obtained by the fine-grained image recognition method based on multi-view characteristic fusion according to the present invention.
Detailed Description
The embodiment is described with reference to fig. 2 to 5, and a fine-grained image recognition method based on multi-view feature fusion is implemented by the following steps:
step one, bilinear feature extraction: and inputting an original image with a fixed size by adopting a pre-trained ResNet-50 network structure on ImageNet, and fusing the feature graphs output by different convolution layers to obtain a bilinear feature vector.
In the feature extraction step, a network pre-trained in the data set ImageNet is used as a base network for feature extraction, and commonly used image classification networks such as VGGNet, google lenet, ResNet, and the like can be fine-tuned (fine-tune) to adapt the model to a specific task. Specifically, the original image is input to a feature extraction network to obtain feature maps (feature maps) output by the last two convolutional layers, which are respectively recorded as
Figure BDA0002691299670000041
Wherein D1, D2 represent the number of channels of the two features, and H and W represent the height and width of the feature map, respectively. In order to solve the problem of overhigh feature dimension after fusion and ensure enough feature information contained in the generated feature vector, only F is randomly extracted2Characteristic diagram of middle n channels and F1Fusion is performed. Feature recording chart F1And F2The feature vector of each position along the channel is
Figure BDA0002691299670000042
Figure BDA0002691299670000043
The bilinear matrix can be obtained by multiplying two eigenvectors
Figure BDA0002691299670000044
Adding the bilinear matrixes corresponding to all positions in the characteristic diagram, and spreading the matrixes into a vector, namely a bilinear vector
Figure BDA0002691299670000045
Wherein D is D1 × D2. The bilinear vectors provide a stronger representation of features than the linear model.
Step two, a step of restraining branch learning:
A. note that the map generation step: and generating the attention drawing according to the size of the feature map and the threshold value.
B. And an image suppression step of generating a suppression mask according to the attention map and covering the suppression mask on the original image to generate a suppression image with a local area masked.
C. And (3) multi-classification cross entropy calculation: and (4) obtaining bilinear feature vectors from the inhibition image through the first step, inputting the bilinear feature vectors into the full-connection layer to obtain a prediction probability value, and calculating multi-classification cross entropy of the obtained class prediction value.
In the step of learning the suppression branch, the following three aspects are included:
step A, a characteristic diagram output by the convolution layer in the characteristic extraction network
Figure BDA0002691299670000046
Is averaged over the individual channels of pdSorting according to the average value, selecting the value of top-5 to calculate entropy:
attention A was constructed by comparing the entropy to the size of the threshold:
Figure BDA0002691299670000048
step B enlarges the attention map to the original image size, calculates the average value M thereof, sets the element larger than the threshold value in the attention map to 0 and the other elements to 1 with M × θ as the threshold value, thereby obtaining a suppression mask M:
Figure BDA0002691299670000049
calculating the average value m of the attention points, setting a threshold value theta in a range of 0-1,
step C, covering the inhibition mask on the original image, thereby obtaining an inhibition image with a local area masked:
Is(x,y)=I(x,y)*M(x,y)
in the formula, I (x, y) is the value of the (x, y) position in I in the original image.
The most salient regions of the image are suppressed, thereby distracting the neural network and forcing the neural network to learn discriminant information from other regions. The dependence of the network on training samples can be reduced, overfitting is prevented, and the robustness of the model is further improved.
Step three, learning by a similar comparison module:
A. an image sampling step: and randomly selecting other N images in the same category as positive samples.
B. And a characteristic fusion step, namely fusing the target image and the randomly sampled positive sample image by the bilinear characteristic vector obtained in the step one to obtain fusion characteristics, wherein the obtained fusion characteristics integrate the characteristic information of a plurality of images in the same category.
C. Calculating a fusion characteristic loss function: and directly inputting the fused feature vectors into a full-connection layer to obtain prediction probability, and calculating multi-class cross entropy of the obtained class prediction values.
With reference to fig. 3, step a randomly selects N images belonging to the same category as the input image, and all the N images are fed into the bilinear feature extraction network of step one.
Step B, averaging the bilinear feature vectors of the multiple images of the same type output in the step A to obtain a fused feature vector:
Figure BDA0002691299670000051
wherein j is the position of the characteristic vector, V (j) is the value of the characteristic vector at the jth position, and T is the number of the selected positive samples; vr(j) Is the value of the r-th positive sample at the j-th position;
step four, calculating the central loss;
A. a class center generating step: and continuously updating the feature vectors of the centers of all the categories learned by the network in the training process.
B. A central loss calculation step: and (3) taking the distance between the bilinear feature vector obtained by each input image and the quasi-center vector as the center loss, and continuously optimizing in the training process.
In this embodiment, a feature vector is calculated for each category as the class center of the corresponding category, and the feature vector is continuously updated as the training progresses. And the bilinear feature vectors of each sample and the offset of the center of the corresponding class sample are punished, so that the samples in the same class are gathered together as much as possible, and the complex sample pair construction process is omitted. Let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, are obtained, N is the number of samples in the current batch, and the formula is as follows:
Figure BDA0002691299670000061
step five, calculating a model optimization loss function:
and weighting and summing the cross entropy loss function of the bilinear feature of the original image, the cross entropy loss function of the bilinear feature of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain the loss function optimized by the model.
With reference to FIG. 4, the intersection of bilinear feature vectors of the original image is notedFork entropy loss of LCE1Suppressing the cross-entropy loss of bilinear feature vectors of an image to LCE2Cross entropy loss of fusion features is LCE3Center loss of LCAnd weighting and summing the loss functions to obtain a model optimized loss function L:
Figure BDA0002691299670000062
where λ is the weight of the central loss function.
In fig. 5, a first row is an original image selected randomly in a data set, a second row is a class activation map obtained by global branching of an original image input, and a third row is a class activation map obtained by suppressing branching. It can be seen that in the global branch, the network learns the most salient regions of the image, such as the beak of a bird, the headlights of an automobile, etc., and in the inhibitory branch, the network learns subtle features that facilitate fine-grained classification, such as the torso of a bird, the wheels, etc. The combination of multiple visual angles enables the judgment basis of the network model to be more comprehensive, and the method can be used for not only distinguishing areas but also subtly capturing fine granularity characteristics.
The fine-grained image recognition method according to the embodiment introduces a new data enhancement mode, and suppresses component regions in an image through attention map guidance, so that attention is dispersed, and a network learns more complementary region feature information. A similar comparison module is introduced, and feature information from a plurality of images in the same category is fused, so that the images in the same category are represented in an embedding space as close as possible, and the classification performance is improved.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (4)

1. A fine-grained image recognition method based on multi-view feature fusion is characterized by comprising the following steps: the method is realized by the following steps:
step one, extracting bilinear features;
inputting the original image into a bilinear feature extraction network, and fusing feature images output by different convolution layers to obtain a bilinear feature vector; the feature extraction network adopts a network structure pre-trained in a data set ImageNet;
step two, inhibiting branch learning, and the specific process is as follows:
step two, generating an attention drawing according to the size and the threshold of the feature drawing output by different convolution layers of the feature extraction network in the step one;
step two, generating a suppression mask according to the attention map in the step two, and covering the suppression mask on the original image to generate a suppression image with a local area masked;
step two, extracting bilinear features of the suppressed image obtained in the step two by adopting the step one to obtain a bilinear feature vector, inputting the bilinear feature vector to a full-connection layer to obtain a predicted class probability value, and calculating a multi-classification cross entropy according to the predicted class probability value;
step three, learning by a similar comparison module;
step three, randomly selecting other N images in the same category as the original image as positive sample images;
step two, sending the target image and the positive sample image in the step one into the feature extraction network in the step one to perform bilinear feature vector fusion, and obtaining a bilinear feature vector of a plurality of images with the same type of fusion features;
thirdly, averaging the bilinear feature vectors of a plurality of images under the same category obtained in the third step to obtain a fused feature vector, inputting the fused feature vector into a full-link layer to obtain a predicted probability, and calculating a multi-class cross entropy of the obtained predicted probability of the same category;
step four, a central loss function LCCalculating;
let viIs a bilinear feature of the ith sample, ciThe average characteristics of all samples of the corresponding category of the sample i, namely the class center, N is the number of samples of the current batch, and then the center loss function LCThe formula of (1) is as follows:
Figure FDA0002691299660000011
fifthly, calculating a model optimization loss function;
and weighting and summing the cross entropy loss function of the bilinear feature vector of the original image, the cross entropy loss function of the bilinear feature vector of the suppressed image, the cross entropy loss function of the fusion feature and the central loss function to obtain a loss function optimized by the model.
2. The fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the first step, the specific process of generating the attention intention is as follows:
feature map output for the last convolutional layer in a feature extraction network
Figure FDA0002691299660000025
Is averaged over the individual channels of pdIn the formula, D is the channel number of the characteristic, and H and W are the height and width of the characteristic diagram respectively; sorting according to the average value to obtain an entropy E according to the formula:
Figure FDA0002691299660000021
attention A was constructed by comparing the entropy to the size of the threshold:
Figure FDA0002691299660000022
in the formula, FkThe two-dimensional characteristic graphs corresponding to the channels are sorted according to the channels;
in the second step, the specific process of generating the suppression mask is as follows:
enlarging the attention map in the first step to the size of an original image, calculating an average value M of the attention maps, setting a threshold value theta within the range of 0-1, setting elements larger than the threshold value M theta in the attention maps as 0 and setting other elements as 1 by taking M theta as the threshold value, and obtaining a suppression mask M:
Figure FDA0002691299660000023
wherein A (x, y) is a value indicating the (x, y) position in FIG. A;
covering the suppression mask on the original image to obtain a suppression image I with a local area maskeds(x,y);
Is(x,y)=I(x,y)*M(x,y)
In the formula, I (x, y) is the value of the (x, y) position in I in the original image.
3. The fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the third step, the bilinear feature vectors of a plurality of images under the same category are averaged to obtain a fused feature vector, which is expressed by the following formula:
Figure FDA0002691299660000024
where j is the bit of the feature vectorV (j) is the value of the characteristic vector at the jth position, and T is the quantity of the selected positive samples; vr(j) Is the value of the r-th positive sample at the j-th position;
4. the fine-grained image recognition method based on multi-view feature fusion according to claim 1, characterized in that: in the fifth step, the cross entropy loss function of the bilinear feature vector of the original image is LCE1Suppressing the cross entropy loss function of the bilinear eigenvector of the image to be LCE2The cross entropy loss function of the fusion feature is LCE3Center loss function of LCCarrying out weighted summation on the images to obtain a loss function L of model optimization, and finally realizing fine-grained image identification; represented by the formula:
Figure FDA0002691299660000031
where λ is the weight of the central loss function.
CN202010992253.7A 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion Active CN112132004B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010992253.7A CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010992253.7A CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Publications (2)

Publication Number Publication Date
CN112132004A true CN112132004A (en) 2020-12-25
CN112132004B CN112132004B (en) 2024-06-25

Family

ID=73841694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010992253.7A Active CN112132004B (en) 2020-09-21 2020-09-21 Fine granularity image recognition method based on multi-view feature fusion

Country Status (1)

Country Link
CN (1) CN112132004B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712066A (en) * 2021-01-19 2021-04-27 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN112733912A (en) * 2020-12-31 2021-04-30 华侨大学 Fine-grained image recognition method based on multi-grained countermeasure loss
CN112766378A (en) * 2021-01-19 2021-05-07 北京工商大学 Cross-domain small sample image classification model method focusing on fine-grained identification
CN112800927A (en) * 2021-01-25 2021-05-14 北京工业大学 AM-Softmax loss-based butterfly image fine granularity identification method
CN112990270A (en) * 2021-02-10 2021-06-18 华东师范大学 Automatic fusion method of traditional feature and depth feature
CN113065443A (en) * 2021-03-25 2021-07-02 携程计算机技术(上海)有限公司 Training method, recognition method, system, device and medium of image recognition model
CN113255793A (en) * 2021-06-01 2021-08-13 之江实验室 Fine-grained ship identification method based on contrast learning
CN113449613A (en) * 2021-06-15 2021-09-28 北京华创智芯科技有限公司 Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium
CN113642571A (en) * 2021-07-12 2021-11-12 中国海洋大学 Fine-grained image identification method based on saliency attention mechanism
CN113705489A (en) * 2021-08-31 2021-11-26 中国电子科技集团公司第二十八研究所 Remote sensing image fine-grained airplane identification method based on priori regional knowledge guidance
CN114119979A (en) * 2021-12-06 2022-03-01 西安电子科技大学 Fine-grained image classification method based on segmentation mask and self-attention neural network
CN114676777A (en) * 2022-03-25 2022-06-28 中国科学院软件研究所 Self-supervision learning fine-grained image classification method based on twin network
CN115424086A (en) * 2022-07-26 2022-12-02 北京邮电大学 Multi-view fine-granularity identification method and device, electronic equipment and medium
CN117725483A (en) * 2023-09-26 2024-03-19 电子科技大学 Supervised signal classification method based on neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features
CN110135502A (en) * 2019-05-17 2019-08-16 东南大学 A kind of image fine granularity recognition methods based on intensified learning strategy
CN110210550A (en) * 2019-05-28 2019-09-06 东南大学 Image fine granularity recognition methods based on integrated study strategy
CN110222636A (en) * 2019-05-31 2019-09-10 中国民航大学 The pedestrian's attribute recognition approach inhibited based on background
CN110807465A (en) * 2019-11-05 2020-02-18 北京邮电大学 Fine-grained image identification method based on channel loss function
CN111523534A (en) * 2020-03-31 2020-08-11 华东师范大学 Image description method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685115A (en) * 2018-11-30 2019-04-26 西北大学 A kind of the fine granularity conceptual model and learning method of bilinearity Fusion Features
CN110135502A (en) * 2019-05-17 2019-08-16 东南大学 A kind of image fine granularity recognition methods based on intensified learning strategy
CN110210550A (en) * 2019-05-28 2019-09-06 东南大学 Image fine granularity recognition methods based on integrated study strategy
CN110222636A (en) * 2019-05-31 2019-09-10 中国民航大学 The pedestrian's attribute recognition approach inhibited based on background
CN110807465A (en) * 2019-11-05 2020-02-18 北京邮电大学 Fine-grained image identification method based on channel loss function
CN111523534A (en) * 2020-03-31 2020-08-11 华东师范大学 Image description method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄伟锋;张甜;常东良;闫冬;王嘉希;王丹;马占宇;: "基于多视角融合的细粒度图像分类方法", 信号处理, no. 09 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733912A (en) * 2020-12-31 2021-04-30 华侨大学 Fine-grained image recognition method based on multi-grained countermeasure loss
CN112733912B (en) * 2020-12-31 2023-06-09 华侨大学 Fine granularity image recognition method based on multi-granularity countering loss
CN112766378A (en) * 2021-01-19 2021-05-07 北京工商大学 Cross-domain small sample image classification model method focusing on fine-grained identification
CN112712066A (en) * 2021-01-19 2021-04-27 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN112766378B (en) * 2021-01-19 2023-07-21 北京工商大学 Cross-domain small sample image classification model method focusing on fine granularity recognition
CN112800927A (en) * 2021-01-25 2021-05-14 北京工业大学 AM-Softmax loss-based butterfly image fine granularity identification method
CN112800927B (en) * 2021-01-25 2024-03-29 北京工业大学 Butterfly image fine-granularity identification method based on AM-Softmax loss
CN112990270B (en) * 2021-02-10 2023-04-07 华东师范大学 Automatic fusion method of traditional feature and depth feature
CN112990270A (en) * 2021-02-10 2021-06-18 华东师范大学 Automatic fusion method of traditional feature and depth feature
CN113065443A (en) * 2021-03-25 2021-07-02 携程计算机技术(上海)有限公司 Training method, recognition method, system, device and medium of image recognition model
CN113255793B (en) * 2021-06-01 2021-11-30 之江实验室 Fine-grained ship identification method based on contrast learning
CN113255793A (en) * 2021-06-01 2021-08-13 之江实验室 Fine-grained ship identification method based on contrast learning
CN113449613A (en) * 2021-06-15 2021-09-28 北京华创智芯科技有限公司 Multitask long-tail distribution image recognition method, multitask long-tail distribution image recognition system, electronic device and medium
CN113449613B (en) * 2021-06-15 2024-02-27 北京华创智芯科技有限公司 Multi-task long tail distribution image recognition method, system, electronic equipment and medium
CN113642571A (en) * 2021-07-12 2021-11-12 中国海洋大学 Fine-grained image identification method based on saliency attention mechanism
CN113642571B (en) * 2021-07-12 2023-10-10 中国海洋大学 Fine granularity image recognition method based on salient attention mechanism
CN113705489A (en) * 2021-08-31 2021-11-26 中国电子科技集团公司第二十八研究所 Remote sensing image fine-grained airplane identification method based on priori regional knowledge guidance
CN113705489B (en) * 2021-08-31 2024-06-07 中国电子科技集团公司第二十八研究所 Remote sensing image fine-granularity airplane identification method based on priori regional knowledge guidance
CN114119979A (en) * 2021-12-06 2022-03-01 西安电子科技大学 Fine-grained image classification method based on segmentation mask and self-attention neural network
CN114676777A (en) * 2022-03-25 2022-06-28 中国科学院软件研究所 Self-supervision learning fine-grained image classification method based on twin network
CN115424086A (en) * 2022-07-26 2022-12-02 北京邮电大学 Multi-view fine-granularity identification method and device, electronic equipment and medium
CN117725483A (en) * 2023-09-26 2024-03-19 电子科技大学 Supervised signal classification method based on neural network

Also Published As

Publication number Publication date
CN112132004B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
CN112132004A (en) Fine-grained image identification method based on multi-view feature fusion
CN112101150B (en) Multi-feature fusion pedestrian re-identification method based on orientation constraint
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN108960140B (en) Pedestrian re-identification method based on multi-region feature extraction and fusion
CN111881714B (en) Unsupervised cross-domain pedestrian re-identification method
CN113516012B (en) Pedestrian re-identification method and system based on multi-level feature fusion
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
Lin et al. RSCM: Region selection and concurrency model for multi-class weather recognition
CN108596211B (en) Shielded pedestrian re-identification method based on centralized learning and deep network learning
Wang et al. A survey of vehicle re-identification based on deep learning
Pasolli et al. SVM active learning approach for image classification using spatial information
Awad et al. Multicomponent image segmentation using a genetic algorithm and artificial neural network
CN107633226B (en) Human body motion tracking feature processing method
CN114005096A (en) Vehicle weight recognition method based on feature enhancement
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN114067143B (en) Vehicle re-identification method based on double sub-networks
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN113408492A (en) Pedestrian re-identification method based on global-local feature dynamic alignment
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN111274922A (en) Pedestrian re-identification method and system based on multi-level deep learning network
CN109299668A (en) A kind of hyperspectral image classification method based on Active Learning and clustering
CN108596195B (en) Scene recognition method based on sparse coding feature extraction
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN114037640A (en) Image generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant