CN114140657A - Image retrieval method based on multi-feature fusion - Google Patents

Image retrieval method based on multi-feature fusion Download PDF

Info

Publication number
CN114140657A
CN114140657A CN202111017516.3A CN202111017516A CN114140657A CN 114140657 A CN114140657 A CN 114140657A CN 202111017516 A CN202111017516 A CN 202111017516A CN 114140657 A CN114140657 A CN 114140657A
Authority
CN
China
Prior art keywords
feature
image
features
target image
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111017516.3A
Other languages
Chinese (zh)
Inventor
张华熊
江宁远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Sci Tech University ZSTU
Original Assignee
Zhejiang Sci Tech University ZSTU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Sci Tech University ZSTU filed Critical Zhejiang Sci Tech University ZSTU
Priority to CN202111017516.3A priority Critical patent/CN114140657A/en
Publication of CN114140657A publication Critical patent/CN114140657A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an image retrieval method based on multi-feature fusion, which extracts the content features of an image by fusing and using two shallow visual features and deep learning features which have different levels and complementarity, can accurately describe the image features, and improves the reliability of image retrieval and the robustness of retrieval. The designed fusion features are superior to the traditional features and the single features by combining the geometric invariance of the image shallow visual features and the high-level semantic characteristics of the deep learning features; according to the invention, PCA dimension reduction processing is carried out on the fusion features, the obtained feature dimension is low, and the method has great advantages in feature comparison speed and feature storage space; the multi-feature fusion mode designed by the invention is simple, the retrieval process is efficient, and the retrieval accuracy is high.

Description

Image retrieval method based on multi-feature fusion
Technical Field
The invention belongs to the technical field of image retrieval, and particularly relates to an image retrieval method based on multi-feature fusion.
Background
Image retrieval is one of the research key points in the fields of information retrieval and machine vision, and the image retrieval refers to the action of a retrieval system user for searching images which are required by the user in an image database in a certain range. Image retrieval techniques can be classified into two categories according to different description methods of images: one type is text-based image retrieval and the other type is content-based image retrieval. The text-based image retrieval technology realizes the description of image contents by depending on an artificial text annotation method, so that the image retrieval is realized by searching keywords, and the defects of huge workload of artificial annotation, strong subjectivity, incapability of completely covering the contents of the image by text annotation and the like exist; and the content-based image retrieval starts from the content of the image, so that the ambiguity problem existing in the text annotation process is effectively overcome.
The content features of the current image can be divided into shallow visual features and deep learning features; the shallow layer visual features mainly refer to visual content features expressed by an image, and generally comprise global features such as color, texture and shape, local features such as SIFT, and the like, wherein the SIFT local features have invariance to image rotation, scale scaling, brightness and other transformations, so that the shallow layer visual features are widely applied to the field of computer vision. The deep learning features are image features extracted from a deep neural network, complex feature representation of the image can be independently learned through a data training mode, high-level semantic information of the image can be extracted, errors caused by 'semantic gap' can be effectively reduced compared with shallow visual features, and a better retrieval effect is achieved.
In the document [ Babenko A, Slesarev A, Chigorin A, et al. neural codes for image retrieval [ C ]// European conference on computer vision. Springer, Cham,2014:584-599], it is proposed that image features are extracted from a fully connected layer of a CNN model pre-trained on ImageNet and used in an image retrieval scene, so that good effects are obtained, but the fully connected layer features lack certain geometric invariance and still have certain problems in image retrieval. The above shortcomings of the retrieval method based on single feature extraction promote the research of the image retrieval method based on multi-feature fusion. Coupled multi-dimensional indexes are proposed in documents [ Zheng L, Wang S, Liu Z, et al.packing and packing: Coupled multi-index for acquisition image retrieval [ C ]// Proceedings of the IEEE conference on computer vision and pattern retrieval.2014: 1939-.
Therefore, the image retrieval method based on multi-feature fusion at present has the defects that the fusion method flow is complex, the feature extraction and fusion time is increased, the use and fusion of various features cause the increase of image feature dimensions, and the retrieval time is greatly increased, and the like.
Disclosure of Invention
In view of the above, the present invention provides an image retrieval method based on multi-feature fusion, which realizes content feature extraction for an image by fusing and using two shallow visual features and deep learning features with different levels and complementarity, can accurately describe image features, and improves reliability of image retrieval and robustness of retrieval.
An image retrieval method based on multi-feature fusion comprises the following steps:
(1) SIFT feature extraction is carried out on the target image, and the SIFT features are coded by utilizing a pre-trained visual dictionary and serve as the shallow visual features of the target image;
(2) inputting the preprocessed target image into a pre-trained Resnet50 neural network to extract convolutional layer characteristics as deep learning characteristics of the target image;
(3) respectively carrying out L2 norm normalization on the shallow visual feature and the deep learning feature of the target image, then carrying out weighted concatenation on the normalized features and combining PCA (principal Component analysis) dimension reduction processing, thereby obtaining the fusion feature of the target image;
(4) and comparing the fusion characteristics of the target image with all image characteristic vectors in the characteristic library, and finally obtaining a retrieval result by adopting a query expansion mode.
Further, the pre-training process of the visual dictionary in the step (1) is as follows: the SIFT feature vectors of each image in the data set are extracted by utilizing the image data set, the set of the feature vectors is clustered by using a K-means clustering algorithm, the feature vector set is finally divided into a plurality of clusters, and the clustering center of each cluster can be regarded as a visual word in a visual dictionary.
Further, in the step (1), the SIFT features of the target image are encoded by using a local feature encoding algorithm, the encoding algorithm adopts multi-neighbor soft distribution to aggregate SIFT local features, the membership degree of the SIFT feature vector and n neighbor visual words is calculated by a distance ratio, and the calculation formula of the membership degree is as follows:
Figure BDA0003236407610000031
wherein: x is the number ofiIs SIFT feature vector of target image, and n is represented as SIFT feature vector xiNumber of assigned neighbor visual words, bjFor the j-th adjacent visual word assigned, uijIs SIFT feature vector xiIn the neighborhood of the visual word bjAnd beta is the rate of change of the smoothing factor control function.
Further, the specific implementation manner of the step (2) is as follows: firstly, scaling the size of a target image to 224 multiplied by 224 pixel size, and carrying out averaging processing; and then inputting the processed target image into a pre-trained Resnet50 neural network, extracting a feature map output by the 5 th convolutional layer of the neural network, and aggregating the feature map into a one-dimensional feature vector as the deep learning feature of the target image.
Further, the pretraining process of the Resnet50 neural network is as follows: the method comprises the steps of initializing a Renset50 neural network by using weight parameters trained on an ImageNet data set, then performing migration training on an image data set, namely training a Resnet50 neural network as a softmax classifier, and performing migration training on the neural network in batches by adopting a cross entropy loss function and a mini-batch optimizer in a forward propagation mode and a backward propagation mode in the training process.
Further, the feature maps are aggregated into one-dimensional feature vectors by using a rmac (regional Maximum activities of constraints) coding mode, which is specifically realized as follows: for any two-dimensional feature map of any layer in the feature map, firstly, uniformly sampling on the feature map by using a multi-scale sliding window strategy, wherein the side length of a square corresponding to a sliding window of the ith scale is 2 x min (W, H)/(l +1), W and H are the width and height of the feature map, the square windows slide on the feature map, and the overlapping area between adjacent windows is not less than 40%; then summing the characteristic response maximum values of all local areas extracted by each scale sliding window to obtain an RMAC characteristic value of the characteristic diagram; and finally, combining the RMAC characteristic values of all the characteristic graphs into a one-dimensional vector form, namely, taking the vector form as the deep learning characteristic of the target image.
Further, the specific implementation manner of the step (4) is as follows: firstly, the fusion characteristics of the target image are taken as an initial query vector F0Similarity calculation is carried out with all image feature vectors in the feature library, and k image feature vectors { F with the closest similarity are found out1,F2,…,Fk}; f is then calculated by the following formula0And { F1,F2,…,FkMean value FavgTaking the query vector as a new query vector, finally carrying out similarity calculation on the query vector and all image feature vectors in a feature library, and finding out k image feature vectors with the closest similarity as a final retrieval result;
Figure BDA0003236407610000041
wherein: k is a self-setting positive integer.
Based on the technical scheme, the invention has the following beneficial technical effects:
1. the fusion feature designed in the invention is superior to the traditional feature and the single feature by combining the geometric invariance of the image shallow visual feature and the high-level semantic characteristic of the deep learning feature.
2. According to the invention, PCA dimension reduction processing is carried out on the fusion features, so that the obtained feature dimension is low, and the method has great advantages in feature comparison speed and feature storage space.
3. The multi-feature fusion mode designed in the invention is simple, the retrieval process is efficient, and the retrieval accuracy is high.
Drawings
FIG. 1 is a flowchart illustrating steps of an image retrieval method according to the present invention.
Fig. 2 is a schematic diagram of RMAC signature encoding in accordance with the present invention.
Fig. 3 is an example of a result of a similar image retrieved by the method of the present invention.
Detailed Description
In order to more specifically describe the present invention, the following detailed description is provided for the technical solution of the present invention with reference to the accompanying drawings and the specific embodiments.
As shown in fig. 1, the image retrieval method based on multi-feature fusion of the present invention includes the following steps:
(1) and (5) feature extraction.
The invention needs to extract shallow visual features and deep learning features from the retrieved images.
For superficial visual features: firstly, SIFT feature vectors are extracted from an image set, a K-means clustering algorithm is used for clustering feature vector sets, finally the feature vector sets are divided into a plurality of clusters, each clustering center can be regarded as a visual word, and a visual dictionary is obtained.
Then, aggregating SIFT local features of the retrieval image by adopting a multi-neighbor soft distribution method, and calculating the membership degrees of SIFT feature vectors and n neighbor visual words through a distance ratio, wherein the membership degree is a calculation formula:
Figure BDA0003236407610000051
wherein: u. ofijRepresenting local features xiIn the visual sense of the word bjDegree of membership in, β is the rate of change of the smoothing factor control function, | | xi-bj||2Representing local features xiAnd visual word bjIs a Euclidean distance, n tableWill xiTo n neighboring visual words.
For deep learning features: renset50 is first initialized with weight parameters trained on the ImageNet dataset; performing migration training on a target image set, and training by using a Resnet50 network as a softmax classifier; in the training process of the neural network, a cross entropy loss function and a mini-batch optimizer are adopted to perform migration training on the network in batches in a forward propagation and backward propagation mode.
And then, the size of the retrieval image is scaled to 224 multiplied by 224 pixel size, the retrieval image is subjected to averaging, the retrieval image is input into a Resnet50 network of transfer learning, a feature map of a Conv5 layer is extracted, and three-dimensional feature maps are aggregated into a one-dimensional feature vector.
The polymerization method adopts RMAC characteristic coding, firstly, a sliding window strategy is utilized to carry out uniform sampling on a characteristic diagram omega, the programming length of a square corresponding to a 1 st scale region is min (W, H), the side length of a square corresponding to a 2 nd scale region is 2 x min (W, H)/3, the side length of a square corresponding to an L-th scale region is 2 x min (W, H)/(L +1), the square regions slide on the characteristic diagram omega, and each region has an overlapping area which is not less than 40%. As shown in fig. 2, the region selection process in the scales of 1 to 3 is respectively shown, 2 regions are extracted from the feature map when L is 1, 6 regions are extracted when L is 2, and 12 regions are extracted when L is 3, so that a total of 20 local regions are extracted from a single feature map.
The global and 20 area maxima are then summed to obtain the RMAC features for a single feature map, as follows:
Figure BDA0003236407610000052
wherein R represents the number of local regions on the feature map, firRepresenting the characteristic response maximum for the i-th channel r region.
(2) And (5) feature fusion.
The invention respectively normalizes the shallow vision of the retrieval image by using the L2 normFeatures and deep learning features were normalized to the L2 norm. Suppose there is a set of feature vectors
Figure BDA0003236407610000061
First calculate the L2 norm of the X vector:
Figure BDA0003236407610000062
each dimension of vector X is then divided by | X |2To obtain a new normalized vector X' of L2 norm, i.e.:
Figure BDA0003236407610000063
and then, carrying out weighted concatenation on the shallow visual features and the deep learning features after normalization of the L2 norm, wherein the fusion mode is as follows:
F=[γ1FB2FR]
wherein: fBSIFT-BOW shallow visual feature, F, for multi-neighbor soft allocationRResnet50 deep convolution feature, γ, for transfer learning1、γ2Are weight parameters of both, respectively, and γ12=1。
And finally, carrying out PCA (principal component analysis) dimension reduction processing on the fused features, removing redundant information and obtaining final fused features. The PCA dimensionality reduction is to map the original n-dimensional features onto k dimensions, which are completely new orthogonal features also called principal components, and the main transformation process is as follows:
first, the original data X (m, n) is de-averaged, i.e. the average value is subtracted from each feature dimension.
Secondly, calculating a covariance matrix C (m, n), and calculating an eigenvalue and a vector of the covariance matrix.
And thirdly, selecting eigenvectors corresponding to the largest k eigenvalues in the covariance matrix C to form a dimension reduction matrix T (n, k).
Fourthly, by formula XnewX × T, obtained XnewNamely the main component data of the original data X reduced to k dimension.
(3) And (5) feature retrieval.
Comparing the fusion features of the retrieved images in a vector library by using Euclidean distance, retrieving and reordering by using an average query expansion strategy to improve retrieval accuracy, and obtaining a final retrieval result, wherein the execution process of the average query expansion strategy is as follows:
extracting a content feature vector F of a query image0Using F0Similarity comparison is carried out with the feature vectors in the feature library, and the first k feature vector sets F { F } with the nearest distance are found out1,F2,…,Fk}。
② calculating F0And a set of feature vectors F { F }1,F2,…,FkMean value of FavgThe calculation method is as follows:
Figure BDA0003236407610000071
wherein: f0Is the initial query vector, FmIs the feature vector of the mth result of the initial query.
Finally, the average value FavgAnd as a new query characteristic example, querying to obtain a final retrieval result. FIG. 3 is an exemplary image search result of the present invention, wherein the first boxed image is labeled as the target search image, and the rest are the search result of the similar image.
The embodiments described above are presented to enable a person having ordinary skill in the art to make and use the invention. It will be readily apparent to those skilled in the art that various modifications to the above-described embodiments may be made, and the generic principles defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.

Claims (7)

1. An image retrieval method based on multi-feature fusion comprises the following steps:
(1) SIFT feature extraction is carried out on the target image, and the SIFT features are coded by utilizing a pre-trained visual dictionary and serve as the shallow visual features of the target image;
(2) inputting the preprocessed target image into a pre-trained Resnet50 neural network to extract convolutional layer characteristics as deep learning characteristics of the target image;
(3) respectively carrying out L2 norm normalization on the shallow visual feature and the deep learning feature of the target image, then carrying out weighted series connection on the normalized features and combining PCA (principal component analysis) dimension reduction processing, thereby obtaining the fusion features of the target image;
(4) and comparing the fusion characteristics of the target image with all image characteristic vectors in the characteristic library, and finally obtaining a retrieval result by adopting a query expansion mode.
2. The image retrieval method according to claim 1, characterized in that: the pre-training process of the visual dictionary in the step (1) comprises the following steps: the SIFT feature vectors of each image in the data set are extracted by utilizing the image data set, the set of the feature vectors is clustered by using a K-means clustering algorithm, the feature vector set is finally divided into a plurality of clusters, and the clustering center of each cluster can be regarded as a visual word in a visual dictionary.
3. The image retrieval method according to claim 1, characterized in that: in the step (1), SIFT features of the target image are coded by using a local feature coding algorithm, the coding algorithm adopts multi-neighbor soft distribution to aggregate SIFT local features, the membership degree of SIFT feature vectors and n neighbor visual words is calculated according to a distance ratio, and a calculation formula of the membership degree is as follows:
Figure FDA0003236407600000011
wherein: x is the number ofiBeing target imagesSIFT feature vector, n is represented as SIFT feature vector xiNumber of assigned neighbor visual words, bjFor the j-th adjacent visual word assigned, uijIs SIFT feature vector xiIn the neighborhood of the visual word bjAnd beta is the rate of change of the smoothing factor control function.
4. The image retrieval method according to claim 1, characterized in that: the specific implementation manner of the step (2) is as follows: firstly, scaling the size of a target image to 224 multiplied by 224 pixel size, and carrying out averaging processing; and then inputting the processed target image into a pre-trained Resnet50 neural network, extracting a feature map output by the 5 th convolutional layer of the neural network, and aggregating the feature map into a one-dimensional feature vector as the deep learning feature of the target image.
5. The image retrieval method according to claim 1, characterized in that: the pretraining process of the Resnet50 neural network comprises the following steps: the method comprises the steps of initializing a Renset50 neural network by using weight parameters trained on an ImageNet data set, then performing migration training on an image data set, namely training a Resnet50 neural network as a softmax classifier, and performing migration training on the neural network in batches by adopting a cross entropy loss function and a mini-batch optimizer in a forward propagation mode and a backward propagation mode in the training process.
6. The image retrieval method according to claim 4, characterized in that: the feature maps are aggregated into a one-dimensional feature vector by adopting an RMAC coding mode, and the method is specifically realized as follows: for any two-dimensional feature map of any layer in the feature map, firstly, uniformly sampling on the feature map by using a multi-scale sliding window strategy, wherein the side length of a square corresponding to a sliding window of the ith scale is 2 x min (W, H)/(l +1), W and H are the width and height of the feature map, the square windows slide on the feature map, and the overlapping area between adjacent windows is not less than 40%; then summing the characteristic response maximum values of all local areas extracted by each scale sliding window to obtain an RMAC characteristic value of the characteristic diagram; and finally, combining the RMAC characteristic values of all the characteristic graphs into a one-dimensional vector form, namely, taking the vector form as the deep learning characteristic of the target image.
7. The image retrieval method according to claim 1, characterized in that: the specific implementation manner of the step (4) is as follows: firstly, the fusion characteristics of the target image are taken as an initial query vector F0Similarity calculation is carried out with all image feature vectors in the feature library, and k image feature vectors { F with the closest similarity are found out1,F2,…,Fk}; f is then calculated by the following formula0And { F1,F2,…,FkMean value FavgTaking the query vector as a new query vector, finally carrying out similarity calculation on the query vector and all image feature vectors in a feature library, and finding out k image feature vectors with the closest similarity as a final retrieval result;
Figure FDA0003236407600000021
wherein: k is a self-setting positive integer.
CN202111017516.3A 2021-08-30 2021-08-30 Image retrieval method based on multi-feature fusion Pending CN114140657A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111017516.3A CN114140657A (en) 2021-08-30 2021-08-30 Image retrieval method based on multi-feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111017516.3A CN114140657A (en) 2021-08-30 2021-08-30 Image retrieval method based on multi-feature fusion

Publications (1)

Publication Number Publication Date
CN114140657A true CN114140657A (en) 2022-03-04

Family

ID=80393797

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111017516.3A Pending CN114140657A (en) 2021-08-30 2021-08-30 Image retrieval method based on multi-feature fusion

Country Status (1)

Country Link
CN (1) CN114140657A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550220A (en) * 2022-04-21 2022-05-27 中国科学技术大学 Training method of pedestrian re-recognition model and pedestrian re-recognition method
CN116150417A (en) * 2023-04-19 2023-05-23 上海维智卓新信息科技有限公司 Multi-scale multi-fusion image retrieval method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550220A (en) * 2022-04-21 2022-05-27 中国科学技术大学 Training method of pedestrian re-recognition model and pedestrian re-recognition method
CN114550220B (en) * 2022-04-21 2022-09-09 中国科学技术大学 Training method of pedestrian re-recognition model and pedestrian re-recognition method
CN116150417A (en) * 2023-04-19 2023-05-23 上海维智卓新信息科技有限公司 Multi-scale multi-fusion image retrieval method and device
CN116150417B (en) * 2023-04-19 2023-08-04 上海维智卓新信息科技有限公司 Multi-scale multi-fusion image retrieval method and device

Similar Documents

Publication Publication Date Title
CN107239565B (en) Image retrieval method based on saliency region
CN108038122B (en) Trademark image retrieval method
CN110222218B (en) Image retrieval method based on multi-scale NetVLAD and depth hash
CN111125411B (en) Large-scale image retrieval method for deep strong correlation hash learning
CN108897791B (en) Image retrieval method based on depth convolution characteristics and semantic similarity measurement
Wang et al. A new SVM-based active feedback scheme for image retrieval
CN111597298A (en) Cross-modal retrieval method and device based on deep confrontation discrete hash learning
CN114140657A (en) Image retrieval method based on multi-feature fusion
CN110647907A (en) Multi-label image classification algorithm using multi-layer classification and dictionary learning
Xu et al. Iterative manifold embedding layer learned by incomplete data for large-scale image retrieval
CN110598022B (en) Image retrieval system and method based on robust deep hash network
CN107169117A (en) A kind of manual draw human motion search method based on autocoder and DTW
Alemu et al. Multi-feature fusion for image retrieval using constrained dominant sets
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN114693397A (en) Multi-view multi-modal commodity recommendation method based on attention neural network
CN112036511B (en) Image retrieval method based on attention mechanism graph convolution neural network
CN116610778A (en) Bidirectional image-text matching method based on cross-modal global and local attention mechanism
Xu et al. Weakly supervised facial expression recognition via transferred DAL-CNN and active incremental learning
Nakayama Linear distance metric learning for large-scale generic image recognition
CN105183845A (en) ERVQ image indexing and retrieval method in combination with semantic features
Lu et al. Image retrieval based on incremental subspace learning
CN116578734B (en) Probability embedding combination retrieval method based on CLIP
CN112084353A (en) Bag-of-words model method for rapid landmark-convolution feature matching
Li et al. Otcmr: Bridging heterogeneity gap with optimal transport for cross-modal retrieval
CN115100694A (en) Fingerprint quick retrieval method based on self-supervision neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination