CN114610941A - Cultural relic image retrieval system based on comparison learning - Google Patents
Cultural relic image retrieval system based on comparison learning Download PDFInfo
- Publication number
- CN114610941A CN114610941A CN202210253589.0A CN202210253589A CN114610941A CN 114610941 A CN114610941 A CN 114610941A CN 202210253589 A CN202210253589 A CN 202210253589A CN 114610941 A CN114610941 A CN 114610941A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- samples
- query
- retrieval
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 36
- 238000000034 method Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 3
- 238000013434 data augmentation Methods 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims 1
- 238000005070 sampling Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 4
- 230000000052 comparative effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及文物图像资料的特征提取与对比匹配技术,更具体地,涉及到一种基于对比学习算法的针对文物数据的图像检索系统。The invention relates to the feature extraction and contrast matching technology of cultural relic image data, and more particularly, to an image retrieval system for cultural relic data based on a contrast learning algorithm.
背景技术Background technique
民间文物交易流通各环节中的审核工作过于依赖经验分析与肉眼判断,存在过程繁杂、效率低下等问题,这也催生了计算机自动检索文物图像的需求。图像检索旨在建立查询图像与图像数据库之间的索引,根据某种度量方式,输出数据库中与查询图像匹配或相似的图像。基于目前图像数据量大且检索需求高的现状,急需提出适应民间文物多样性与场景复杂性的高保真数字信息采集技术,设计文物数据关键特征信息提取方式与比对匹配方法。The review work in each link of the transaction and circulation of folk cultural relics relies too much on empirical analysis and judgment with the naked eye, and there are problems such as complicated processes and low efficiency, which has also spawned the need for computers to automatically retrieve images of cultural relics. Image retrieval aims to build an index between a query image and an image database, and output images in the database that match or are similar to the query image according to some measure. Based on the current situation of large amount of image data and high retrieval demand, it is urgent to propose a high-fidelity digital information acquisition technology that adapts to the diversity of folk cultural relics and the complexity of the scene, and to design the extraction method and comparison and matching method for key feature information of cultural relics data.
发明内容SUMMARY OF THE INVENTION
为了解决现有技术中的问题,本发明提供一种基于对比学习的文物图像检索系统,解决现有技术中检索准确率与效率低、对计算力要求高等问题。In order to solve the problems in the prior art, the present invention provides a cultural relic image retrieval system based on contrastive learning, which solves the problems of low retrieval accuracy and efficiency and high requirements for computing power in the prior art.
本发明的技术方案是:The technical scheme of the present invention is:
一种基于对比学习的文物图像检索系统,包括特征提取器和检索模块,所述特征提取器包括预处理和特征提取,所述检索模块包括排序和索引、相似性计算;将待检索的图像输入,对其进行预处理和特征提取得到相应的特征向量,同时对图像数据库中的所有图像进行预处理与特征提取,得到对应的图像特征库,之后通过检索模块,计算查询图像的特征向量与图像特征库中特征向量之间的相似度,并利用相似度对图像数据库中的图像进行排序与索引,得到与查询图像匹配的图像作为最终的检索结果。A cultural relic image retrieval system based on comparative learning, including a feature extractor and a retrieval module, the feature extractor includes preprocessing and feature extraction, the retrieval module includes sorting and indexing, similarity calculation; , perform preprocessing and feature extraction on it to obtain the corresponding feature vector, and simultaneously perform preprocessing and feature extraction on all images in the image database to obtain the corresponding image feature library, and then use the retrieval module to calculate the feature vector of the query image and the image The similarity between the feature vectors in the feature library is used to sort and index the images in the image database, and the image matching the query image is obtained as the final retrieval result.
使用有监督的对比学习算法对特征提取器进行网络训练;对比学习模型采用完全对称且参数共享的两个分支,每个分支均包括数据增强、编码器网络与投影网络,其中编码器网络与投影网络组成特征提取器;对于任意一张图像x,它通过两种不同的数据增强方式形成两个增强视图xi与xj;由于上下分支是完全对称的,上分支中xi首先经过编码器网络转换为对应的特征表示hi=fθ(xi);之后非线性变换结构--投影网络将特征表示映射为最终的特征表示zi=gθ(hi);类似地,下分支的增强视图经过两次非线性变换得到最终的特征表示zj=gθ(fθ(xj))。The feature extractor network is trained using a supervised contrastive learning algorithm; the contrastive learning model employs two branches that are fully symmetrical and parameter-sharing, each branch includes data augmentation, an encoder network, and a projection network, where the encoder network and the projection network The network constitutes a feature extractor; for any image x, it forms two enhanced views x i and x j through two different data enhancement methods; since the upper and lower branches are completely symmetrical, x i in the upper branch first passes through the encoder The network is converted to the corresponding feature representation hi = f θ (x i ); then the nonlinear transformation structure-projection network maps the feature representation to the final feature representation zi = g θ ( hi ); similarly, the lower branch The enhanced view of is subjected to two nonlinear transformations to obtain the final feature representation z j =g θ (f θ (x j )).
所述网络训练为:随机采样N个样本构成一个Batch,记为{xk,yk}k=1,2,...,N,yk是xk的标签,通过数据增强可以得到2N个样本其中,和是同一个样本经两种随机的数据增强方式得到的数据对,数据增强过程中的标签信息始终不会改变;对于有监督对比学习,一个样本对应着多个正样本,即Batch内与其标签信息相同的样本作为正样本,而与其标签信息不同的样本作为负样本,这样可以有效利用已知的标签信息进行监督学习,从而实现同类别的样本在表示空间中更加接近,而不同类别的样本在表示空间中相互远离,提高特征表示的判别能力;因此,有监督对比学习的损失函数定义为:The network training is: randomly sample N samples to form a Batch, denoted as {x k , y k } k = 1, 2, ..., N , y k is the label of x k , and 2N can be obtained through data enhancement samples in, and It is a data pair obtained by the same sample through two random data enhancement methods, and the label information will never change during the data enhancement process; for supervised comparative learning, one sample corresponds to multiple positive samples, that is, the batch and its label information The same sample is used as a positive sample, and a sample with different label information is used as a negative sample, which can effectively use the known label information for supervised learning, so that the samples of the same category are closer in the representation space, while the samples of different categories are The representation space is far away from each other to improve the discriminative ability of feature representation; therefore, the loss function of supervised contrastive learning is defined as:
其中,1i≠j∈{0,1}为指示函数,当且仅当i≠j时取1,否则取0;τ>0为温度参数;zj(i)表示zi的正样本,zi·zj(i)表示向量之间的内积运算;表示Batch中与样本zi具有相同标签信息的样本总数;通过优化式(4)中的损失函数对网络进行训练,将训练好的编码器网络与投影网络作为特征提取器对查询图像和图像数据库中的图像进行特征提取。Among them, 1 i≠j ∈{0, 1} is the indicator function, if and only if i≠j, take 1, otherwise take 0; τ>0 is the temperature parameter; z j(i) represents the positive sample of zi, z i ·z j(i) represents the inner product operation between vectors; Represents the total number of samples in the Batch that have the same label information as the sample zi ; the network is trained by optimizing the loss function in equation (4), and the trained encoder network and projection network are used as feature extractors to query images and image databases. feature extraction from the images.
所述特征向量之间的相似度计算函数采用对特征向量L2正则化后的点积或者特征向量间的余弦相似度:The similarity calculation function between the eigenvectors adopts the dot product after regularization of the eigenvectors L2 or the cosine similarity between the eigenvectors:
其中,zi与zi表示一维向量,||·||2表示向量的L2范数。Among them, zi and zi represent a one-dimensional vector, and ||·|| 2 represents the L2 norm of the vector.
在索引与排序过程中,使用平均查询扩展及数据库端特征增强以进一步提高检索结果的准确性。During the indexing and sorting process, average query expansion and database-side feature enhancement are used to further improve the accuracy of retrieval results.
所述平均查询扩展即首先根据原始查询Q0的特征向量与特征库中特征向量之间的相似度对数据库中的图像进行排序,返回前m(m<50)个结果,之后对原始查询Q0与m个结果进行平均,形成一个新的查询Qavg,并利用新的查询生成最终的检索结果;The average query expansion is to first sort the images in the database according to the similarity between the feature vector of the original query Q 0 and the feature vector in the feature database, return the first m (m<50) results, and then analyze the original query Q 0 and m results are averaged to form a new query Q avg , and the new query is used to generate the final retrieval result;
其中,z0为原始查询的特征向量,zi为第i个结果的特征向量。Among them, z 0 is the feature vector of the original query, and zi is the feature vector of the ith result.
所述数据库端特征增强通过对数据库中图像及与其相近图像的组合对原始图像进行替换,旨在利用图像邻域的特征来提高图像表示的质量;首先对图像特征库中的特征向量两两计算相似度,对于任一图像而言,将与其最近的K个图像特征进行相加,或者根据特征的排名对求和进行加权:The database-side feature enhancement replaces the original image by the combination of the image in the database and its similar images, aiming to improve the quality of the image representation by using the features of the image neighborhood; first, the feature vectors in the image feature library are calculated in pairs. Similarity, for any image, the K nearest image features are added, or the sum is weighted according to the ranking of the features:
其中,r是图像特征的排名,k是考虑的相近图像总数。where r is the rank of image features and k is the total number of close images considered.
有益效果:Beneficial effects:
本发明提出的基于对比学习的文物图像检索系统,通过有监督对比学习算法对网络进行训练得到特征提取器,可以对图像提取得到有效的、具有判别性的特征表示,并通过平均查询扩展与数据端数据特征增强来进一步提高检索的准确性。用户输入一张文物图像作为查询,该检索系统可以准确在图像数据库中检索并返回与查询图像匹配的结果(一个或排序后的多个)。在常见的图像数据集cifar10上得到的定量与定性结果均表明检索系统的有效性。The cultural relic image retrieval system based on contrastive learning proposed by the present invention trains the network through a supervised contrastive learning algorithm to obtain a feature extractor, which can extract an effective and discriminative feature representation from the image, and expands the data with the average query. The end data feature enhancement is used to further improve the retrieval accuracy. The user inputs an image of a cultural relic as a query, and the retrieval system can accurately retrieve and return results (one or multiple sorted) matching the query image in the image database. Both quantitative and qualitative results obtained on the common image dataset cifar10 demonstrate the effectiveness of the retrieval system.
附图说明Description of drawings
图1基于对比学习的文物图像检索系统;Fig. 1 Cultural relic image retrieval system based on contrastive learning;
图2对比学习模型;Figure 2 compares the learning model;
图3系统的定量与定性结果。Figure 3 Quantitative and qualitative results of the system.
具体实施方式Detailed ways
下面结合附图和具体实施方式对本发明进行详细说明。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
本发明基于对比学习的文物图像检索系统如图1所示,输入一张待检索的图像,对其进行预处理如尺度变换、随机翻转等,提取得到对应的查询图像特征,同时对图像数据库中的所有图像进行预处理与特征提取,得到对应的图像特征库,之后计算图像特征和图像特征库中所有特征之间的相似性,并利用相似性对图像数据库中的图像进行排序与索引,得到与查询图像匹配的图像(一张或多张排序后的图像)作为最终的检索结果。下面对特征提取器的训练以及检索模块的实现进行详细介绍:The cultural relic image retrieval system based on comparative learning of the present invention is shown in Figure 1. An image to be retrieved is input, and preprocessed such as scale transformation, random flip, etc., to extract the corresponding query image features. All images are preprocessed and feature extracted to obtain the corresponding image feature library, then calculate the similarity between the image features and all the features in the image feature library, and use the similarity to sort and index the images in the image database to get The images (one or more sorted images) matching the query image are used as the final retrieval result. The following is a detailed introduction to the training of the feature extractor and the implementation of the retrieval module:
1.特征提取器1. Feature Extractor
特征提取器需要对查询图像以及图像数据库中的图像进行特征提取以对图像信息进行有效的表示,其是后续检索模块中根据特征表示之间相似度来衡量图像之间相似度的基础,也是整个检索系统检索准确性的关键所在。这里使用有监督的对比学习算法对特征提取器进行训练。The feature extractor needs to perform feature extraction on the query image and the images in the image database to effectively represent the image information. The key to retrieval accuracy is the retrieval system. Here the feature extractor is trained using a supervised contrastive learning algorithm.
对比学习的核心思想是拉进样本与正样本之间的距离,同时拉远样本与其负样本之间的距离。有监督的对比学习算法中,利用数据集中的标签信息作为监督,每个样本都对应着多个正样本与负样本,训练特征提取器可以使其生成有判别性的特征表示,有利于图像检索任务的实现。对比学习模型如图2所示,采用完全对称且参数共享的两个分支,每个分支均包括数据增强、编码器网络与投影网络,其中的编码器网络与投影网络组成特征提取器。The core idea of contrastive learning is to pull in the distance between samples and positive samples, and at the same time pull the distance between samples and their negative samples. In the supervised contrastive learning algorithm, the label information in the data set is used as supervision, and each sample corresponds to multiple positive samples and negative samples. Training the feature extractor can make it generate a discriminative feature representation, which is beneficial to image retrieval. realization of the task. The contrastive learning model is shown in Figure 2. It adopts two branches that are completely symmetrical and share parameters. Each branch includes data enhancement, encoder network and projection network. The encoder network and projection network form a feature extractor.
对于任意一张图像x,它通过两种不同的数据增强方式形成两个增强视图xi与xj。由于上下分支是完全对称的,以上分支为例,xi首先经过编码器网络(一般采用ResNet作为模型结构)转换为对应的特征表示hi=fθ(xi)。之后非线性变换结构--投影网络(由[FC->BN->ReLU->FC]两层MLP构成)将特征表示映射为最终的特征表示zi=gθ(hi)。类似地,下分支的增强视图经过两次非线性变换得到最终的特征表示zj=gθ(fθ(xj))。对比学习的目的则是使得表示空间中正样本之间的距离较近,而负样本之间的距离较远。For any image x, it forms two augmented views x i and x j through two different data augmentation methods. Since the upper and lower branches are completely symmetrical, taking the above branch as an example, x i is first converted into a corresponding feature representation hi =f θ ( xi ) through an encoder network (usually using ResNet as the model structure). Then the nonlinear transformation structure-projection network (consisting of [FC->BN->ReLU->FC] two-layer MLP) maps the feature representation to the final feature representation zi =g θ (h i ). Similarly, the enhanced view of the lower branch undergoes two nonlinear transformations to obtain the final feature representation z j =g θ (f θ (x j )). The purpose of contrastive learning is to make the distance between positive samples in the representation space closer, while the distance between negative samples is farther.
网络训练时,随机采样N个样本构成一个Batch,记为{xk,yk}k=1,2,...,N,yk是xk的标签,通过数据增强可以得到2N个样本其中,和是同一个样本经两种随机的数据增强方式得到的数据对,数据增强过程中的标签信息始终不会改变。若不考虑类别的监督信息,数据对互为正样本,而与Batch中除外的其他任意2N-2个样本都互为负样本。此时为自监督的对比学习算法,其损失函数定义为:During network training, randomly sample N samples to form a Batch, denoted as {x k , y k } k = 1, 2, ..., N , y k is the label of x k , and 2N samples can be obtained through data enhancement in, and It is a data pair obtained by the same sample through two random data enhancement methods, and the label information in the data enhancement process will never change. If the supervisory information of the category is not considered, the data are positive samples of each other, and Except in Batch Any other 2N-2 samples are negative samples of each other. At this time, it is a self-supervised contrastive learning algorithm, and its loss function is defined as:
其中,1i≠k∈{0,1}为指示函数,当且仅当i≠k时取1,否则取0;τ>0为温度参数;zj(i)表示zi的正样本,zi·zj(i)表示向量之间的内积运算。可知,损失函数的分子部分鼓励样本与正样本之间的相似度越高越好,即在表示空间中距离越近越好;分母部分则鼓励样本与负样本之间的相似度越低越好,即在表示空间中距离越远越好。Among them, 1 i≠k ∈{0,1} is the indicator function, if and only if i≠k, take 1, otherwise take 0; τ>0 is the temperature parameter; z j(i) represents the positive sample of zi, z i ·z j(i) represents an inner product operation between vectors. It can be seen that the numerator part of the loss function encourages the higher the similarity between the sample and the positive sample, the better, that is, the closer the distance in the representation space, the better; the denominator part encourages the lower the similarity between the sample and the negative sample, the better , that is, the farther the distance in the representation space, the better.
可知,自监督对比学习的损失函数将每个样本作为一个单独的类别进行处理,无法处理数据集中存在标签即已知多个样本属于同一类别的情况。而对于有监督对比学习,一个样本对应着多个正样本,即Batch内与其标签信息相同的样本作为正样本,而与其标签信息不同的样本作为负样本,这样可以有效利用已知的标签信息进行监督学习,从而实现同类别的样本在表示空间中更加接近,而不同类别的样本在表示空间中相互远离,提高特征表示的判别能力。因此,有监督对比学习的损失函数定义为:It can be seen that the loss function of self-supervised contrastive learning treats each sample as a separate category, and cannot handle the situation where there are labels in the dataset, that is, it is known that multiple samples belong to the same category. For supervised contrastive learning, one sample corresponds to multiple positive samples, that is, the samples with the same label information in the batch are regarded as positive samples, and the samples with different label information are regarded as negative samples, which can effectively use the known label information to carry out Supervised learning, so that samples of the same category are closer in the representation space, while samples of different categories are far away from each other in the representation space, improving the discriminative ability of feature representation. Therefore, the loss function for supervised contrastive learning is defined as:
其中,表示Batch中与样本zi具有相同标签信息的样本总数。通过优化式(4)中的损失函数对网络进行训练,将训练好的编码器网络与投影网络作为特征提取器对查询图像和图像数据库中的图像进行特征提取。in, Indicates the total number of samples in the Batch that have the same label information as the sample zi . The network is trained by optimizing the loss function in equation (4), and the trained encoder network and projection network are used as feature extractors to extract features from the query image and the images in the image database.
2.检索模块2. Retrieval module
利用特征提取器对数据库中的所有图像进行特征提取得到相应的图像特征库。进行检索时,输入一张查询图像,对其进行特征提取得到相应的特征向量。之后通过检索模块,计算查询图像的特征向量与图形特征库中特征向量之间的相似度,并根据相似度进行索引与排序,输出排序后的图像作为最终结果(个数由人为设定)。The feature extractor is used to extract the features of all the images in the database to obtain the corresponding image feature library. When retrieving, input a query image, and perform feature extraction on it to obtain the corresponding feature vector. Afterwards, through the retrieval module, the similarity between the feature vector of the query image and the feature vector in the graphic feature library is calculated, and the indexing and sorting are performed according to the similarity, and the sorted images are output as the final result (the number is set manually).
特征向量之间的相似度计算函数一般采用对特征向量L2正则化后的点积或者特征向量间的余弦相似度:The similarity calculation function between eigenvectors generally adopts the dot product after regularization of eigenvectors L2 or the cosine similarity between eigenvectors:
在索引与排序过程中,使用平均查询扩展及数据库端特征增强以进一步提高检索结果的准确性。平均查询扩展即首先根据原始查询Q0的特征向量与特征库中特征向量之间的相似度对数据库中的图像进行排序,返回前m(m<50)个结果,之后对原始查询Q0与m个结果进行平均,形成一个新的查询Qavg,并利用新的查询生成最终的检索结果。During the indexing and sorting process, average query expansion and database-side feature enhancement are used to further improve the accuracy of retrieval results. The average query expansion is to first sort the images in the database according to the similarity between the feature vector of the original query Q 0 and the feature vector in the feature database, and return the first m (m < 50) results, and then compare the original query Q0 and m The results are averaged to form a new query Q avg , and the new query is used to generate the final retrieval result.
其中,z0为原始查询的特征向量,zi为第i个结果的特征向量。数据库端特征增强通过对数据库中图像及与其相近图像的组合对原始图像进行替换,旨在利用图像邻域的特征来提高图像表示的质量。首先对图像特征库中的特征向量两两计算相似度,对于任一图像而言,将与其最近的K个图像特征进行相加,或者根据特征的排名对求和进行加权:Among them, z 0 is the feature vector of the original query, and zi is the feature vector of the ith result. Database-side feature enhancement replaces the original image with a combination of images in the database and its adjacent images, aiming to improve the quality of image representation by utilizing the features of image neighborhoods. First, the similarity is calculated for the feature vectors in the image feature library pairwise. For any image, the K nearest image features are added, or the sum is weighted according to the ranking of the features:
其中,r是图像特征的排名,k是考虑的相近图像总数。where r is the rank of image features and k is the total number of close images considered.
所提的基于对比学习的文物图像检索系统,特征提取器中的编码器网络采用ResNet50网络架构,图像特征向量之间的相似度计算使用余弦相似度,检索模块进行平均查询扩展与数据库端特征增强时,选取特征相似度排名前五的结果进行计算。首先基于有监督对比学习算法对特征提取器进行训练,利用训练好的特征提取器对图像数据库中的图像进行特征提取得到特征数据库,用户进行查询时,输入一张查询图像到系统中,系统将通过特征提取器对其进行预处理与特征提取得到查询特征,之后对查询特征与特征数据库中的特征进行相似性度量,利用特征相似度实现排序与索引,最终输出与查询图像匹配的图像(一个或排序后的多个)给用户。该检索系统可以实现快速且有效的检索,表1展示了其在常见的图像数据集cifar10上的检索准确率,图3展示了其输出为10个的检索结果。In the proposed cultural relic image retrieval system based on contrastive learning, the encoder network in the feature extractor adopts the ResNet50 network architecture, the similarity between image feature vectors is calculated using cosine similarity, and the retrieval module performs average query expansion and database-side feature enhancement. , select the top five results of feature similarity for calculation. First, the feature extractor is trained based on the supervised contrastive learning algorithm, and the trained feature extractor is used to extract the features of the images in the image database to obtain the feature database. When the user queries, input a query image into the system, the system will The query feature is obtained by preprocessing and feature extraction by the feature extractor, and then the similarity between the query feature and the feature in the feature database is measured, and the feature similarity is used to achieve sorting and indexing, and finally an image matching the query image is output (a or sorted multiple) to the user. The retrieval system can achieve fast and effective retrieval. Table 1 shows its retrieval accuracy on the common image dataset cifar10, and Figure 3 shows the retrieval results with 10 outputs.
表1Table 1
本发明公开和提出的技术方案,本领域技术人员可通过借鉴本文内容,适当改变条件路线等环节实现,尽管本发明的方法和制备技术已通过较佳实施例子进行了描述,相关技术人员明显能在不脱离本发明内容、精神和范围内对本文所述的方法和技术路线进行改动或重新组合,来实现最终的制备技术。特别需要指出的是,所有相类似的替换和改动对本领域技术人员来说是显而易见的,他们都被视为包括在本发明精神、范围和内容中。本发明未尽事宜属于公知技术。The technical solutions disclosed and proposed in the present invention can be realized by those skilled in the art by referring to the content of this article and appropriately changing the conditions, routes and other links. The methods and technical routes described herein can be modified or recombined without departing from the content, spirit and scope of the present invention to achieve the final preparation technology. It should be particularly pointed out that all similar substitutions and modifications apparent to those skilled in the art are deemed to be included in the spirit, scope and content of the present invention. Matters not covered by the present invention belong to the known technology.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210253589.0A CN114610941B (en) | 2022-03-15 | 2022-03-15 | Cultural Relics Image Retrieval System Based on Contrastive Learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210253589.0A CN114610941B (en) | 2022-03-15 | 2022-03-15 | Cultural Relics Image Retrieval System Based on Contrastive Learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114610941A true CN114610941A (en) | 2022-06-10 |
CN114610941B CN114610941B (en) | 2025-01-14 |
Family
ID=81862722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210253589.0A Active CN114610941B (en) | 2022-03-15 | 2022-03-15 | Cultural Relics Image Retrieval System Based on Contrastive Learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114610941B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580268A (en) * | 2023-07-11 | 2023-08-11 | 腾讯科技(深圳)有限公司 | Training method of image target positioning model, image processing method and related products |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019148898A1 (en) * | 2018-02-01 | 2019-08-08 | 北京大学深圳研究生院 | Adversarial cross-media retrieving method based on restricted text space |
CN110851645A (en) * | 2019-11-08 | 2020-02-28 | 吉林大学 | A Similarity Preserving Image Retrieval Method Based on Deep Metric Learning |
CN113127661A (en) * | 2021-04-06 | 2021-07-16 | 中国科学院计算技术研究所 | Multi-supervision medical image retrieval method and system based on cyclic query expansion |
CN113743251A (en) * | 2021-08-17 | 2021-12-03 | 华中科技大学 | Target searching method and device based on weak supervision scene |
CN113822368A (en) * | 2021-09-29 | 2021-12-21 | 成都信息工程大学 | Anchor-free incremental target detection method |
-
2022
- 2022-03-15 CN CN202210253589.0A patent/CN114610941B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019148898A1 (en) * | 2018-02-01 | 2019-08-08 | 北京大学深圳研究生院 | Adversarial cross-media retrieving method based on restricted text space |
CN110851645A (en) * | 2019-11-08 | 2020-02-28 | 吉林大学 | A Similarity Preserving Image Retrieval Method Based on Deep Metric Learning |
CN113127661A (en) * | 2021-04-06 | 2021-07-16 | 中国科学院计算技术研究所 | Multi-supervision medical image retrieval method and system based on cyclic query expansion |
CN113743251A (en) * | 2021-08-17 | 2021-12-03 | 华中科技大学 | Target searching method and device based on weak supervision scene |
CN113822368A (en) * | 2021-09-29 | 2021-12-21 | 成都信息工程大学 | Anchor-free incremental target detection method |
Non-Patent Citations (2)
Title |
---|
陈阳;周圆;: "一种基于深度学习模型的图像模糊自动分析处理算法", 小型微型计算机系统, no. 03, 15 March 2018 (2018-03-15) * |
项圣凯;曹铁勇;方正;洪施展;: "使用密集弱注意力机制的图像显著性检测", 中国图象图形学报, no. 01, 16 January 2020 (2020-01-16) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580268A (en) * | 2023-07-11 | 2023-08-11 | 腾讯科技(深圳)有限公司 | Training method of image target positioning model, image processing method and related products |
CN116580268B (en) * | 2023-07-11 | 2023-10-03 | 腾讯科技(深圳)有限公司 | Training method of image target positioning model, image processing method and related products |
Also Published As
Publication number | Publication date |
---|---|
CN114610941B (en) | 2025-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114926746B (en) | SAR image change detection method based on multiscale differential feature attention mechanism | |
Beikmohammadi et al. | SWP-LeafNET: A novel multistage approach for plant leaf identification based on deep CNN | |
CN107066599A (en) | A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning | |
CN110188225B (en) | Image retrieval method based on sequencing learning and multivariate loss | |
CN102004786B (en) | Acceleration method in image retrieval system | |
CN105718960A (en) | Image ordering model based on convolutional neural network and spatial pyramid matching | |
Champ et al. | A comparative study of fine-grained classification methods in the context of the LifeCLEF plant identification challenge 2015 | |
CN113392191B (en) | Text matching method and device based on multi-dimensional semantic joint learning | |
CN110866134B (en) | A Distribution Consistency Preserving Metric Learning Method for Image Retrieval | |
CN109902714A (en) | A Multimodal Medical Image Retrieval Method Based on Multi-Graph Regularized Deep Hashing | |
CN111338950A (en) | Software defect feature selection method based on spectral clustering | |
CN105320764A (en) | 3D model retrieval method and 3D model retrieval apparatus based on slow increment features | |
CN107291895A (en) | A kind of quick stratification document searching method | |
CN114676769A (en) | A small-sample insect image recognition method based on visual Transformer | |
Ahmed et al. | Prediction of COVID-19 disease severity using machine learning techniques | |
CN106250925A (en) | A kind of zero Sample video sorting technique based on the canonical correlation analysis improved | |
WO2021128704A1 (en) | Open set classification method based on classification utility | |
CN114610941A (en) | Cultural relic image retrieval system based on comparison learning | |
CN116258938A (en) | Image Retrieval and Recognition Method Based on Autonomous Evolutionary Loss | |
CN114090813B (en) | Variable self-encoder balanced hash remote sensing image retrieval method based on multichannel feature fusion | |
CN103514276A (en) | Graphic target retrieval positioning method based on center estimation | |
Ni et al. | The analysis and research of clustering algorithm based on PCA | |
Xiang et al. | Wool fabric image retrieval based on soft similarity and listwise learning | |
CN117576471A (en) | Method and device for classifying few-sample images by introducing local feature alignment and prototype correction mechanisms | |
Lin et al. | Multi-stage network with geometric semantic attention for two-view correspondence learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20221101 Address after: 300072 Tianjin City, Nankai District Wei Jin Road No. 92 Applicant after: Tianjin University Applicant after: Yiyuan digital (Beijing) Technology Group Co.,Ltd. Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92 Applicant before: Tianjin University |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |