CN110457514A

CN110457514A - A kind of multi-tag image search method based on depth Hash

Info

Publication number: CN110457514A
Application number: CN201910741839.3A
Authority: CN
Inventors: 谢武; 刘满意; 强保华; 王培雷; 曹亚伟
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2019-11-15

Abstract

The invention discloses a kind of multi-tag image search method based on depth Hash, it is to improve the accuracy rate of multi-tag image retrieval.The COS distance that the method introduces pairs of multi-tag image tag vector participates in model training as supervision message, feature extraction is carried out to multi-tag image using residual error network, binary coding mechanism is introduced simultaneously, dimensionality reduction is carried out to the high dimensional feature of extraction, and the depth Hash model based on residual error network is trained using multi-tag image data set；It calls the model to carry out image retrieval on multi-tag image querying data set after the completion of training, and assesses the generalization ability and retrieval accuracy of the model.

Description

A kind of multi-tag image search method based on depth Hash

Technical field

The present invention relates to image retrieval technologies fields, and in particular to a kind of multi-tag image retrieval side based on depth Hash Method.

Background technique

Content-based image retrieval technology is a kind of using image retrieval technique as input, is different from based on text Image retrieval technologies.Single label image Hash is to be encoded using hash algorithm to the image for using single label for labelling Technology, the method for mainstream is to combine deep learning with hash algorithm at present；The advantage of multi-tag image Hash technology is The multiple semantic information for being included, more closer to reality application scenarios, but current multi-tag figure can sufficiently be learnt to image There are also to be hoisted for performance such as accuracy rate as Hash etc..

Summary of the invention

In order to improve the accuracy rate of multi-tag image retrieval, the present invention provides a kind of multi-tag image based on depth Hash Search method, the COS distance that the method introduces pairs of multi-tag image tag vector participate in model instruction as supervision message Practice, feature extraction is carried out to multi-tag image using residual error network, while it is special to the higher-dimension of extraction to introduce binary coding mechanism Sign carries out dimensionality reduction, and is trained using multi-tag image data set to the depth Hash model based on residual error network；It has trained Image retrieval is carried out on multi-tag image querying data set at rear calling model, and assesses generalization ability and the inspection of the model Rope accuracy.

Technical solution of the present invention includes the following steps:

(1) multi-tag image training dataset is given.

(2) software environment based on deep learning Keras frame is built, is prepared for the training of subsequent network model.

(3) data prediction is carried out to the multi-tag image training dataset of input, is processed into image and label one is a pair of The form answered.

(4) add Hash to lose multi-tag image encrypting algorithm of the layer building based on depth Hash using residual error network, use First 49 layers of 50 layers of residual error network extract the higher-dimension semantic feature of multi-tag image, and the last classification layer of residual error network replaced For Hash loss layer, dimensionality reduction is carried out to the higher-dimension semantic feature of acquisition.

(5) multi-tag image encrypting algorithm will be inputted by the multi-tag training set of images of data prediction, uses more marks The COS distance of image tag vector is signed to measure the similitude between multi-tag image, intersects entropy loss and minimum using weighting Mean square error loss is retained as using quantization to the similitude sequencing information of multi-tag image with this to construct a loss function Loss function carrys out the Hash coding quality of Controlling model output；Then above-mentioned three groups of loss functions are constituted into associated losses function pair The model built is trained, and generates the model file of HDF5 format.

(6) Hash coding is carried out to image in multi-tag image library using trained multi-tag image encrypting algorithm, it will The Hash coding deposit Hash table of multi-tag image is obtained, to set up the index database of multi-tag image library.

(7) picture is obtained from multi-tag image querying data set as query image, calls trained depth Hash Model encodes to obtain the Hash of query image, and the similarity of image, makes in calculating input image and multi-tag image data base With average retrieval precision come the generalization ability of assessment models.

Detailed description of the invention

Fig. 1 is flow chart of the invention.

Fig. 2 is the model training and on-line retrieval process figure in the present invention.

Specific embodiment

The embodiment that the present invention provides devises a kind of new multi-tag image encrypting algorithm, on the one hand, model uses residual Poor network enhances the extractability to multi-tag image low-dimensional feature, while can be with by the identical mapping module of residual error network Accelerate model convergence rate；On the other hand, model applies fine tuning strategy, so as to increase the retrieval rate of model.Tool It is divided into two stages: off-line training and online retrieving for body.

Off-line training step: firstly, depth Hash model of the building based on residual error network, is come just using the weight of pre-training Parameter in beginningization model；Intersect entropy loss, least mean-square error loss and quantization loss function loss construction one secondly, introducing A associated losses function is as supervision；Finally, being finely adjusted using target multi-tag training set of images to the model.

The online retrieving stage: binaryzation volume is carried out to multi-tag picture material using trained depth Hash model Then code is stored binaryzation coding and image identification in the form of key-value pair into Hash table.When user input query image When, query image two-value retrieval code H is generated using Hash model_q, calculate H_qIt is retrieved with two-values all in multi-tag image data base The Hamming distance of code obtains search result sorted lists according to Hamming distance from small to large, returns in retrieval ordering the results list The highest preceding n (1≤n≤retrieval ordering the results list length) of similarity image is returned the result as retrieval.

Referring to Fig.1, the present invention includes the following steps:

(1) multi-tag image training dataset is given.

(2) training that the software environment based on deep learning Keras frame is subsequent network model is built to prepare.

(4) Hash is added to lose multi-tag image encrypting algorithm of the layer building based on depth Hash using residual error network, specifically For, in order to obtain the balance between the model size and multi-tag image retrieval precision, the present invention uses 50 layers of residual error net Network extracts the high-level semantics features of multi-tag image.In order to solve the efficiency of multi-tag image retrieval, the present invention will be residual The last classification layer of poor network replaces with Hash loss layer, carries out dimensionality reduction to the higher-dimension semantic feature of acquisition.On the one hand, the mould Type has used the network structure haveing excellent performance to obtain the better character representation of multi-tag image；Hash encoding mechanism is introduced to height The feature of dimension carries out dimensionality reduction, and the time of retrieval consumption is reduced on the basis of guaranteeing retrieval rate, thus obtain it is a kind of it is efficient, Accurate multi-tag image search method；On the other hand, the model applies fine tuning strategy in training to increase model Retrieval rate.

(5) model will be inputted by the multi-tag training set of images of data prediction, in order to retain multi-tag image In include multistage affinity information, we bright measure multi-tag image using the COS distance of multi-tag image tag vector Between similitude.On this basis, we are bright constructs a loss using entropy loss and least mean-square error loss is intersected Function is retained as the similitude sequencing information to multi-tag image with this；Come the Kazakhstan of Controlling model output using quantization loss function Uncommon coding quality；Then above-mentioned three groups of loss functions associated losses function is constituted to be trained to obtain the model built The better character representation of multi-tag image, and generate the model file of HDF5 format.

(6) Hash coding is carried out to image in multi-tag image library using trained model, multi-tag image will be obtained Hash coding deposit Hash table, to set up the index database of multi-tag image library.

(7) picture is obtained from multi-tag image querying data set as query image, and trained model is called Hash to query image encodes, and the similarity of image, uses average inspection in calculating input image and multi-tag image data base Suo Jingdu (MAP) carrys out the generalization ability of assessment models.

Referring to Fig. 2, specific step is as follows for the multi-tag image encrypting algorithm training and retrieval:

(1) pre-training model is obtained to 50 layers of residual error network training on ImageNet large-scale image data collection first.

(2) Hash is added to lose multi-tag image encrypting algorithm of the layer building based on depth Hash, the mould using residual error network Type uses the preceding 49 layer network layer of 50 layers of residual error network, while the last one classification layer replaces with Hash layer by residual error network.Make The weight of the pre-training model obtained with step (1) has built the parameter based on depth Hash model to initialize, and then exists The model constructed is fixed on target multi-tag image data set the fine tuning of parameter.

(3) fine-tuned good model is subjected to the generation of Hash feature, feature two-value to multi-tag image data base image Change.Then image in multi-tag image querying data set is used to calculate query image and multi-tag picture number as query image According to the similarity of image in library, the generalization ability of evaluation model is carried out using average retrieval precision.Model is in two multi-tag images The highest average retrieval precision for inquiring data set is respectively 87.39% and 82.41%.

The multi-tag image encrypting algorithm is constructed mainly to include the following steps:

(1) the preceding 49 layers of characteristic extracting module as model for retaining 50 layers of residual error network is replaced former residual using Hash layer The classification layer of poor network realizes the higher-dimension multi-tag image, semantic Feature Dimension Reduction to extraction.

(2) it is calculated as the label similarity matrix S to multi-tag image, the expression formula calculated is as follows:

S={ s_ij| i, j=1,2 ..., n }；

Wherein s_ijFor the element of matrix S the i-th row jth column, value can be divided into two kinds of situations: as multi-tag image I_iAnd I_j When sharing at least one class label, s_ijValue is r_ij；Otherwise s_ijValue is 0.r_ijValue is pairs of multi-tag image The COS distance of label vector, calculation expression is as shown in above-mentioned formula, wherein l_iAnd l_jRespectively multi-tag image I_iAnd I_j's Label vector, < l_i,l_j> it is l_iAnd l_jInner product, | | l_iorj||₂For multi-tag image tag vector l_iOr l_jL2 norm, this hair It is bright to participate in model training for the COS distance of pairs of multi-tag image tag vector as supervision message.

In order to allow Hash coding that can preferably indicate the multi-tag characteristics of image of model extraction, the present invention uses quantization loss To control the quality of Hash coding；Meanwhile in order to promote the performance of multi-tag image retrieval, the present invention using intersect entropy loss and Least mean-square error loss is trained model, retains existing multi-level semantic phase between the pairs of image of multi-tag with this Like degree.

The expression formula for quantifying loss function is as follows: Q=| | | u_i|-1||₁+|||u_j-1|||₁。

Wherein u_iThe q-bit Hash coding of multi-tag image is exported for depth Hash model, | | u_i||₁For a kind of vector Element-Level operation, indicates to vector u_iEvery one-dimensional element take absolute value, quantization loss function promotes model output to be distributed in -1, Near+1.

Cross entropy loss function and least mean-square error loss function are as follows:

Wherein s_ij=0or1 indicates cross entropy loss function, is expressed as the class label possessed multi-tag image with this Completely the same or completely inconsistent situation；0 < s_ij< 1 is expressed as only possessing multi-tag image the similar classification in part The case where label.WhereinIt is expressed as encoding u to multi-tag image q-bit Hash_iAnd u_jBetween weighting inner product, σ () is sigmoid activation primitive.

(3) this final model is constituted using intersection entropy loss, least mean-square error loss and quantization loss function loss Associated losses function is trained, and the strategy of application fine tuning is trained model to guide the depth based on residual error network Hash model reaches an expected effect.

Claims

1. a kind of multi-tag image search method based on depth Hash, it is characterised in that the described method includes:

(1) Hash is added to lose multi-tag image encrypting algorithm of the layer building based on depth Hash using residual error network, using 50 layers First 49 layers of residual error network extract the higher-dimension semantic feature of multi-tag image, and the last classification layer of residual error network is replaced with Kazakhstan Uncommon loss layer carries out dimensionality reduction to the higher-dimension semantic feature of acquisition；

(2) the multi-tag image encrypting algorithm will be inputted by the multi-tag training set of images of data prediction, uses more marks The COS distance of image tag vector is signed to measure the similitude between multi-tag image, intersects entropy loss and minimum using weighting Mean square error loss is retained as using quantization to the similitude sequencing information of multi-tag image with this to construct a loss function Loss function carrys out the Hash coding quality of Controlling model output；Then above-mentioned three groups of loss functions are constituted into associated losses function pair The model built is trained, and generates the model file of HDF5 format.

2. according to the method described in claim 1, further including following steps:

(1) multi-tag image training dataset is given；

(2) software environment based on deep learning Keras frame is built, is prepared for the training of subsequent network model；

(3) data prediction is carried out to the multi-tag image training dataset of input, it is one-to-one with label is processed into image Form；

(4) Hash coding is carried out to image in multi-tag image library using trained multi-tag image encrypting algorithm, will obtained The Hash coding deposit Hash table of multi-tag image, it is established that the index database of multi-tag image library；

(5) picture is obtained from multi-tag image querying data set as query image, calls trained depth Hash model To obtain the Hash coding of query image, the similarity of calculating input image and image in multi-tag image data base, using flat Equal retrieval precision carrys out the generalization ability of assessment models.