CN110457503A - A kind of rapid Optimum depth hashing image coding method and target image search method - Google Patents
A kind of rapid Optimum depth hashing image coding method and target image search method Download PDFInfo
- Publication number
- CN110457503A CN110457503A CN201910701690.6A CN201910701690A CN110457503A CN 110457503 A CN110457503 A CN 110457503A CN 201910701690 A CN201910701690 A CN 201910701690A CN 110457503 A CN110457503 A CN 110457503A
- Authority
- CN
- China
- Prior art keywords
- coding
- image
- formula
- depth
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of rapid Optimum depth hashing image coding method and target image search methods, based on Greedy strategy, for large-scale image data set, hashing image encoding model is established, the binary-coding of all images is generated by the depth Hash coding network obtained after optimization.When carrying out target image retrieval, the similar similar image of query image can be obtained rapidly by calculating the Hamming distance between query image coding and database images coding.The method of the present invention combination neural network has better solved gradient and has disappeared and quantization error problem, and coding efficiency is more excellent;The training process of depth network is completed with less the number of iterations, training speed is faster;The problem of can apply to various band discrete constraints, application range is wider;The optimal speed of deep neural network and the retrieval performance of generated image coding are further increased, the retrieval precision of large-scale image data base is effectively increased.
Description
Technical field
The invention belongs to technical field of information retrieval, are related to image procossing and image fast searching techniques more particularly to one
Rapid Optimum depth hashing image coding method and its target image search method of the kind based on Greedy strategy.
Background technique
With the arrival of big data era, the data in each field are in explosive growth, in so big data tide,
The information for how retrieving oneself needs becomes an important and urgent research topic.Hash algorithm is exactly a kind of is directed to
The algorithm of target image retrieval is rapidly completed on large-scale image data set, main thought is that image is all encoded into a string of two-values
Code (i.e. each image, which is all used a string, has the two-value code of limit as the character representation of the image), by quick between two-value code
Xor operation obtain its Hamming distance, thus sequence after complete approximate KNN image retrieval (looked for from image data base
Out with the most similar image of query image).The representation of this bianry image feature can bring low-down memory requirement (two
System) and very fast retrieval rate (Hamming distance between feature can be obtained by xor operation by turn simplest in computer
From the similarity degree to judge two images), therefore have very big research potential and application range.
Deep learning was especially its masterpiece in recent years --- the fast development of convolutional neural networks, so that major and image
It is relevant to apply the promotion (such as classification, object detection, image retrieval) that matter has been obtained in performance.Deep neural network is main
It is trained using stochastic gradient descent method, it is simply that being obtained after network forward-propagating by input picture to network
Characteristics of image, calculating corresponding loss function, (target of alternatively referred to as objective function, such as retrieval is that of a sort image should have
Have similar characteristics of image), then backpropagation will be lost, the gradient for calculating each layer neuron (is obtained towards reduction loss function
Direction), thus complete parameter update and network training, so that loss function is preferably minimized (as shown in Figure 1).Deep learning is strong
Hash researcher is promoted to propose hash algorithm and depth network integration, i.e. depth hash algorithm further increasing image greatly
The retrieval performance of coding.
Depth hash algorithm solves the thinking of rapid image retrieval tasks in two steps, first is that powerful using convolutional neural networks
Feature learning ability, the depth characteristic expression of each image in study image data base is gone, in the hope of relative to each picture of original image
Plain value or the feature for using traditional characteristic extraction algorithm to extract indicate that depth network, which can export, can more characterize input figure as image
As the characteristics of image of feature.Second is that above-mentioned successive value characteristics of image is further encoded to binary features using hash algorithm,
To significantly reduce memory requirement and improve recall precision rapidly, it is really achieved fast and accurately Search Requirement.It is deep
Degree hash algorithm makes depth characteristic study and Hash by effectively merging above-mentioned two step to the same depth network frame
Coding two parts can mutually promote study and training, to obtain optimal image binary-coding and corresponding encoded network.
But actually to realize that the depth hash algorithm of real network end-to-end training is still a extremely challenging
Problem, Major Difficulties are encoded into sign function used in two-value code (as shown in Figure 2) by image, and gradient (derivative) is everywhere
Be zero, this be for the deep neural network being trained with gradient descent method it is fatal, sign function front layer network without
Method obtains any gradient updating information and leads to failure to train.
Hash algorithm is by being encoded into a string of compact two-value codes for image, so as to quick on large-scale image data set
The completing target image retrieval of the task (gives a query image, algorithm is found out similar same in large-scale image data base
Class image simultaneously returns to user), have very extensive application range --- such as to scheme to search figure, face authentication technical application.
And depth hash algorithm then wishes to combine instantly powerful both deep learning and hash algorithm, further increases image retrieval system
The performance of system.The very stubborn problem that depth Hash research field faces is that image is encoded into used in two-value code
Sign function (value output+1 of the input greater than 0, input the value output -1 less than 0), gradient is zero everywhere, this is for gradient
The deep neural network that descent method is trained be it is fatal, this gradient disappearance problem can make network front layer be unable to get appoint
What more new information, final failure to train cannot get efficient coding to image.
Existing major part depth hash algorithm proposition first loosens former problem, stringent without requiring in the training stage
It generates two-value code { -1 ,+1 }, loosens can to take the successive value (corresponding generating function can be led) between -1 to+1 everywhere, to make
Obtaining network can complete to train, and then in the above-mentioned continuous value tag of last test stage re-quantization, obtain real two-value code.It should
Although method can solve gradient disappearance problem, but this problem of loosening strategy and introducing quantization error, be presented as test
The characteristics of image that stage forces the real two-value code generated and training stage to generate is different, to cause showing for performance decline
As.
Although HashNet and DSDH are avoided that the gradient in depth Hash disappears and the obstacles such as quantization error complete network instruction
The problem of experienced, but they expose, is also more obvious.On the one hand, they require repetitive exercise many times, because DSDH is each
Only update one in binary-coding, it is therefore desirable to it is cyclically updated different coding sites, and HashNet needs constantly tightening about
It is trained again after beam β, thus needs more training iteration, training is at high cost.On the other hand, for DSDH, use from
Dissipating cyclic coordinate descent method can be only applied to discrete quadratic programming problem, and application range is limited, and HashNet cannot increase due to β
Infinity is grown to, so it is best that result, which can still have part quantization error to lead to performance not,.
Most of depth hash algorithm application loosens strategy to solve the problems, such as gradient disappearance, as document [1] uses tanh letter
Then number (derivative is not 0) exists come the discrete value { -1 ,+1 } for exporting the successive value between [- 1 ,+1] to replace sign function
Again by the stringent binaryzation of obtained characteristics of image after the completion of training, real bianry image feature is obtained.Although this method can
To solve the problems, such as gradient disappearance, but its application loosens strategy and can introduce quantization error, is presented as that test phase generates real
The successive value characteristics of image that two-value code and training stage generate is different, so as to cause image binary-coding generated and volume
Code network is suboptimum.For quantization error problem, document [2] and [3] illustrate respective solution and achieve at present
The field preferably performance.Document [2] proposes HashNet algorithm, by being compiled first using the y=tanh (β x) loosened
Code function, is then gradually increased β in training process to approach original sign function y=sgn (x) to reducing quantization error,
The increase step by step of training difficulty makes network be unlikely to failure to train at the very start.Document [3] then proposes DSDH algorithm, use from
Cyclic coordinate descent method is dissipated to solve Hash discrete codes optimization problem, and entire solve can maintain discrete value with the training stage
Constraint is not loosened, to avoid introducing quantization error.Although retrieval performance gets a promotion, HashNet and DSDH still remain net
The problems such as network training is slow, application range is limited.
Bibliography:
[1]Zhao F,Huang Y,Wang L,et al.Deep semantic ranking based hashing
for multi-label image retrieval[C]//Proceedings of the IEEE conference on
computer vision and pattern recognition.2015:1556-1564.
[2]Cao Z,Long M,Wang J,et al.HashNet:Deep Learning to Hash by
Continuation[C]//ICCV.2017:5609-5618.
[3]Li Q,Sun Z,He R,et al.Deep supervised discrete hashing[C]//
Advances in Neural Information Processing Systems.2017:2482-2491.
[4]Goodfellow I,Bengio Y,Courville A,et al.Deep learning[M]
.Cambridge:MIT press,2016.
[5]Li W J,Wang S,Kang W C.Feature learning based deep supervised
hashing with pairwise labels[J].arXiv preprint arXiv:1511.03855,2015.
[6]Zhu H,Long M,Wang J,et al.Deep Hashing Network for Efficient
Similarity Retrieval[C]//AAAI.2016:2415-2421.
[7]Lai H,Pan Y,Liu Y,et al.Simultaneous feature learning and hash
coding with deep neural networks[C]//Proceedings of the IEEE conference on
computer vision and pattern recognition.2015:3270-3278.
[8]Xia R,Pan Y,Lai H,et al.Supervised hashing for image retrieval via
image representation learning[C]//AAAI.2014,1(2014):2.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a kind of rapid Optimum depth Kazakhstan based on Greedy strategy
Uncommon image encoding method and target image search method have better solved gradient disappearance in conjunction with neural network and quantization error are asked
Topic, coding efficiency are more excellent;The training process of depth network is completed with less the number of iterations, training speed is faster;It can apply to
The problem of various band discrete constraints, application range is wider;Further increase the optimal speed and institute's generation figure of deep neural network
As the retrieval performance of coding, the retrieval precision of large-scale image data base is effectively increased.
Present invention provide the technical scheme that
A kind of target image search method of rapid Optimum depth Hash coding, comprising: quick excellent based on Greedy strategy
Change the coding method of depth hashing image and target image search method.It is established based on Greedy strategy for large-scale image data set
Hashing image encoding model generates the binary-coding of all images by the depth Hash coding network obtained after optimization.Into
It, can be rapid to obtain by calculating the Hamming distance between query image coding and database images coding when row target image is retrieved
To the similar similar image of query image.Include the following steps:
1) hashing image encoded question models, and obtains hashing image encoding model;
The modeling of hashing image encoded question, obtains optimization problem i.e. hashing image encoding model, is expressed as formula (1):
s.t.B∈{-1,+1}KFormula (1)
Wherein, B indicates the binary-coding that the present invention carries out propagated forward to input picture using depth network and generates, and
Constraint condition then constrain coding B each all can only the selection from { -1 ,+1 }, one shared K (i.e. each image is all compiled
Code is K two-value codes).L (B) indicates to carry out loss function calculating to B, because algorithm proposed by the present invention is limited unlike DSDH
L must be quadratic programming form, therefore in model of the invention, and the L of above formula can be specially that deep learning field is various often
Loss function, such as mean square error function, intersection entropy function.
2) hashing image encoding model (formula (1)) is solved using Greedy strategy, obtains optimal binary-coding B;Including such as
Lower operation:
21) in solution procedure, do not considering discrete constraint B ∈ { -1 ,+1 }KIn the case where, L can be first calculated about B
GradientThen it is iterated update using the gradient descent method of following formula, is expressed as formula (2):
Wherein, t indicates that the t in training process takes turns update, and lr indicates the learning rate that algorithm has been set in advance;BtIndicate the
T takes turns updated coding B;Bt+1Indicate that t+1 takes turns updated coding B;The loss function of L expression model.
With gradient updating formula (2) find out come Bt+1It is the current iteration that L (B) is selected when not considering discrete constraint
Optimal more new direction;
22) it obtains from successive value Bt+1Solution that is nearest and meeting discrete value constraint, i.e. sgn (Bt+1), sgn () is indicated
Sign function is used by element;
23) to solution sgn (Bt+1) direction carry out parameter update, that is, using formula (3) solve Hash code optimization ask
Topic formula (1):
3) the projected depth Hash coding module in depth network, training hashing image encoding model, realizes formula (3) more
New paragon;Including operating as follows:
31) input picture is expressed as a string of characteristics of image H for taking successive value using convolutional neural networks;
32) a completely new depth hashing image coding module, input are designed in the last layer of convolutional neural networks
It is successive value characteristics of image H, output is exactly the B encoded;The update side of formula (3) is realized in depth hashing image coding module
Formula;Use sign function in module propagated forward to H by turn, is strictly taken the coding B of two-value;It will when module backpropagation
The gradient information indirect assignment that coding B is obtained returns gradient smoothly even the gradient information of H is equal to the gradient information of B to H
To preceding layer network;
Depth hashing image coding module is able to solve the gradient in depth hash algorithm and disappears and quantization error problem, and
Realize rapid Optimum and accurate coding.Depth hashing image coding module completes following operation:
321) variable H is introduced into formula (3) first, available formula (4):
Wherein, Ht+1Indicate that t+1 takes turns updated variable H;
It 322) is at this time realization formula (4a) that symbol is used to input H in the propagated forward of depth hashing image coding module
Number function, is expressed as formula (5):
B=sgn (H) formula (5)
It 323) is realization formula (4b) to add a penalty term in the objective function of network training to assist depth Hash figure
As coding module;
For (4b), need first to add a penalty term in loss function LSo that network is in training
Process approximation meets H ≈ sgn (H)=B;For H variable, more new formula becomes formula (6):
Wherein, HtIndicate that t takes turns updated variable H;
Contrast (6) and formula (4b) can obtain, and the method for realizing (4b) is in the backpropagation of depth Hash coding module
It realizes indirect assignment passback, is expressed as formula (7):
Formula (7) represents in the back-propagation process of newly-designed coding module, and loss function can be with about the gradient of B
Front layer H is completely directly returned to, and finally passes back to network front end, parameter learning is completed and network updates.
In above-mentioned network training process, when forward-propagating strictly use sign function with keep discrete constraint always at
It is vertical;Gradient completely passes back to preceding layer network when backpropagation, and while solving the problems, such as gradient disappearance, synchronized update is respectively encoded
Position, to quickly complete effective neural metwork training and convergence;
4) after the completion of training, trained picture depth Hash coding network is obtained;
5) trained depth Hash coding network is utilized, all database images are encoded, generates database diagram
As coding;
By the above process, the rapid Optimum depth hashing image coding based on Greedy strategy is realized.
When carrying out target image retrieval, perform the following operations:
6) whenever user's offer query image, first query image is encoded with above-mentioned depth Hash coding network, it is raw
It is encoded at query image;
7) it is then returned after sequence by calculating the Hamming distance between query image coding and all database images coding
It returns apart from a database images of the smallest preceding M (return amount of images set by the user), the as similar image of query image
Search result;
By the above process, the quick-searching of target image database images similar with query image is realized.
Compared with prior art, the beneficial effects of the present invention are:
The present invention provides a kind of target image search method of rapid Optimum depth Hash coding based on Greedy strategy, makes
Depth Hash discrete optimization problems of device is solved with Greedy strategy, iteratively to the approximation for meeting discrete constraint under present case
Optimal solution come update network to complete quickly and efficiently train.A completely new depth Hash is devised in specific implementation to compile
Code module, when forward-propagating, use sign function strictly to keep discrete constraint to set up always, avoid quantization error problem,
And layer network before gradient completely passes back in its backpropagation, while solving the problems, such as gradient disappearance, each volume of synchronized update
Code bit, to quickly complete effective neural metwork training and convergence.In addition, the present invention adds in the objective function of network training
One penalty term assists proposed coding module, and it is inclined to efficiently reduce the gradient generated in coding module backpropagation
Difference ensure that the accuracy of parameter more new direction and the stability of the network optimization, so that image generated coding is more accurate
With robust.Experimental section demonstrates depth Hash coding method proposed by the invention and possesses faster compared to previous algorithm
Training speed and more preferably retrieval performance have absolutely proved that application of the present invention in large-scale Image-Database Retrieval problem is latent
Power.
The technical advantage of the method for the present invention embodies are as follows:
(1) a completely new depth Hash coding module is proposed based on Greedy strategy, the coding module is in forward-propagating
Shi Yange keeps maintaining discrete constraint using sign function, thus avoid introducing quantization error, it then will in backpropagation
Gradient completely returns, and avoids gradient disappearance problem and realizes each bits of coded synchronized update, and network is helped to complete quickly instruction
Experienced and coding efficiency is promoted.For large-scale image retrieval application, which can be effectively reduced trained cost and significantly improve
Retrieval precision.
(2) penalty term is added in the objective function of network trainingTo assist the present invention
The coding module proposed, effect makes the value of continuous variable H closer to discrete codes variable B, so that coding module
Backpropagation in gradient deviation reduce, ensure that the accuracy of parameter more new direction and the stability of the network optimization, so that institute
The image of generation encodes more accurate and robust.
Detailed description of the invention
Fig. 1 is the schematic diagram of existing neural network gradient descent method;
Wherein, abscissa is the parameter w of network;Ordinate is the loss function of training network.
Fig. 2 is sign function schematic diagram.
Fig. 3 is the flow diagram of target image search method provided by the invention.
Fig. 4 is the structural schematic diagram of the network model used when present invention specific implementation;
Wherein, the front layer basic framework of network selects AlexNet structure, and image obtains feature H after AlexNet, passes through
It crosses Hash coding module proposed by the present invention and obtains coding B.
Using the ratio of the method for the present invention and the rapid Optimum performance of other existing methods when Fig. 5 is present invention specific implementation
Compared with.
Specific embodiment
With reference to the accompanying drawing, the present invention, the model of but do not limit the invention in any way are further described by embodiment
It encloses.
The present invention provides a kind of rapid Optimum depth hashing image coding method based on Greedy strategy and target image inspection
Suo Fangfa solves depth Hash discrete optimization problems of device using Greedy strategy, iteratively discrete to meeting under present case
The approximate optimal solution of constraint come update network to complete quickly and efficiently train.By designing a completely new depth Hash
Coding module, when forward-propagating, use sign function strictly to keep discrete constraint to set up always, avoid quantization error problem,
And layer network before gradient completely passes back in its backpropagation, while solving the problems, such as gradient disappearance, synchronized update is each
Bits of coded, to quickly complete effective neural metwork training and convergence.In addition, the present invention adds in the objective function of network training
Add a penalty term to assist proposed coding module, it is inclined to efficiently reduce the gradient generated in coding module backpropagation
Difference ensure that the accuracy of parameter more new direction and the stability of the network optimization, so that image generated coding is more accurate
With robust.
The process of the method for the present invention is as shown in Figure 3.The overall structure of network is as shown in Figure 4.
When it is implemented, the present invention generates image coding for all query images and database images, including walk as follows
It is rapid:
First, the modeling of Hash encoded question obtains following optimization problem:
s.t.B∈{-1,+1}KFormula (1)
Wherein, B indicates the binary-coding that the present invention carries out propagated forward to input picture using depth network and generates, and
Constraint condition then constrain coding B each all can only the selection from { -1 ,+1 }, one shared K (i.e. each image is all compiled
Code is K two-value codes).L (B) indicates to carry out loss function calculating to B, because algorithm proposed by the present invention is limited unlike DSDH
L must be quadratic programming form, therefore in model of the invention, and the L of above formula can be specially that deep learning field is various often
Loss function, such as mean square error function, intersection entropy function.
This optimization problem is solved, optimal binary-coding B is obtained.Do not considering discrete constraint B ∈ { -1 ,+1 }KFeelings
Under condition, gradient of the L about B can be first calculatedThen update is iterated using the gradient descent method of following formula:
Wherein, t indicates that the t in training process takes turns update, and lr indicates the learning rate that algorithm has been set in advance;BtIndicate the
T takes turns updated coding B;Bt+1Indicate that t+1 takes turns updated coding B;The loss function of L expression model.However with this formula meter
Almost impossible satisfaction constraint B ∈ { -1 ,+1 } of the B calculatedK, and after considering upper discrete value constraint, (1) reforms into
One np hard problem.It is exactly greedy algorithm that one, which quickly and efficiently solves the algorithm of np hard problem, is updated by each iteration
The optimal selection under present case is all selected, a feasible solution very close with globally optimal solution may finally be obtained.If
With gradient updating (2) formula find out come Bt+1Be the current iteration that L (B) is selected when not considering discrete constraint it is optimal more
New direction, then using greedy principle, from successive value Bt+1Solution that is nearest and meeting discrete value constraint, i.e. sgn (Bt+1),
Be likely to be discrete solution optimal in current iteration, thus of the invention " greed " towards solution sgn (Bt+1) direction carry out
Parameter updates, that is, problem in (1) is solved using following formula:
Although (3) formula may not be most efficient method in independent solution discrete optimization problems of device, but most can and nerve net
One of the mode that network mutually merges to solve discrete optimization problems of device, reason have following three points:
1. deep neural network is to complete parameter by stochastic gradient descent method to update and learn, and gradient decline is originally
Body is exactly a kind of Greedy strategy, iteratively updates a step to the direction of steepest descent of present case to complete final receive
It holds back.It can be seen that the feasibility of Greedy strategy is very high in neural network, therefore discrete taken with what the thought was gone to solve in depth network
Value optimization problem is also very reasonable and can the phase.
2. the variant that the form renewal of equation (3) is exactly neural network gradient updating formula in fact (is equally calculating parameter
Then gradient carries out parameter update), renewal model similar in the two is to realize that (3) formula has established solid base in neural network
Plinth (see 4.2 sections).
3. stochastic gradient descent algorithm (calculating gradient with part sample) is natively suitable as pointed by document [4]
In joined noise in true gradient information (being calculated with all training samples), noise appropriate is in addition to that can be in gradient
Neural network brings the effect of some regularizations, moreover it is possible to allow neural network fled from optimization most local minizing point and
Saddle point.And (3) formula is observed it can be found that it functions as to introduce by (2) formula of sign function sgn () to script " makes an uproar
Sound ".Therefore it uses (3) formula to not only facilitate network in neural network and solves discrete value optimization problem, also to a certain degree
On improve the optimization process performance of network.
The discrete codes optimization problem in neural network is solved based on (3) formula of Greedy strategy, will specifically be described below
Depth Hash coding module proposed by the invention, to realize (3) formula update mode in depth network.
As the basic procedure for the depth hash algorithm that background parts are introduced, it will use convolutional neural networks (can first
Use any common depth network, such as AlexNet, ResNet) input picture is expressed as a string of images for taking successive value
This string feature is denoted as H by feature.Then a completely new Hash coding module will be designed, input is successive value characteristics of image
H, output are exactly the B encoded.Need to realize equation (3) update mode of 4.1 sections in this module, and from the construction side of the module
Formula intuitively illustrates how it solves the problems, such as that gradient in depth hash algorithm disappears and quantization error and realizes rapid Optimum
With the target of accurate coding.
Variable H is introduced into formula (3) first, available:
Ht+1Indicate that t+1 takes turns updated variable H;
How so task realizes (4a) and (4b) if having reformed into.Observation (4a) formula can have soon found that, realize its
Mode is very direct, is exactly to use sign function to input H in the propagated forward of newly-designed coding module:
B=sgn (H) formula (5)
And for (4b), it needs first to add a penalty term in loss function LThere is this to punish
, network can be made to meet H ≈ sgn (H)=B in training process approximation, then more new formula will become for H variable:
HtIndicate that t takes turns updated variable H;
Contrast (4b) and formula (6), the method that can find realization (4b) are exactly to encode mould in newly-designed depth Hash
It is realized when the backpropagation of block:
Above formula represents in the back-propagation process of newly-designed coding module, and loss function can be with about the gradient of B
Front layer H is completely directly returned to, and finally passes back to network front end and completes parameter learning and network update.
It intuitively sees, the most key two parts (formula 5 and 7) solve depth respectively in the realization of Hash coding module
The gradient of Hash field face disappears and two hang-up of quantization error.Firstly, Hash coding module makes in the forward propagation process
With equation (5), it is meant that it will strictly keep the discrete value characteristic of coding to loosen in entire training process without any,
Therefore the problem of also just fundamentally having prevented quantization error.And in back-propagation process, coding layer has used formula (7),
So that loss function is directly completely returned to variable H about the gradient of B, on the one hand solves since forward-propagating uses sign function
Gradient disappearance problem caused by sgn (), on the other hand each bits of coded can obtain simultaneously gradient and update, and complete network
Quickly training and study.
In addition, the present invention is also added to a penalty term in the objective function of network trainingCome auxiliary
Itd is proposed coding module is helped, effect intuitively can be understood as making the value of variable H closer to variable B, so that terraced
It spends deviation and reduces (even (6) formula is equal as far as possible), ensure that the accuracy of parameter more new direction and the stability of the network optimization.
Therefore, using for formula (7) enables algorithm of the invention to obtain faster optimal speed compared to other algorithms,
And the mutual cooperation of formula (5) (7) and penalty term is so that Hash coding layer of the invention and deep neural network preferably merge
Together, to possess more preferably coding efficiency.
It is the verifying to the method for the present invention below:
Experimental detail is illustrated first, and the present invention will realize on pytorch frame to complete the code of inventive algorithm.This
Invention has selected AlexNet as convolutional neural networks basis of the invention, i.e., the continuous value tag for extracting image is completed with it
Operation, is then followed by upper Hash coding layer of the invention in the output H of AlexNet, to generate image binary-coding B, later will
Classified using the most common intersection entropy loss to coding B, calculates the sum of loss of misclassification and start backpropagation, join
Number updates and completes the training of network.For the present invention by using the update mode of lot sample present treatment, the size criticized is 32.This hair simultaneously
Stochastic gradient descent optimizer of the bright use with momentum form, learning rate lr are set as 0.001, and momentum parameter is set as 0.9.
The rapid Optimum performance of inventive algorithm, the present invention and the DSDH algorithm that discrete constraint can be always maintained at are shown first
And some common loosen tactful Hash coding method and be compared that (ours is algorithm proposed by the present invention in figure;Tanh is
Refer to and replaces sgn function using the tanh function loosened;Penalty refer to punishment mode constraint network export as far as possible -1 with
+ 1 value, but without constraining by force;Number 12 indicates to be encoded with 12bits, and 48 indicate using 48bits), as a result as schemed
Shown in 5.
The present invention is it is clear from the figure that algorithm proposed by the present invention can use less trained the number of iterations
(epoch) it is promoted to complete faster and better retrieval performance (MAP), so that training cost is lower than previous algorithm.
Next excellent retrieval performance of the present invention on large-scale image data set is shown.The present invention and several at present at this
The depth hash algorithm that field possesses top performance is compared, as a result such as table 1 (testing on data set CIFAR10) and table 2
Shown in (being tested on data set ImageNet).
Retrieval performance MAP on 1 CIFAR10 data set of table compares
Retrieval performance MAP on 2 ImageNet data set of table compares
, it is apparent that the depth hash algorithm previous compared to DSDH and HashNet etc., the present invention gather around from table
There is better coding efficiency, it is meant that the image two-value code that the present invention generates can reach better retrieval performance, be very suitable for
It is applied in large-scale image indexing system.
It should be noted that the purpose for publicizing and implementing example is to help to further understand the present invention, but the skill of this field
Art personnel, which are understood that, not to be departed from the present invention and spirit and scope of the appended claims, and various substitutions and modifications are all
It is possible.Therefore, the present invention should not be limited to embodiment disclosure of that, and the scope of protection of present invention is with claim
Subject to the range that book defines.
Claims (6)
1. a kind of rapid Optimum depth hashing image coding method is established based on Greedy strategy and is breathed out for large-scale image data set
Attempt to generate the binary-coding of all images by the depth Hash coding network obtained after optimization as encoding model;Including such as
Lower step:
1) hashing image encoded question models, and obtains hashing image encoding model;
Hashing image encoding model is expressed as formula (1):
s.t.B∈{-1,+1}KFormula (1)
Wherein, B indicates the binary-coding generated using depth network to input picture progress propagated forward;Wherein constraint condition
Constrain coding B each can only the selection from { -1 ,+1 }, one is K shared, i.e., each image is all encoded to K two-values
Code;L (B) indicates the loss function calculated B;
2) hashing image encoding model is solved using Greedy strategy, obtains optimal binary-coding B;Including operating as follows:
21) in solution procedure, do not considering discrete constraint B ∈ { -1 ,+1 }KIn the case where, first calculate gradient of the L about B
Then it is iterated update using the gradient descent method of following formula, is expressed as formula (2):
Wherein, t indicates that the t in training process takes turns update, and lr indicates the learning rate that algorithm has been set in advance;BtIndicate t wheel more
Coding B after new;Bt+1Indicate that t+1 takes turns updated coding B;The loss function of L expression model;It is asked with gradient updating formula (2)
B outt+1It is the optimal more new direction for the current iteration that L (B) is selected when not considering discrete constraint;
22) it obtains from successive value Bt+1Solution that is nearest and meeting discrete value constraint, i.e. sgn (Bt+1), sgn () is indicated by member
Sign function is used plainly;
23) to solution sgn (Bt+1) direction carry out parameter update, i.e., using formula (3) solve formula (1):
3) the projected depth hashing image coding module in depth network, training hashing image encoding model, realizes formula (3) more
New paragon;Including operating as follows:
31) input picture is expressed as a string of characteristics of image H for taking successive value using convolutional neural networks;
32) a completely new depth hashing image coding module is designed in the last layer of convolutional neural networks:
Input is successive value characteristics of image H, and output is coding B;
The update mode of formula (3) is realized in depth hashing image coding module;The use of H by turn is accorded in module propagated forward
Number function, obtains the coding B for taking two-value;
The gradient information indirect assignment for obtaining coding B when module backpropagation is to H, even the gradient information of H is equal to the gradient of B
Information makes gradient smoothly pass back to preceding layer network;
4) after the completion of the training and convergence of neural network, trained picture depth Hash coding network is obtained;
5) trained depth Hash coding network is utilized, all database images are encoded, database images is generated and compiles
Code;
By the above process, the rapid Optimum depth hashing image coding based on Greedy strategy is realized.
2. rapid Optimum depth hashing image coding method as described in claim 1, characterized in that loss function includes square
Error function intersects entropy function.
3. rapid Optimum depth hashing image coding method as described in claim 1, characterized in that in step 32), depth is breathed out
Uncommon image coding module is implemented as follows operation:
321) variable H is introduced into formula (3) first, obtained formula (4), including formula (4a) and (4b):
Wherein, Ht+1Indicate that t+1 takes turns updated variable H;
It 322) is at this time realization formula (4a) that symbol letter is used to input H in the propagated forward of depth hashing image coding module
Number, is expressed as formula (5):
B=sgn (H) formula (5)
323) penalty term is added in the objective function of network training to assist depth hashing image coding module, to realize
Formula (4b);
For (4b), a penalty term is first added in loss function LSo that network is in training process approximation
Meet H ≈ sgn (H)=B;For H variable, more new formula is formula (6):
Wherein, HtIndicate that t takes turns updated variable H;
Contrast (6) and formula (4b), in the backpropagation of depth Hash coding module, indirect assignment is returned, and is expressed as formula (7):
Formula (7) represent in the back-propagation process of newly-designed coding module, loss function about B gradient directly completely
Front layer H is returned to, and finally passes back to network front end, parameter learning is thus completed and network updates.
4. rapid Optimum depth hashing image coding method as described in claim 1, characterized in that convolutional neural networks use
Depth network A lexNet or ResNet.
5. a kind of target image search method of rapid Optimum depth Hash coding, using quick excellent described in Claims 1 to 44
Change the coding method of depth hashing image and establishes hashing image encoding model, it is raw by the depth Hash coding network obtained after optimization
At the binary-coding of all images;Again by calculating the Hamming distance between query image coding and database images coding, obtain
To the similar similar image of query image;Realize the quick-searching of target image database images similar with query image.
6. the target image search method of rapid Optimum depth Hash coding as claimed in claim 5, characterized in that specific to execute
Following operation:
6) when user provides query image, first with the rapid Optimum depth hashing image coding method to query graph
As being encoded, the coding of query image is generated;
7) it is then returned after sequence by calculating the Hamming distance between query image coding and all database images codings
Apart from the smallest preceding M database images, as the retrieving similar images result of query image;M is return set by the user
Amount of images.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910701690.6A CN110457503B (en) | 2019-07-31 | 2019-07-31 | Method for quickly optimizing depth hash image coding and target image retrieval |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910701690.6A CN110457503B (en) | 2019-07-31 | 2019-07-31 | Method for quickly optimizing depth hash image coding and target image retrieval |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110457503A true CN110457503A (en) | 2019-11-15 |
CN110457503B CN110457503B (en) | 2022-03-25 |
Family
ID=68484255
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910701690.6A Active CN110457503B (en) | 2019-07-31 | 2019-07-31 | Method for quickly optimizing depth hash image coding and target image retrieval |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110457503B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111506761A (en) * | 2020-04-22 | 2020-08-07 | 上海极链网络科技有限公司 | Similar picture query method, device, system and storage medium |
CN112862096A (en) * | 2021-02-04 | 2021-05-28 | 百果园技术(新加坡)有限公司 | Model training and data processing method, device, equipment and medium |
CN113034626A (en) * | 2021-03-03 | 2021-06-25 | 中国科学技术大学 | Optimization method for alignment of target object in feature domain in structured image coding |
CN113343020A (en) * | 2021-08-06 | 2021-09-03 | 腾讯科技(深圳)有限公司 | Image processing method and device based on artificial intelligence and electronic equipment |
WO2022055905A1 (en) * | 2020-09-08 | 2022-03-17 | Kla Corporation | Unsupervised pattern synonym detection using image hashing |
CN115495546A (en) * | 2022-11-21 | 2022-12-20 | 中国科学技术大学 | Similar text retrieval method, system, device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799881A (en) * | 2012-07-05 | 2012-11-28 | 哈尔滨理工大学 | Fingerprint direction information obtaining method based on binary image encoding model |
US9351007B1 (en) * | 2005-07-28 | 2016-05-24 | Teradici Corporation | Progressive block encoding using region analysis |
CN108932314A (en) * | 2018-06-21 | 2018-12-04 | 南京农业大学 | A kind of chrysanthemum image content retrieval method based on the study of depth Hash |
CN110069644A (en) * | 2019-04-24 | 2019-07-30 | 南京邮电大学 | A kind of compression domain large-scale image search method based on deep learning |
-
2019
- 2019-07-31 CN CN201910701690.6A patent/CN110457503B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9351007B1 (en) * | 2005-07-28 | 2016-05-24 | Teradici Corporation | Progressive block encoding using region analysis |
CN102799881A (en) * | 2012-07-05 | 2012-11-28 | 哈尔滨理工大学 | Fingerprint direction information obtaining method based on binary image encoding model |
CN108932314A (en) * | 2018-06-21 | 2018-12-04 | 南京农业大学 | A kind of chrysanthemum image content retrieval method based on the study of depth Hash |
CN110069644A (en) * | 2019-04-24 | 2019-07-30 | 南京邮电大学 | A kind of compression domain large-scale image search method based on deep learning |
Non-Patent Citations (1)
Title |
---|
刘玉莹等: "深度哈希在大规模图像处理中的应用", 《中国计算机用户协会网络应用分会2017年第二十一届网络新技术与应用年会》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111506761A (en) * | 2020-04-22 | 2020-08-07 | 上海极链网络科技有限公司 | Similar picture query method, device, system and storage medium |
CN111506761B (en) * | 2020-04-22 | 2021-05-14 | 上海极链网络科技有限公司 | Similar picture query method, device, system and storage medium |
WO2022055905A1 (en) * | 2020-09-08 | 2022-03-17 | Kla Corporation | Unsupervised pattern synonym detection using image hashing |
CN116018615A (en) * | 2020-09-08 | 2023-04-25 | 科磊股份有限公司 | Unsupervised pattern equivalence detection using image hashing |
US11748868B2 (en) | 2020-09-08 | 2023-09-05 | Kla Corporation | Unsupervised pattern synonym detection using image hashing |
CN112862096A (en) * | 2021-02-04 | 2021-05-28 | 百果园技术(新加坡)有限公司 | Model training and data processing method, device, equipment and medium |
CN113034626A (en) * | 2021-03-03 | 2021-06-25 | 中国科学技术大学 | Optimization method for alignment of target object in feature domain in structured image coding |
CN113034626B (en) * | 2021-03-03 | 2024-04-02 | 中国科学技术大学 | Optimization method for alignment of target object in feature domain in structured image coding |
CN113343020A (en) * | 2021-08-06 | 2021-09-03 | 腾讯科技(深圳)有限公司 | Image processing method and device based on artificial intelligence and electronic equipment |
CN115495546A (en) * | 2022-11-21 | 2022-12-20 | 中国科学技术大学 | Similar text retrieval method, system, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110457503B (en) | 2022-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457503A (en) | A kind of rapid Optimum depth hashing image coding method and target image search method | |
CN111914085B (en) | Text fine granularity emotion classification method, system, device and storage medium | |
CN106874898A (en) | Extensive face identification method based on depth convolutional neural networks model | |
CN107423376A (en) | One kind has the quick picture retrieval method of supervision depth Hash and system | |
Shao et al. | Intern: A new learning paradigm towards general vision | |
Du et al. | Real-time detection of vehicle and traffic light for intelligent and connected vehicles based on YOLOv3 network | |
CN110188263B (en) | Heterogeneous time interval-oriented scientific research hotspot prediction method and system | |
Li et al. | Vision-language models in remote sensing: Current progress and future trends | |
CN114357148A (en) | Image text retrieval method based on multi-level network | |
CN113705099A (en) | Social platform rumor detection model construction method and detection method based on contrast learning | |
CN113011396A (en) | Gait recognition method based on deep learning cascade feature fusion | |
CN117236492A (en) | Traffic demand prediction method based on dynamic multi-scale graph learning | |
CN110197521B (en) | Visual text embedding method based on semantic structure representation | |
CN117235216A (en) | Knowledge reasoning method based on heterogeneous knowledge fusion | |
Liu et al. | Improvement of pruning method for convolution neural network compression | |
Sairam et al. | Image Captioning using CNN and LSTM | |
CN117496388A (en) | Cross-modal video description model based on dynamic memory network | |
CN116401353A (en) | Safe multi-hop question-answering method and system combining internal knowledge patterns and external knowledge patterns | |
Syed et al. | STGT: Forecasting pedestrian motion using spatio-temporal graph transformer | |
CN114332701B (en) | Target tracking method based on task distinguishing detection and re-identification combined network | |
CN115424004A (en) | Target detection method based on attention mechanism and comparative learning loss function | |
CN114911930A (en) | Global and local complementary bidirectional attention video question-answering method and system | |
Xue et al. | Integrating multi-network topology via deep semi-supervised node embedding | |
Li et al. | Automated deep learning system for power line inspection image analysis and processing: Architecture and design issues | |
Li et al. | Improved YOLOV3 surveillance device object detection method based on federated learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |