CN109299306B

CN109299306B - Image retrieval method and device

Info

Publication number: CN109299306B
Application number: CN201811533583.9A
Authority: CN
Inventors: 张勇; 朱立松
Original assignee: Cntv Wuxi Co ltd
Current assignee: Cntv Wuxi Co ltd
Priority date: 2018-12-14
Filing date: 2018-12-14
Publication date: 2021-09-07
Anticipated expiration: 2038-12-14
Also published as: CN109299306A

Abstract

The invention discloses an image retrieval method and an image retrieval device, wherein the image retrieval method comprises the following steps: s10, constructing a neural network model; s20, training the neural network model for the first time by using the training set; s30 randomly selecting two similar images from the training set as input, carrying out secondary training on the neural network model by using an adjacent component minimization method and a back propagation algorithm, and dividing the feature vectors of the images into a first-stage feature vector and a second-stage feature vector; s40, inputting the image to be searched to the neural network model to obtain a first-stage feature vector and a second-stage feature vector of the image to be searched; s50, by calculating the distance between the first-stage feature vector of the image to be retrieved and the image in the database, the image similar to the image to be retrieved is searched in the database, thereby greatly accelerating the matching speed in the image searching process, simultaneously improving the searching precision, and obtaining excellent performance by performing manual design without depending on the experience of technical personnel.

Description

Image retrieval method and device

Technical Field

The invention relates to the technical field of computer vision, in particular to an image retrieval method and device.

Background

Image data belongs to typical unstructured data, and at present, query, retrieval, similarity comparison and the like of image data in a database have certain difficulties because of the following reasons: 1) the image data has high dimensionality, the resolution of a general high-definition image can reach about 200 ten thousand pixels, and the resolution of an ultra-definition image can reach 800 ten thousand pixels; 2) the semantics contained in the image are difficult to directly acquire from the image data, for example, if an image contains a car, the image semantics can be easily observed by a person when the person watches the image, but the computer is difficult to acquire, and the specific semantics of the car contained in the image can be recognized only through algorithms such as artificial intelligence and the like.

In order to make the image easier to be queried, retrieved and compared, it is a common method to extract the image features, for example, a Scale-invariant feature transform (SIFT) algorithm or a SURF algorithm (a modification of the SIFT algorithm) is used to extract the local feature points of the image, both algorithms are descriptions of the distribution of pixel point values in the local region of the feature points, for example, each feature point of the SIFT feature corresponds to a 128-bit description vector, and each feature point of the SURF feature corresponds to a 64-bit description vector. However, the dimensionality of the feature vector calculated by the feature algorithm such as the SURF feature or the SIFT feature is still high, and the requirement on the aspect of fast image retrieval cannot be met.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an image retrieval method and device, which effectively solve the technical problem that the rapid image retrieval cannot be realized in the prior art.

In order to achieve the purpose, the invention is realized by the following technical scheme:

an image retrieval method, comprising:

s10, constructing a neural network model, which comprises an input layer positioned at the data input side of the model, a characteristic output layer positioned in the middle of the model and a model output layer positioned at the output side of the model, wherein the neural network model is symmetrically arranged along the characteristic output layer positioned in the middle, the number of neurons from the input layer to the characteristic output layer is gradually reduced, and the number of neurons from the characteristic output layer to the model output layer is gradually increased;

s20, training the neural network model for the first time by using a training set, recovering and obtaining an image input by an input layer on a model output layer, and updating the weight connected with each layer of neural network and the feature vector of the image output from a feature output layer;

s30 randomly selecting two similar images from a training set as input, carrying out secondary training on the neural network model by using an adjacent component minimization method and a back propagation algorithm, and dividing the feature vector of the image into a first-level feature vector and a second-level feature vector, wherein the first-level feature vector consists of the same features of the two images, and the second-level feature vector consists of features different from the other image, so as to complete the training of the neural network model;

s40 reserving the part from the input layer to the feature output layer in the neural network model as the neural network model for image retrieval, and inputting the image to be retrieved to be inquired into the neural network model to obtain the first-stage feature vector and the second-stage feature vector of the image to be retrieved;

s50, finding out the image similar to the image to be searched in the database by calculating the distance between the first-stage characteristic vector of the image to be searched and the image in the database.

Further preferably, in step S20, the neural network model is trained for the first time according to the objective function O, and the weight connected to each layer of neural network and the feature vector of the image output from the feature output layer are updated;

wherein, y_iIs the output vector of the model, x_iAnd Q is the number of images in the training set.

Further preferably, in step S30, the objective function O is propagated backward from the model output layer₁Updating the weight of each layer of neural network; when propagating backward to the feature output layer, the objective function O is superimposed₂The gradient information of (2) continues to be propagated;

wherein, the lambda is a weight parameter, and the lambda belongs to (0, 1);

O₂＝maxλO_NCA

wherein x is_iVector, x, representing image i_jVector representing image j, c (x)_i) Indicates the category to which the image i belongs, c (x)_j) Indicates the class, p, to which the image j belongs_ijRepresenting the probability that image i will have image j as a neighbor,

the image q is an image different from the image i in the database, d (i, j) represents the Euclidean distance between the image i and the first-level feature vector of the image j, and d (i, j) ═ d (x)_i,x_j)＝||f(x_i)-f(x_j)||²，f(x_i) The first level feature vector, f (x), representing image i_j) Representing the first level feature vector of image j.

Further preferably, in step S50, the method includes:

s51, calculating the Euclidean distance between the first-level feature vector of the image to be retrieved and the first-level feature vector of the image in the database;

s52, comparing the calculation results to obtain an image similar to the image to be retrieved in the database;

s53 judges whether or not there are a plurality of images similar to the image to be retrieved in the database, and if so,

s54, calculating the Euclidean distance between the second-level feature vector of the image to be retrieved and the second-level feature vector of the similar image in the database;

s55, comparing the calculation results to obtain the image most similar to the image to be searched in the database.

Further preferably, the neural network model is a binary neural network, and in step S50, a hamming distance between the first-level feature vector and the image in the database is calculated.

The present invention also provides an image retrieval apparatus comprising:

the network model building module is used for building a neural network model and comprises an input layer positioned at the data input side of the model, a characteristic output layer positioned in the middle of the model and a model output layer positioned at the output side of the model, wherein the neural network model is symmetrically arranged along the characteristic output layer positioned in the middle, the number of neurons from the input layer to the characteristic output layer is gradually reduced, and the number of neurons from the characteristic output layer to the model output layer is gradually increased;

the network model training module is used for carrying out first training on the neural network model constructed by the network model construction module by using a training set, recovering and obtaining an image input by an input layer on a model output layer, and updating the weight connected with each layer of neural network and the feature vector of the image output from a feature output layer; randomly selecting two similar images from a training set as input, carrying out second training on the neural network model by using an adjacent component minimization device and a back propagation algorithm, and dividing the feature vector of the image into a first-stage feature vector and a second-stage feature vector, wherein the first-stage feature vector consists of the same features of the two images, and the second-stage feature vector consists of features different from the other image, so as to finish the training of the neural network model;

the neural network model is used for outputting and obtaining a first-stage feature vector and a second-stage feature vector of the image to be retrieved according to the input image to be retrieved;

and the retrieval module is used for searching the image similar to the image to be retrieved in the database by calculating the distance between the first-stage characteristic vector of the image to be retrieved output by the neural network model and the image in the database.

Further preferably, in the network model training module, the neural network model is trained for the first time according to the target function O, and the weight connected to each layer of neural network and the feature vector of the image output from the feature output layer are updated;

Further preferably, the slave model is used in the second training of the neural network model by the network model training moduleOutput layer start back propagation objective function O₁Updating the weight of each layer of neural network; when propagating backward to the feature output layer, the objective function O is superimposed₂The gradient information of (2) continues to be propagated;

wherein, the lambda is a weight parameter, and the lambda belongs to (0, 1);

O₂＝maxλO_NCA

Further preferably, in the retrieval module, the method includes:

the calculation unit is used for calculating a first Euclidean distance between a first-level feature vector of the image to be retrieved and a first-level feature vector of the image in the database; when a plurality of images similar to the image to be retrieved exist in the database, further calculating a second Euclidean distance between a second-level feature vector of the image to be retrieved and a second-level feature vector of the similar image in the database;

the comparison unit is used for comparing the first Euclidean distance obtained by calculation of the calculation unit to obtain an image similar to the image to be retrieved in the database; and the second Euclidean distance comparison unit is used for comparing the second Euclidean distance obtained by calculation, and obtaining the image which is most similar to the image to be retrieved in the database.

Further preferably, the neural network model is a binary neural network, and in the retrieval module, the hamming distance between the first-level feature vector and the image in the database is calculated.

In the image retrieval method and the device, a symmetrical neural network model is constructed, and the feature vector of the image is divided into a first-level feature vector and a second-level feature vector through training, wherein the first-level feature vector consists of the same features of two images, and the second-level feature vector consists of features different from the other image. Therefore, in the process of retrieving the image, the image similar to the image can be found by calculating the first-level feature vector of the image or the distance between the first-level feature vector and the second-level feature vector, the matching speed in the image searching process is greatly accelerated, the searching precision is improved, and the image searching method can obtain very good performance by performing manual design without depending on the experience of technicians.

Drawings

A more complete understanding of the present invention, and the attendant advantages and features thereof, will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein:

FIG. 1 is a schematic flow chart of an image retrieval method according to the present invention;

FIG. 2 is a schematic diagram of a neural network model constructed in the present invention;

FIG. 3 is a schematic diagram of an image retrieving device according to the present invention.

Reference numerals:

100-image retrieval device, 110-network model construction module, 120-network model training module, 130-neural network model and 140-retrieval module.

Detailed Description

In order to make the contents of the present invention more comprehensible, the present invention is further described below with reference to the accompanying drawings. The invention is of course not limited to this particular embodiment, and general alternatives known to those skilled in the art are also covered by the scope of the invention.

As shown in fig. 1, which is a schematic flow chart of the image retrieval method provided by the present invention, it can be seen that the image retrieval method includes:

s10 a neural network model is built, the neural network model comprises an input layer located at the data input side of the model, a characteristic output layer located in the middle of the model and a model output layer located at the output side of the model, the neural network model is symmetrically arranged along the characteristic output layer located in the middle, the number of neurons from the input layer to the characteristic output layer is gradually reduced, and the number of neurons from the characteristic output layer to the model output layer is gradually increased;

s20, training the neural network model for the first time by using a training set, recovering and obtaining the image input by the input layer on the model output layer, updating the weight connected with each layer of neural network and obtaining the feature vector of the image output from the feature output layer;

s30 randomly selecting two similar images from the training set as input, carrying out second training on the neural network model by using an adjacent component minimization method and a back propagation algorithm, and dividing the feature vector of the image into a first-level feature vector and a second-level feature vector, wherein the first-level feature vector consists of the same features of the two images, and the second-level feature vector consists of features different from the other image, so as to complete the training of the neural network model;

Fig. 2 shows a constructed neural network model, which includes an input layer, a first hidden layer, a feature output layer, a second hidden layer, and a model output layer, and the input layer and the model output layer are symmetrically disposed along the feature output layer located in the middle, that is, the input layer corresponds to the model output layer, each layer in the first hidden layer and each layer in the second hidden layer are symmetrically disposed about the feature output layer, and specifically, the number of neuron nodes in each layer is the same, for example, in an example, the first hidden layer includes 3 hidden layers, and the number of neurons in the input layer and the 3 hidden layers is in sequence [ 3072500400300 ], and the number of neurons in the feature output layer is 100, and the number of neurons in the corresponding second hidden layer (also including the 3 hidden layers) and the model output layer is [ 3004005003072 ].

In constructing the neural network model, the image is first processed to reduce the resolution of the image to a smaller size, e.g., to 32x 32. Considering a color image having three components YUV (or RGB) and encoded using 8-bit pixels, a 32x32 sized image consists of between 3072 integers 0,255. Then, the 3072-dimensional vector is divided by 255 and normalized to a real number between [0,1], and is input into the neural network model as the state of the neuron of the input layer, and then the neuron state value of the hidden layer is sequentially calculated according to formula (1):

where vector x represents the state of the input layer neurons, table W represents the weight matrix from the input layer to the first one of the first hidden layers (including multiple hidden layers), b represents the bias values for the first hidden layer, and y represents the state of the first hidden layer. After the state value of the first hidden layer is obtained, states of a second layer, a third layer, …, a feature output layer, a second hidden layer and a model output layer in the first hidden layer are sequentially calculated by using the same method, wherein the feature output layer comprises N neurons, and N real numbers output by the neurons are N-dimensional feature vectors of the image. It should be noted that the second hidden layer and the model output layer in the model are only used in the process of training the neural network, and after the training is completed, the second hidden layer and the model output layer may be removed from the neural network model, and the input layer, the first hidden layer and the feature output layer are retained to retrieve the image.

In the process of training the constructed neural network model, firstly, a training set is used for carrying out first training on the neural network model:

assume that the training set contains a total of Q images, each image corresponding to [0,1] of a particular dimension (e.g., 3072)]The real number vector between, noted as { x_iTraining a neural network model according to an objective function O shown in a formula (2), updating weights connected with each layer of neural network, and outputting a feature vector of an obtained image from a feature output layer;

wherein, y_iIs the output vector of the model (corresponding to the output of the output layer of the model), x_iIs the input vector corresponding to the image of the input model (corresponding to the input of the input layer).

As can be seen from the objective function O, the first training is aimed at the output of the model output layer being as identical as possible to the input of the input layer, i.e. the image from which the input was retrieved is restored in the model output layer. If the objective function O is compressed to zero in one training, it means that the model output layer can output the image input by the neural network model, that is, the feature vector (including N features) output by the feature output layer contains all the information of the image. For the selection of the target function, other functions can be selected according to actual conditions as long as the purpose of the invention can be achieved, for example, a cross entropy function can also be selected.

After the neural network model is trained for the first time, the neural network model is trained for the second time by using an adjacent component minimization method and a back propagation algorithm, so that similar images have the same first-stage feature vectors as much as possible. Assume that all Q images in the training set have a labeled class, i.e. { x_i,c(x_i) In which c (x)_i) E {1, 2.., C } represents a graphImages belong to the category and images with the same label are considered similar. In addition, the first K real numbers of N real numbers output in the feature output layer are defined as first-stage feature vectors, and are recorded as f (x)_i)＝f_k(x_i),k＝1,...,K。

In the training process, a back propagation algorithm is used, and the back propagation is started from the model output layer to obtain an objective function O of the formula (3)₁And updating the weight of each layer of neural network, and minimizing the objective function O by adjusting the weight₁；

When propagating backwards to the feature output layer, the objective function O as in equations (4) and (5) is superimposed₂The gradient information of (a) is continuously propagated, in the process, the lambda O is maximized by adjusting the network weight_NCA(relevant only to the first K neurons in the feature output layer);

O₂＝maxλO_NCA (4)

wherein, lambda is weight parameter, and an objective function O is determined₁And an objective function O₂λ ∈ (0, 1); x is the number of_iVector, x, representing image i_jVector representing image j, c (x)_i) Indicates the category to which the image i belongs, c (x)_j) Indicates the class, p, to which the image j belongs_ijThe probability that image i has image j as a neighbor is expressed by equation (6):

wherein, the image q is an image different from the image i in the database, and p_iiD (i, j) represents the euclidean distance between image i and the first level feature vector of image j, and d (i, j) is d (x)_i,x_j)＝||f(x_i)-f(x_j)||²，f(x_i) The first level feature vector, f (x), representing image i_j) Representing the first level feature vector of image j.

The probability of determining that the image i belongs to the class c is as follows (7):

the training of the neural network model is completed in this way, and N real numbers output by the feature output layer after the second training is completed are divided into two stages, wherein the first-stage feature vector contains K real numbers, the K feature real numbers of two similar images are identical under an ideal condition, and the Euclidean distance between the first-stage feature vectors of the two images is calculated to compare in practical application; the second-level feature vector contains N-K real numbers, which are the features of the image itself (features different from other images in the similar set). Then, reserving a part from an input layer to a feature output layer in the neural network model as the neural network model for image retrieval, and when an image which is the same as or similar to the image to be retrieved needs to be queried in a database, inputting the image to be retrieved which needs to be queried into the neural network model to obtain a first-stage feature vector and a second-stage feature vector of the image to be retrieved; then, calculating the Euclidean distance between the first-stage feature vector of the image to be retrieved and the first-stage feature vector of the image in the database; and selecting the image with the minimum Euclidean distance as a similar image according to the calculation result, and completing the retrieval of the image. In practical application, the Euclidean distance between the first-level feature vector of the image to be retrieved and the first-level feature vector of the image in the database is calculated and compared to possibly obtain a large number of similar image sets, at the moment, the Euclidean distance between the second-level feature vector of the image to be retrieved and the second-level feature vector of the similar image in the database is further calculated, and the image which is most similar to the image to be retrieved in the database can be obtained through comparison. In another example, the neural network model is a binary neural network, and in step S50, the hamming distance between the first-level feature vector and the image in the database is calculated, thereby completing the data retrieval.

As shown in fig. 3, which is a schematic structural diagram of the image retrieval apparatus provided by the present invention, it can be seen that the image retrieval apparatus 100 includes: the neural network model comprises a network model building module 110, a network model training module 120, a neural network model 130 and a retrieval module 140, wherein the network model building module 110 is used for building the neural network model 130 and comprises an input layer positioned at the input side of model data, a characteristic output layer positioned in the middle of the model and a model output layer positioned at the output side of the model, the neural network model 130 is symmetrically arranged along the characteristic output layer positioned in the middle, the number of neurons from the input layer to the characteristic output layer is gradually reduced, and the number of neurons from the characteristic output layer to the model output layer is gradually increased; the network model training module 120 is configured to perform first training on the neural network model constructed by the network model construction module 110 by using a training set, recover and obtain an image input by the input layer on the model output layer, and update weights connected to each layer of the neural network and feature vectors of the image output from the feature output layer; randomly selecting two similar images from a training set as input, carrying out second training on the neural network model by using an adjacent component minimization method and a back propagation algorithm, and dividing the feature vector of the image into a first-stage feature vector and a second-stage feature vector, wherein the first-stage feature vector consists of the same features of the two images, and the second-stage feature vector consists of features different from the other image, so as to finish the training of the neural network model; the neural network model 130 is used for obtaining a first-level feature vector and a second-level feature vector of the image to be retrieved according to the input image to be retrieved; the retrieval module 140 is configured to search for an image similar to the image to be retrieved in the database by calculating a distance between the first-level feature vector of the image to be retrieved output by the neural network model 130 and the image in the database.

As shown in fig. 2, the neural network model 130 constructed by the network model construction module 110 includes an input layer, a first hidden layer, a feature output layer, a second hidden layer and a model output layer, which are symmetrically disposed along the feature output layer located in the middle, that is, the input layer corresponds to the model output layer, each layer in the first hidden layer and each layer in the second hidden layer are symmetrically disposed about the feature output layer, specifically, the number of neuron nodes in each layer is the same, for example, in an example, the first hidden layer includes 3 hidden layers, and the number of neurons in the input layer and the 3 hidden layers is in sequence [ 3072500400300 ], the number of neurons in the feature output layer is 100, and the number of neurons in the corresponding second hidden layer (also including the 3 hidden layers) and the model output layer is [ 3004005003072 ].

The network model building module 110 first processes the image to reduce the resolution of the image to a smaller size, e.g., to reduce the resolution of the picture to 32x32, in building the neural network model 130. Considering a color image having three components YUV (or RGB) and encoded using 8-bit pixels, a 32x32 sized image consists of between 3072 integers 0,255. Then, the 3072-dimensional vector is divided by 255 and normalized to a real number between [0,1], and is input to the neural network model 130 as the state of the neuron of the input layer, and then the neuron state value of the hidden layer is sequentially calculated according to formula (1):

In the process of training the constructed neural network model 130 by using the network model training module 120, firstly, the neural network model is trained for the first time by using a training set:

assume that the training set contains a total of Q images, each image corresponding to [0,1] of a particular dimension (e.g., 3072)]The real number vector between, noted as { x_iQ, and according to an objective function

Training the neural network model, updating the weight of each layer of neural network connection and obtaining the feature vector of the image output from the feature output layer, wherein y_iIs the output vector of the model (corresponding to the output of the output layer of the model), x_iIs the input vector corresponding to the image of the input model (corresponding to the input of the input layer).

As can be seen from the objective function O, the first training is aimed at the output of the model output layer being as identical as possible to the input of the input layer, i.e. the image from which the input was retrieved is restored in the model output layer. If the objective function O is compressed to zero in one training, it means that the model output layer can output the image input by the neural network model 130, that is, the feature vector (including N features) output by the feature output layer contains all the information of the image. For the selection of the target function, other functions can be selected according to actual conditions as long as the purpose of the invention can be achieved, for example, a cross entropy function can also be selected.

After the first training of the neural network model 130, the neural network model is trained for the second time by using the neighbor component minimization method and the back propagation algorithm, so that similar images have the same first-stage feature vectors as much as possible. Assume that all Q images in the training set have a labeled class, i.e. { x_i,c(x_i) In which c (x)_i) E {1, 2., C } represents the category to which the image belongs, and images with the same label are considered similar; in addition, the first K real numbers of N real numbers output in the feature output layer are defined as first-stage feature vectors, and are recorded as f (x)_i)＝f_k(x_i),k＝1,...,K。

In the training process, the target function which is propagated reversely from the model output layer by using a back propagation algorithm

And updating the weight of each layer of neural network, and minimizing the objective function O by adjusting the weight₁(ii) a When propagating backwards to the feature output layer, O is superimposed₂＝maxλO_NCAThe gradient information of (a) is continuously propagated, in the process, the lambda O is maximized by adjusting the network weight_NCA(only related to the first K neurons in the feature output layer). In particular, the method comprises the steps of,

lambda is weight parameter, and an objective function O is determined₁And an objective function O₂λ ∈ (0, 1); x is the number of_iVector, x, representing image i_jVector representing image j, c (x)_i) Indicates the category to which the image i belongs, c (x)_j) Indicates the class, p, to which the image j belongs_ijRepresenting the probability that image i will have image j as a neighbor,

wherein, the image q is an image different from the image i in the database, and p_iiD (i, j) represents the euclidean distance between image i and the first level feature vector of image j, and d (i, j) is d (x)_i,x_j)＝||f(x_i)-f(x_j)||²，f(x_i) The first level feature vector, f (x), representing image i_j) Representing the first level feature vector of image j. In addition, the probability that the image i belongs to the category c is determined as

Thus, the training of the neural network model 130 is completed, and after the second training is completed, the N real numbers output by the feature output layer are divided into two stages, wherein the first-stage feature vector contains K real numbers, and the K feature real numbers of the two images which are similar under an ideal condition are identical, and are compared by calculating the euclidean distance between the first-stage feature vectors of the two images in practical application; the second-level feature vector contains N-K real numbers, which are the features of the image itself (features different from other images in the similar set). Then, the part from the input layer to the feature output layer in the neural network model 130 is reserved as the neural network model 130 for image retrieval, when an image which is the same as or similar to the image to be retrieved needs to be queried in the database through the retrieval module 140, the image to be retrieved which needs to be queried is input into the neural network model 130, and a first-stage feature vector and a second-stage feature vector of the image to be retrieved are obtained; then, the calculation unit calculates the Euclidean distance between the first-level feature vector of the image to be retrieved and the first-level feature vector of the image in the database; and the comparison unit selects the image with the minimum Euclidean distance as a similar image according to the calculation result, and completes the retrieval of the image. In practical application, the Euclidean distance between the first-level feature vector of the image to be retrieved and the first-level feature vector of the image in the database is calculated and compared to possibly obtain a large number of similar image sets, at the moment, the calculating unit further calculates the Euclidean distance between the second-level feature vector of the image to be retrieved and the second-level feature vector of the similar image in the database, and the comparing unit can obtain the image which is most similar to the image to be retrieved in the database through comparison. In another example, the neural network model 130 is a binary neural network, and in the retrieval module 140, the hamming distance between the first-level feature vector and the image in the database is calculated, so as to complete the data retrieval.

Claims

1. An image retrieval method, comprising:

s50, searching an image similar to the image to be retrieved in the database by calculating the distance between the first-stage feature vector of the image to be retrieved and the image in the database;

in step S20, training the neural network model for the first time according to the objective function O, updating weights of connections of each layer of neural network, and obtaining feature vectors of the image output from the feature output layer;

wherein, y_iIs the output vector of the model, x_iInputting a corresponding input vector of the image of the model, wherein Q is the number of the images in the training set;

in step S30, the objective function O is propagated backward from the model output layer₁Updating the weight of each layer of neural network; when propagating backward to the feature output layer, the objective function O is superimposed₂Continuation of gradient information ofSpreading;

wherein, the lambda is a weight parameter, and the lambda belongs to (0, 1);

O₂＝maxλO_NCA

2. The image retrieval method according to claim 1, wherein in step S50, comprising:

3. The image retrieval method of claim 1, wherein the neural network model is a binary neural network, and in step S50, a hamming distance between the first-level feature vector and the image in the database is calculated.

4. An image retrieval apparatus, comprising:

the retrieval module is used for searching images similar to the images to be retrieved in the database by calculating the distance between the first-stage characteristic vector of the images to be retrieved output by the neural network model and the images in the database;

in a network model training module, performing first training on the neural network model according to an objective function O, updating weights connected with each layer of neural network and outputting feature vectors of the obtained image from a feature output layer;

in the process of carrying out secondary training on the neural network model by the network model training module, a target function O is propagated reversely from the model output layer₁Updating the weight of each layer of neural network; when propagating backward to the feature output layer, the objective function O is superimposed₂The gradient information of (2) continues to be propagated;

wherein, the lambda is a weight parameter, and the lambda belongs to (0, 1);

O₂＝maxλO_NCA

5. The image retrieval apparatus according to claim 4, wherein in the retrieval module, comprising:

6. The image retrieval device of claim 4, wherein the neural network model is a binary neural network, and in the retrieval module, a Hamming distance between the first-level feature vector and the image in the database is calculated.