CN115344728A

CN115344728A - Image retrieval model training method, image retrieval model using method, image retrieval model training device, image retrieval model using device, image retrieval model equipment and image retrieval model medium

Info

Publication number: CN115344728A
Application number: CN202211265842.0A
Authority: CN
Inventors: 毕晓鹏; 孙逸鹏; 姚锟
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2022-11-15

Abstract

The disclosure provides an image retrieval model training method, an image retrieval model using method, an image retrieval model training device, an image retrieval model using device and a medium, relates to the technical field of artificial intelligence, specifically to the technical field of deep learning, image processing and computer vision, and can be used in scenes such as OCR (optical character recognition). The specific implementation scheme is as follows: acquiring a query sample image and a positive sample image and a negative sample image of the query sample image; respectively extracting the characteristics of the query sample image and the positive sample image through a first characteristic extraction network in the image retrieval model to obtain corresponding query sample characteristics and positive sample characteristics; performing feature extraction on the negative sample image through a second feature extraction network in the image retrieval model to obtain corresponding negative sample features; determining difference metric data between the positive sample features and the negative sample features; and training the image retrieval model according to the difference measurement data. According to the technical scheme disclosed by the invention, the model accuracy of the image retrieval model is improved.

Description

Image retrieval model training method, image retrieval model using device, image retrieval model equipment and image retrieval model using medium

Technical Field

The present disclosure relates to the technical field of artificial intelligence, and in particular, to the technical field of deep learning, image processing, and computer vision, which can be used in scenes such as Optical Character Recognition (OCR), and in particular, to a method, an apparatus, a device, and a medium for training and using an image retrieval model.

Background

With the continuous development of computer technology, more and more fields relate to image retrieval, and image retrieval modes are more and more, and image retrieval is performed by adopting an image retrieval model.

When an image retrieval model is used for image retrieval, the performance of the image retrieval model determines the accuracy of a retrieval result.

Disclosure of Invention

The present disclosure provides image retrieval model training, methods of use, apparatuses, devices and media.

According to an aspect of the present disclosure, there is provided an image retrieval model training method, including:

acquiring a query sample image and a base library to be queried; the base library to be queried comprises at least one positive sample image of a query sample image and at least one negative sample image of the query sample image;

respectively extracting the characteristics of the query sample image and the positive sample image through a first characteristic extraction network in the image retrieval model to obtain corresponding query sample characteristics and positive sample characteristics;

performing feature extraction on the negative sample image through a second feature extraction network in the image retrieval model to obtain corresponding negative sample features;

determining difference metric data between the positive sample features and the negative sample features;

and training the image retrieval model according to the difference measurement data.

According to another aspect of the present disclosure, there is provided an image retrieval model using method, including:

acquiring a query predicted image and a base to be queried;

respectively inputting the query predicted image and the candidate image in the base to be queried into a first feature extraction network in an image retrieval model with trained values to obtain a query prediction feature and a candidate feature; the image retrieval model is obtained by training by adopting the image retrieval model training method provided by the disclosure;

and selecting a retrieval result image of the query predicted image from the candidate images according to the similarity between the query predicted features and the candidate features.

According to still another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the image retrieval model training methods provided by the embodiments of the present disclosure and/or to perform any of the image retrieval model using methods provided by the embodiments of the present disclosure.

According to still another aspect of the present disclosure, a non-transitory computer-readable storage medium is provided, in which computer instructions are stored, wherein the computer instructions are configured to cause a computer to perform any one of the image retrieval model training methods provided by the embodiments of the present disclosure, and/or perform any one of the image retrieval model using methods provided by the embodiments of the present disclosure.

According to the technical scheme disclosed by the invention, the model accuracy of the image retrieval model is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of an image retrieval model training method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of another image retrieval model training method provided by the embodiment of the disclosure;

FIG. 3 is a schematic diagram of another image retrieval model training method provided by the embodiment of the present disclosure;

FIG. 4 is a schematic diagram of an image retrieval model using method provided by the embodiment of the disclosure;

FIG. 5A is a schematic diagram of an image retrieval model training method provided by an embodiment of the present disclosure;

FIG. 5B is a schematic diagram illustrating the use of comparison results by an image retrieval model according to an embodiment of the disclosure;

FIG. 6 is a block diagram of an image retrieval model training apparatus according to an embodiment of the present disclosure;

FIG. 7 is a block diagram of an apparatus for using an image retrieval model according to an embodiment of the present disclosure;

FIG. 8 is a block diagram of an electronic device for implementing an image retrieval model training method, and/or an image retrieval model using method, according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The image retrieval model training method and the image retrieval model training device provided by the embodiment of the disclosure are suitable for a training scene aiming at an image retrieval model. The image retrieval model training method provided by the embodiment of the present disclosure may be executed by an image retrieval model training apparatus, which may be implemented by software and/or hardware and is specifically configured in an electronic device, which may be a computer, a server, a mobile terminal, or the like, and the present disclosure does not limit this.

For ease of understanding, the image retrieval model training method will be described in detail first. The image retrieval model training method shown in fig. 1 specifically includes the following steps:

s110, acquiring a query sample image and a base library to be queried; the base library to be queried comprises at least one positive sample image of the query sample image and at least one negative sample image of the query sample image.

Wherein the query sample image may be a training sample image for image query. The base to be queried may be an image database for image retrieval. And the base library to be queried comprises a positive sample image and/or a negative sample image of the query sample image. Wherein the positive sample image may be an image that is the same as the category label of the query sample image; the negative sample image may be an image that is different from the category label of the query sample image. It is understood that, in order to improve the retrieval accuracy of the image retrieval model, the base library to be queried may include both the positive sample image and the negative sample image. All positive sample images and/or all negative sample images in the query sample images can be directly obtained from the base library to be queried, and partial positive sample images and/or partial negative sample images can also be obtained in batches in a batch processing (namely batch) mode.

And S120, respectively extracting the characteristics of the query sample image and the positive sample image through a first characteristic extraction network in the image retrieval model to obtain corresponding query sample characteristics and positive sample characteristics.

And S130, performing feature extraction on the negative sample image through a second feature extraction network in the image retrieval model to obtain corresponding negative sample features.

The image retrieval model comprises a first feature extraction network and a second feature extraction network. The first feature extraction network is used for performing feature extraction on the query sample image to obtain query sample features corresponding to the query sample image, and performing feature extraction on the positive sample image to obtain positive sample features; the second feature extraction network is used for carrying out feature extraction on the negative sample image to obtain negative sample features.

The first feature extraction network and the second feature extraction network may be implemented by any one of the feature extraction networks in the prior art, and the present disclosure does not limit the specific network structures of the first feature extraction network and the second feature extraction network. The network structure of the first feature extraction network and the second feature extraction network may be the same or different. In an alternative embodiment, in order to simplify the model structure of the image retrieval model, the first feature extraction network and the second feature extraction network are typically implemented using the same network structure.

It should be noted that, when feature extraction is performed on the positive sample image, feature extraction may be performed on all positive sample images in the bottom library to be queried, or feature extraction may be performed on a part of positive sample images selected from the bottom library to be queried according to a preset positive sample selection rule. Correspondingly, when the characteristics of the negative sample images are extracted, the characteristics of all the negative sample images in the bottom library to be inquired can be extracted, and part of the negative sample images in the bottom library to be inquired can be selected according to a preset negative sample selection rule to be extracted. The preset positive sample selection rule and the preset negative sample selection rule may be determined by a relevant technician according to a large number of experiments or manual experiences, which is not limited in the embodiments of the present disclosure.

It is to be noted that S120 and S130 may be executed sequentially, in parallel, or alternately, and the present disclosure does not limit the specific execution order of the two.

And S140, determining difference measurement data between the positive sample characteristics and the negative sample characteristics.

Wherein the difference metric data can be used to quantify the difference between the feature information characterizing the positive sample image and the feature information characterizing the negative sample image.

Optionally, a preset loss function may be introduced, and the difference metric data may be directly determined according to the first similarity and the second similarity.

Or, optionally, after the first similarity and the second similarity are preprocessed, a preset Loss function is introduced to determine the difference metric data according to the result of the preprocessing, for example, an AP-Loss (Average Precision Loss) may be used.

And S150, training the image retrieval model according to the difference measurement data.

And adjusting network parameters in the image retrieval model according to the difference measurement data between the positive sample characteristics and the negative sample characteristics determined in the previous step so as to improve the image retrieval capability of the image retrieval model. The parameter adjustment may be performed for the first feature extraction network and the second feature extraction network, respectively, based on the difference metric data. It should be noted that the parameter adjusting manner of the first feature extraction network and the parameter adjusting manner of the second feature extraction network may be the same or different, and this is not limited in this embodiment of the disclosure.

According to the technical scheme of the embodiment, when the negative sample features are extracted, the method is different from the feature extraction mode of the negative sample images in the current training batch in the prior art, and the selection range of the negative sample images is expanded to the base library to be inquired, so that the diversity of the negative sample images is improved, and the richness of the negative sample features and the comprehensiveness of the relevant features of the negative sample images are improved. The negative sample characteristics obtained by the method are combined with the positive sample characteristics to determine the difference measurement data, so that the determined difference measurement data can truly reflect the difference situation between the negative sample image and the positive sample image in the base to be queried, the accuracy of the difference measurement data determination result is improved, and the image retrieval model has better image retrieval capability in the process of training the image retrieval model based on the difference measurement data, and further the retrieval accuracy of the image retrieval model is improved.

On the basis of the above technical solutions, the present disclosure also provides an optional embodiment, in which the operation of determining the difference metric data between the positive sample feature and the negative sample feature is further refined. It should be noted that, for parts not described in detail in the embodiments of the present disclosure, reference may be made to related expressions in other embodiments, and details are not described herein again. The image retrieval model training method shown in fig. 2 specifically includes the following steps:

s210, acquiring a query sample image and a base library to be queried; the base library to be queried comprises at least one positive sample image of the query sample image and at least one negative sample image of the query sample image.

And S220, respectively carrying out feature extraction on the query sample image and the positive sample image through a first feature extraction network in the image retrieval model to obtain corresponding query sample features and positive sample features.

And S230, performing feature extraction on the negative sample image through a second feature extraction network in the image retrieval model to obtain corresponding negative sample features.

S240, sorting the second similarity between the positive sample characteristics and the query sample characteristics according to a preset sorting mode to obtain the positive sample serial number of the positive sample characteristics.

The preset sorting mode may be a sorting mode from small to large, or a sorting mode from large to small. The second similarity is used to quantify a degree of similarity characterizing, e.g., an approximation between image features of the positive sample image and the query sample image. Specifically, the similarity between the positive sample feature and the query sample feature may be determined as a second similarity of the positive sample feature; and sequencing the second similarity of the positive sample characteristics according to a preset sequencing mode, and taking the sequencing number of each positive sample characteristic as a corresponding positive sample number.

And S250, sequencing the first similarity between the negative sample characteristics and the query sample characteristics according to a preset sequencing mode to obtain the negative sample serial number of the negative sample characteristics.

Wherein the first similarity is used for quantifying and characterizing the similarity degree between the negative sample image and the query sample image, such as the approximate situation between the image characteristics of the negative sample image and the query sample image. Specifically, the similarity between the negative sample feature and the query sample feature may be determined as a first similarity corresponding to the negative sample feature; and arranging the negative sample characteristics according to the first similarity in the same preset ordering mode, and taking the ordering number of each negative sample characteristic as the serial number of the negative sample.

And S260, determining difference measurement data between the positive sample characteristics and the negative sample characteristics according to the positive sample serial numbers and the negative sample serial numbers.

And determining the difference measurement data through the positive sample sequence number and the negative sample sequence number determined in the previous steps. The method for determining the difference metric data may use any algorithm in the prior art, which is not limited in the embodiment of the present disclosure.

The following describes a specific determination process of the difference metric data by taking the difference metric data as the AP loss.

Illustratively, the AP value for the qth query sample image may be calculated by the following formula:

；

wherein,

a positive sample set composed of positive sample features;

the number of features of the positive sample;

forming a full sample set for the positive sample features and the negative sample features; i is

The ith positive sample feature. The larger the AP value is, the more similar the positive sample image corresponding to the query sample image is to the query sample image, and the less similar the negative sample image is to the query sample image, i.e. the average accuracy of the image retrieval model is higher.

And further calculating the AP loss according to the AP value:

；

wherein,

for AP loss, m is the number of query sample images,

the AP value for the q-th query sample image. Of course, the larger the AP value, the smaller the AP loss, indicating the better the retrieval accuracy of the image retrieval model.

And S270, training the image retrieval model according to the difference measurement data.

Illustratively, the network parameters of the first feature extraction network and the second feature extraction network may be adjusted according to the variation of the difference metric data in different iterative training processes.

It should be noted that, in the process of performing parameter adjustment on the first feature extraction network and the second feature extraction network, the same or different parameter adjustment mechanisms may be adopted.

Optionally, a gradient update method may be adopted, and based on the difference metric data, network parameters of the first feature extraction network and the second feature extraction network are adjusted respectively.

Alternatively, a gradient update method may be used to adjust the network parameters of the first feature extraction network based on the difference metric data, and a momentum update method may be used to adjust the network parameters of the second feature extraction network based on the difference metric data.

In an optional embodiment, in the case that the network structures of the first feature extraction network and the second feature extraction network are the same, the network parameters of the first feature extraction network may be adjusted based on the difference metric data, and the network parameters of the second feature extraction network may be adaptively adjusted according to the parameter adjustment condition of the first feature extraction network.

Illustratively, the network structure of the first feature extraction network is the same as that of the second feature extraction network; the training the image retrieval model according to the difference metric data may include: updating network parameters in the first feature extraction network according to the difference measurement data; and updating the corresponding network parameters in the second characteristic extraction network according to the network parameters and the preset updating amplitude of the first characteristic extraction network.

The preset updating amplitude can be a ratio amplitude of the network parameters of the first feature extraction network and the network parameters of the second feature extraction network. That is, the model parameters of the second feature extraction network and the network parameters of the first feature extraction network may be jointly updated according to the model parameters of the second feature extraction network, and the ratio of the two network parameters is different. According to actual requirements, the proportion of the first feature extraction network parameters is smaller or much smaller than the proportion of the second feature extraction network parameters when the network parameters of the second feature extraction network are updated, for example, the proportion of the first feature extraction network parameters to the network parameters of the first feature extraction network may be preset to be 0.1%, and the proportion of the second feature extraction network parameters to the network parameters of the second feature extraction network may be preset to be 99.9%.

Illustratively, a gradient update method may be adopted to optimally adjust the network parameters of the first feature extraction network. And extracting the network parameters of the network according to any one of the extracted network parameters aiming at the first characteristic, the adjusted network parameters of the first characteristic and the preset updating amplitude, and performing relevant adjustment on the corresponding network parameters in the second characteristic extraction network.

In the above embodiment, the network parameters of the first feature extraction network are used as the update basis of the network parameters of the second feature extraction network, so that the difference of the network model parameters at different times can be ensured to be small, and further, the difference of the negative sample features of the same negative sample image extracted by the image retrieval model at different times in the second feature extraction network is small, the condition of feature drift is avoided, the consistency of the negative sample feature extraction results of the second feature extraction network at different times is ensured, and the retrieval precision of the image retrieval model is improved.

According to the technical scheme of the embodiment of the disclosure, the positive sample serial number and the negative sample serial number are introduced to determine the difference measurement data, and a large amount of high-precision data is not required to be operated, so that the data operation amount in the difference measurement data determination process is reduced, and meanwhile, the precision requirement on computing equipment is lowered. Meanwhile, the characteristic difference situation between the positive sample characteristic and the negative sample can be accurately reflected through the numerical difference between the positive sample serial number and the negative sample serial number, the accuracy of the difference measurement data determination result is ensured, and the retrieval accuracy of the image retrieval model is improved.

On the basis of the above technical solutions, the present disclosure also provides an optional embodiment, in which the operation of determining the difference metric data between the positive sample feature and the negative sample feature is further refined. It should be noted that, for parts not described in detail in the embodiments of the present disclosure, reference may be made to related expressions in other embodiments, and details are not described herein again.

The image retrieval model training method shown in fig. 3 specifically includes the following steps:

s310, acquiring a query sample image and a base library to be queried; the base library to be queried comprises at least one positive sample image of the query sample image and at least one negative sample image of the query sample image.

And S320, respectively performing feature extraction on the query sample image and the positive sample image through a first feature extraction network in the image retrieval model to obtain corresponding query sample features and positive sample features.

And S330, performing feature extraction on the negative sample image through a second feature extraction network in the image retrieval model to obtain corresponding negative sample features.

S340, selecting the difficult-to-negative sample features from the negative sample features according to the first similarity between the negative sample features and the query sample features.

Wherein the first similarity is used for quantifying and characterizing the similarity degree between the negative sample image and the query sample image, such as the approximate situation between the image characteristics of the negative sample image and the query sample image. The image of the hard negative sample can be understood as an image of the negative sample carrying features related to features of the positive sample, namely an image which is easily identified as the positive sample in the retrieval process. Correspondingly, the difficult-to-negative sample features are the feature extraction results of the difficult-to-negative sample images.

For example, a negative sample feature that is closer to a positive sample feature or a query sample feature may be considered a difficult-to-negative sample feature.

Specifically, the similarity between the negative sample feature and the query sample feature may be determined as a first similarity corresponding to the negative sample feature; and taking the negative sample characteristics corresponding to the first similarity greater than the preset similarity threshold as the difficult negative sample characteristics. Wherein the preset similarity threshold can be determined by a person skilled in the art according to a large number of experiments or manual experience.

Optionally, the preset similarity threshold may also be determined according to the similarity between the positive sample feature and the query sample feature, so as to improve the reasonability of the determined preset similarity threshold, which is beneficial to improving the accuracy of the selected difficult-to-negative sample feature.

In an alternative embodiment, the selecting, according to the first similarity, a hard negative sample feature from negative sample features may include: determining a similarity threshold according to a second similarity between the positive sample characteristics and the query sample characteristics; and selecting the hard negative sample characteristics from the negative sample characteristics according to the first similarity and the similarity threshold.

Wherein the second similarity is used for quantifying and characterizing the similarity degree between the positive sample image and the query sample image, such as the approximate situation between the image characteristics of the positive sample image and the query sample image.

For example, a similarity between the positive sample feature and the query sample feature may be determined as a second similarity of the positive sample feature; taking the statistical value of each second similarity determined between each positive sample characteristic and the query sample characteristic as a similarity threshold value; and taking the negative sample characteristic corresponding to the first similarity greater than the similarity threshold as the difficult negative sample characteristic. The statistical value may be a random value, a median, a minimum value or an average value.

It can be understood that, since the similarity threshold is determined based on the second similarity corresponding to each positive sample feature, the determined similarity threshold can better reflect the similarity between the positive sample feature and the query sample feature. Therefore, the features of the hard-to-negative sample selected based on the similarity threshold determined in the above manner are closer to the features of the positive sample, and the accuracy of the selection result of the features of the hard-to-negative sample is improved. The image retrieval model is trained by adopting the difficult-to-load sample characteristics, so that the image retrieval model can be helped to learn better positive and negative sample distinguishing capability, and the retrieval accuracy of the image retrieval model is improved.

And S350, determining difference measurement data between the positive sample characteristics and the difficult-to-negative sample characteristics.

Illustratively, the difference metric data may be determined directly from the first similarity and the second similarity.

In order to reduce the computation amount of the difference measurement data determination process and reduce the precision requirement on the computing device, in an optional implementation manner, the second similarity between the positive sample features and the query sample features may be ranked according to a preset ranking mode to obtain the positive sample serial numbers of the positive sample features; sequencing the second similarity and the first similarity of the features of the difficult-to-load samples according to a preset sequencing mode to obtain the serial numbers of the difficult-to-load samples of the features of the difficult-to-load samples; and determining difference measurement data between the positive sample characteristics and the difficult-to-negative sample characteristics according to the positive sample serial numbers and the difficult-to-negative sample serial numbers.

The preset sorting mode may be a small-to-large sorting mode or a large-to-small sorting mode.

Specifically, the second similarity of each positive sample feature may be sorted according to a preset sorting mode, and the sorting number of each positive sample feature is used as the corresponding positive sample serial number; the first similarity of the characteristics of the difficult-to-negative samples and the second similarity of the characteristics of the positive samples are ranked together according to the same preset ranking mode, and ranking numbers given to the characteristics of the difficult-to-negative samples are used as corresponding serial numbers of the difficult-to-negative samples; and determining difference measurement data according to the sequence numbers of the positive samples and the sequence numbers of the hard-to-negative samples.

On the basis of the above embodiments, further, after the difficult-to-negative sample image and the difficult-to-negative sample feature are screened out, the AP value of the kth query sample image may be calculated by the following formula:

；

wherein,

a positive sample set composed of positive sample features;

the number of positive sample features;

a full sample set composed of positive sample features and difficult negative sample features; i is

The ith positive sample. The larger the AP value is, the more similar the positive sample image corresponding to the query sample image is to the query sample image, and the less similar the negative sample image is to the query sample image, i.e. the average accuracy of the image retrieval model is higher.

And further calculating the AP loss according to the AP value:

；

wherein,

for AP loss, m is the number of query sample images,

AP value for the kth query sample image. Of course, the larger the AP value, the smaller the AP loss, indicating the better the retrieval accuracy of the image retrieval model.

In the embodiment, the positive sample serial number and the difficult-to-negative sample serial number are introduced to determine the difference measurement data, and a large amount of high-precision data is not required to be calculated, so that the data calculation amount in the difference measurement data determination process is reduced, the precision requirement on computing equipment is reduced, meanwhile, the characteristic difference situation between the positive sample characteristic and the difficult-to-negative sample characteristic can be accurately reflected through the numerical difference between the positive sample serial number and the difficult-to-negative sample serial number, the accuracy of the difference measurement data determination result is ensured, and the retrieval accuracy of the image retrieval model is improved.

And S360, training the image retrieval model according to the difference measurement data.

According to the technical scheme of the embodiment of the disclosure, the characteristics of the difficult-to-negative sample are screened out from all the characteristics of the negative sample, and then the difference measurement data is determined according to the characteristics of the difficult-to-negative sample, so that the number of the characteristics of the negative sample participating in the difference measurement data is reduced, and the data calculation amount and the calculation efficiency in the process of determining the difference measurement data are reduced. Furthermore, the hard negative sample characteristics of which the image characteristics are closer to the positive sample characteristics are selected from all the negative samples, so that the trained image retrieval model can learn the resolution capability of the positive and negative samples more quickly, and the training period of the image retrieval model can be shortened.

For ease of understanding, the method of using the image retrieval model will be described first. As shown in fig. 4, the method specifically includes the following steps:

and S410, acquiring a query predicted image and a base to be queried.

The query predicted image can be an image which needs to be subjected to image retrieval, and the retrieval aims to find other images with the same type as the query predicted image from the base to be queried through an image retrieval model. The base to be queried may be an image database for image retrieval.

And S420, inputting the query predicted image and the candidate image in the base to be queried into a first feature extraction network in the trained image retrieval model respectively to obtain a query predicted feature and a candidate feature.

The image retrieval model is obtained by training by using the image retrieval model training method provided by the above embodiments.

Wherein, the candidate image in the base library to be queried can be at least part of image in the base library to be queried. And taking the query predicted image obtained in the previous step and the candidate images of the base to be queried as the input of a feature extraction network of an image retrieval model, so as to obtain the image features (namely query prediction features) of the query predicted image and the image features (namely candidate features) of each candidate image.

The feature extraction network for specifying the image feature may be the first feature extraction network or the second feature extraction network.

And S430, selecting a retrieval result image of the query predicted image from the candidate images according to the similarity between the query predicted features and the candidate features.

Illustratively, by determining the similarity between the query predicted feature and the candidate features, candidate features similar to the query predicted feature are selected from the candidate features, and candidate images corresponding to the selected candidate features are used as retrieval result images.

Optionally, a threshold of the similarity may be preset, candidate features with the similarity satisfying the threshold of the similarity may be selected, and the candidate images corresponding to the selected candidate features may be used as the retrieval result images.

Or optionally, the similarity corresponding to different candidate features may be sorted from large to small; selecting candidate features with a preset number or a preset percentage of similarity ranking in the front; and taking the candidate image corresponding to the selected candidate feature as a retrieval result image. The preset number and the preset percentage may also be set by a relevant technician according to needs or experience values, or through a large number of experiments.

According to the technical scheme of the embodiment of the disclosure, the query prediction features of the query prediction image and the candidate features of the candidate images are obtained through a pre-trained image retrieval model, and then the retrieval result image is determined according to the similarity of the query prediction features and the candidate features. Because the pre-trained image retrieval model has better positive and negative sample resolution capability, the image retrieval is carried out by the trained image retrieval model, and the reasonability and the accuracy of the image retrieval result are improved.

On the basis of the above technical solutions, the present disclosure also provides a preferred embodiment, in which a training process of the image retrieval model is exemplified. For parts which are not described in detail in the embodiments of the present disclosure, reference may be made to related expressions in other embodiments, which are not described herein again. As shown in fig. 5A, the method specifically includes:

fig. 5A shows a training method of an image retrieval model using a Dynamic Rank Learning (DRL) framework, which can use any AP-Loss function-based method for model training.

The image retrieval model comprises a first feature extraction network fq model, a second feature extraction network frm model, a Hard Negative sample sorting module HNR (Hard Negative Rank) and an AP-Loss calculation module.

The fq model is used for extracting the characteristics of the query sample image to obtain the characteristics of the query sample; and extracting the characteristics of the positive sample images of the query sample images, and sequencing according to the similarity between each positive sample characteristic and the query sample characteristic to obtain a positive sample queue.

The frm model is used for extracting negative sample characteristics of the negative sample images corresponding to the query sample images in the base library to be queried, and adding the negative sample characteristics to the sorting queue.

The HNR is used for respectively determining the similarity between each positive sample characteristic and each negative sample characteristic and the query characteristic; taking the similarity mean value of each positive sample characteristic as a reference similarity; if the similarity of the negative sample features is larger than the reference similarity, the difficult negative sample features can be further screened out, and the difficult negative sample images corresponding to the difficult negative sample features are sorted into a difficult negative sample queue.

The AP-Loss calculation module is used for calculating an AP value and an AP Loss function, and finally adjusting model parameters through a gradient (grad) updating method so as to achieve the purpose of training the image retrieval model. Illustratively, the AP value may be determined using the following equation:

；

wherein,

is the AP value of the kth query sample image,

a positive sample set corresponds to the positive sample queue;

the number of features of the positive sample;

a full sample set consisting of a positive sample queue and a difficult negative sample queue; i is

Illustratively, the following equation may be used to determine AP-Loss:

；

wherein m is the number of query samples;

AP value for the kth query sample. Wherein, the larger the AP value, the smaller the AP loss, which indicates the better accuracy of the image retrieval model.

It can be understood that, since the computation complexity of the AP-Loss is proportional to the length of the sorting queue, the computation complexity of the AP-Loss in a long sorting queue becomes high, thereby affecting the training efficiency. Therefore, the features of the hard negative samples are obtained through screening, the length of the sorting queue is reduced, the complexity and the time of calculation are reduced, and the calculation efficiency is improved.

Specifically, in the method for computing the AP-Loss based on the Batch (i.e. Batch processing), the computational complexity of the algorithm of the AP-Loss is (2), wherein the computational complexity represents the Size of the Batch Size. After introducing an extra long sort queue, the computational complexity of AP-Loss increases to () where it represents the length of the sort queue, which is much larger than the Size of the Batch Size (i.e., the number of samples used for training in a single Batch). While the dynamic hard negative sample ordering proposed herein reduces the algorithm complexity () to () where K represents the number of hard samples selected and satisfies ≪ and >. Dynamic hard-to-negative sample sorting can increase the process of dynamic hard sample selection, but the additional hard sample selection process is relatively less complicated compared with the calculation of AP-Loss and can be ignored. In conclusion, the present disclosure shortens the gap between the calculated AP and the theoretical AP by adding a small amount of calculation, and thus the optimization effect is better. It is worth noting that the dynamic ordering queue is introduced only in the training phase in the examples of the present disclosure, so there is no additional memory and computational consumption in the testing phase.

In addition, for the DRL, the number of the difficult-to-load sample sets is related to the distribution of the base library to be queried, and the difficult-to-load sample images corresponding to different query sample images are different, so that the difficult-to-load sample images are dynamically changed in the model training process, and the retrieval accuracy of the image retrieval model is improved in the training process.

In the whole training process of the DRL, as shown in fig. 5A, the input query sample image may be processed by the same data enhancement method. And extracting feature data in the query sample image and the positive sample image of the query sample image by an fq model aiming at each query sample image to obtain positive sample features. And extracting the characteristic data of the negative sample image data of the query sample image by the frm model to obtain the negative sample characteristics, and pressing the data into the sequencing queue. The features in the positive sample queue and the features in the sorting queue are input into a difficult negative sample sorting module (HNR) together, and the difficult negative sample features are dynamically selected for each query sample image through the HNR module. The HNR module can filter out simple samples which are useless for calculating the loss function, and only keep negative samples which are difficult to extract features for the image retrieval model. In the process, the sorting queue has the phenomenon of characteristic drift, namely, the network model parameters at different moments have larger difference, so that the characteristic difference extracted from the models at different moments is larger.

To alleviate this problem, the models in the ordering queue may no longer be updated with gradients, but rather may use a momentum update approach, i.e., each time the model parameters are updated, a small portion of the weight parameters are derived from the fq model and a large portion of the weight parameters are derived from the frm model itself. Therefore, the sequencing queue model which changes slowly at different moments can ensure the consistency of the characteristics in the sequencing queue, thereby ensuring the stability of model retrieval. For example, for the fq model, a gradient update method is adopted for parameter update; and aiming at the frm model, adopting a momentum updating method. In particular, for any parameter in the frm model

Using the corresponding parameters in the fq model

Co-training with the own parameters of the fq model:

；

wherein m is a predetermined percentage, and may be an empirical value or a test value, wherein m > > (1-m). For example, m is 99.9%.

It is worth noting that the dynamic ordering queue is introduced only in the training phase in the examples of the present disclosure, so there is no additional memory and computational consumption in the testing phase.

In an actual image retrieval example, as shown in fig. 5B, a graph of comparison results of models trained based on the DRL architecture according to the embodiment of the present disclosure and a model trained by a baseline method (AP-Loss calculated in Batch) of the prior art is shown. The first column is the query predicted image (marked as X in the figure), and the second column to the sixth column are the first image to the fifth image which are most similar to the query predicted image in the image base. The image marked with Y represents a positive sample retrieval result which is obtained after retrieval and is similar to the query predicted image, and the image marked with Z represents a negative sample false detection result which is obtained after retrieval and is not similar to the query predicted image. The first, third and fifth rows represent the search results of the baseline method, and the second, fourth and sixth rows represent the search results of the method of this embodiment. It can be seen from the search results that the model trained based on the DRL of the present embodiment has a more accurate search effect than the model trained by the baseline method.

The image retrieval model using method and the image retrieval model using device provided by the embodiment of the disclosure are suitable for scenes using the image retrieval model. Each image retrieval model using method provided by the embodiments of the present disclosure may be executed by an image retrieval model using apparatus, which may be implemented by software and/or hardware and is specifically configured in an electronic device, which may be a computer, a server, a mobile terminal, or the like. The electronic device that executes the image search model training method may be the same as or different from the electronic device that executes the image search model training method, and the present disclosure does not limit this. As an implementation of each of the above image retrieval model training methods, the present disclosure also provides an optional embodiment of an execution device implementing each of the above image retrieval model training methods. As shown in fig. 6, the image search model training apparatus 600 specifically includes:

an image and base acquisition module 610, configured to acquire a query sample image and a base to be queried; the base library to be queried comprises at least one positive sample image of a query sample image and at least one negative sample image of the query sample image;

the first feature extraction network 620 is configured to perform feature extraction on the query sample image and the positive sample image respectively to obtain corresponding query sample features and positive sample features;

the second feature extraction network 630 is configured to perform feature extraction on the negative sample image to obtain corresponding negative sample features;

a difference metric data determination module 640 for determining difference metric data between the positive and negative sample features;

a model training module 650 for training an image retrieval model comprising a first feature extraction network and a second feature extraction network according to the difference metric data.

According to the technical scheme, when the negative sample features are extracted, the method is different from the feature extraction mode of the negative sample images in the current training batch in the prior art, and the selection range of the negative sample images is expanded to the bottom library to be inquired, so that the diversity of the negative sample images is improved, and the richness of the negative sample features and the comprehensiveness of the relevant features of the inquiry sample images are improved. The negative sample characteristics obtained by the method are combined with the positive sample characteristics to determine the difference measurement data, so that the determined difference measurement data can truly reflect the difference situation between the negative sample image and the positive sample image in the base to be queried, the accuracy of the difference measurement data determination result is improved, the image retrieval model has better image retrieval capability in the process of training the image retrieval model based on the difference measurement data, and the retrieval accuracy of the image retrieval model is improved.

In an alternative embodiment, the difference metric data determining module 640 may include:

the negative sample characteristic determining unit is used for selecting a negative sample characteristic from the negative sample characteristics according to a first similarity between the negative sample characteristic and the query sample characteristic;

and the difference data determining unit is used for determining difference measurement data between the positive sample characteristics and the hard negative sample characteristics.

In an optional implementation, the hard negative sample determination unit may include:

the similarity threshold determining subunit is used for determining a similarity threshold according to a second similarity between the positive sample characteristic and the query sample characteristic;

and the difficult negative sample characteristic determining subunit is used for selecting the difficult negative sample characteristic from the negative sample characteristics according to the first similarity and the similarity threshold.

In an alternative embodiment, the difference data determination unit includes:

the positive sample serial number determining subunit is used for sequencing the second similarity between the positive sample characteristics and the query sample characteristics according to a preset sequencing mode to obtain the positive sample serial number of the positive sample characteristics;

the sequence number determining subunit is used for sequencing the second similarity and the first similarity of the characteristics of the difficult-to-load sample according to a preset sequencing mode to obtain the sequence number of the difficult-to-load sample of the characteristics of the difficult-to-load sample;

and the difference measurement data determining subunit is used for determining the difference measurement data between the positive sample characteristics and the difficult-to-negative sample characteristics according to the positive sample serial numbers and the difficult-to-negative sample serial numbers.

the positive sample serial number determining unit is used for sequencing the second similarity between the positive sample characteristics and the query sample characteristics according to a preset sequencing mode to obtain the positive sample serial number of the positive sample characteristics;

the negative sample serial number determining unit is used for sequencing the first similarity between the negative sample characteristics and the query sample characteristics according to a preset sequencing mode to obtain the negative sample serial number of the negative sample characteristics;

and the difference measurement unit is used for determining difference measurement data between the positive sample characteristics and the negative sample characteristics according to the positive sample serial numbers and the negative sample serial numbers.

In an alternative embodiment, the first feature extraction network and the second feature extraction network have the same network structure; the model training module 650 may include:

a first parameter updating unit, configured to update a network parameter in the first feature extraction network according to the difference metric data;

and the second parameter updating unit is used for updating the corresponding network parameters in the second characteristic extraction network according to the network parameters of the first characteristic extraction network and the preset updating amplitude.

The image retrieval model training device can execute the image retrieval model training method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing each image retrieval model training method.

As an implementation of the above image retrieval model using method, the present disclosure also provides an alternative embodiment of an execution device implementing the above image retrieval model using method. As shown in fig. 7, the image search model using apparatus 700 specifically includes:

an image and library acquiring module 710, configured to acquire a query predicted image and a base library to be queried;

the image input module 720 is configured to input the query predicted image and the candidate image in the to-be-queried base library to a first feature extraction network in a trained image retrieval model, so as to obtain a query predicted feature and a candidate feature; the image retrieval model is obtained by training by adopting any image retrieval model training device provided by the embodiment of the disclosure;

and the image retrieval module 730 is configured to select a retrieval result image of the query predicted image from the candidate images according to the similarity between the query predicted feature and the candidate features.

The image retrieval model using device can execute the image retrieval model using method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing each image retrieval model using method.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the related inquiry sample image, inquiry predicted image and the base library to be inquired all accord with the regulation of related laws and regulations, and do not violate the public order and good customs.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as an image retrieval model training method, and/or an image retrieval model using method. For example, in some embodiments, the image retrieval model training method, and/or the image retrieval model using method, may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, computer programs may perform one or more steps of the image retrieval model training methods, and/or image retrieval model using methods, described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the image retrieval model training method, and/or the image retrieval model using method, in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome. The server may also be a server of a distributed system, or a server incorporating a blockchain.

Artificial intelligence is the subject of research that makes computers simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge map technology and the like.

It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in this disclosure may be performed in parallel or sequentially or in a different order, as long as the desired results of the technical solutions provided by this disclosure can be achieved, and are not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. An image retrieval model training method comprises the following steps:

acquiring a query sample image and a to-be-queried base library; wherein the base library to be queried comprises at least one positive sample image of the query sample image and at least one negative sample image of the query sample image;

respectively extracting the characteristics of the query sample image and the positive sample image through a first characteristic extraction network in an image retrieval model to obtain corresponding query sample characteristics and positive sample characteristics; and the number of the first and second groups,

2. The method of claim 1, wherein the determining difference metric data between the positive and negative sample features comprises:

selecting difficult-to-negative sample features from the negative sample features according to a first similarity between the negative sample features and the query sample features;

determining difference metric data between the positive sample features and the difficult-to-negative sample features.

3. The method according to claim 2, wherein the selecting features difficult to be negatively sampled from the features of the negative samples according to the first similarity comprises:

determining a similarity threshold according to a second similarity between the positive sample characteristic and the query sample characteristic;

and selecting features of the hard negative samples from the features of the negative samples according to the first similarity and the similarity threshold.

4. The method of claim 2 or 3, wherein the determining difference metric data between the positive sample features and the difficult-to-nega tive sample features comprises:

sequencing the second similarity between the positive sample characteristics and the query sample characteristics according to a preset sequencing mode to obtain a positive sample serial number of the positive sample characteristics;

sequencing the second similarity and the first similarity of the features of the difficult-to-load sample according to the preset sequencing mode to obtain the serial number of the difficult-to-load sample of the features of the difficult-to-load sample;

and determining difference measurement data between the positive sample characteristics and the difficult-to-negative sample characteristics according to the positive sample serial numbers and the difficult-to-negative sample serial numbers.

5. The method of claim 1, wherein the determining difference metric data between the positive sample features and the negative sample features comprises:

sequencing the first similarity between the negative sample characteristics and the query sample characteristics according to the preset sequencing mode to obtain the negative sample serial number of the negative sample characteristics;

and determining difference measurement data between the positive sample characteristics and the negative sample characteristics according to the positive sample serial numbers and the negative sample serial numbers.

6. The method according to any of claims 1-3 and 5, wherein the first feature extraction network and the second feature extraction network have the same network structure; the training the image retrieval model according to the difference metric data comprises:

updating network parameters in the first feature extraction network according to the difference metric data;

and updating the corresponding network parameters in the second feature extraction network according to the network parameters and the preset updating amplitude of the first feature extraction network.

7. An image retrieval model using method, comprising:

acquiring a query predicted image and a base to be queried;

inputting the query predicted image and the candidate image in the base to be queried into a first feature extraction network in a trained image retrieval model respectively to obtain a query predicted feature and a candidate feature; wherein the image retrieval model is obtained by training by adopting the method of any one of claims 1 to 6;

and selecting a retrieval result image of the query predicted image from the candidate images according to the similarity between the query prediction feature and the candidate features.

8. An image retrieval model training apparatus comprising:

the image and bottom library acquisition module is used for acquiring a query sample image and a bottom library to be queried; wherein the base library to be queried comprises at least one positive sample image of the query sample image and at least one negative sample image of the query sample image;

the first feature extraction network is used for respectively extracting features of the query sample image and the positive sample image to obtain corresponding query sample features and positive sample features; and the number of the first and second groups,

the second feature extraction network is used for carrying out feature extraction on the negative sample image to obtain corresponding negative sample features;

a difference metric data determination module for determining difference metric data between the positive sample features and the negative sample features;

and the model training module is used for training an image retrieval model comprising a first feature extraction network and a second feature extraction network according to the difference measurement data.

9. The apparatus of claim 8, wherein the discrepancy metric data determination module comprises:

the difficult negative sample determining unit is used for selecting difficult negative sample characteristics from the negative sample characteristics according to a first similarity between the negative sample characteristics and the query sample characteristics;

a difference data determination unit for determining difference metric data between the positive sample features and the difficult-to-negative sample features.

10. The apparatus of claim 9, wherein the hard negative example determination unit comprises:

a similarity threshold determining subunit, configured to determine a similarity threshold according to a second similarity between the positive sample feature and the query sample feature;

11. The apparatus according to claim 9 or 10, wherein the difference data determining unit comprises:

a positive sample sequence number determining subunit, configured to rank, according to a preset ranking manner, second similarities between the positive sample features and the query sample features, so as to obtain positive sample sequence numbers of the positive sample features;

the sequence number determining subunit is configured to rank the second similarity and the first similarity of the characteristic of the difficult-to-load sample according to the preset ranking mode, so as to obtain a sequence number of the difficult-to-load sample of the characteristic of the difficult-to-load sample;

12. The apparatus of claim 8, wherein the discrepancy metric data determination module comprises:

a positive sample sequence number determining unit, configured to rank, according to a preset ranking manner, second similarities between the positive sample features and the query sample features, so as to obtain a positive sample sequence number of the positive sample features;

a negative sample sequence number determining unit, configured to rank, according to the preset ranking manner, first similarities between the negative sample features and the query sample features to obtain negative sample sequence numbers of the negative sample features;

13. The apparatus according to any of claims 8-10 and 12, wherein the network structure of the first feature extraction network is the same as the network structure of the second feature extraction network; the model training module comprises:

and the second parameter updating unit is used for updating the corresponding network parameters in the second feature extraction network according to the network parameters of the first feature extraction network and the preset updating amplitude.

14. An image retrieval model using apparatus comprising:

the image and library acquisition module is used for acquiring a query predicted image and a base library to be queried;

the image input module is used for respectively inputting the query predicted image and the candidate image in the base to be queried into a first feature extraction network in a trained image retrieval model to obtain a query prediction feature and a candidate feature; wherein the image retrieval model is obtained by training by using the device of any one of claims 8-13;

and the image retrieval module is used for selecting a retrieval result image of the query predicted image from the candidate images according to the similarity between the query prediction feature and the candidate features.

15. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image retrieval model training method of any one of claims 1-6 and/or the image retrieval model using method of claim 7.

16. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the image retrieval model training method of any one of claims 1 to 6 and/or the image retrieval model using method of claim 7.