CN106407281B

CN106407281B - Image retrieval method and device

Info

Publication number: CN106407281B
Application number: CN201610743934.3A
Authority: CN
Inventors: 李甫
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2016-08-26
Filing date: 2016-08-26
Publication date: 2019-12-24
Anticipated expiration: 2036-08-26
Also published as: CN106407281A

Abstract

The invention provides an image retrieval method and device, wherein the method comprises the following steps: acquiring an image to be inquired; inputting the image to be queried and a plurality of preset evaluation images into a pre-established same-type retrieval model and a similar retrieval model, and outputting to obtain the image feature to be queried and a plurality of evaluation image features; similarity calculation is carried out on the image features to be inquired and the multiple evaluation image features; and sequencing the multiple evaluation images according to the similarity from high to low, and selecting a preset number of evaluation images in the front sequence as a retrieval result. The invention can obviously improve the detection accuracy.

Description

Image retrieval method and device

Technical Field

The present invention relates to the field of image retrieval technologies, and in particular, to a content-based image retrieval method and apparatus.

Background

Content-based Image Retrieval (CBIR) is an Image Retrieval technique that analyzes and retrieves the Content semantics of an Image (e.g., color, texture, layout, etc. of the Image). In the current multimedia age, image video resources are increasingly abundant, and when image data is processed, it is important to be able to quickly retrieve pictures that are the same as or similar to the query image. For example, how to quickly find the same type or similar articles with the inquired picture by some e-commerce platforms can quickly and accurately push corresponding commodities for users, so that the user experience is improved, and the workload of manual operation is reduced.

At present, in most image retrieval methods, feature extraction is performed on an image first, then similarity calculation is performed by using the features, and the most similar image is selected according to the similarity. In view of the retrieval speed, many image retrieval algorithms use simple features to describe images, resulting in less detailed description of images. Some image retrieval algorithms are based on deep learning of a single model, so that the description of the model on the image is not comprehensive. Therefore, how to describe the image in detail and comprehensively is a technical problem in image retrieval.

Disclosure of Invention

In order to improve the image retrieval accuracy, embodiments of the present invention provide an image retrieval method and apparatus.

According to an aspect of the present invention, there is provided an image retrieval method including: acquiring an image to be inquired; inputting the image to be queried and a plurality of preset evaluation images into a pre-established same-type retrieval model and a similar retrieval model, and outputting to obtain the image feature to be queried and a plurality of evaluation image features; similarity calculation is carried out on the image features to be inquired and the multiple evaluation image features; and sequencing the multiple evaluation images according to the similarity from high to low, and selecting a preset number of evaluation images in the front of the sequence as a retrieval result.

Preferably, the process of establishing the homogeneous search model includes: constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type are similar to the images to be inquired in appearance; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-style retrieval model.

Preferably, the process of establishing the homogeneous search model includes: constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type belong to the same type as the images to be inquired; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-class retrieval model.

Preferably, after the image feature to be queried and the image feature to be evaluated are obtained through the output, and before the similarity calculation is performed on the image feature to be queried and the image feature to be evaluated, the method further includes: and performing dimension reduction processing on the image feature to be inquired and the evaluation image feature, and mapping the image feature to be inquired and the evaluation image feature after the dimension reduction processing.

Preferably, a principal component analysis method is adopted to perform dimension reduction processing on the image features to be inquired and the evaluation image features; and mapping the image characteristics to be queried and the image characteristics to be evaluated after the dimension reduction processing by adopting a linear judgment method.

According to another aspect of the present invention, there is provided an image retrieval apparatus comprising: the image acquisition unit is used for acquiring an image to be inquired; the model operation unit is used for inputting the image to be inquired and a plurality of preset evaluation images into a pre-established same-type retrieval model and a same-type retrieval model and outputting to obtain the image characteristics to be inquired and a plurality of evaluation image characteristics; the similarity calculation unit is used for calculating the similarity of the image features to be inquired and the plurality of evaluation image features; and the retrieval result determining unit is used for sequencing the plurality of evaluation images from high to low according to the similarity and selecting a preset number of evaluation images in the front of the sequence as the retrieval result.

Preferably, the method further comprises the following steps: the system comprises a same-style retrieval model establishing unit, a query unit and a query unit, wherein the same-style retrieval model establishing unit is used for constructing a training set, the training set comprises a plurality of images to be queried and corresponding same-style images, and the same-style images are images similar to the images to be queried in appearance; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-style retrieval model.

Preferably, the method further comprises the following steps: the system comprises a same-class retrieval model establishing unit, a query unit and a query unit, wherein the same-class retrieval model establishing unit is used for constructing a training set, the training set comprises a plurality of images to be queried and corresponding same-class images, and the same-class images belong to the same class as the images to be queried; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-class retrieval model.

Preferably, the method further comprises the following steps: the dimension reduction processing unit is used for carrying out dimension reduction processing on the image feature to be inquired and the image feature to be evaluated; and the mapping unit is used for mapping the image features to be queried and the image features to be evaluated after the dimension reduction processing.

Preferably, the dimension reduction processing unit performs dimension reduction processing on the image feature to be queried and the evaluation image feature by adopting a principal component analysis method; and the mapping unit adopts a linear judgment method to map the image characteristics to be inquired and the image characteristics to be evaluated after the dimension reduction processing.

Therefore, the embodiment of the invention provides the image retrieval method fusing the multiple models, so that the returned results of retrieval are similar to the appearance of the query image, and belong to the same category, and the detection accuracy can be obviously improved after the two complementary characteristics are fused. In addition, the invention also uses PCA and LDA to carry out dimension reduction and space mapping processing on the fused features, thereby greatly improving the retrieval speed under the condition of small precision loss.

Drawings

FIG. 1 is a flowchart of an image retrieval method according to an embodiment of the present invention;

FIG. 2 is a flowchart of an image retrieval method according to another embodiment of the present invention;

fig. 3 is a schematic structural diagram of an image retrieval apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

In the prior art, due to the consideration of retrieval speed, many image retrieval algorithms use simple features to describe images, so that the description of the images is not fine enough. Some image retrieval algorithms are based on deep learning of a single model, so that the description of the model on the image is not comprehensive. Therefore, the invention provides a retrieval strategy fusing the homogeneous model and the homogeneous model based on the specific image retrieval purpose of commodity homogeneous retrieval, so that the description of the image is more detailed and comprehensive, and in a preferred mode, the dimension of the characteristic dimension is reduced through PCA (principal component analysis) and is mapped through LDA (linear discriminant analysis), thereby improving the retrieval speed.

Referring to fig. 1, a flowchart of an image retrieval method provided in an embodiment of the present invention is shown, where the method includes:

s101: acquiring an image to be inquired;

s102: inputting an image to be queried and a plurality of preset evaluation images into a pre-established same-type retrieval model and a same-type retrieval model, and outputting to obtain the characteristics of the image to be queried and the characteristics of the evaluation images;

s103: carrying out similarity calculation on the image feature to be inquired and the multiple evaluation image features;

s104: and sequencing the multiple evaluation images according to the similarity from high to low, and selecting a preset number of evaluation images in the front sequence as a retrieval result.

For image retrieval, the final goal is to retrieve an image that has the same or similar appearance according to the image to be queried. According to the deep learning theory, two models can be established in advance: the same-style retrieval model is used for finding the commodities with the same style or the same pattern, so that the returned result is similar to the image to be inquired in appearance; and the same-class retrieval model is used for finding the commodities in the same class and ensuring that the returned result and the image to be inquired belong to the same class. By fusing the two same type retrieval models and the same type retrieval model, the image characteristics can be more carefully and comprehensively described, so that the retrieval accuracy is improved. Those skilled in the art appreciate that the concept of deep learning, which is derived from artificial neural networks, forms more abstract high-level representation attribute classes or features by combining low-level features to discover a distributed feature representation of the data. Based on the characteristics described by the model established by deep learning, it can be understood that the same information quantity of the original data is expressed by a lower dimension data vector.

In the invention, the meaning of the same type is that the appearances of the two images are the same or similar, and the meaning of the same type is that the articles in the two images belong to the same category. For example, taking the image query of the clothing article as an example, two pieces of clothes with black and white horizontal stripes belong to the same style, the same style of clothes can be matched through the same style retrieval model, and the same style of clothes may include a T-shirt, a shirt and a coat; further, the same type of clothes images with the image to be inquired can be matched through the same type of retrieval model, for example, if the image to be inquired is a T-shirt, only the same type of T-shirt is matched through the same type of retrieval model.

For simplicity and intuition of description, the embodiment of the invention is described by taking image retrieval of clothing articles as an example. However, it should be noted that the embodiment of the present invention is not limited to the search of the clothing product image, and may also be used for searching other product images, such as electric appliances, furniture, daily necessities, and the like.

Another embodiment of the present invention will be described below by taking the clothing image retrieval as an example.

The difference from the embodiment of fig. 1 is that the present embodiment describes the building processes of two retrieval models in detail (S201, S202), and adds the step of dimension reduction and mapping for features (S205). Referring to fig. 2, a flowchart of an image retrieval method according to another embodiment of the present invention includes the following steps S201 to S207.

S201: and establishing a homogeneous retrieval model.

In summary, the process of establishing the homogeneous search model includes two steps: constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type are similar to the images to be inquired in appearance; and inputting the training set into a deep learning network for training, and establishing to obtain a homogeneous retrieval model.

Specifically, when a training set is constructed, a plurality of images to be queried and corresponding images of the same type are collected, and a group of image sets with similar appearances are stored as a category for training. Meanwhile, in order to capture more details of the commodity and remove edge noise of the image, preprocessing operations can be performed on all the images: the original images are respectively enlarged, and then an area of a predetermined size in the center of the images is cut out as added data, thereby increasing the number of training samples.

And then, inputting the constructed training set into a deep learning network for classification training. For example, the deep learning network may employ a GoodLeNet network.

For the deep learning algorithm, in order to improve the performance of image classification or retrieval, it is necessary to increase the depth of the model or increase the number of filters and neurons. The GoodLeNet network adopts a 22-layer structure, two loss layers are added at different depths to ensure the existence of gradient values, and the problem of gradient reduction caused by excessive layers is avoided. Meanwhile, the GoodLeNet network adds various core structures, such as 1 × 1, 3 × 3, and 5 × 5. In addition, a convolution kernel of 1 × 1 is added after the convolution layer and before the pooling layer to reduce the thickness of the feature map and prevent the feature dimension of the final concatenated feature from being too large.

Therefore, the GoodLeNet network is preferably used for classification training, after numerous iterations, the loss of the network is basically kept unchanged, and the final classification accuracy is high.

Although the GoodLeNet network is preferably used for model training in the present invention, the model training is not limited to this, and other deep learning networks may be used for model training.

In practical operation, it is found that if the image is searched by using the same type of search model, an obvious problem occurs, namely, the returned result is an image which is similar to the inquired appearance, but whether the image and the inquired commodity belong to the same category is not considered. For example, when inputting an image of a T-shirt product with stripes, if only the same-style search model is used, it is unreasonable to return a product with stripe patterns, which includes both the desired T-shirt and the product outside the desired, such as jacket and polo shirt. Therefore, the expected retrieval result is similar to the appearance of the query image and also belongs to the same category, so that the same type retrieval model is increased.

S202: and establishing a similar retrieval model.

In summary, the process of establishing the same kind of search model includes the following two steps: constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type belong to the same type as the images to be inquired; and inputting the training set into a deep learning network for training, and establishing to obtain a similar retrieval model.

To make the training category sufficiently large, a data source published in the technical field, for example, a public competition data set using an ImageNet database, may be used. The database contains a total of 1000 categories, 126 ten thousand training images. It should be noted that the same category in the training set is different from the concept of the category in the same-style search model. In the same-style retrieval model, the same class contains images of the same commodity (or the same style commodity) in different postures and different angles. In the same-kind search model, the images in each category belong to the same small-range category, such as T-shirts, down jackets, one-piece dresses, and the like, regardless of whether the patterns and styles are the same. Therefore, the same type of search model is more concerned about similar appearance, and the same type of search model is restricted to belong to the same category.

Feature extraction is performed by using a GoodLeNet network pre-trained on a training set, but the effect is not ideal through experimental discovery. Further, then, the training data set and the number of classes can be increased, for example, by using all data of the ImageNet database, the database has a larger number of images and a larger number of classes, i.e., the classification is more detailed. Therefore, when training, the model can capture more subtle differences between each category, and the learned features can describe the image more finely. In specific operation, the model trained based on the ImageNet database can be subjected to feature extraction, and the extracted features and the features of the same-style retrieval model are respectively subjected to two-norm normalization and then are connected in series to form a longer feature which is used as a final feature for describing an image.

S203: and acquiring an image to be inquired.

The image to be queried is generally a commodity image input by a user, and the user inputs the image so as to find the same or similar commodity.

S204: and inputting the image to be queried and a plurality of preset evaluation images into the same type retrieval model and the same type retrieval model, and outputting to obtain the image feature to be queried and a plurality of evaluation image features.

The evaluation image is preset and stored, and can be respectively preset and stored according to different types of commodities. For example, for clothing items, clothing item images are preset as a base database for query and retrieval.

As described above, since the same-type search model and the same-type search model perform feature extraction through the deep learning algorithm in the process of establishing, after the image to be queried and the image to be evaluated are input to the two models, the image feature to be queried and the image feature to be evaluated can be obtained.

S205: and performing dimension reduction processing on the image feature to be queried and the multiple evaluation image features, and mapping the image feature to be queried and the multiple evaluation image features after the dimension reduction processing.

The invention adopts the same type retrieval model and the same type retrieval model, thereby causing the increase of the characteristics, and obviously causing the increase of the dimension of the characteristics when a plurality of characteristics are combined together, thereby influencing the efficiency of the subsequent similarity calculation. Therefore, the dimension reduction processing can be carried out on the image feature to be inquired and the image feature to be evaluated, and the image feature to be inquired and the image feature to be evaluated after the dimension reduction processing are mapped, so that the processing effect is improved.

Specifically, Principal Component Analysis (PCA) can be used to perform dimension reduction on the features, and Linear Discriminant Analysis (LDA) can be used to map the features after dimension reduction.

PCA is a statistical method that transforms a set of variables that may have correlation into a set of linearly uncorrelated variables by orthogonal transformation, and the transformed set of variables is called principal component. The principal component analysis is to eliminate redundant repeated variables (closely related variables) for all the originally proposed variables, and establish new variables as few as possible, so that the new variables are irrelevant pairwise, and the new variables keep original information as much as possible in the aspect of reflecting the information of the subject.

LDA is one of classification algorithms, and projects historical data to ensure that the projected data of the same category are as close as possible, the data of different categories are separated as far as possible, and a linear discriminant model is generated to separate and predict newly generated data.

In the embodiment of the invention, as the extracted image features to be inquired and the image features to be evaluated are sparse, unnecessary noise can be removed by using PCA dimension reduction, and the feature dimension with higher value is reserved. And LDA can further increase the distance between each characteristic class, reduce the intra-class distance, guarantee that the new sample mode after the projection can be distinguished more obviously.

S206: and performing similarity calculation on the image features to be inquired and the image features to be evaluated after dimension reduction processing and mapping.

Similarity calculation is used for measuring the similarity between objects, and common similarity calculation methods include a vector space-based similarity calculation method, a hash-based similarity calculation method, a theme-based similarity calculation method, and the like. The embodiment of the present invention may adopt a similarity calculation method that may occur in the present or future, and is not limited thereto.

S207: and sequencing the multiple evaluation images according to the similarity from high to low, and selecting a preset number of evaluation images in the front of the sequence as a retrieval result.

It can be understood that the higher the similarity between the evaluation image and the image to be queried, the closer the evaluation image and the image to be queried, including the closer the style and the category are. By sorting the evaluation images from high to low according to the similarity, a predetermined number of images closest to the image to be queried can be selected. For example, 100 images with high to low similarity are selected.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

Fig. 3 is a schematic structural diagram of an image retrieval apparatus according to an embodiment of the present invention. The device includes:

an image obtaining unit 301, configured to obtain an image to be queried;

the model operation unit 302 is configured to input the image to be queried and a plurality of preset evaluation images into a pre-established homogeneous retrieval model and a homogeneous retrieval model, and output to obtain an image feature to be queried and a plurality of evaluation image features;

the similarity calculation unit 303 is configured to perform similarity calculation on the image feature to be queried and the multiple evaluation image features;

the retrieval result determining unit 304 is configured to rank the multiple evaluation images according to the similarity from high to low, and select a preset number of evaluation images ranked in the top as a retrieval result.

Preferably, the apparatus further comprises:

a homogeneous retrieval model establishing unit 305, configured to construct a training set, where the training set includes a plurality of images to be queried and corresponding homogeneous images, where the homogeneous images are images similar to the images to be queried in appearance; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-style retrieval model.

In summary, the process of establishing the homogeneous search model includes two steps: constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type are similar to the images to be inquired in appearance; and inputting the training set into a deep learning network model for training, and establishing to obtain a homogeneous retrieval model.

Specifically, when a training set is constructed, a plurality of images to be queried and corresponding images of the same type are collected, and a group of pictures with similar appearances are stored in a set as a category for training. Meanwhile, in order to capture more details of the commodity and remove edge noise of the image, preprocessing operations can be performed on all the images: the original images are respectively enlarged, and then an area of a predetermined size in the center of the images is cut out as added data, thereby increasing the number of training samples.

For the deep learning algorithm, in order to improve the performance of image classification or retrieval, it is necessary to increase the depth of the model or increase the number of filters and neurons. The GoodLeNet network adopts a 22-layer structure, two loss layers are added at different depths to ensure the existence of gradient values, and the problem of small gradient caused by excessive layers is avoided. Meanwhile, the GoodLeNet network adds various core structures, such as 1 × 1, 3 × 3, and 5 × 5. In addition, a convolution kernel of 1 × 1 is added after the convolution layer and before the pooling layer to reduce the thickness of the feature map and prevent the feature dimension of the final concatenated feature from being too large.

In practical operation, it is found that if the image is searched by using the same type of search model, an obvious problem occurs, namely, the returned result is an image which is similar to the inquired appearance, but whether the image and the inquired commodity belong to the same category is not considered. For example, when inputting an image of a T-shirt product with stripes, if the same style search model is used, it is unreasonable to return a product with stripes, which includes both the desired T-shirt and the product outside the desired item, such as jacket and polo shirt. Therefore, the expected retrieval result is similar to the appearance of the query image and also belongs to the same category, so that the same type retrieval model is increased.

Preferably, the apparatus further comprises:

a similar retrieval model establishing unit 306, configured to construct a training set, where the training set includes a plurality of images to be queried and corresponding similar images, where the similar images are images that belong to the same category as the images to be queried; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-class retrieval model.

Feature extraction is performed by using a GoodLeNet network pre-trained on a training set, but the effect is not ideal through experimental discovery. Further, then, the training data set and the number of classes can be increased, for example, by using all data of the ImageNet database, the database has a larger number of images and a larger number of classes, i.e., the classification is more detailed. Therefore, when training, the model can capture more subtle differences between each category, and the learned features can describe the image more finely.

Preferably, the apparatus further comprises:

a dimension reduction processing unit 307, configured to perform dimension reduction processing on the image feature to be queried and the multiple evaluation image features;

the mapping unit 308 is configured to map the image feature to be queried and the multiple evaluation image features after the dimension reduction processing.

Specifically, Principal Component Analysis (PCA) can be used for performing dimension reduction processing on the features, and Linear Discriminant Analysis (LDA) is used for mapping the features after dimension reduction. Because the extracted features are sparse, unnecessary noise can be removed by using PCA dimension reduction, and feature dimensions with higher value are reserved. And LDA can further increase the distance between each characteristic class, reduce the intra-class distance, guarantee that the new sample mode after the projection can be distinguished more obviously.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The foregoing describes in detail a scheduling method and system for a relational database provided by the present invention, and the present invention has been described in detail by applying specific examples to explain the principles and embodiments of the present invention, and the descriptions of the foregoing examples are only used to help understand the method and core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An image retrieval method, comprising:

acquiring an image to be inquired;

inputting the image to be queried and a plurality of preset evaluation images into a pre-established same-type retrieval model and a similar retrieval model, and outputting to obtain the image feature to be queried and a plurality of evaluation image features;

similarity calculation is carried out on the image features to be inquired and the multiple evaluation image features;

sequencing the multiple evaluation images from high to low according to the similarity, and selecting a preset number of evaluation images in the front of the sequence as a retrieval result;

the establishment process of the same-style retrieval model comprises the following steps:

constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and intercepting a region with a preset size at the center after amplifying the images to be inquired and the images of the same type, wherein the images of the same type are similar to the images to be inquired in appearance;

and inputting the training set into a deep learning network model for training, and establishing to obtain the same-style retrieval model.

2. The method according to claim 1, wherein the building process of the homogeneous search model comprises:

constructing a training set, wherein the training set comprises a plurality of images to be inquired and corresponding images of the same type, and the images of the same type belong to the same type as the images to be inquired;

and inputting the training set into a deep learning network model for training, and establishing to obtain the same-class retrieval model.

3. The method according to any one of claims 1-2, wherein after the outputting obtains the image feature to be queried and the evaluation image feature, and before the performing the similarity calculation on the image feature to be queried and the evaluation image feature, the method further comprises:

and performing dimension reduction processing on the image feature to be queried and the multiple evaluation image features, and mapping the image feature to be queried and the multiple evaluation image features after the dimension reduction processing.

4. The method according to claim 3, characterized in that a principal component analysis method is adopted to perform dimension reduction processing on the image feature to be queried and the evaluation image feature; and mapping the image characteristics to be queried and the image characteristics to be evaluated after the dimension reduction processing by adopting a linear judgment method.

5. An image retrieval apparatus, comprising:

the image acquisition unit is used for acquiring an image to be inquired;

the model operation unit is used for inputting the image to be inquired and a plurality of preset evaluation images into a pre-established same-type retrieval model and a same-type retrieval model and outputting to obtain the image characteristics to be inquired and a plurality of evaluation image characteristics;

the similarity calculation unit is used for calculating the similarity of the image features to be inquired and the plurality of evaluation image features;

the retrieval result determining unit is used for sequencing the plurality of evaluation images from high to low according to the similarity and selecting a preset number of evaluation images in the front of the sequence as a retrieval result;

further comprising:

the system comprises a same-style retrieval model establishing unit, a searching unit and a searching unit, wherein the same-style retrieval model establishing unit is used for constructing a training set, the training set comprises a plurality of images to be inquired, corresponding same-style images and a region with a preset size at the center is intercepted after the images to be inquired and the same-style images are amplified, and the same-style images refer to images with similar appearances to the images to be inquired; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-style retrieval model.

6. The apparatus of claim 5, further comprising:

the system comprises a same-class retrieval model establishing unit, a query unit and a query unit, wherein the same-class retrieval model establishing unit is used for constructing a training set, the training set comprises a plurality of images to be queried and corresponding same-class images, and the same-class images belong to the same class as the images to be queried; and inputting the training set into a deep learning network model for training, and establishing to obtain the same-class retrieval model.

7. The apparatus of any of claims 5-6, further comprising:

the dimension reduction processing unit is used for carrying out dimension reduction processing on the image features to be inquired and the plurality of evaluation image features;

and the mapping unit is used for mapping the image features to be queried and the multiple evaluation image features after the dimension reduction processing.

8. The apparatus according to claim 7, wherein the dimension reduction processing unit performs dimension reduction processing on the image feature to be queried and the evaluation image feature by using a principal component analysis method; and the mapping unit adopts a linear judgment method to map the image characteristics to be inquired and the image characteristics to be evaluated after the dimension reduction processing.