WO2020125100A1

WO2020125100A1 - Image search method, apparatus, and device

Info

Publication number: WO2020125100A1
Application number: PCT/CN2019/106737
Authority: WO
Inventors: 党琪; 索勃; 顾明; 钟伟才
Original assignee: 华为技术有限公司
Priority date: 2018-12-21
Filing date: 2019-09-19
Publication date: 2020-06-25
Also published as: CN111353062A

Abstract

Disclosed by the present application is an image search method; first, first segment matching is performed on a hash value of a global feature of a query image inputted by a user and a matching image in a matching image library, to obtain a first candidate set comprising a rough match result. Then, second segment matching is performed on the global features of the matching image in the first candidate set and the global features of the query image, to obtain a second candidate set comprising a fine match result. Finally, third segment matching is performed on the local features of the matching image in the second candidate set and the local features of the query image, to obtain a final search result. The method uses a three-segment search method to obtain matching images similar to a query image; the faster the speed of matching during the coarse-grained matching process, the more matching images are filtered, avoiding the repetition of matching of multiple features of each matching image, improving the speed of image search.

Description

Image retrieval method, device and equipment

Technical field

The present application relates to the field of computer technology, in particular to an image retrieval method, and corresponding devices and equipment.

Background technique

Image retrieval is a technique to retrieve matching images that are similar or identical to the query image input by the user. The technology has a wide range of application areas, such as search engines and infringing image recognition. Taking the identification of infringing images as an example, in today's copyright market, there are a large number of copyrighted images that are used at will without authorization. Therefore, it is necessary to use technical means to retrieve the infringing images on the network to help image agencies and photographers further Combat infringement and pirated images. The current network image data is huge, which puts forward higher requirements for the efficiency of image retrieval.

Summary of the invention

The present application provides an image retrieval method, which improves the efficiency of image retrieval.

In a first aspect, an image retrieval method is provided. The image retrieval method is executed by at least one computing device and is used to retrieve a matching image that is the same as or similar to a query image input by a user. The method includes the following three-stage matching process:

The first segment of matching: obtaining the hash value of the global feature of the query image, matching the hash value of the global feature of the query image and the hash value of the global feature of the matching image of the matching image library; according to the globality of the query image The matching result of the hash value of the feature and the hash value of the global feature of the matching image of the matching image library is to select the first candidate set from the matching image of the matching image library.

Second stage matching: acquiring the global features of the query image, matching the global features of the query image and the global features of the matching images in the first candidate set; according to the global features of the query image and the first candidate The matching result of the global features of the matching matching images is selected from the matching images of the first candidate set.

Third-stage matching: obtaining the local features of the query image, matching the local features of the query image and the local features of the matching image in the second candidate set; according to the local features of the query image and the second candidate The matching result of the local features of the matching matching images is determined from the matching images of the second candidate set. Among them, the hash values, global features, and local features of the global features of the matching images in the matching image library have been previously extracted and stored in the feature library before the method is executed.

The first segment of matching quickly filters a large number of matching images through hash value matching, which improves the efficiency of image retrieval. The second-stage matching and the third-stage matching further finely match the filtered matching images, ensuring the accuracy of image retrieval.

In a possible implementation manner, the method further includes: obtaining location information of the local features of the query image, where the location information of the local features of the query image includes coordinate information of the local features of the query image or the Local information of the query image is in the region information of the query image;

In the case where the position information of the local features of the query image includes the coordinate information of the local features of the query image, it is also necessary to obtain the area information of the local features of the query image according to the coordinate information of the local features of the query image ;

Acquiring the local features of the query image, matching the local features of the query image and the local features of the matching image in the second candidate set; according to the matching of the local features of the query image and the second candidate set The matching result of the local features of the image, and determining the retrieval result from the matching images of the second candidate set includes:

Area information matching the local features of the query image and area information of the local features of the matching image in the second candidate set; according to the area information of the local features of the query image and the matching image in the second candidate set The matching result of the local information of the local feature, the search result is determined from the matching image of the second candidate set, the local information of the local feature of the matching image in the search result and the local feature of the query image The similarity of the information is greater than the first threshold.

Using the location information of local features to filter matching images as the retrieval result can reduce the false retrieval results caused by the mismatch of local features and further improve the retrieval accuracy.

In a possible implementation manner, the method further includes: acquiring location information of the local features of the query image, the location information of the local features of the query image includes local features of the query image in the query image Regional information

Matching the local features of the query image with the local features of the matching image in the second candidate set; according to the region information of the local features of the query image, acquiring the local features of the query image matching on the query image The distribution vector of the first region of the first; based on the region information of the local features of the matching images of the second candidate set, obtain the local features of the matching images of the matching image of the second candidate set on the matching images of the second candidate set The second area distribution vector; determine whether the similarity between the first area distribution vector and the second area distribution vector is greater than a second threshold, if the first area distribution vector and the second area distribution vector are similar The degree is greater than the second threshold, and it is determined that the matching image in the second candidate set is included in the retrieval result.

Retrieve the local features of the match to the query image and the matched image to obtain its regional information. By comparing the local information of the matched feature in the query image and the local feature of the match in the matched image, the matching error is further reduced. The negative impact of local features on the retrieval results improves the retrieval accuracy.

In a second aspect, an image retrieval device is provided, including: a first matching module, a second matching module, and a third matching module. The three modules can be implemented by software running on a computing device, and the three modules can run on different computing devices.

In a third aspect, a computing device system is provided, including at least one computing device, each computing device including a processor and a memory, the processor of the at least one computing device is configured to execute the program code in the memory to perform the foregoing first aspect Or the method provided in any possible implementation manner of the first aspect.

According to a fourth aspect, a non-transitory readable storage medium is provided. When the non-transitory readable storage medium is executed by at least one computing device, the at least one computing device performs the foregoing first aspect or In one aspect, the method provided in any possible implementation manner. The storage medium stores the program. The types of storage media include but are not limited to volatile memory, such as random access memory, non-volatile memory, such as flash memory, hard disk drive (HDD), solid state drive (SSD).

A fifth aspect provides a computing device program product. When the computing device program product is executed by at least one computing device, the at least one computing device executes the foregoing first aspect or any possible implementation manner of the first aspect The method provided. The computer program product may be a software installation package, and when the method provided in the first aspect or any possible implementation manner of the first aspect needs to be used, the computer program product may be downloaded and executed on the computing device The computer program product.

In the sixth aspect, an image retrieval method is also provided. In the process of matching the local features of the query image and the local features of the matching image, the method uses the region information where the local features of the query image and the local features of the matching image are located. The distribution vector indicates that whether the query image matches the matching image is determined by comparing the similarity between the regional distribution vectors. This method reduces the negative impact of matching mismatched local features on the retrieval results and improves the retrieval accuracy. The method includes:

Acquiring location information of at least one local feature of the query image and location information of at least one local feature of the matching image, the location information of each local feature of the query image includes local features of the query image in the region of the query image Information, the location information of each local feature of the matching image includes region information of the local feature of the matching image in the matching image;

According to the region information of at least one local feature of the query image, obtain a distribution vector of at least one local feature of the query image in a first region of the query image, where the first region distribution vector includes each of the query images The number of local features included in each area;

According to the region information of the at least one local feature of the matching image, obtain the distribution vector of the at least one local feature of the matching image in the second region of the matching image, where the second region distribution vector includes each of the matching images The number of local features included in each area;

Determining that the similarity between the first area distribution vector and the second area distribution vector is greater than a threshold;

It is determined that the query image matches the matching image.

In a seventh aspect, a computing device system is also provided, including at least one computing device, each computing device includes a processor and a memory, and the processor of the at least one computing device is used to execute the program code in the memory to perform the sixth aspect The method provided.

According to an eighth aspect, a non-transitory readable storage medium is also provided. When the non-transitory readable storage medium is executed by at least one computing device, the at least one computing device performs the foregoing sixth aspect provides Methods.

The ninth aspect. A computing device program product is also provided. When the computing device program product is executed by at least one computing device, the at least one computing device executes the method provided in the foregoing sixth aspect.

BRIEF DESCRIPTION

In order to more clearly explain the technical methods of the embodiments of the present application, the drawings required in the embodiments will be briefly described below.

Figure 1 is a schematic diagram of the system architecture provided by this application;

Figure 2 is a schematic diagram of the system operating environment provided by the application;

FIG. 3 is a schematic diagram of another system operating environment provided by the application;

4 is a schematic diagram of another system operating environment provided by the application;

5 is a flow chart of the preprocessing method provided by this application;

6 is a schematic diagram of the position information of the feature points provided by the application;

7 is a schematic diagram of the area division of the feature points provided by this application;

8 is a flowchart of an online processing method provided by this application;

9 is a schematic diagram of the location information provided by the application;

10 is a schematic structural diagram of a pretreatment device provided by this application;

11 is a schematic structural diagram of an image retrieval device provided by this application;

12 is a schematic structural diagram of an image retrieval system provided by the present application.

detailed description

The technical methods in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. In this application, there is no logical or timing dependency between the "first", "second", and "nth".

Global features, features extracted with the entire image as input, are generally used to express the overall characteristics of the image. Global features include any one or a combination of the following: pre-trained deep learning features, GIST features, local binary pattern (LBP) features, or color distribution features. Among them, the pre-training deep learning feature is an image feature calculated using a pre-trained convolutional neural network model, and the pre-training model refers to a model obtained by training on an image in a data set. Data sets, such as the ImageNet project, is a large visualization database for visual object recognition software research, containing 1000 categories and tens of millions of images. After the image is input to the pre-trained convolutional neural network model, the features in any convolutional layer of the model can be used as the global features of the image. Color distribution features refer to image features represented by color statistical distribution vectors, such as color histograms, color moments, color aggregation vectors, etc.; color histograms refer to the distribution characteristics of image color spaces in different color intervals, such as red green blue (red green blue) , RGB) color distribution space, etc.; color moments include first-order moment, second-order moment, third-order moment, etc. The GIST feature is an abstract scene representation. This representation can naturally stimulate the concept of different classification scenarios, such as cities and mountains. The characteristics of GIST were first published in Friedman A. Framing pictures: Therole of knowledge in automatized encoding and memory [J]. Journal of experimental psychology: General, 1979, 108(3): 316.

Local features, features extracted with part of the image as input, are generally used to express local characteristics of the image. Local features include any one or a combination of the following: scale invariant (SIFT) features, accelerated robust features (SURF), and affine SIFT features (also called ASIFT feature), principal component analysis-based scale-invariant feature (principal component analysis-scale feature variation (PCA-SIFT), histogram of orientation oriented gradient (HOG), oriented FAST detection and rotated BRIEF descriptor Features (oriented FAST and rotated (Brief, ORB), including accelerated segmentation detection features (features from accelerated segmentation test, FAST), binary robust invariant scale features (binary robust independent independent features), or binary robust non-variable features (binary robust invariant scalable keypoints, BRISK), etc. Local features are generally extracted with a pixel in the image as the center. Such pixels are also called points of interest, key points, or feature points. The various descriptors extracted in the local range of feature points generally have the characteristics of local scale invariance, that is, if the content in the local range of feature points is subjected to various types of processing, such as zooming, rotating, adding noise, occlusion, change Illumination, the descriptor extracted in the local range after processing is the same as or similar to the descriptor extracted in the local range before processing. Therefore, if there are a large number of local feature matches in the two images, that is, a large number of feature points match, it can generally be considered that the two images have a high degree of similarity. Generally, multiple local features can be extracted for each local feature in an image.

Similarity refers to the degree of similarity between two hash values, features, or vectors. The similarity may be the distance between two hash values, features, or vectors, or a value obtained by normalizing the distance between two hash values, features, or vectors. The distance here can be Euclidean distance, cosine distance, editor distance, Hamming distance, etc. The smaller the distance between two hash values, features, or vectors, the greater the similarity.

FIG. 1 introduces a system structure diagram provided by the present application. The system is divided into a preprocessing section 100 and an online processing section 200. The preprocessing part 100 is used to extract corresponding hash values, global features, local features, and position information for the matching images in the matching image library 105. The online processing section 200 is used to retrieve similar images for the query images input by the user in the matching image library 105, and provide the retrieval results to the user. The matching images in the matching image library 105 come from the network or an image library prepared in advance. Since the number of matching images stored in the matching image library 105 may be huge, the pre-processing part 100 generally needs to be pre-processed before the online processing part 200 works . The preprocessing part 100 may periodically extract corresponding hash values, global features, local features, and position information for newly added matching images in the matching image library, so as to maintain the real-time retrieval result of the query image. The online processing part 200 and the preprocessing part 100 can also be processed in parallel to further improve the working efficiency of the system.

The matching image library 105 stores matching images, and each matching image is input to the local feature extraction module 101 and the global feature extraction module 102, respectively. The local feature extraction module 101 extracts the local features and position information of the matching image, and stores the local features and position information in the feature library 104. The global feature extraction module 102 extracts the global features of the matching image, and stores the global features in the feature library 104. The global feature extraction module 102 also inputs the extracted global feature into the hash module 103, and the hash module 103 hashes the input global feature to obtain a hash value corresponding to the global feature, and stores the obtained hash value in Feature library 104.

The query image input by the user through the query interface/interface 207 is input to the local feature extraction module 201 and the global feature extraction module 202, respectively. The global feature extraction module 202 extracts the global features of the query image, and inputs the global features into the hash module 203 and the second matching module 205. The hash module 203 generates a hash value according to the global feature input by the global feature extraction module 202, and inputs the generated hash value into the first matching module 204. The first matching module 204 matches the hash value of the matching image 106 in the feature library 104 and the hash value generated by the hash module 203, and selects the first candidate set from the matching image library 105 according to the matching result. The similarity between the hash value of the matching image in the first candidate set and the hash value of the query image is higher than the threshold. The matching image in the first candidate set is input to the second matching module 205, and the second matching module 205 matches the global feature of the matching image in the first candidate set and the global feature generated by the global feature extraction module 202, according to the matching result from the first The matching images in the candidate set select part or all of the matching images as the second candidate set. The similarity between the global feature of the matching image and the global feature of the query image in the second candidate set is higher than the threshold. The images in the second candidate set are input to the third matching module 206. The local feature extraction module 201 extracts local features and position information of the query image, and inputs the local features and position information into the third matching module 206. The third matching module matches the local features and position information of the images in the second candidate set and the local features and position information generated by the local feature extraction module 201, and selects partial or full matches from the images in the second candidate set according to the matching results The image is returned to the user through the query interface/interface 207 as a search result.

Optionally, the local feature extraction module 201 of the online processing part and the local feature extraction module 201 of the preprocessing part are the same. The global feature extraction module 202 of the online processing part is the same as the global feature extraction module 202 of the preprocessing part. The hash module 203 of the online processing section is the same as the hash module 103 of the pre-processing section.

The system shown in Figure 1 includes at least the following two operating environments. As shown in FIG. 2, both the preprocessing section 100 and the online processing section 200 are deployed on the cloud. The system provides the image retrieval function as a cloud service to the user. The user transmits the query image to the cloud through the query interface/interface 207 (application programming interface or browser). After finding a matching image that is the same as or similar to the query image on the cloud, the search result is returned to the user.

As shown in FIG. 3, the preprocessing section 100 is deployed on the cloud, and the abundant matching image resources and storage resources are used to extract features from the huge matching image library 105. The online processing part 200 is deployed on the user equipment, and the user equipment completes the feature extraction and matching functions of the query image. Since the online processing section 200 requires less computing resources and storage resources, the online processing section 200 is placed on the user equipment in the operating environment shown in FIG. 3, and the resources of the cloud and user equipment are further rationally utilized. In the operating environment shown in FIG. 3, the online processing part 200 may be provided to the user in the form of a software module by a cloud operator and installed on the user device.

As shown in FIG. 4, the preprocessing part 100 is deployed on the cloud, and the first matching module 204, the second matching module 205, and the third matching module 206 are also deployed on the cloud. The other modules of the online processing section 200 are deployed on the user equipment. In the operating environment shown in FIG. 4, the first matching module 204, the second matching module 205, the third matching module 206, and the feature library 104 are all deployed on the cloud to avoid the feature library 104, the first matching module 204, and the second The network resource consumption caused by the transmission of hash values, global features, and local features between the matching module 205 and the third matching module 206. At the same time, the remaining modules of the online processing section 200 are placed on the user equipment, and the resources of the cloud and the user equipment are further rationally utilized. In the operating environment shown in FIG. 4, the local feature extraction module 201, the global feature extraction module 202, the hash module 203, and the query interface/interface 207 can be provided to users by a cloud operator as software modules and installed on user devices on.

FIG. 5 provides a flowchart of the preprocessing method performed by the preprocessing section 100.

S301, the local feature extraction module 101 obtains the matching image and the identification ID of the matching image from the matching image library 105. Each matching image stored in the matching image library 105 has a unique ID.

In S302, the local feature extraction module 101 extracts local features for the matching image, and sends the extracted local features and their corresponding position information and matching image ID to the feature database 104.

The local feature extraction module 101 may include at least one sub-module, each sub-module is used to extract a local feature, for example, the SIFT sub-module is used to extract SIFT local features from the matching image, and the SURF sub-module is used to extract SURF from the matching image Local features. The various local features extracted together with the matching image ID are sent to the feature library 104. The location information of the feature point of each local feature of each local feature, such as coordinates or area information in the matching image, is also sent to the feature library 104 as the location information of the local feature.

S303, the matching image library 105 sends the matching image and the identification ID of the matching image to the global feature extraction module 102.

S304/S306, the global feature extraction module 102 extracts global features for the matching image, and sends the extracted global features and the matching image ID to the hash module 103 and the feature library 104.

The global feature extraction module 102 may include at least one submodule, each submodule is used to extract a local feature, for example, the RGB submodule is used to extract RGB global features from the matched image, and the LBP submodule is used to extract LBP from the matched image Global characteristics. The extracted various global features are sent to the feature library 104 together with the matching image ID. After the global feature extraction module 102 extracts the global features, S304 and S306 may be executed in parallel or in any order.

S305, the hash module 103 hashes the global feature, generates a hash value, and sends the generated hash value and the matching image ID to the feature database 104.

S301-S302 and S303-S306 can be executed in parallel or in any order.

After performing S301-S306 for each matching image in the matching image library 105, the feature library 104 stores a series of information for each matching image, including the ID, hash value, global features, local features and location information of the matching images ,As shown in Table 1:

图像IDImage ID	哈希值Hashes	全局特征Global feature	局部特征Local features	位置信息location information
ID 1 ID1		哈希值1Hash value 1	全局特征1 Global Features 1	局部特征1-1Local features 1-1	位置信息1-1Location Information 1-1
A	A	全局特征2Global Features 2	局部特征1-2Local features 1-2	位置信息1-2Location information 1-2
A	A	全局特征3Global Features 3	......	......
A	A	......	局部特征2-1Local features 2-1	位置信息2-1Location information 2-1
A	A	全局特征MGlobal feature M	......	......
A	A	A	局部特征3-1Local features 3-1	位置信息3-1Location Information 3-1
A	A	A	......	......
A	A	A	局部特征N-1Local features N-1	位置信息N-1Location information N-1
ID 2ID 2	......	......	......	......

Table 1

Table 1 records the global feature 1 to the global feature M extracted from the matching image ID 1. Each global feature refers to a global feature. For example, the global feature 1 may be an RBG global feature, and the global feature 2 may be an LBP global feature. . Table 1 also records the hash value 1 generated from the hash of the global feature 1 to the global feature M. The hash value 1 can be composed of at least one hash value, that is, the global feature 1 to the global feature M can be combined. If a hash value 1 is desired, the global feature 1 to the global feature M may be divided into multiple segments and hashed separately, and multiple hash values are hashed to form a hash value 1. Taking the global features matching the image ID 1 including M, each global feature including mi dimensions as an example, 1≦ _i ≦M. Then, the hash value 1 of the matching image ID 1 may be composed of ∑m _i hash values, that is, each dimension of each global feature is hashed into one hash value. It is also possible to arbitrarily combine the M global features and hash the combined result to obtain a hash value. In this case, the hash value 1 is composed of less than ∑m _i hash values.

Exemplarily, the hashing algorithm adopted by the hashing module 103 may adopt the following manner: mapping a positive number to 1, and mapping a non-positive number to 0. When this hash algorithm is used and each dimension of each global feature is hashed into a hash value, the global features of ∑m _i dimensions matching the image ID 1 are mapped to ∑m _i bit The binary sequence greatly reduces the complexity of global features and improves the matching speed of the subsequent first candidate set.

Table 1 also records the local feature 1-1 to the local feature N-1 extracted from the matching image ID 1 and the location information of each local feature. Since the matching image ID 1 can generally extract multiple feature points for each local feature, for example, the local feature 1-1 and the local feature 1-2 are SIFT local features, and the local feature 201 is a SURF local feature. A total of M global features and N local features are recorded in Table 1, M and N are both positive integers.

FIG. 6 provides a schematic diagram of the location information of each local feature recorded in the feature library 104. The image in FIG. 6 includes 10 feature points, each feature point corresponds to a local feature, and different feature points may correspond to the same or different kinds of local features. The position information of the local feature corresponding to each feature point includes the coordinate information of the feature point in the image or the area information of the feature point in the image. Each image is divided into H areas, H is a positive integer greater than 1. As shown in Figure 7, both the query image and the matching image can be divided into n rows and m columns, n*m is equal to H, m and n are positive integers, for example, the image in Figure 6 is divided into 3*3 total 9 areas. Therefore, the location information corresponding to each feature point may be the area ID where each feature point is located. In this case, the location information of Table 1 is shown in Table 2:

图像IDImage ID	哈希值Hashes	全局特征Global feature	局部特征Local features	位置信息location information
ID 1 ID1		哈希值1Hash value 1	全局特征1 Global Features 1	局部特征1-1Local features 1-1	区域1Zone 1
A	A	全局特征2Global Features 2	局部特征1-2Local features 1-2	区域6Zone 6
A	A	全局特征3Global Features 3	......	......
A	A	......	局部特征2-1Local features 2-1	区域7Zone 7
A	A	全局特征MGlobal feature M	......	......
A	A	A	局部特征3-1Local features 3-1	区域2Zone 2
A	A	A	......	......
A	A	A	局部特征N-1Local features N-1	区域3Zone 3
ID 2ID 2	......	......	......	......

Table 2

The preprocessing part 100 may further include a matching image acquisition module for acquiring new matching images from other image libraries or from the network, and storing the new matching images in the matching image library 105. Whenever a new matching image is stored in the matching image library 105, each module of the preprocessing section 100 executes the process shown in FIG. 5, and stores the hash value of the new matching image and various types of features and position information in the feature library 104 so that the new matching image can be matched with the query image.

FIG. 8 provides a flowchart of the online processing method performed by the online processing section 200.

S401. The query interface/interface 207 acquires a query image input by the user, and sends the query image to the local feature extraction module 201.

S402. The local feature extraction module 201 extracts the local features of the query image and the location information of each local feature, and sends the local features of the query image and the location information of each local feature to the third matching module 206. S402 refers to S302.

S403, the query interface/interface 207 sends the query image to the global feature extraction module 202.

S404/S406, the global feature extraction module 202 extracts the global features in the query image, and sends the global features to the hash module 203 and the second matching module 205. S404/S406 refer to S304/S306.

S405, the hash module 203 hashes the global feature, generates a hash value, and sends the hash value to the first matching module 204. S405 refers to S305 and the related description of the foregoing hash operation.

In S407/S408, the first matching module 204 obtains the hash value of the matching image and the ID of the matching image from the feature library 104. After obtaining the hash value of the matching image and the ID of the matching image, the first matching module 204 matches the hash value of the query image and the hash value of the matching image, and takes the ID of the matching image on the match as the first according to the matching result The candidate set is sent to the second matching module 205. The first matching module 204 can match the hash value of the query image and the hash value of the matching image by calculating the similarity between the hash value of the query image and the hash value of each matching image. The first matching module 204 uses the ID of the matching image corresponding to the hash value of the hash value of the query image greater than the first threshold as the first candidate set. Alternatively, the first matching module 204 uses the IDs of the matching images corresponding to the Q hash values with the highest similarity of the hash values of the query images in the matching image library 105 as the first candidate set, where Q is a positive integer.

Through the first matching module 204, the matching images in the matching image library 105 are preliminarily screened, and a large number of images with low similarity to the query image are eliminated, thereby improving the matching speed and matching accuracy of the query image.

S409. The second matching module 205 obtains the global features corresponding to the matching image ID from the feature library 104 according to the matching image ID in the first candidate set.

S410. The second matching module 205 matches the global features of the query image with the obtained global features of the matching image, and sends the ID of the matching image on the match as the second candidate set to the third matching module 205 according to the matching result. The second matching module 205 matches the global features of the query image and the global features of the matching images in the first candidate set by calculating the similarity between the global features of the query image and the global features of the matching image, that is, the matching result includes the query image’s The similarity between the global feature and the global feature of the matching image. The second matching module 205 uses the ID of the matching image corresponding to the global feature whose similarity of the global feature of the query image is greater than the second threshold as the second candidate set. Alternatively, the second matching module 205 uses the IDs of the matching images corresponding to the W groups of global features with the largest similarity of the global features of the query image in the first candidate set as the second candidate set, where W is a positive integer.

The second matching module 205 further filters the matching images in the first candidate set, further filters out images with a low similarity to the query image from the first candidate set, reduces the calculation amount of subsequent matching, and further improves the query image Matching speed and matching accuracy.

In S411, the third matching module 206 obtains local features and position information corresponding to the matching image ID from the feature database 104 according to the matching image ID in the second candidate set.

S412. The third matching module 206 matches the local features and corresponding location information of the query image with the obtained local features and corresponding location information of the matching image, determines the ID of the matching image based on the matching result, and determines the ID of the matching image based on the matching result. The ID of the matching image acquires the corresponding matching image from the matching image library 105 as the retrieval result.

S412 includes two parts, the first part is the matching of local features, and the second part is the matching of the location information of the local features on the match. The matching of the first part and the second part is performed on each matching image in the query image and the second candidate set to determine which matching images corresponding to IDs in the second candidate set can be used as the retrieval result.

In the first part, the third matching module 206 calculates the similarity between each local feature of the query image and each local feature of the matching image x in the second candidate set, and the ID of the matching image x is any one of the second candidate set. If the local feature with the highest similarity to the local feature 1 in the query image among the local features of the matching image x is the local feature 1-1, the local feature with the second highest similarity to the local feature 1 in the query image among the local features of the matching image x The feature is local feature 1-2, and the similarity between local feature 1 and local feature 1-1 is d ₁ The similarity between local feature 1 and local feature 1-2 is d ₂ , if the ratio of d ₁ and d ₂ is less than With three thresholds, the local feature 1 matches the local feature 1-1 successfully. If the ratio of d ₁ and d ₂ is not less than the third threshold, then the local feature 1 of the query image has no matching local feature among the local features of the matching image x. After the first part is completed, the third matching module 206 acquires the local feature set on the match between the query image and each matched image. Taking the query image and the matching image x matching 10 local features as an example, the query image and matching image x matching 10 local features are denoted as v ₁ to v ₁₀ , and the matching image x matching the query image is 10 The local features are denoted as V ₁ to V ₁₀ , where v ₁ to v ₁₀ and V ₁ to V _{10 are} matched one-to-one.

The second part, according to the location information of v ₁ to v ₁₀ in the query image and the location information of V ₁ to V ₁₀ in the matching image x, determine the regional distribution of v ₁ to v ₁₀ in the query image and V ₁ The distribution of the regions in V to ₁₀ in the matching image x. If the location information acquired in S402 and S411 is coordinates, it is necessary to determine which region of the query image v ₁ to v ₁₀ belong to according to the coordinates and which region of the matching image x V ₁ to V ₁₀ belong to according to the coordinates . Generate the regional distribution vector f _i =[c ₁ ,c ₂ ,...,c _H ] according to the regional distribution of v ₁ to v ₁₀ in the query image, where c _i represents v included in the region i of the query image The number of feature points corresponding to ₁ to v ₁₀ , 1≤i≤H. Generate a regional distribution vector F _i =[C ₁ ,C ₂ ,...,C _H ] according to the regional distribution of V ₁ to V ₁₀ in the matching image, where C _i represents V included in the region i of the matching image The number of feature points corresponding to ₁ to V ₁₀ , 1≤i≤H. Take FIG. 9 as an example, f _i =[2,1,1,1,0,1,1,3,0], F _i =[2,2,0,1,0,1,1,2,1 ]. The similarity between f _i and F _i is calculated as the matching result. If the similarity between f _i and F _i is greater than the fourth threshold, the matching image x can be used as the retrieval result. The third matching module 206 obtains the corresponding matching image from the matching image library 104 according to the ID of the matching image x as a retrieval result.

Through the first part and the second part above, the search results are further selected from the second candidate set, and the accuracy of the search results is further improved. The use of location information of local features makes the query image and the image in the retrieval result not only have more local feature matches, but also the location of the feature points corresponding to the local features on the match in the image are similar, ensuring the query image Similarity to matching images.

Optionally, in S412, the retrieval result may be selected from the second candidate set through the first part and the second part. It is also possible to select the search result from the second candidate set without using the second part, that is, without using the position information of the local features. In this case, none of the aforementioned local feature extraction modules 101/201, the third matching module 206, and the feature library 104 in FIG. 5 and FIG. 8 need to generate, transmit, or store the location information of the local features. In the case of using only the first part, after obtaining the number of matching local features on the query image and the matching image x, determine whether the number of matching local features is greater than the fifth threshold, or the number of matching local features is within the query image Whether the ratio of the total number of local features is greater than the sixth threshold. If the number of local features on the match is greater than the fifth threshold, or the ratio of the number of local features on the match to the total number of local features in the query image is greater than the sixth threshold, the matching image x may be used as the retrieval result.

In S413, the third matching module 206 returns the search result to the user through the query interface/interface 207.

In the above steps, S401 to S402 and S403 to S406 can be executed in parallel. S404 to S405 and S406 can be executed in parallel. S407, S409, and S411 can be executed in parallel or combined into one step, that is, the hash value, matching image ID, global features, local features, and location information are sent to the first matching module 204 in S407, and in S408 In addition to sending the first candidate set to the second matching module 205, the first matching module 204 also sends the ID, global features, local features, and location information of the matching image to the second matching module 205. In S410, in addition to sending the second candidate set to the third matching module 206, the second matching module 205 also sends the ID, local features, and position information of the matching image to the third matching module 206.

The present application also provides a preprocessing device 500 for performing the preprocessing part 100. As shown in FIG. 10, the preprocessing device 500 includes a local feature extraction module 101, a global feature extraction module 102, a hash module 103, and a feature library 104和matching image library 105. The present application also provides an image retrieval device 600 for performing an online processing section 200. As shown in FIG. 11, the image retrieval device 600 includes a local feature extraction module 201, a global feature extraction module 202, a hash module 203, and a first match Module 204, second matching module 205 and third matching module 206. Each module in the preprocessing device 500 and the image retrieval device 600 may be a software module, and the feature library 104 may be a database or a storage space provided by a cloud storage service.

The present application also provides an image retrieval system 700. As shown in FIG. 12, the image retrieval system 700 includes a preprocessing cluster and an online processing cluster. The preprocessing cluster is used to perform the preprocessing section 100, and the online processing cluster is used to process the online processing section 200. Each cluster includes at least one computing device 705 . The computing device 705 includes a bus 703, a processor 701, a communication interface 702, and a memory 704. The processor 701, the memory 704, and the communication interface 702 communicate via a bus 703.

The processor 701 may be a central processing unit (English: central processing unit, abbreviation: CPU). The memory may include a volatile memory (English: volatile memory), for example, a random access memory (English: random access memory, abbreviation: RAM). The memory 704 may also include non-volatile memory (English: non-volatile memory), such as read-only memory (English: read-only memory, abbreviation: ROM), flash memory, HDD, or SSD. An executable program code is stored in the memory 704, and the processor 701 executes the executable code to perform the aforementioned methods of FIGS. 5 and 8. The memory 704 may further include software modules required by other operating processes such as an operating system. The operating system can be LINUX ^TM , UNIX ^TM , WINDOWS ^TM and so on.

The memory 704 of the computing device 705 stores codes required to run each module in the preprocessing apparatus 500 and the image retrieval apparatus 600, and the processor 701 executes these codes to realize the functions of the preprocessing apparatus 500 and the image retrieval apparatus 600, that is, executes The methods shown in Figures 5 and 8. The computing device 700 in the pre-processing cluster may be a computing device in a cloud environment. The computing device 700 in the online processing cluster may be a computing device in a cloud environment, or a computing device in an edge environment or a computing device in a user environment. The storage resources required by the feature library 104 and the matching image library 105 are relatively large, and can also be stored in the storage cluster in the preprocessing cluster. The online processing system communicates with the user's query interface/interface 207 to obtain the query image input by the user and return the search result to the user.

The description of the processes corresponding to the above drawings has its own emphasis. For a part of a process that is not detailed, please refer to the relevant descriptions of other processes.

In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part. The computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server or data center Transmit to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be a computer Any available media that can be accessed or a data storage device that includes one or more available media integrated servers, data centers, etc. The available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg , DVD), or semiconductor media (such as SSD), etc.

Claims

An image retrieval method performed by a computing device is characterized by including:

Acquiring the hash value of the global feature of the query image, matching the hash value of the global feature of the query image and the hash value of the global feature of the matching image of the matching image library;

Selecting the first candidate set from the matching images of the matching image library according to the matching result of the hash values of the global features of the query image and the hash values of the global features of the matching images of the matching image library;

Acquiring the global features of the query image, matching the global features of the query image and the global features of the matching images in the first candidate set;

Selecting a second candidate set from the matching images of the first candidate set according to the matching result of the global features of the query image and the global features of the matching images in the first candidate set;

Acquiring the local features of the query image, matching the local features of the query image and the local features of the matching image in the second candidate set;

The retrieval result is determined from the matching images of the second candidate set according to the matching results of the local features of the query image and the local features of the matching images in the second candidate set.
The method according to claim 1, further comprising: acquiring regional information of local features of the query image in the query image;

Acquiring the local features of the query image, matching the local features of the query image and the local features of the matching image in the second candidate set; according to the matching of the local features of the query image and the second candidate set The matching result of the local features of the image, and determining the retrieval result from the matching images of the second candidate set includes:

Region information matching the local features of the query image and region information of the local features of the matching images in the second candidate set;

Determine the retrieval result from the matching images of the second candidate set according to the matching result of the region information of the local features of the query image and the region information of the local features of the matching images in the second candidate set, the The similarity between the region information of the local feature of the matching image and the region information of the local feature of the query image in the search result is greater than the first threshold.
The method according to claim 1, further comprising: acquiring regional information of local features of the query image in the query image;

Acquiring the local features of the query image, matching the local features of the query image and the local features of the matching image in the second candidate set; according to the matching of the local features of the query image and the second candidate set The matching result of the local features of the image, and determining the retrieval result from the matching images of the second candidate set includes:

Matching the local features of the query image and the local features of the matching image in the second candidate set;

Acquiring, according to the region information of the local features of the query image, the matching local features of the query image in the first region of the query image;

Acquiring, according to the region information of the local features of the matching images of the second candidate set, the local features of the matching images of the second candidate set on the match in the second region of the matching images of the second candidate set;

Determine whether the similarity between the first area distribution vector and the second area distribution vector is greater than a second threshold, and if the similarity between the first area distribution vector and the second area distribution vector is greater than a second threshold, determine The matching image in the second candidate set is included in the retrieval result.
An image retrieval device is characterized by comprising:

The first matching module is used to obtain the hash value of the global feature of the query image, match the hash value of the global feature of the query image and the hash value of the global feature of the matching image of the matching image library; according to the query image A matching result of the hash value of the global feature of the global feature and the hash value of the global feature of the matching image of the matching image library, selecting a first candidate set from the matching images of the matching image library; Send to the second matching module;

The second matching module is used to obtain the global features of the query image, match the global features of the query image and the global features of the matching images in the first candidate set; according to the global features and the query image A matching result of the global features of the matching images in the first candidate set, selecting a second candidate set from the matching images in the first candidate set; sending the second candidate set to a third matching module;

The third matching module is used to obtain the local features of the query image, match the local features of the query image and the local features of the matching image in the second candidate set; according to the local features of the query image and all The matching result of the local features of the matching images in the second candidate set is determined from the matching images in the second candidate set.
The apparatus of claim 4, wherein the third matching module is used to:

Acquiring the region information of the local feature of the query image in the query image; the region information matching the local feature of the query image and the region information of the local feature of the matching image in the second candidate set; according to the query image The matching result of the local information of the local feature and the local feature of the matching image in the second candidate set, the search result is determined from the matching image of the second candidate set, and the match in the search result The similarity between the area information of the local features of the image and the area information of the local features of the query image is greater than the first threshold.
The apparatus of claim 4, wherein the third matching module is used to:

Obtain the local information of the local features of the query image in the query image; match the local features of the query image and the local features of the matching image in the second candidate set; and regional information based on the local features of the query image , Obtain the local features of the matching image on the query image in the first region of the query image distribution; according to the second candidate set of matching image local features of the regional information, get the matching on the second candidate The local features of the matching image of the set are distributed in the second region of the matching image of the second candidate set; determining whether the similarity between the first region distribution vector and the second region distribution vector is greater than the second threshold, if The similarity between the first area distribution vector and the second area distribution vector is greater than a second threshold, and it is determined that the matching images in the second candidate set are included in the retrieval result.
A computing device system, characterized in that it includes at least one computing device, each computing device includes a processor and a memory, and the processor of the at least one computing device is used to execute the program code in the memory to execute any of claims 1 to 3 One of the methods.
A non-transitory readable storage medium, characterized in that, when the non-transitory readable storage medium is executed by a computing device, the computing device performs the method according to any one of claims 1 to 3 above.
An image matching method performed by a computing device is characterized in that it includes:

Acquiring area information of at least one local feature of the query image and area information of at least one local feature of the matching image;

According to the region information of at least one local feature of the query image, obtain a distribution vector of at least one local feature of the query image in a first region of the query image, where the first region distribution vector includes each of the query images The number of local features included in each area;

According to the region information of the at least one local feature of the matching image, obtain the distribution vector of the at least one local feature of the matching image in the second region of the matching image, where the second region distribution vector includes each of the matching images The number of local features included in each area;

Determining that the similarity between the first area distribution vector and the second area distribution vector is greater than a threshold;

It is determined that the query image matches the matching image.
A computing device system, characterized in that it includes at least one computing device, each computing device includes a processor and a memory, and the processor of the at least one computing device is used to execute the program code in the memory to execute the program of claim 9 method.