CN113886628A - Image retrieval method, device and storage medium - Google Patents

Image retrieval method, device and storage medium Download PDF

Info

Publication number
CN113886628A
CN113886628A CN202111176709.3A CN202111176709A CN113886628A CN 113886628 A CN113886628 A CN 113886628A CN 202111176709 A CN202111176709 A CN 202111176709A CN 113886628 A CN113886628 A CN 113886628A
Authority
CN
China
Prior art keywords
image
retrieved
cluster
images
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111176709.3A
Other languages
Chinese (zh)
Inventor
张超群
周斌
孙鑫焱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shizhuang Information Technology Co ltd
Original Assignee
Shanghai Shizhuang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shizhuang Information Technology Co ltd filed Critical Shanghai Shizhuang Information Technology Co ltd
Priority to CN202111176709.3A priority Critical patent/CN113886628A/en
Publication of CN113886628A publication Critical patent/CN113886628A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses an image retrieval method, an image retrieval device and a storage medium, wherein the method comprises the following steps: receiving a retrieval request; primarily screening images in an image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved; calculating a first image feature vector of the image to be retrieved, and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set; and calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector. By the embodiment of the invention, the images with the similarity meeting the requirement standard with the image to be searched in the search image library and the related information of the images can be returned during image search.

Description

Image retrieval method, device and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an image retrieval method, an image retrieval device, and a storage medium.
Background
The image retrieval means inputting a picture to be retrieved and then retrieving the picture which is the same as or similar to the picture in the picture library and the information related to the picture. The image retrieval has wide practical application, for strange animals and plants and the like which are seen, pictures can be taken and uploaded by using image retrieval software website applets and the like, then the types of the animals and plants are identified, and commodity pictures can be taken and uploaded through the image retrieval on an e-commerce platform APP to search for target commodities.
At present, in some application scenes, the requirement on the retrieval precision is high, for example, an e-commerce platform retrieves whether the same commodity display picture exists or not, so as to prevent the problems that the propaganda image is copied and infringed, the platform repeatedly puts the same commodity on shelf and the like; and (3) the goods of the e-commerce platform are damaged due to flaws, the flaw pictures are used for performing after-sale claims, the customer service system automatically performs flaw image retrieval, and whether the claimant performs secondary claim cheating to obtain claim compensation by using the pictures which are claimed before is judged. The image retrieval application scenes require that the retrieval result picture is completely the same as or highly similar to the picture to be retrieved, the retrieval precision requirement is high, and the retrieval result is not allowed to have missing return and wrong return.
However, the techniques commonly used in the current image retrieval process are: the method adopts a convolutional neural network method, the convolutional neural network is used for extracting image semantic features, then the feature vectors are obtained through vectorization of the features, and the similarity between the images is compared through calculating the distance between the vectors. The image retrieval method is low in precision, only the previous similar images meeting the set standard are returned, the user needs to further screen or the background manual auditor needs to perform secondary judgment, the user experience is poor, the auditing efficiency is low, and the high-precision retrieval requirement cannot be met.
Disclosure of Invention
The invention mainly aims to provide an image retrieval method, an image retrieval device and a storage medium, and aims to solve the technical problems that the image retrieval method in the prior art is low in precision and cannot meet application scenes with high retrieval precision requirements.
In order to achieve the above object, the present invention provides an image retrieval method, including the steps of:
receiving a retrieval request, wherein the retrieval request carries an image to be retrieved;
primarily screening images in an image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved;
calculating a first image feature vector of the image to be retrieved, and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering direction.
Optionally, the step of performing preliminary screening on the images in the image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved includes:
calculating the Hash code of the image to be retrieved according to a preset algorithm;
acquiring a central cluster hash code of an image cluster in the image library;
determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the central cluster of the image cluster;
and if the cluster is determined to be similar, determining the first image set according to the Hash codes of all the images in the image cluster and the Hash code of the image to be retrieved.
Optionally, the step of determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the center cluster of the image cluster includes:
calculating a first Hamming distance between the central Hash code of the image clustering cluster and the Hash code of the image to be retrieved;
and if the first Hamming distance is smaller than a first preset threshold value, determining the image cluster as a similar cluster of the image to be retrieved.
Optionally, the step of determining the first image set according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved includes:
traversing all images in the image cluster;
calculating a second Hamming distance between the Hash code of the first target image in the image cluster and the Hash code of the image to be retrieved;
and if the second Hamming distance is smaller than a second preset threshold value, determining that the target image is a similar image of the image to be retrieved until all images in the image cluster are traversed to obtain the first image set.
Optionally, the step of determining a second image set according to the first image feature vector and a second image feature vector of an image in the first image set includes:
acquiring second image feature vectors of all images in the first image set;
respectively calculating a first cosine distance between a first image feature vector of the image to be retrieved and a second image feature vector of a second target image in the first image set;
and if the first cosine distance is smaller than a third preset threshold value, determining that the second target image is a similar image of the image to be retrieved until all images in the image cluster are traversed to obtain the second image set.
Optionally, the step of calculating a first image pixel cluster vector of the image to be retrieved and a second image pixel cluster vector of the image in the second image set includes:
respectively carrying out image pixel value clustering processing on the RGB three channels of the image to be retrieved and the RGB three channels of the image in the second image set to obtain a first pixel value cluster of the image to be retrieved and a second pixel value cluster of the image in the second image set;
and calculating the central value of the first pixel value cluster and the central value of the second pixel value cluster to obtain the first image pixel cluster vector and the second image pixel cluster vector.
Optionally, the second image set includes a third target image, and the step of obtaining a similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector includes:
respectively calculating a second cosine distance between the first image pixel cluster vector and a second image feature vector of the third target image;
and if the second cosine distance is smaller than a fourth preset threshold, determining that the second target image is a similar image of the image to be retrieved until all images in the second image set are traversed, and obtaining a final similar image of the image to be retrieved.
Optionally, the method further comprises the steps of:
and outputting the index number of the final similar image so as to obtain the image corresponding to the index number and the related information thereof from the image library according to the index number and display the image and the related information to a user.
Further, to achieve the above object, the present invention also proposes an image retrieval apparatus comprising:
the device comprises a receiving unit, a searching unit and a searching unit, wherein the receiving unit is used for receiving a searching request which carries an image to be searched;
the preliminary screening unit is used for preliminarily screening the images in the image library according to the images to be retrieved received by the receiving unit to obtain a first image set similar to the images to be retrieved;
the first processing unit is used for calculating a first image feature vector of the image to be retrieved and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and the second processing unit is used for calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
In addition, in order to achieve the above object, the present invention further provides an electronic device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the image retrieval method as described above when executing the computer program.
Furthermore, to achieve the above object, the present invention also proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the image retrieval method as described above.
According to the technical scheme, when an image retrieval request is received, the retrieval range of the images in the image library is narrowed in a mode of carrying out multi-dimensional comparison on the images to be retrieved and the images in the image library for multiple times, so that the images with the similarity meeting the requirement standard with the images to be retrieved in the retrieval image library and the related information of the images can be returned, the retrieval precision is high, all the images meeting the conditions can be retrieved, and the false recognition rate of the retrieval result are low.
Drawings
FIG. 1 is a flowchart illustrating an image retrieval method according to a first embodiment of the present invention;
FIG. 2 is a schematic flow chart of the embodiment of step 102 in FIG. 1;
FIG. 3 is a schematic flow chart of the embodiment of step 103 in FIG. 1;
FIG. 4 is a schematic flow chart of an embodiment of step 104 in FIG. 1;
FIG. 5 is a flowchart illustrating an image retrieval method according to a second embodiment of the present invention
FIG. 6 is a block diagram of an image retrieval apparatus according to the present invention;
fig. 7 is a schematic structural diagram of an electronic device provided in the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
In one embodiment, as shown in fig. 1, the present invention provides an image retrieval method, the method comprising:
step 101, receiving a retrieval request, wherein the retrieval request carries an image to be retrieved.
In the embodiment of the application, a user or an operator uploads an image to be retrieved on an APP or a website, a retrieval system acquires the image to be retrieved and stores the image and related information to a server, and then the retrieval system sends a retrieval request to the server. The server runs an image retrieval program after receiving a retrieval request from the retrieval system to obtain similar images of the image to be retrieved. Therefore, it can be understood that the retrieval request should carry an image to be retrieved, so that the server can run an image retrieval program according to the image to be retrieved to obtain a similar image.
And 102, primarily screening the images in the image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved.
When the step is executed, the server performs coarse screening on all images stored in the image library by using a perceptual hash algorithm according to the image to be retrieved so as to reduce the range of the retrieved image and obtain a first image set. For example, the prescreening method may be: the server can firstly calculate the Hash codes of the images to be retrieved, then obtains the center cluster Hash codes of all the image clustering clusters in the image library, determines similar clusters of the images to be retrieved according to the Hash codes of the images to be retrieved and the center cluster Hash codes of the image clustering clusters, and then finds out the images similar to the images to be retrieved from the similar clusters, so that the retrieval range is narrowed, the calculation amount of retrieval work is reduced, and the retrieval efficiency can be improved to a certain extent.
Step 103, calculating a first image feature vector of the image to be retrieved, and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set.
In this step, when calculating the image feature vector of the image to be retrieved, for example, the server may extract the image feature vector by using a trained convolutional neural network, where the convolutional neural network may include vggtnet, ResNet, DenseNet, EfficieNet, and the like, and the size of the feature vector is 1 × 512. And the second image feature vectors of all the images in the first image set are extracted by the server in advance by adopting the trained convolutional neural network and are stored in a database in which the image library is located.
In this step, when the second image set is determined, for example, cosine distances between feature vectors of images to be retrieved and feature vectors of all images in the first image set may be calculated one by one, and when the cosine distances are smaller than a set value, the images are determined to be similar images of the images to be retrieved, and finally, the determined similar images are combined to form the second image set.
And 104, calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
In the step, image pixel value clustering is carried out on RGB three channels of an image to be retrieved by respectively using K-Means to obtain a set number of pixel value clustering clusters, the central value of each pixel value cluster is calculated to obtain a pixel clustering vector of the image to be retrieved, each channel corresponds to a clustering vector, and the vector length is 10. Using the same method, image pixel cluster vectors for all images in the second image set can be calculated. Then, when the similar images are determined, cosine distances between the clustering vectors of the images to be retrieved and the clustering vectors of all the images in the second image set can be calculated one by one, and when the cosine distances are smaller than a set value, the similar images which are finally retrieved are determined.
In the embodiment of the invention, firstly, an image to be retrieved is primarily screened to obtain a first image set similar to the image to be retrieved, a second image set is determined according to an image characteristic vector of the image to be retrieved and an image characteristic vector of an image in the first image set, and then according to an image pixel clustering vector of the image to be retrieved and an image pixel clustering vector of an image in the second image set, a retrieval range is reduced step by carrying out multi-dimensional comparison on the image to be retrieved and an image in an image library for many times, relevant information of the image and the similar image in the retrieval image library, the similarity of which meets a requirement standard with the image to be retrieved, can be returned, all images meeting the requirement can be retrieved, the error recognition rate and the missing recognition rate of the retrieval result are low, and the precision of the obtained retrieval result is high. Meanwhile, the retrieval process consumes less time, the time consumed by single retrieval is within 200 milliseconds, the real-time retrieval performance requirement is met, meanwhile, the user does not need to further screen or secondarily judge the retrieval result, and the user experience is improved.
Referring to fig. 2, fig. 2 provides a specific implementation of step 102 in the embodiment of fig. 1, that is, when an image in an image library is subjected to preliminary screening according to the image to be retrieved to obtain a first image set similar to the image to be retrieved, a specific flow may be as shown in fig. 2, and includes the following steps:
and 1021, calculating the hash code of the image to be retrieved according to a preset algorithm.
During the specific operation of the step, the server converts the image to be retrieved into a gray-scale image, the number of channels is 1 as a result, the size of the gray-scale image is scaled to 64x64, discrete cosine transformation is performed on the scaled gray-scale image to obtain a frequency domain data matrix, the data matrix is converted into a one-dimensional vector from left to right and from top to bottom, then the average value of all values in the vector is calculated, then each value in the vector is compared, if the average value is greater than or equal to the average value, the average value is 1, and if the average value is smaller than the average value, the hash code of the image to be detected is obtained.
And step 1022, acquiring the hash code of the center cluster of the image cluster in the image library.
Before this step is performed, the hash codes of all the images in the image library have been calculated in advance according to the method described in step 1021 and stored in the database of the image library. And clustering by adopting K-means to obtain a plurality of clustering clusters, wherein each clustering cluster is provided with a central cluster hash code. The purpose of this step is to obtain the central cluster hash code of each image cluster from the database of the image library.
And 1023, determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the central cluster of the image cluster.
Step 1024, if the cluster is determined to be a similar cluster, determining the first image set according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved.
Step 1023 — step 1024 may be executed as follows: and calculating the Hamming distance between the Hash code of the image to be retrieved and the center cluster code of each image cluster, wherein if the Hamming distance is smaller than a preset threshold value, the Hamming distance is similar, and the corresponding image cluster is similar. And then calculating the Hamming distance between the Hash code of the image to be retrieved and the image Hash codes corresponding to all the images in the similar cluster, wherein if the Hamming distance is smaller than a preset threshold value, the image is a similar image.
Specifically, when the first image set is determined according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved, the operation mode may be:
in the image cluster determined to be a similar cluster, sequentially traversing all images in the image cluster, and calculating the Hamming distance between the Hash code of each target image in the image cluster and the Hash code of the image to be retrieved one by one; and if the Hamming distance is smaller than a second preset threshold value, determining that the target image is a similar image of the image to be retrieved until all images in the image cluster are traversed. And finally, combining all the retrieved similar images into a first image set.
In the embodiment, when an image retrieval request is received, the retrieval range of the images in the image library is narrowed step by performing multi-dimensional comparison on the images to be retrieved and the images in the image library for multiple times, so that the relevant information of the images and the similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the condition can be retrieved, and the false recognition rate of the retrieval result are low. And the similar clusters are searched and the image similarity retrieval is only carried out in the similar clusters by calculating the Hamming distance between the central Hash code in the image cluster and the Hash code of the image to be retrieved, so that the retrieval range of the image library can be reduced, the calculated amount is reduced, the retrieval time is shortened, the retrieval efficiency is improved, and the real-time retrieval requirement is met.
Referring to fig. 3, fig. 3 provides a specific implementation of step 103 in the embodiment of fig. 1, that is, when determining the second image set according to the first image feature vector and the second image feature vector of the image in the first image set, a specific process can be shown in fig. 3, which includes the following steps:
and step 1031, obtaining second image feature vectors of all images in the first image set.
Wherein all images in the image library have extracted image feature vectors using the trained convolutional neural network described above and are stored in the database in which the image library is located, which of course also includes the image feature vectors of all images in the first image set. Therefore, when this step is performed, it is only necessary to read the image feature vectors of all the images in the first image set from the database of the image library.
Step 1032, respectively calculating a first cosine distance between the first image feature vector of the image to be retrieved and a second image feature vector of a second target image in the first image set.
Step 1033, if the first cosine distance is smaller than a third preset threshold, determining that the second target image is a similar image of the image to be retrieved until all images in the image cluster are traversed, and obtaining the second image set.
Step 1032-step 1033, during the specific operation, sequentially traversing all the images in the first image set, and calculating cosine distances between the feature vectors of each target image in the first image set and the feature vectors of the images to be retrieved one by one; and if the cosine distance is smaller than a third preset threshold value, determining that the target image is a similar image of the image to be retrieved until all images in the first image set are traversed. Finally, all the similar images which are searched out are combined into a second image set.
In the embodiment, when an image retrieval request is received, the retrieval range of the images in the image library is reduced step by means of multi-dimensional comparison for multiple times, so that the relevant information of the images and similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the conditions can be retrieved, and the false recognition rate and the missing recognition rate of the retrieval result are low. Moreover, by calculating the cosine distances between the feature vector of the image to be detected and all the images in the first image set, the retrieval range of the image library can be further reduced, the calculated amount is reduced, the retrieval time is shortened, the retrieval efficiency is improved, and the real-time retrieval requirement is met.
Referring to fig. 4, fig. 4 provides a specific implementation manner of step 104 in the embodiment of fig. 1, that is, when calculating a first image pixel cluster vector of the image to be retrieved and a second image pixel cluster vector of an image in the second image set, and obtaining a similar image of the image to be retrieved according to the first image pixel cluster vector and the second image pixel cluster vector, a specific process may be as shown in fig. 4, and includes the following steps:
specifically, when calculating the first image pixel cluster vector of the image to be retrieved and the second image pixel cluster vector of the image in the second image set, the method includes steps 1041 and 1042. And when obtaining a similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector, including steps 1043 and 1044.
Step 1041, performing image pixel value clustering processing on the RGB three channels of the image to be retrieved and the RGB three channels of the image in the second image set respectively, and obtaining a first pixel value cluster of the image to be retrieved and a second pixel value cluster of the image in the second image set.
Step 1042, calculating a central value of the first pixel value cluster and a central value of the second pixel value cluster to obtain the first image pixel cluster vector and the second image pixel cluster vector.
In steps 1041 to 1042, when the method is executed specifically, image pixel value clustering may be performed on RGB three channels of an image to be retrieved by using K-Means, respectively, to obtain a set number of pixel value cluster, and a central value of each pixel value cluster is calculated to obtain a pixel clustering vector of the image to be retrieved, where each channel corresponds to a clustering vector, and the vector length is 10. Using the same method, image pixel cluster vectors for all images in the second image set can be calculated.
Step 1043, calculating a second cosine distance between the first image pixel cluster vector and a second image feature vector of the third target image, respectively.
Step 1044 of determining that the third target image is a similar image of the image to be retrieved if the second cosine distance is smaller than a fourth preset threshold value until all images in the second image set are traversed to obtain the similar image of the image to be retrieved.
In steps 1041 to 1042, since the second image set includes a plurality of images, when similar images are determined, cosine distances between the cluster vectors of the images to be retrieved and the cluster vectors of all the images in the second image set (i.e., a plurality of third target objects) may be calculated one by one, and if the cosine distances are smaller than a fourth preset threshold, the target images are determined to be similar images of the images to be retrieved until all the images in the second image set are traversed. And finally, outputting all the retrieved similar images.
In the embodiment, when an image retrieval request is received, the retrieval range of the images in the image library is reduced by means of multi-dimensional comparison for many times, so that the relevant information of the images and similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the conditions can be retrieved, and the false recognition rate of the retrieval result are low. Moreover, by calculating the cosine distances between the cluster vector of the image to be detected and all the images in the second image set, the retrieval range of the image library can be further reduced, the calculated amount is reduced, the retrieval time is shortened, the retrieval efficiency is improved, and the real-time retrieval requirement is met.
Referring to fig. 5, fig. 5 provides a schematic flow chart after step 104 in the embodiment of fig. 1, that is, after step 104 is executed, the method may further include:
and 105, outputting the index number of the final similar image so as to obtain the image corresponding to the index number and the related information thereof from the image library according to the index number and display the image and the related information to a user.
The server may output the search result to the search system. During specific operation, the server returns the index number of the final similar image obtained by searching in the image library to the searching system. The retrieval system accesses the image library by using the retrieval number, acquires the image and the image related information corresponding to the index number, and displays the image and the image related information to a display interface of an APP or a website where the retrieval initiator is located.
In the embodiment, when an image retrieval request is received, the retrieval range of the images in the image library is reduced by means of multi-dimensional comparison for many times, so that the relevant information of the images and similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the conditions can be retrieved, and the false recognition rate of the retrieval result are low. And returning the retrieved similar pictures to the retrieval system in an index number mode. Therefore, the retrieval system can access the image library according to the index number, acquire the image and the image related information corresponding to the index number, and display the image and the image related information to the display interface of the APP or the website where the retrieval initiator is located.
Further, an embodiment of the present invention further provides an image retrieval apparatus, and with reference to fig. 6, the image retrieval apparatus includes:
a receiving unit 10, configured to receive a retrieval request, where the retrieval request carries an image to be retrieved;
a preliminary screening unit 20, configured to perform preliminary screening on images in an image library according to the image to be retrieved received by the receiving unit 10, so as to obtain a first image set similar to the image to be retrieved;
the first processing unit 30 is configured to calculate a first image feature vector of the image to be retrieved, and determine a second image set according to the first image feature vector and a second image feature vector of an image in the first image set;
and the second processing unit 40 is configured to calculate a first image pixel cluster vector of the image to be retrieved and a second image pixel cluster vector of the image in the second image set, and obtain a final similar image of the image to be retrieved according to the first image pixel cluster vector and the second image pixel cluster vector.
According to the scheme, firstly, an image to be retrieved is subjected to primary screening, a first image set similar to the image to be retrieved is obtained, a second image set is determined according to an image feature vector of the image to be retrieved and an image feature vector of an image in the first image set, then according to an image pixel clustering vector of the image to be retrieved and an image pixel clustering vector of an image in the second image set, the retrieval range is reduced step by carrying out multi-dimensional comparison on the image to be retrieved and an image in an image library, relevant information of the image and the similar image, of which the similarity to the image to be retrieved meets the requirement standard, in the retrieval image library can be returned, all images meeting the requirement can be retrieved, the error recognition rate and the missing recognition rate of the retrieval result are low, and the accuracy of the obtained retrieval result is high. Meanwhile, the retrieval process consumes less time, the time consumed by single retrieval is within 200 milliseconds, the real-time retrieval performance requirement is met, meanwhile, the user does not need to further screen or secondarily judge the retrieval result, and the user experience is improved.
It should be noted that each unit in the apparatus may be configured to implement each step in the method, and achieve the corresponding technical effect, which is not described herein again.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
As shown in fig. 7, the electronic device may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include standard wired interfaces, wireless interfaces (e.g., WI-FI, 4G, 5G interfaces). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 7 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 7, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an image retrieval program.
In the electronic apparatus shown in fig. 7, the network interface 1004 is mainly used for data communication with an external network; the user interface 1003 is mainly used for receiving input instructions of a user; the electronic device calls an image retrieval program stored in the memory 1005 by the processor 1001, and performs the following operations:
receiving a retrieval request, wherein the retrieval request carries an image to be retrieved;
primarily screening images in an image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved;
calculating a first image feature vector of the image to be retrieved, and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
Optionally, the step of performing preliminary screening on the images in the image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved includes:
calculating the Hash code of the image to be retrieved according to a preset algorithm;
acquiring a central cluster hash code of an image cluster in the image library;
determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the central cluster of the image cluster;
and if the cluster is determined to be similar, determining the first image set according to the Hash codes of all the images in the image cluster and the Hash code of the image to be retrieved.
Optionally, the step of determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the center cluster of the image cluster includes:
calculating a first Hamming distance between the central Hash code of the image clustering cluster and the Hash code of the image to be retrieved;
and if the first Hamming distance is smaller than a first preset threshold value, determining the image cluster as a similar cluster of the image to be retrieved.
Optionally, the step of determining the first image set according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved includes:
traversing all images in the image cluster;
calculating a second Hamming distance between the Hash code of the first target image in the image cluster and the Hash code of the image to be retrieved;
and if the second Hamming distance is smaller than a second preset threshold value, determining that the first target image is a similar image of the image to be retrieved until all images in the image cluster are traversed, and obtaining the first image set.
Optionally, the step of determining a second image set according to the first image feature vector and a second image feature vector of an image in the first image set includes:
acquiring second image feature vectors of all images in the first image set;
respectively calculating a first cosine distance between a first image feature vector of the image to be retrieved and a second image feature vector of a second target image in the first image set;
and if the first cosine distance is smaller than a third preset threshold value, determining that the second target image is a similar image of the image to be retrieved until all images in the image cluster are traversed to obtain the second image set.
Optionally, the step of calculating a first image pixel cluster vector of the image to be retrieved and a second image pixel cluster vector of the image in the second image set includes:
respectively carrying out image pixel value clustering processing on the RGB three channels of the image to be retrieved and the RGB three channels of the image in the second image set to obtain a first pixel value cluster of the image to be retrieved and a second pixel value cluster of the image in the second image set;
and calculating the central value of the first pixel value cluster and the central value of the second pixel value cluster to obtain the first image pixel cluster vector and the second image pixel cluster vector.
Optionally, the second image set includes a third target image, and the step of obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector includes:
respectively calculating a second cosine distance between the first image pixel cluster vector and a second image feature vector of the third target image;
and if the second cosine distance is smaller than a fourth preset threshold, determining that the third target image is a similar image of the image to be retrieved until all images in the second image set are traversed, and obtaining a final similar image of the image to be retrieved.
Optionally, the method further comprises the steps of:
and outputting the index number of the final similar image so as to obtain the image corresponding to the index number and the related information thereof from the image library according to the index number and display the image and the related information to a user.
According to the scheme, when an image retrieval request is received, the retrieval range of the images in the image library is narrowed through a multi-dimensional comparison mode for multiple times, so that the related information of the images and the similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the requirement can be retrieved, and the false recognition rate and the missing recognition rate of the retrieval result are low. And returning the retrieved similar pictures to the retrieval system in an index number mode. Therefore, the retrieval system can access the image library according to the index number, acquire the image and the image related information corresponding to the index number, and display the image and the image related information to the display interface of the APP or the website where the retrieval initiator is located.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, where an image retrieval program is stored on the computer-readable storage medium, and when executed by a processor, the image retrieval program implements the following operations:
receiving a retrieval request, wherein the retrieval request carries an image to be retrieved;
primarily screening images in an image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved;
calculating a first image feature vector of the image to be retrieved, and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
Optionally, the step of performing preliminary screening on the images in the image library according to the image to be retrieved to obtain a first image set similar to the image to be retrieved includes:
calculating the Hash code of the image to be retrieved according to a preset algorithm;
acquiring a central cluster hash code of an image cluster in the image library;
determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the central cluster of the image cluster;
and if the cluster is determined to be similar, determining the first image set according to the Hash codes of all the images in the image cluster and the Hash code of the image to be retrieved.
Optionally, the step of determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the center cluster of the image cluster includes:
calculating the distance between the central Hash code of the image cluster and the Hash code of the image to be retrieved
A first hamming distance of;
and if the first Hamming distance is smaller than a first preset threshold value, determining the image cluster as a similar cluster of the image to be retrieved.
Optionally, the step of determining the first image set according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved includes:
traversing all images in the image cluster;
calculating a second Hamming distance between the Hash code of the first target image in the image cluster and the Hash code of the image to be retrieved;
and if the second Hamming distance is smaller than a second preset threshold value, determining that the first target image is a similar image of the image to be retrieved until all images in the image cluster are traversed, and obtaining the first image set.
Optionally, the step of determining a second image set according to the first image feature vector and a second image feature vector of an image in the first image set includes:
acquiring second image feature vectors of all images in the first image set;
respectively calculating a first cosine distance between a first image feature vector of the image to be retrieved and a second image feature vector of a second target image in the first image set;
and if the first cosine distance is smaller than a third preset threshold value, determining that the second target image is a similar image of the image to be retrieved until all images in the image cluster are traversed to obtain the second image set.
Optionally, the step of calculating a first image pixel cluster vector of the image to be retrieved and a second image pixel cluster vector of the image in the second image set includes:
respectively carrying out image pixel value clustering processing on the RGB three channels of the image to be retrieved and the RGB three channels of the image in the second image set to obtain a first pixel value cluster of the image to be retrieved and a second pixel value cluster of the image in the second image set;
and calculating the central value of the first pixel value cluster and the central value of the second pixel value cluster to obtain the first image pixel cluster vector and the second image pixel cluster vector.
Optionally, the second image set includes a third target image, and the step of obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector includes:
respectively calculating a second cosine distance between the first image pixel cluster vector and a second image feature vector of the third target image;
and if the second cosine distance is smaller than a fourth preset threshold, determining that the third target image is a similar image of the image to be retrieved until all images in the second image set are traversed, and obtaining a final similar image of the image to be retrieved.
Optionally, the method further comprises the steps of:
outputting the index number of the final similar image so as to obtain the image corresponding to the index number and the related information thereof from the image library according to the index number and display the image and the related information to the user
According to the scheme, when an image retrieval request is received, the retrieval range of the images in the image library is narrowed through a multi-dimensional comparison mode for multiple times, so that the related information of the images and the similar images in the retrieval image library, the similarity of which with the images to be retrieved meets the requirement standard, can be returned, the retrieval precision is high, all the images meeting the requirement can be retrieved, and the false recognition rate and the missing recognition rate of the retrieval result are low. And returning the retrieved similar pictures to the retrieval system in an index number mode. Therefore, the retrieval system can access the image library according to the index number, acquire the image and the image related information corresponding to the index number, and display the image and the image related information to the display interface of the APP or the website where the retrieval initiator is located.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, a controller, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An image retrieval method, characterized in that the method comprises the steps of:
the retrieval server receives a retrieval request sent by a retrieval system, wherein the retrieval request is received by the retrieval system from a user and carries an image to be retrieved;
the retrieval server calculates the Hash codes of the images to be retrieved according to a preset algorithm, obtains the Hash codes of the center clusters of the image clusters in the image library, determines whether the image clusters are similar clusters of the images to be retrieved according to the Hash codes of the images to be retrieved and the Hash codes of the center clusters of the image clusters, and determines a first image set according to the Hash codes of all the images in the image clusters and the Hash codes of the images to be retrieved if the image clusters are similar clusters; wherein a central cluster hash code of an image cluster in the image library has been pre-computed and stored in a database of the image library;
the retrieval server calculates a first image feature vector of the image to be retrieved, and determines a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and the retrieval server calculates a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtains a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
2. The method according to claim 1, wherein the step of determining whether the image cluster is a similar cluster of the image to be retrieved according to the hash code of the image to be retrieved and the hash code of the center cluster of the image cluster comprises:
the retrieval server calculates a first Hamming distance between the central Hash codes of the image clustering clusters and the Hash codes of the images to be retrieved one by one;
and if the first Hamming distance is smaller than a first preset threshold value, the retrieval server determines the image cluster as a similar cluster of the image to be retrieved.
3. The method according to claim 2, wherein the step of determining the first image set according to the hash codes of all the images in the image cluster and the hash code of the image to be retrieved comprises:
the retrieval server acquires hash codes of all images in the image clustering cluster; the Hash codes of all the images in the image clustering cluster are calculated in advance and stored in a database of the image database;
the retrieval server calculates a second Hamming distance between the Hash code of each target image in the image clustering cluster and the Hash code of the image to be retrieved one by one;
if the second Hamming distance is smaller than a second preset threshold value, the retrieval server determines that the first target image is an initial similar image of the image to be retrieved until all images in the image cluster are traversed; and combining all the initial similar images to obtain the first image set.
4. A method according to any one of claims 1 to 3, wherein the step of determining a second image set from the first image feature vector and a second image feature vector of an image in the first image set comprises:
the retrieval server acquires second image feature vectors of all images in the first image set; second image feature vectors of all images in the first image set are calculated in advance and stored in a database of the image library;
the retrieval server calculates a first cosine distance between a first image feature vector of the image to be retrieved and a second image feature vector of a second target image in the first image set one by one;
and if the first cosine distance is smaller than a third preset threshold value, the retrieval server determines that the second target image is an intermediate similar image of the image to be retrieved until all images in the first image set are traversed, and combines all intermediate similar images to obtain the second image set.
5. The method according to any one of claims 1 to 4, wherein the step of calculating a first image pixel cluster vector for the image to be retrieved and a second image pixel cluster vector for the image in the second image set comprises:
the retrieval server carries out image pixel value clustering processing on the RGB three channels of the image to be retrieved and the RGB three channels of the image in the second image set respectively to obtain a first pixel value cluster of the image to be retrieved and a second pixel value cluster of the image in the second image set;
and the retrieval server calculates the central value of the first pixel value cluster and the central value of the second pixel value cluster to obtain the first image pixel cluster vector and the second image pixel cluster vector.
6. The method according to claim 5, wherein the second image set comprises a third target image, and the step of obtaining a final similar image of the image to be retrieved according to the first image pixel cluster vector and the second image pixel cluster vector comprises:
the retrieval server respectively calculates a second cosine distance between the first image pixel clustering vector and a second image feature vector of the third target image;
if the second cosine distance is smaller than a fourth preset threshold, the retrieval server determines that the third target image is a similar image of the image to be retrieved until all images in the second image set are traversed, and a final similar image of the image to be retrieved is obtained.
7. The method according to claim 1, characterized in that the method further comprises the steps of:
and the retrieval server outputs the index number of the final similar image so that the retrieval system can acquire the image corresponding to the index number and the related information thereof from the image library according to the index number and display the image and the related information to the user.
8. An image retrieval apparatus, characterized in that the apparatus comprises:
the retrieval system comprises a receiving unit, a searching unit and a searching unit, wherein the receiving unit is used for receiving a retrieval request sent by the retrieval system, the retrieval request is received by the retrieval system from a user, and the retrieval request carries an image to be retrieved;
the preliminary screening unit is used for calculating the Hash codes of the images to be retrieved according to a preset algorithm, acquiring the Hash codes of the center clusters of the images in the image library, determining whether the image cluster is a similar cluster of the images to be retrieved according to the Hash codes of the images to be retrieved and the Hash codes of the center clusters of the images, and determining a first image set according to the Hash codes of all the images in the image cluster and the Hash codes of the images to be retrieved if the image cluster is determined to be a similar cluster; wherein a central cluster hash code of an image cluster in the image library has been pre-computed and stored in a database of the image library;
the first processing unit is used for calculating a first image feature vector of the image to be retrieved and determining a second image set according to the first image feature vector and a second image feature vector of the image in the first image set;
and the second processing unit is used for calculating a first image pixel clustering vector of the image to be retrieved and a second image pixel clustering vector of the image in the second image set, and obtaining a final similar image of the image to be retrieved according to the first image pixel clustering vector and the second image pixel clustering vector.
9. An electronic device, characterized in that the electronic device comprises: memory, a processor and an image retrieval program stored on the memory and executable on the processor, the image retrieval program being configured to implement the steps of the image retrieval method according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the image retrieval method according to any one of claims 1 to 7.
CN202111176709.3A 2021-10-09 2021-10-09 Image retrieval method, device and storage medium Pending CN113886628A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111176709.3A CN113886628A (en) 2021-10-09 2021-10-09 Image retrieval method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111176709.3A CN113886628A (en) 2021-10-09 2021-10-09 Image retrieval method, device and storage medium

Publications (1)

Publication Number Publication Date
CN113886628A true CN113886628A (en) 2022-01-04

Family

ID=79005841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111176709.3A Pending CN113886628A (en) 2021-10-09 2021-10-09 Image retrieval method, device and storage medium

Country Status (1)

Country Link
CN (1) CN113886628A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910296A (en) * 2023-09-08 2023-10-20 上海任意门科技有限公司 Method, system, electronic device and medium for identifying transport content

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910296A (en) * 2023-09-08 2023-10-20 上海任意门科技有限公司 Method, system, electronic device and medium for identifying transport content
CN116910296B (en) * 2023-09-08 2023-12-08 上海任意门科技有限公司 Method, system, electronic device and medium for identifying transport content

Similar Documents

Publication Publication Date Title
CN110019896B (en) Image retrieval method and device and electronic equipment
US9940655B2 (en) Image processing
CN109992601B (en) To-do information pushing method and device and computer equipment
US11310559B2 (en) Method and apparatus for recommending video
CN107590255B (en) Information pushing method and device
CN109168047B (en) Video recommendation method and device, server and storage medium
CN109922379B (en) Advertisement video optimization method, device and equipment and computer readable storage medium
CN112527972A (en) Intelligent customer service chat robot implementation method and system based on deep learning
CN110825894A (en) Data index establishing method, data index retrieving method, data index establishing device, data index retrieving device, data index establishing equipment and storage medium
CN111078842A (en) Method, device, server and storage medium for determining query result
CN113806588B (en) Method and device for searching video
KR101896404B1 (en) Product Recommendation System Using Computer Vision
CN113627411A (en) Super-resolution-based commodity identification and price matching method and system
CN114419501A (en) Video recommendation method and device, computer equipment and storage medium
CN110910215A (en) Product recommendation method, device, equipment and computer-readable storage medium
CN111143555A (en) Big data-based customer portrait generation method, device, equipment and storage medium
CN113139816A (en) Information processing method, device, electronic equipment and storage medium
CN113886628A (en) Image retrieval method, device and storage medium
CN113657504A (en) Image retrieval method, image retrieval device, computer equipment and storage medium
CN116127105B (en) Data collection method and device for big data platform
CN110191124B (en) Web front-end development data-based website identification method and device and storage equipment
JP4883719B2 (en) Similar image retrieval method and apparatus
CN113111206A (en) Image searching method and device, electronic equipment and storage medium
CN111460268B (en) Method and device for determining database query request and computer equipment
CN113641855A (en) Video recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination