CN111753111A - Picture searching method and device - Google Patents

Picture searching method and device Download PDF

Info

Publication number
CN111753111A
CN111753111A CN201910284715.7A CN201910284715A CN111753111A CN 111753111 A CN111753111 A CN 111753111A CN 201910284715 A CN201910284715 A CN 201910284715A CN 111753111 A CN111753111 A CN 111753111A
Authority
CN
China
Prior art keywords
picture
group
cluster
pictures
pushed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910284715.7A
Other languages
Chinese (zh)
Inventor
李耀强
石元伟
王春明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910284715.7A priority Critical patent/CN111753111A/en
Publication of CN111753111A publication Critical patent/CN111753111A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a picture searching method and device. One embodiment of the method comprises: selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group is associated with a cluster group set obtained after product quantization, the clusters included in the cluster groups in the cluster group set are associated with an inverted chain table, and the inverted chain table comprises picture identifiers of pictures related to the clusters in the picture group; selecting at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relation between a picture for searching and a center point of a class cluster included in a class cluster group set associated with each picture group in the at least one picture group and an inverted chain table associated with the class cluster; generating a search result comprising the determined picture to be pushed. According to the embodiment, the inverted link list technology is introduced, so that the picture searching time can be saved, and the picture searching efficiency is improved.

Description

Picture searching method and device
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a picture searching method and device.
Background
With the rapid development of internet technology, not only text search but also picture search can be performed. For example, after obtaining images for search uploaded by a user, the search system may perform similarity calculation on the images for search and each image in the database, and then select at least one image from the images for search as a search result according to a result of the similarity calculation and return the at least one image to the user.
Disclosure of Invention
The embodiment of the application provides an image searching method and device.
In a first aspect, an embodiment of the present application provides an image searching method, where the method includes: selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group is associated with a cluster group set obtained after product quantization, the clusters included in the cluster groups in the cluster group set are associated with an inverted chain table, and the inverted chain table comprises picture identifiers of pictures related to the clusters in the picture group; selecting at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relation between a picture for searching and a center point of a class cluster included in the class cluster group set associated with each picture group in the at least one picture group and an inverted chain table associated with the class cluster; generating a search result comprising the determined picture to be pushed.
In some embodiments, the group of pictures is further associated with a first cluster, where the first cluster includes picture feature vectors corresponding to respective pictures in the group of pictures; and selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group comprises: acquiring a first picture characteristic vector corresponding to a search picture; determining the distance between the center point of a first cluster associated with each picture group in the picture group set and a first picture feature vector; at least one group of pictures is selected from the group of pictures set based on the determined distance.
In some embodiments, selecting at least one group of pictures from the group of pictures set based on the determined distance comprises: and selecting a picture group with the minimum distance between the central point of the first cluster associated with the first preset number and the first picture characteristic vector from the picture group set.
In some embodiments, selecting at least one picture from a picture set formed by pictures in the at least one picture group as a picture to be pushed based on a matching relationship between a picture for search and a center point of a cluster included in a cluster group set associated with each picture group in the at least one picture group and an inverted link table associated with the cluster, includes: determining the distance between the center point of each cluster and the first picture characteristic vector for each cluster included in each cluster group in the cluster group set associated with each picture group in the at least one picture group; traversing each picture identifier in the inverted list associated with the cluster, and performing distance accumulation operation on the picture indicated by the picture identifier based on the distance between the center point of the cluster and the first picture feature vector; and selecting at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set.
In some embodiments, selecting at least one picture from the picture set as a picture to be pushed based on a distance obtained by a distance accumulation operation corresponding to each picture in the picture set includes: and selecting a second preset number of corresponding pictures with the minimum distance obtained by distance accumulation operation from the picture set as the pictures to be pushed.
In some embodiments, generating search results including the determined picture to be pushed includes: determining the distance between the picture characteristic vector corresponding to each picture to be pushed and the first picture characteristic vector; sequencing the determined pictures to be pushed according to the sequence of the distances from small to large to obtain a picture sequence to be pushed; and generating a search result comprising a picture sequence to be pushed.
In some embodiments, the above method further comprises: and returning the search result to the terminal equipment used by the user.
In a second aspect, an embodiment of the present application provides an image search apparatus, including: the device comprises a first selection unit, a second selection unit and a third selection unit, wherein the first selection unit is configured to select at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, the picture group is associated with a cluster group set obtained after product quantization, the clusters included in the cluster groups in the cluster group set are associated with an inverted linked list, and the inverted linked list comprises picture identifiers of pictures related to the clusters in the picture group; the second selecting unit is configured to select at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relation between the picture for searching and a central point of a cluster included in a cluster group set associated with each picture group in the at least one picture group and an inverted chain table associated with the cluster; a generating unit configured to generate a search result including the determined picture to be pushed.
In some embodiments, the group of pictures is further associated with a first cluster, where the first cluster includes picture feature vectors corresponding to respective pictures in the group of pictures; and the first selecting unit comprises: an acquisition subunit configured to acquire a first picture feature vector corresponding to a search picture; a determining subunit configured to determine a distance between a center point of a first class cluster associated with each group of pictures in the group of pictures set and the first picture feature vector; a first selecting sub-unit configured to select at least one group of pictures from the group of pictures set based on the determined distance.
In some embodiments, the first selecting subunit is further configured to: and selecting a picture group with the minimum distance between the central point of the first cluster associated with the first preset number and the first picture characteristic vector from the picture group set.
In some embodiments, the second selecting unit includes: a distance accumulation subunit configured to determine, for each class cluster included in each class cluster group in the class cluster group set associated with each picture group in the at least one picture group, a distance between a center point of the class cluster and the first picture feature vector; traversing each picture identifier in the inverted list associated with the cluster, and performing distance accumulation operation on the picture indicated by the picture identifier based on the distance between the center point of the cluster and the first picture feature vector; and the second selecting subunit is configured to select at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set.
In some embodiments, the second selecting subunit is further configured to: and selecting a second preset number of corresponding pictures with the minimum distance obtained by distance accumulation operation from the picture set as the pictures to be pushed.
In some embodiments, the generating unit is further configured to: determining the distance between the picture characteristic vector corresponding to each picture to be pushed and the first picture characteristic vector; sequencing the determined pictures to be pushed according to the sequence of the distances from small to large to obtain a picture sequence to be pushed; and generating a search result comprising a picture sequence to be pushed.
In some embodiments, the above apparatus further comprises: and the sending unit is configured to return the search result to the terminal equipment used by the user.
In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon; when executed by the one or more processors, cause the one or more processors to implement a method as described in any implementation of the first aspect.
In a fourth aspect, the present application provides a computer-readable medium, on which a computer program is stored, which when executed by a processor implements the method described in any implementation manner of the first aspect.
The picture searching method and the device provided by the embodiment of the application select at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group is associated with a cluster group set obtained by product quantization, the cluster groups in the cluster group set comprise a cluster association inverted linked list, the inverted link list comprises picture identifications of pictures in the picture group and related to the clusters, and then based on a matching relation between the pictures for searching and center points of clusters included in the cluster groups in the cluster group set associated with each picture group in the at least one picture group and the inverted link list associated with the clusters, at least one picture is selected from the picture sets formed by the pictures in the picture groups to be used as pictures to be pushed so as to generate a searching result comprising the determined pictures to be pushed. According to the scheme provided by the embodiment of the application, the inverted link list technology is introduced, so that the matching relation between each picture in the picture set and the picture for searching can be quickly determined, pictures (such as the determined picture to be pushed) similar to the picture for searching can be quickly determined, the picture searching time can be saved, and the picture searching efficiency is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram to which some embodiments of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a picture search method according to the present application;
FIG. 3 is a diagram illustrating an application scenario of the picture search method according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a picture search method according to the present application;
FIG. 5 is a schematic diagram of an embodiment of an image search apparatus according to the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use with the electronic device to implement some embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the picture search method or picture search apparatus of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a web browser application, an information search application, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a picture search server for performing picture search, and the picture search server may receive pictures for search uploaded by users through the terminal devices 101, 102, and 103, perform corresponding search operations based on the pictures for search, and return search results to the terminal devices.
It should be noted that, the image search method provided by some embodiments of the present application is generally executed by the server 105, and accordingly, the image search apparatus is generally disposed in the server 105.
It should be noted that the server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a picture search method according to the present application is shown. The process 200 of the picture searching method includes the following steps:
step 201, based on the picture for search uploaded by the user, at least one picture group is selected from a preset picture group set.
In the present embodiment, the execution subject of the picture search method may be a server (e.g., the server 105 shown in fig. 1). The server can receive pictures for searching uploaded by a user through terminal equipment (such as the terminal equipment 101, 102 and 103 shown in fig. 1) in real time, and then select at least one picture group from a preset picture group set based on the pictures for searching. Each group of pictures in the group of pictures set may be associated with a group of clusters of the class obtained after the multiplication quantization. Each class cluster included in each class cluster group in the class cluster group set may be associated with a reverse linked list. The inverted link list may include picture identifications of pictures in the group of pictures that are related to the cluster.
The group of pictures may be stored locally in the server, or may be stored in a device for storing pictures connected to the server, and is not particularly limited herein. In addition, the group of pictures may be obtained by clustering each picture in the picture library by a clustering algorithm (e.g., a K-means clustering algorithm, etc.). Where K may represent the number of categories and means may represent the mean. K-means is an algorithm that clusters data points by mean.
For a class cluster group set associated with each picture group in the picture group set, the class cluster group set may be obtained by performing Product Quantization (PQ) on each picture in the picture group through a Product Quantization (PQ) algorithm. The product Quantization algorithm is developed based on Vector Quantization (VQ) because it is a search algorithm for increasing the search speed of images.
In particular, for each group of pictures in the group of pictures set, each picture in the group of pictures may correspond to a picture feature vector. The picture feature vector may be used to characterize the picture to which it corresponds. When performing multiplication quantization on each picture in the group of pictures, a vector space in which the picture feature vector corresponding to each picture is located may be first divided into a plurality of subspaces. Then, the picture feature vector corresponding to each picture can be divided into a plurality of sub-feature vectors based on the plurality of subspaces. Then, for each subspace of the plurality of subspaces, the divided sub-feature vectors belonging to the subspace may be clustered by a clustering algorithm, thereby obtaining a cluster group corresponding to the subspace. Then, the class cluster groups respectively corresponding to the plurality of subspaces may be generated into a class cluster group set corresponding to the group of pictures.
For the inverted linked list associated with each class cluster included in each class cluster group in the class cluster group set, the inverted linked list may be formed by picture identifiers of pictures to which each sub-feature vector included in the class cluster belongs respectively. In addition, the cluster may also correspond to a center point. The center point may for example be the average of the individual sub-feature vectors comprised by the class cluster.
In this embodiment, the server may adopt various selection methods to select at least one group of pictures from the group of pictures set based on the pictures for search uploaded by the user. As an example, each group of pictures in the group of pictures set may correspond to a picture category. The server may analyze the content of the picture for search by using various analysis methods, and determine the picture type to which the picture for search belongs based on the analysis result. Then, the server can select a group of pictures corresponding to the same picture category as the picture for searching from the group of pictures set.
Optionally, each group of pictures in the group of pictures set may also be associated with a first type of cluster. The first cluster may include picture feature vectors corresponding to respective pictures in the group of pictures. In addition, the first cluster may correspond to a central point, and the central point may be, for example, an average value of feature vectors of respective pictures in the first cluster. In order to improve the effectiveness of the group of pictures selected from the group of pictures set, the server may adopt the following selection modes: acquiring a first picture characteristic vector corresponding to a search picture; determining the distance between the center point of a first cluster associated with each picture group in the picture group set and a first picture feature vector; at least one group of pictures is selected from the group of pictures set based on the determined distance. For example, a group of pictures with a first preset number (e.g., 1, 2, or 3, etc.) of distances between the center point of the associated first cluster and the first picture feature vector being the smallest is selected from the group of pictures. Here, the first preset number may be set according to actual needs, and is not particularly limited herein.
It should be noted that the server may use various image feature extraction algorithms (such as a convolutional neural network, a local binary pattern algorithm, and the like) to perform feature extraction on the image for search, and generate the extracted features into a first image feature vector corresponding to the image for search. In addition, the server may determine the distance between the center point of the first cluster and the first picture feature vector by using various distance measurement methods (e.g., euclidean distance, manhattan distance, minuscule distance, etc.).
It is noted that the greater the distance between two vectors, the lower the similarity between the two vectors can be characterized. The smaller the distance between two vectors, the higher the similarity between the two vectors can be characterized.
Step 202, based on the matching relationship between the picture for search and the center point of the cluster included in the cluster group set associated with each picture group in at least one picture group and the inverted linked list associated with the cluster, selecting at least one picture from the picture set formed by the pictures in each picture group as the picture to be pushed.
In this embodiment, the server may select at least one picture from the picture sets formed by the pictures in the at least one picture group as the picture to be pushed based on a matching relationship between the picture for search and a central point of a class cluster included in a class cluster group set associated with each picture group in the at least one picture group and an inverted linked list associated with the class cluster.
As an example, the server may determine a spatial existing distance between each picture in the picture set and the picture for search based on a matching relationship between the picture for search and a central point of a cluster included in a cluster group set associated with each picture group in the at least one picture group and an inverted link table associated with the cluster. Wherein, the distance can be used for measuring the similarity between the picture and the picture for searching. And then the server can select at least one picture from the picture set as a picture to be pushed based on the determined distance.
Specifically, for each cluster included in each cluster group in the cluster group set associated with each group of pictures in the at least one group of pictures, the server may determine the distance between the center point of the cluster and the first picture feature vector by using the distance measurement method described above. Then, for each class cluster included in each class cluster group in the class cluster group set associated with each group of pictures in the at least one group of pictures, the server may traverse the inverted linked list associated with the class cluster, and read the distance between the previously determined center point of the class cluster and the first picture feature vector. When traversing to each picture identifier in the inverted list, the server may perform a distance accumulation operation on the picture indicated by the picture identifier based on the distance. Then, the server may select at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set. For example, a second preset number (e.g., 50, 100, or 200, etc.) of pictures with the smallest distance obtained by the distance accumulation operation are selected from the picture set as the pictures to be pushed. Here, the second preset number may be set according to actual needs, and is not particularly limited herein.
The distance obtained by the distance accumulation operation corresponding to each picture in the picture set may be a distance that exists in space between the picture and the picture for search. In fact, the distance obtained by the distance accumulation operation corresponding to each picture in the picture set is the sum of the distances (the distances between the center point of the cluster and the first picture feature vector) corresponding to the clusters to which the sub-feature vectors corresponding to the picture feature vector corresponding to the picture belong.
It should be noted that, by using the inverted link table technique, it is possible to avoid repeatedly reading the distance between the center point of the same cluster and the first picture feature vector when determining the distance for measuring the similarity between each picture in the picture set and the search picture. Therefore, time consumed in the picture searching process can be effectively saved, the waiting time of a user can be effectively saved, and the user experience is improved.
Step 203, generating a search result comprising the determined picture to be pushed.
In this embodiment, the server may generate a search result including the determined picture to be pushed. As an example, the server may first sort the determined pictures to be pushed, and obtain a picture sequence to be pushed. For example, the server may sort the pictures to be pushed based on an order from small to large of distances obtained through the distance accumulation operation corresponding to the pictures to be pushed. Then, the server may generate a search result including the sequence of pictures to be pushed.
In some optional implementation manners of this embodiment, the server may further return the search result to the terminal device used by the user through a wired connection manner or a wireless connection manner.
With continuing reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the picture search method according to the present embodiment. In the application scenario of fig. 3, a user may send picture a for search to a picture search server through a terminal device, as indicated by reference numeral 301. The picture search server may locally store a group of pictures set. Each group of pictures in the group of pictures set may be associated with a group of clusters of the class resulting from the product quantization. Each class cluster included in each class cluster group in the class cluster group set may be associated with a reverse linked list. The inverted link list may include picture identifications of pictures in the group of pictures that are related to the cluster. After receiving the picture a for search, the picture search server may select at least one picture group from the picture group set based on the picture a for search, as indicated by reference numeral 302. Then, as indicated by reference numeral 303, the picture search server may determine, based on a matching relationship between the picture a for search and a center point of a class cluster included in a class cluster group set associated with each picture group in the at least one picture group and an inverted link table associated with the class cluster, a spatial distance between each picture in a picture set formed by pictures in the each picture group and the picture a for search. Then, as shown by reference numeral 304, the picture search server may select at least one picture from the picture set as a picture to be pushed based on the determined order of the distances from small to large. Then, as indicated by reference numeral 305, the picture search server may generate a search result including the determined picture to be pushed. Finally, as indicated by reference numeral 306, the picture search server may return the search result to the terminal device.
The method provided by the above embodiment of the application selects at least one picture group from a preset picture group set based on a picture for search uploaded by a user, wherein the picture group is associated with a cluster group set obtained after product quantization, the cluster included in the cluster group set is associated with an inverted linked list, the inverted linked list includes picture identifiers of pictures in the picture group and related to the cluster, and then selects at least one picture from the picture sets formed by the pictures in the picture groups as a picture to be pushed based on a matching relationship between the picture for search and a central point of the cluster included in the cluster group set associated with each picture group in the at least one picture group and the inverted linked list associated with the cluster, so as to generate a search result including the determined picture to be pushed. According to the scheme provided by the embodiment of the application, the inverted link list technology is introduced, so that the matching relation between each picture in the picture set and the picture for searching can be quickly determined, pictures (such as the determined picture to be pushed) similar to the picture for searching can be quickly determined, the picture searching time can be saved, and the picture searching efficiency is improved.
With further reference to fig. 4, a flow 400 of yet another embodiment of a picture search method is shown. The process 400 of the picture searching method includes the following steps:
step 401, obtaining a first picture feature vector corresponding to a picture for search uploaded by a user.
In the present embodiment, the execution subject of the picture search method may be a server (e.g., the server 105 shown in fig. 1). The server can receive pictures for search uploaded by the user through terminal devices (such as the terminal devices 101, 102, 103 shown in fig. 1) in real time. And, the server may acquire a first picture feature vector corresponding to the search picture. Here, the server may perform feature extraction on the picture for search by using various picture feature extraction algorithms (e.g., a convolutional neural network, a local binary pattern algorithm, etc.), and generate a first picture feature vector corresponding to the picture for search from the extracted features.
It should be noted that the picture feature vector may be used to characterize the picture corresponding to the picture feature vector.
Step 402, determining a distance between a center point of a first cluster associated with each group of pictures in a preset group of pictures set and a first picture feature vector.
In this embodiment, each group of pictures in the preset group of pictures set may be associated with a first type of cluster. The first cluster may include picture feature vectors corresponding to respective pictures in the group of pictures. In addition, the group of pictures can also be associated with a cluster group set of classes obtained after product quantization. The cluster included in the cluster group set may be associated with an inverted link table, and the inverted link table may include picture identifiers of pictures in the picture group, which are related to the cluster. Here, for the explanation of the first class cluster, the class cluster group set, and the inverted link list, reference may be made to the relevant explanation in the embodiment shown in fig. 2, and details are not repeated here.
In this embodiment, after obtaining the first picture feature vector, the server may determine a distance between a center point of the first cluster associated with each picture group in the picture group set and the first picture feature vector by using various distance measurement methods (e.g., euclidean distance, manhattan distance, mingmen distance, etc.).
At step 403, at least one group of pictures is selected from the group of pictures set based on the determined distance.
In this embodiment, after the server performs step 402, at least one group of pictures may be selected from the group of pictures set based on the determined distance. For example, a group of pictures with a first preset number (e.g., 1, 2, or 3, etc.) of distances between the center point of the associated first cluster and the first picture feature vector being the smallest is selected from the group of pictures. Here, the first preset number may be set according to actual needs, and is not particularly limited herein.
Step 404, for each cluster included in each cluster group in the cluster group set associated with each group of pictures in the at least one group of pictures, determining a distance between a center point of the cluster and the first picture feature vector, traversing each picture identifier in the inverted linked list associated with the cluster, and performing a distance accumulation operation on the pictures indicated by the picture identifiers based on the distance between the center point of the cluster and the first picture feature vector.
In this embodiment, for each cluster included in each cluster group in the cluster group set associated with each group of pictures in the at least one group of pictures, the server may determine the distance between the center point of the cluster and the first picture feature vector by using the distance measurement method described above. Then, the server may traverse each picture identifier in the inverted link list associated with the cluster, and perform distance accumulation operation on the picture indicated by the picture identifier based on the distance between the center point of the cluster and the first picture feature vector.
Step 405, for a picture set formed by pictures in each picture group in at least one picture group, selecting at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set.
In this embodiment, for a picture set formed by pictures in each of the at least one picture group, the server may select at least one picture from the picture set as a picture to be pushed based on a distance obtained by a distance accumulation operation corresponding to each picture in the picture set. For example, the server may select, from the picture set, a second preset number (e.g., 50, 100, or 200, etc.) of pictures with the smallest distance obtained through the distance accumulation operation, as the pictures to be pushed. Here, the second preset number may be set according to actual needs, and is not particularly limited herein.
Step 406, determining a distance between the picture feature vector corresponding to each to-be-pushed picture and the first picture feature vector.
In this embodiment, after determining the to-be-pushed picture corresponding to the search picture, the server may determine the distance between the picture feature vector corresponding to each to-be-pushed picture and the first picture feature vector by using the distance measurement method described above.
And 407, sequencing the determined pictures to be pushed according to the sequence of the distances from small to large to obtain a picture sequence to be pushed.
In this embodiment, the server may sequence the determined pictures to be pushed in the order from small to large according to the distance determined in step 406, so as to obtain a picture sequence to be pushed. In this way, the picture to be pushed, which is most similar to the picture for search, can be arranged in front.
Step 408, generating a search result including the picture sequence to be pushed.
In this embodiment, the server may generate a search result including a sequence of pictures to be pushed. The search result can facilitate the user to quickly obtain the required picture, save the browsing time of the user and improve the user experience.
In some optional implementation manners of this embodiment, the server may further return the search result to the terminal device used by the user through a wired connection manner or a wireless connection manner.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the process 400 of the picture searching method in this embodiment highlights the steps of expanding the selection method of at least one picture group, the selection method of the picture to be pushed, and the generation method of the search result. Thus, the scheme described in the present embodiment can achieve diversity of information processing. In addition, the first picture feature vector corresponding to the picture for searching uploaded by the user is obtained, then the distance between the center point of the first cluster associated with each picture group in the preset picture group set and the first picture feature vector is determined, and then at least one picture group is selected from the picture group set based on the determined distance, so that the effectiveness of the selected picture group can be improved, and the selected picture group has higher correlation with the picture for searching. In addition, for each class cluster included in each class cluster group in the class cluster group set associated with each picture group in the at least one picture group, by determining a distance between a center point of the class cluster and the first picture feature vector, traversing each picture identifier in the inverted linked list associated with the class cluster, and performing a distance accumulation operation on the pictures indicated by the picture identifier based on the distance between the center point of the class cluster and the first picture feature vector, a spatial distance between each picture included in each picture group in the at least one picture group and the picture for search can be quickly obtained, and time consumption in a picture search process can be effectively saved. In addition, the distance between the picture characteristic vector corresponding to each picture to be pushed and the first picture characteristic vector is determined, then the determined pictures to be pushed are sequenced according to the sequence of the distance from small to large to obtain a picture sequence to be pushed, and then a search result comprising the picture sequence to be pushed is generated, so that the picture arranged in front in the search result is the picture most similar to the picture for searching, the user can conveniently and quickly obtain the required picture, the browsing time of the user is saved, and the user experience is improved.
With further reference to fig. 5, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of an image search apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied to various electronic devices.
As shown in fig. 5, the picture search apparatus 500 of the present embodiment includes: the first selecting unit 501 is configured to select at least one group of pictures from a preset group of pictures based on a picture for search uploaded by a user, where the group of pictures may be associated with a group set of clusters obtained by product quantization, a cluster included in a group of clusters in the group set of clusters may be associated with an inverted link table, and the inverted link table may include picture identifiers of pictures in the group of pictures related to the cluster; the second selecting unit 502 is configured to select at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relationship between the picture for search and a center point of a cluster included in a cluster group set associated with each picture group in the at least one picture group and an inverted link table associated with the cluster; the generating unit 503 is configured to generate a search result including the determined picture to be pushed.
In the present embodiment, in the picture search apparatus 500: the detailed processing of the first selecting unit 501, the second selecting unit 502, and the generating unit 503 and the technical effects thereof can refer to the related descriptions of step 201, step 202, and step 203 in the corresponding embodiment of fig. 2, which are not repeated herein.
In some optional implementation manners of this embodiment, the group of pictures may further be associated with a first class of clusters, where the first class of clusters may include picture feature vectors corresponding to respective pictures in the group of pictures; and the first selecting unit 501 may include: an acquisition subunit (not shown in the figure) configured to acquire a first picture feature vector corresponding to a search picture; a determining subunit (not shown in the figures) configured to determine a distance between a center point of a first cluster associated with each group of pictures in the group of pictures set and the first picture feature vector; a first selecting sub-unit (not shown in the figures) configured to select at least one group of pictures from the group of pictures set based on the determined distance.
In some optional implementations of this embodiment, the first selecting subunit may be further configured to: and selecting a picture group with the minimum distance between the central point of the first cluster associated with the first preset number and the first picture characteristic vector from the picture group set.
In some optional implementations of this embodiment, the second selecting unit 502 may include: a distance accumulation subunit (not shown in the figure) configured to determine, for each cluster included in each cluster group in the cluster group set associated with each group of pictures in the at least one group of pictures, a distance between a center point of the cluster and the first picture feature vector; traversing each picture identifier in the inverted list associated with the cluster, and performing distance accumulation operation on the picture indicated by the picture identifier based on the distance between the center point of the cluster and the first picture feature vector; and a second selecting subunit (not shown in the figure) configured to select at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set.
In some optional implementations of this embodiment, the second selecting subunit may be further configured to: and selecting a second preset number of corresponding pictures with the minimum distance obtained by distance accumulation operation from the picture set as the pictures to be pushed.
In some optional implementations of this embodiment, the generating unit 503 may be further configured to: determining the distance between the picture characteristic vector corresponding to each picture to be pushed and the first picture characteristic vector; sequencing the determined pictures to be pushed according to the sequence of the distances from small to large to obtain a picture sequence to be pushed; and generating a search result comprising a picture sequence to be pushed.
In some optional implementations of this embodiment, the apparatus 500 may further include: and a sending unit (not shown in the figure) configured to return the search result to the terminal device used by the user.
The apparatus provided in the foregoing embodiment of the present application selects at least one picture group from a preset picture group set based on a picture for search uploaded by a user, where the picture group is associated with a cluster group set obtained after product quantization, the cluster included in the cluster group set is associated with an inverted linked list, the inverted linked list includes picture identifiers of pictures in the picture group and related to the cluster, and then selects at least one picture from a picture set formed by pictures in the picture groups as a picture to be pushed based on a matching relationship between the picture for search and a center point of a cluster included in the cluster group set associated with each picture group in the at least one picture group and the inverted linked list associated with the cluster, so as to generate a search result including the determined picture to be pushed. According to the scheme provided by the embodiment of the application, the inverted link list technology is introduced, so that the matching relation between each picture in the picture set and the picture for searching can be quickly determined, pictures (such as the determined picture to be pushed) similar to the picture for searching can be quickly determined, the picture searching time can be saved, and the picture searching efficiency is improved.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use in implementing an electronic device (e.g., server 105 of FIG. 1) of an embodiment of the present application is shown. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first selecting unit, a second selecting unit, and a generating unit. The names of these units do not in some cases form a limitation on the units themselves, and for example, the first selection unit may also be described as a "unit that selects at least one group of pictures from a preset group of pictures based on pictures for search uploaded by a user".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to: selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group can be associated with a cluster group set obtained by product quantization, clusters included in the cluster groups in the cluster group set can be associated with an inverted link table, and the inverted link table can include picture identifiers of pictures related to the clusters in the picture group; selecting at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relation between a picture for searching and a center point of a class cluster included in the class cluster group set associated with each picture group in the at least one picture group and an inverted chain table associated with the class cluster; generating a search result comprising the determined picture to be pushed.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. An image searching method comprises the following steps:
selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, wherein the picture group is associated with a cluster group set obtained after product quantization, the clusters included in the cluster groups in the cluster group set are associated with an inverted chain table, and the inverted chain table comprises picture identifiers of pictures related to the clusters in the picture group;
selecting at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relation between the picture for searching and a center point of a class cluster included in a class cluster group set associated with each picture group in the at least one picture group and an inverted chain table associated with the class cluster;
generating a search result comprising the determined picture to be pushed.
2. The method according to claim 1, wherein a group of pictures is further associated with a first class of clusters, and the first class of clusters includes picture feature vectors corresponding to respective pictures in the group of pictures; and
the selecting at least one picture group from a preset picture group set based on a picture for searching uploaded by a user comprises the following steps:
acquiring a first picture characteristic vector corresponding to the picture for searching;
determining a distance between a center point of a first cluster associated with each group of pictures in the group of pictures set and the first picture feature vector;
selecting at least one group of pictures from the group of pictures set based on the determined distance.
3. The method of claim 2, wherein the selecting at least one group of pictures from the group of pictures set based on the determined distance comprises:
and selecting a picture group with the minimum distance between the central point of the first cluster associated with the first preset number and the first picture characteristic vector from the picture group set.
4. The method according to claim 2, wherein the selecting, based on a matching relationship between the picture for search and a center point of a cluster included in a cluster group set associated with each of the at least one picture group and an inverted link table associated with the cluster, at least one picture from a picture set formed by pictures in each of the picture groups as a picture to be pushed comprises:
determining, for each class cluster included in each class cluster group in a class cluster group set associated with each picture group in the at least one picture group, a distance between a center point of the class cluster and the first picture feature vector; traversing each picture identifier in the inverted list associated with the cluster, and performing distance accumulation operation on the picture indicated by the picture identifier based on the distance between the center point of the cluster and the first picture feature vector;
and selecting at least one picture from the picture set as a picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set.
5. The method of claim 4, wherein the selecting at least one picture from the picture set as the picture to be pushed based on the distance obtained by the distance accumulation operation corresponding to each picture in the picture set comprises:
and selecting a second preset number of corresponding pictures with the minimum distance obtained by distance accumulation operation from the picture set as the pictures to be pushed.
6. The method of any of claims 2-5, wherein the generating search results including the determined picture to be pushed comprises:
determining the distance between the picture characteristic vector corresponding to each picture to be pushed and the first picture characteristic vector;
sequencing the determined pictures to be pushed according to the sequence of the distances from small to large to obtain a picture sequence to be pushed;
and generating a search result comprising the picture sequence to be pushed.
7. The method according to one of claims 1-5, wherein the method further comprises:
and returning the search result to the terminal equipment used by the user.
8. An image search apparatus comprising:
the device comprises a first selection unit, a second selection unit and a third selection unit, wherein the first selection unit is configured to select at least one picture group from a preset picture group set based on a picture for searching uploaded by a user, the picture group is associated with a cluster group set obtained after product quantization, the clusters included in the cluster groups in the cluster group set are associated with an inverted linked list, and the inverted linked list comprises picture identifiers of pictures related to the clusters in the picture group;
a second selecting unit, configured to select at least one picture from a picture set formed by pictures in each picture group as a picture to be pushed based on a matching relationship between the picture for search and a center point of a cluster included in a cluster group set associated with each picture group in the at least one picture group and an inverted link table associated with the cluster;
a generating unit configured to generate a search result including the determined picture to be pushed.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-7.
CN201910284715.7A 2019-04-10 2019-04-10 Picture searching method and device Pending CN111753111A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910284715.7A CN111753111A (en) 2019-04-10 2019-04-10 Picture searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910284715.7A CN111753111A (en) 2019-04-10 2019-04-10 Picture searching method and device

Publications (1)

Publication Number Publication Date
CN111753111A true CN111753111A (en) 2020-10-09

Family

ID=72672658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910284715.7A Pending CN111753111A (en) 2019-04-10 2019-04-10 Picture searching method and device

Country Status (1)

Country Link
CN (1) CN111753111A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840762A (en) * 2022-05-19 2022-08-02 马上消费金融股份有限公司 Recommended content determining method and device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294813A (en) * 2013-06-07 2013-09-11 北京捷成世纪科技股份有限公司 Sensitive image search method and device
US20160062731A1 (en) * 2014-08-29 2016-03-03 Adobe Systems Incorporated Shortlist computation for searching high-dimensional spaces
CN106021362A (en) * 2016-05-10 2016-10-12 百度在线网络技术(北京)有限公司 Query picture characteristic representation generation method and device, and picture search method and device
CN107944046A (en) * 2017-12-15 2018-04-20 清华大学 Extensive high dimensional data method for quickly retrieving and system
CN108427745A (en) * 2018-03-08 2018-08-21 南京理工大学 The image search method of visual dictionary and adaptive soft distribution based on optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294813A (en) * 2013-06-07 2013-09-11 北京捷成世纪科技股份有限公司 Sensitive image search method and device
US20160062731A1 (en) * 2014-08-29 2016-03-03 Adobe Systems Incorporated Shortlist computation for searching high-dimensional spaces
CN106021362A (en) * 2016-05-10 2016-10-12 百度在线网络技术(北京)有限公司 Query picture characteristic representation generation method and device, and picture search method and device
CN107944046A (en) * 2017-12-15 2018-04-20 清华大学 Extensive high dimensional data method for quickly retrieving and system
CN108427745A (en) * 2018-03-08 2018-08-21 南京理工大学 The image search method of visual dictionary and adaptive soft distribution based on optimization

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840762A (en) * 2022-05-19 2022-08-02 马上消费金融股份有限公司 Recommended content determining method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN107506495B (en) Information pushing method and device
CN108830235B (en) Method and apparatus for generating information
CN109492772B (en) Method and device for generating information
CN107944481B (en) Method and apparatus for generating information
CN108520470B (en) Method and apparatus for generating user attribute information
CN108960316B (en) Method and apparatus for generating a model
CN109447156B (en) Method and apparatus for generating a model
CN110619078B (en) Method and device for pushing information
CN112650841A (en) Information processing method and device and electronic equipment
CN110555451A (en) information identification method and device
CN107908662B (en) Method and device for realizing search system
CN110866040A (en) User portrait generation method, device and system
CN109062560B (en) Method and apparatus for generating information
CN110188113B (en) Method, device and storage medium for comparing data by using complex expression
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN109992719B (en) Method and apparatus for determining push priority information
CN109582854B (en) Method and apparatus for generating information
CN111680799A (en) Method and apparatus for processing model parameters
CN109377284B (en) Method and electronic equipment for pushing information
CN112749323A (en) Method and device for constructing user portrait
CN111125502B (en) Method and device for generating information
CN111753111A (en) Picture searching method and device
CN112860999B (en) Information recommendation method, device, equipment and storage medium
CN109857838B (en) Method and apparatus for generating information
CN109308299B (en) Method and apparatus for searching information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination