CN113971225A - Image retrieval system, method and device - Google Patents

Image retrieval system, method and device Download PDF

Info

Publication number
CN113971225A
CN113971225A CN202011621979.6A CN202011621979A CN113971225A CN 113971225 A CN113971225 A CN 113971225A CN 202011621979 A CN202011621979 A CN 202011621979A CN 113971225 A CN113971225 A CN 113971225A
Authority
CN
China
Prior art keywords
image
distance
retrieved
engine
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011621979.6A
Other languages
Chinese (zh)
Inventor
彭席汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202211628292.4A priority Critical patent/CN115878824B/en
Priority to EP21837926.1A priority patent/EP4167107A4/en
Priority to PCT/CN2021/099890 priority patent/WO2022007596A1/en
Publication of CN113971225A publication Critical patent/CN113971225A/en
Priority to US18/151,433 priority patent/US20230161811A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/56Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes

Abstract

An image retrieval system comprises an interface and a calculation engine set, wherein the interface is used for acquiring an image to be retrieved; the calculation engine set comprises a first calculation engine and a second calculation engine, wherein the first calculation engine is used for calculating the similarity between the image to be retrieved and a plurality of images contained in the gallery; the second calculation engine is used for determining a retrieval result according to the similarity between the retrieved image and the images contained in the gallery, wherein the retrieval result comprises a target image similar to the image to be retrieved, so that the speed of image retrieval processing is increased, and the time delay of data processing is reduced.

Description

Image retrieval system, method and device
Technical Field
The present application relates to the field of computers, and in particular, to an image retrieval system, method and apparatus.
Background
With the development of computer technology, image retrieval technology is gradually becoming a research hotspot, and application scenes are increasingly wide, for example, in a commodity image search function, images provided by users are searched in a gallery to find similar or identical commodities, and then the found commodities are recommended to the users; or, in the search service of copyright judgment, the image is taken as a retrieval object, the same or similar images are searched in a massive gallery, and then whether the retrieval object has the infringement problem or not is judged.
In a conventional image retrieval process, a Central Processing Unit (CPU) providing an image retrieval function loads images in a gallery to a main memory (main memory) directly according to a user retrieval request, and then compares similarity between a query image and the images in the gallery to determine an image that is the same as or similar to the query image. However, when the gallery includes a large amount of data, the CPU needs to perform operations such as data processing and distance calculation during image retrieval, which is complex and time-consuming, and cannot meet the real-time requirement of image retrieval. Therefore, how to provide an efficient image retrieval method becomes an urgent technical problem to be solved.
Disclosure of Invention
The application provides an image retrieval method, wherein a heterogeneous system comprising a plurality of computing engines executes image retrieval operation, each operation process in the image retrieval is executed according to the idea of 'special-purpose matters', the problems of complex processing and long consumed time caused by executing the image retrieval process by using a single CPU are avoided, and the efficiency of the image retrieval is improved.
In a first aspect, an image retrieval system is provided, where the system includes an interface for acquiring an image to be retrieved and a computing engine set including a first computing engine and a second computing engine, where the first computing engine is configured to compute similarities between the image to be retrieved and a plurality of images included in a gallery; the second calculation engine is used for determining a retrieval result according to the similarity between the image to be retrieved and the images contained in the gallery, and the retrieval result comprises a target image similar to the image to be retrieved. By the image retrieval system, the image retrieval operation can be executed by utilizing the calculation engine set, and the retrieval result is determined according to the similarity between the image to be retrieved and the image in the gallery, so that the problems of long time consumption and low efficiency caused by the execution of the image retrieval operation by a single CPU are solved.
In a possible implementation manner, the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to the distance between the query vector corresponding to the image to be retrieved and the center vector corresponding to the plurality of images. Or, the similarity between the image to be retrieved and the plurality of images included in the gallery refers to a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images. In the image retrieval system to be protected, the similarity of the images is measured by using the distance between the vectors, and then the images matched with the images to be retrieved are determined.
In another possible implementation manner, in the process of image retrieval processing, the first calculation engine is specifically configured to convert the query vector into a query matrix, convert the center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix, where the distance table is used to determine distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images. The first calculation engine can be used for processing matrix correlation calculation, in the process of calculating the distance between vectors, the distance calculation process can be mapped into a matrix processing process, and then the first calculation engine is used for finishing matrix correlation processing, so that the speed of the distance calculation process is increased, and the processing efficiency of the whole image retrieval process is improved.
In another possible implementation manner, the image retrieval system further includes a shared memory, and the shared memory is connected to the first computing engine through an internal bus; the first calculation engine is further configured to store the distance table to the shared memory via the internal bus. By adding the shared memory in the image retrieval system, the first computing engine and the second computing engine can access data in the storage medium in a shared mode, for example, the first computing engine can store the distance table in the shared memory, the second computing engine can acquire the distance table from the shared memory, and complete other processing operations of image retrieval based on the distance table, so that the problems of slow access speed and cache miss (cache miss) caused by using a storage medium except the image retrieval system to store the distance table are avoided, and the speed of image retrieval is further improved.
In another possible implementation manner, the shared memory is connected with the second computing engine through an internal bus; the second calculation engine is specifically used for acquiring the distance table from the shared memory; and determining whether the target image similar to the image to be retrieved is contained in the image library according to the distance table. Because the first computing engine and the second computing engine can both access the shared memory, the second computing engine can acquire the distance table from the shared memory and execute subsequent processing based on the distance table, the problem of time consumption caused by Cache invalidation (Cache miss) caused by data acquisition from the memory of the host system is solved, and the data processing speed is increased.
In another possible implementation manner, the image retrieval system further includes a third computing engine, configured to obtain an index list corresponding to an image in the gallery, where the third computing engine may also be referred to as a direct memory access engine, and is configured to obtain data in a memory connected to the host system in a direct memory access manner; the second calculation engine is specifically configured to obtain a distance matrix according to the distance table and the index list, where the distance matrix is used to indicate a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix. The second calculation engine may determine an image matching the image to be retrieved according to the similarity between the distance table and the index list calculation vectors. Moreover, the distance table and the index list are stored in the shared memory, so that the processing speed of random reading in the process of generating the distance matrix can be increased, and the data processing efficiency is improved.
In another possible implementation manner, the second calculation engine is specifically configured to determine, according to the distance matrix, a search result according to a result of ranking of distances between a query vector corresponding to the image to be searched and center vectors corresponding to the plurality of images. The distance matrix is used for identifying the distance between each segment of the image to be retrieved and the corresponding segment of the central vector, the distance between the query vector and the central vector of the image to be retrieved can be further determined according to the distance matrix, the distances are further sequenced, and the retrieval result is finally determined.
In another possible implementation manner, the second calculation engine is specifically configured to determine the search result according to a preset threshold, where a distance between a vector associated with the image and the query vector included in the search result satisfies a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold. Besides obtaining the retrieval result in a sorting mode, one or more images meeting the conditions can be determined as the retrieval result through a preset threshold value, and the retrieval result matched with the image to be retrieved can also be found.
In another possible implementation, the image retrieval system is a chip or PCIe card, and the first compute engine and the second compute engine are connected by an internal bus. By using the connection mode of the internal bus, the data transmission path between the first computing engine and the second computing engine is shorter, and the data transmission speed is higher.
In another possible implementation, the first compute engine and the second compute engine in the set of compute engines are each logic circuits. By using the first and second compute engines constructed using dedicated logic circuits, different hardware engines can be used for different types of arithmetic operations, and data processing can be completed faster.
In another possible implementation manner, the system is a virtualization system, and the first compute engine and the second compute engine in the compute engine set are respectively virtual machines or containers running on devices in the virtualization system. In addition to the above operations of the dedicated matrix, vector and scalar implemented by using the hardware engine constructed by the hardware circuit, the functions of each hardware engine can be implemented by using a virtual machine or a container in a virtualized form, and then the efficient image retrieval method provided by the present application can be implemented in the data center.
In another possible implementation manner, the image to be retrieved comes from a retrieval request, where the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, where the video to be retrieved includes at least one frame of image to be retrieved. The method and the device can be applied to the image retrieval process and the video retrieval process, wherein each frame of image can be executed according to the image retrieval process, and the retrieval result can be quickly obtained through a calculation engine set.
In a second aspect, the present application provides a method for image retrieval, the method being applied to an image retrieval system, the image retrieval system comprising an interface and a set of computing engines, the set of computing engines comprising a first computing engine and a second computing engine. The specific retrieval process comprises the following steps: the first calculation engine calculates the similarity between the image to be retrieved and a plurality of images contained in the gallery; and the second calculation engine determines a retrieval result according to the similarity between the image to be retrieved and a plurality of images contained in the gallery, wherein the retrieval result comprises whether a target image similar to the image to be retrieved is contained in the gallery.
In a possible implementation manner, the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to the distance between the query vector corresponding to the image to be retrieved and the center vector corresponding to the plurality of images. Or, the similarity between the image to be retrieved and the plurality of images included in the gallery refers to a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images. In the image retrieval system to be protected, the similarity of the images is measured by using the distance between the vectors, and then the images matched with the images to be retrieved are determined.
In another possible implementation manner, in the process of image retrieval processing, the first calculation engine converts the query vector into a query matrix, converts the center vector into a center matrix, and obtains a distance table according to the query matrix and the center matrix, where the distance table is used to determine distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images. The first calculation engine can be used for processing matrix correlation calculation, the distance calculation process can be converted into a matrix processing process in the process of calculating the distance between the vectors, and then the matrix correlation processing is completed by the special first calculation engine, so that the speed of the distance calculation process is increased, and the processing efficiency of the whole image retrieval process is improved.
In another possible implementation manner, the image retrieval system further includes a shared memory, and the shared memory is connected to the first computing engine through an internal bus; the first calculation engine stores the distance table to the shared memory through the internal bus. By adding the shared memory in the image retrieval system, the first calculation engine and the second calculation engine can access the data in the storage medium in a shared manner, for example, the first calculation engine can store the distance table in the shared memory, and the second calculation engine can acquire the distance table from the shared memory and complete other processing operations of image retrieval based on the distance table, thereby avoiding the problems of slow access speed and complex processing caused by using a storage medium other than the image retrieval system to store the distance table.
In another possible implementation manner, the shared memory is connected with the second computing engine through an internal bus; the second calculation engine acquires the distance table from the shared memory; and determining whether the target image similar to the image to be retrieved is contained in the image library according to the distance table. Because the first computing engine and the second computing engine can both access the shared memory, the second computing engine can acquire the distance table from the shared memory and execute subsequent processing based on the distance table, the problem of time consumption caused by Cache invalidation (Cache miss) caused by data acquisition from the memory of the host system is solved, and the data processing speed is increased.
In another possible implementation manner, the image retrieval system further includes a third computing engine, which may obtain an index list corresponding to an image in the gallery, where the third computing engine may also be referred to as a direct memory access engine, and is configured to obtain data in a memory connected to the host system in a direct memory access manner; the second calculation engine obtains a distance matrix according to the distance table and the index list, wherein the distance matrix is used for indicating a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix. The second calculation engine may determine an image matching the image to be retrieved according to the similarity between the distance table and the index list calculation vectors.
In another possible implementation manner, the second calculation engine determines the retrieval result according to the sorting result of the distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images, which is determined by the distance matrix. The distance matrix is used for identifying the distance between each segment of the image to be retrieved and the corresponding segment of the central vector, the distance between the query vector and the central vector of the image to be retrieved can be further determined according to the distance matrix, the distances are further sequenced, and the retrieval result is finally determined.
In another possible implementation manner, the second calculation engine is specifically configured to determine the search result according to a preset threshold, where a distance between a vector associated with the image and the query vector included in the search result satisfies a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold. Besides obtaining the retrieval result in a sorting mode, one or more images meeting the conditions can be determined as the retrieval result through a preset threshold value, and the retrieval result matched with the image to be retrieved can also be found.
In another possible implementation, the image retrieval system is a chip or PCIe card, and the first compute engine and the second compute engine are connected by an internal bus.
In another possible implementation, the first compute engine and the second compute engine in the set of compute engines are each logic circuits.
In another possible implementation, the image retrieval system is a virtualization system, and the first compute engine and the second compute engine in the set of compute engines are virtual machines or containers, respectively, running on devices in the virtualization system.
In another possible implementation manner, the image to be retrieved comes from a retrieval request, where the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, where the video to be retrieved includes at least one frame of image to be retrieved. The method and the device can be applied to the image retrieval process and the video retrieval process, wherein each frame of image can be executed according to the image retrieval process, and the retrieval result can be quickly obtained through a calculation engine set.
In a third aspect, the present application provides an apparatus for image retrieval, which includes various modules for performing the method for image retrieval in the second aspect or any one of the possible implementations of the second aspect.
In a fourth aspect, the present application provides a data processing system, where the data processing system includes a host system and an image retrieval system, where the host system and the image retrieval system communicate via a network, the host system is configured to obtain a retrieval request, determine a retrieval image according to the retrieval request, and send the retrieval image to the image retrieval system, and the image retrieval system is configured to execute functions implemented by each computing engine in the image retrieval system in the first aspect or any one of possible implementation manners of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above aspects.
In a fifth aspect, the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
FIG. 1 is a schematic flow chart of a method for product quantization provided herein;
FIG. 2 is a block diagram illustrating an architecture of a data processing system 100 according to the present application;
FIG. 3 is a schematic flow chart of an image retrieval method provided in the present application;
FIG. 4 is a schematic flow chart of another image retrieval method provided in the present application;
FIG. 5 is a schematic flow chart of a method for calculating a distance table according to the present application;
FIG. 6 is a schematic flow chart diagram illustrating another method for calculating a distance table according to the present application;
FIG. 7 is a schematic flow chart illustrating a method for calculating a distance matrix according to the present disclosure;
FIG. 8 is a schematic flow chart of another method for calculating a distance matrix provided herein;
FIG. 9 is a schematic flow chart illustrating a method for calculating accumulated vectors according to the present application;
FIG. 10 is a schematic flow chart of another method for calculating a distance matrix provided herein;
FIG. 11 is a schematic diagram of a parallel processing flow provided herein;
fig. 12 is a schematic architecture diagram of an image retrieval apparatus 600 according to the present application.
Detailed Description
For ease of understanding, terms referred to in the present embodiment are first explained.
Image search refers to a process of searching for an image matching a query image input by a user in a gallery (also referred to as a base gallery, a sample gallery, or a to-be-searched gallery). Generally, image retrieval is based on deep learning (deep learning) and image recognition technology, and utilizes feature vectorization and search capabilities to search for the same or similar images from a specified gallery in combination with different application services and industry scenes. Each image and the query image in the gallery can be represented by using feature data, a vector of the image can be obtained according to a feature extraction model (for example, a feature extraction model ResNet of deep learning), the feature data of the same image comprises at least one feature vector (also called a vector), each vector can be represented by a multi-dimensional array, for example, vector 1 can be represented by (a, b, c), and a, b, c are floating points. The above-described process of image retrieval may also be understood as a process of retrieval processing based on feature data of an image. For convenience of description, the following embodiments of the present application will be described by taking feature data of each image as an example of one feature vector.
The long features are feature data of the images obtained according to the feature extraction model, which are also called long feature data or uncoded data, long coded data and original feature data, and each image can obtain one long feature.
The short features refer to data obtained by decomposing long features after an Artificial Intelligence (AI) feature training platform is trained according to a model, and are also called short feature data, or called encoded data and short encoded data. In general, a Product Quantization (PQ) method, which is a method to speed up the search, can be used to convert long features into short features.
For example, fig. 1 is a schematic flowchart of a method for product quantization processing provided in an embodiment of the present application, and as shown in the figure, taking a process of converting long features of 1 image into short features as an example, a long feature is first segmented, each segment has a segment number, and the segment number is unique in the segment. For example, dividing the long feature including 512 dimensions into 64 segments, each segment including 8 feature vectors F, the segment numbers being 0 to 7, respectively; then, a feature vector closest to the K clustering center vectors obtained by the pre-training is determined in each segment through the distance calculation between the feature vector and the K clustering center vectors obtained by the pre-training, and the segment serial number of the feature vector is used as the mark of the short feature of the segment. Wherein K is the number of preset clustering centers, and K is a positive integer greater than or equal to 2. Specifically, the K cluster center vectors obtained by pre-training may be obtained by using long features of all or part of images in the gallery as training data and using a clustering algorithm (e.g., a K-Means algorithm) to obtain K clusters, where each cluster includes a cluster center, and a vector for identifying the cluster center is referred to as a cluster center vector (also referred to as a center vector). For example, in segment 1 including F0 through F7, F1 is the feature vector closest to the K cluster center vectors, segment 1 may be identified by the segment number 1 of F1, i.e., segment 1 may be identified by 1 in the short feature. After the distance calculation and comparison, each segment in the long feature can be identified by the sequence number of the feature vector in one segment. The above product quantization process can also be understood as a way of simplifying the recording of long features of images in a gallery, by calculating the distance between the vector in each segment and the central vector, to construct an index of the original features. The collection of identifications of the segments that identify long features for all images in the gallery is referred to as a short feature list (which may also be referred to as an index list) and the order of the short features in the short feature list is the same as the order of the segments of the long features. The subsequent image retrieval processing is retrieval processing performed based on the short feature.
The data processing system 100 provided herein is further described with reference to fig. 2, and as shown in the figure, the data processing system 100 includes a host system 10, a heterogeneous system 20 (which may also be referred to as an image retrieval system), a first memory 30 and an interface bus 40, wherein the host system 10, the heterogeneous system 20 and the first memory 30 are connected via the interface bus 40. The interface bus 40 includes Peripheral Component Interconnect Express (PCIe), and the heterogeneous system 20 may be a chip or a PCIe card.
A host system 10, configured to provide an Application Programming Interface (API) or an image retrieval framework to an application running in the host system 10 or another system, device, or application outside the host system 10, where the API or the image retrieval framework is configured to receive a retrieval request of a user, where the retrieval request includes an image to be retrieved; and further, sending the retrieval request to the heterogeneous system 20 through the interface bus 40, where the heterogeneous system 20 completes processing of the retrieval request, for example, obtaining a query vector associated with the image to be retrieved in the retrieval request. Optionally, the host system may also complete processing of the retrieval request, and obtain a query vector associated with the image to be retrieved. In addition, the host system 10 is also used to run an operating system. The host system 10 includes a processor 101, and the processor 101 may be a Central Processing Unit (CPU).
A heterogeneous system 20 for retrieving images in the gallery that are the same as or similar to the query image. Heterogeneous system 20 includes two or more compute engines, for example, as shown in fig. 2, heterogeneous system 20 includes a matrix compute engine 201, a vector compute engine 202, a scalar compute engine 203, an internal bus 206, and an interface 207. Optionally, the heterogeneous system 20 further includes a Direct Memory Access (DMA) engine 204 and a second storage 205. The matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, the Direct Memory Access (DMA) engine 204, the second storage 205, and the interface 207 communicate with each other through an internal bus 206, where the internal bus 206 may be a parallel bus and is used to implement point-to-point (P2P) communication; the internal part may also include PCIe or Front Side Bus (FSB) or other forms of connection that may implement the various components.
For convenience of description, the matrix computation engine 201 may also be referred to as a first computation engine, the vector computation engine 202 and the scalar computation engine 203 may be referred to as a second computation engine, the direct memory access engine 204 may be referred to as a third computation engine, and the second storage 205 may be referred to as a shared memory.
As a possible embodiment, the first memory 30 and the second memory 205 are two different memories. Optionally, the first storage 30 and the second storage 205 may also be different storages whose mapping relationship is established by a memory mapped input/output (MMIO) technology; alternatively, the first memory 30 and the second memory 205 may be the same memory.
The matrix calculation engine 201 is configured to implement matrix correlation operation processing, including matrix multiplication and other operation processes involving matrices, for example, calculating a distance table at S201 in fig. 3. The vector calculation engine 202 is configured to implement vector correlation calculation, including vector accumulation and other operation processes involving vectors, for example, S204 in fig. 3 accumulates distance vectors to obtain distance results between vectors. And the scalar calculation engine 203 is used for executing scalar correlation operation, including querying the short feature list and the distance table, further generating a distance matrix, and performing sorting operation on the result of matrix distance calculation. For example, in fig. 3, S202 traverses the short feature list, S203 generates the distance matrix, and S205 sorts; the scalar quantity refers to a quantity with only size and no direction. The direct memory access engine 204 is configured to load the short feature list from the first storage 20 to the second storage 205 by using a direct memory access technology, and each computing engine in the heterogeneous system can quickly obtain the content in the short feature list, thereby completing a corresponding processing operation. And a second storage 205, configured to provide a storage area in a shared form for the matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, and the direct memory access engine 204.
The matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, and the direct memory access engine 204 may be implemented by hardware circuits having specific functions. For example, each computing engine is composed of a chip or a circuit structure constructed by electronic components, respectively, for performing computing processing on data. In the embodiment, the matrix calculation engine 201 is taken as an example for explanation, and the matrix calculation engine 201 has universality and programmability of the CPU, but has more specificity, and can efficiently execute correlation operations, such as matrix correlation operations. The matrix computation engine 201 may be replaced with a Graphics Processing Unit (GPU), an embedded neural Network Processor (NPU), or other processing chip.
Alternatively, the matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, and the direct memory access engine 204 may be implemented by software modules, or may be formed by a hybrid of software and hardware. Optionally, the above computing engines may also be deployed in a unified or partially unified manner, for example, the matrix computing engine 201, the vector computing engine 202, and the vector computing engine 203 are deployed in a unified manner to implement matrix class operations and vector class operation processing.
As a possible embodiment, the heterogeneous system shown in fig. 2 can also be implemented in a software manner, for example, the heterogeneous system is a virtualization system, where the matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, and the direct memory access engine 204 can be implemented by a computing device in a virtualization form such as a virtual machine or a container, and in this case, the heterogeneous system is connected to the host system through a network, where the network can be an ethernet network, a fiber network, a wireless network, or the like, to implement communication between the host system and the heterogeneous system. The virtualization system includes one or more devices, each device includes a logic circuit or device for implementing a corresponding computing function, for example, the device may include the PCIe card or chip, and in this case, each computing engine is implemented by one or more virtual machines or containers obtained by virtualizing the device with the PCIe card or chip. For example, a virtualization system includes a device (e.g., GPU) with a device (e.g., GPU) that implements the functions of a matrix compute engine, and a plurality of virtual machines or containers with virtual graphics processors (vGPU) are obtained after virtualizing the resources of the device, where each virtual machine or container is respectively used for implementing the functions of the above compute engines.
For convenience of description, the following embodiment further explains the image retrieval method provided in the present application by taking the data processing system 100 shown in fig. 2 as an example, wherein the heterogeneous system 20 may obtain the image to be retrieved from the host system 10 through the interface 207, determine the similarity between the image to be retrieved and the image in the gallery, and determine the retrieval result of the image to be retrieved according to the calculation result of the similarity. In the above image retrieval process, the heterogeneous system 20 may adopt a concept of "special affairs", referring to fig. 3, in the heterogeneous system 20, different types of calculation engines are used to perform operations of S201 distance table calculation, S202 short feature list traversal, S203 distance matrix generation, S204 distance vector accumulation, and S205 sorting, respectively, so as to determine a retrieval result, where the retrieval result includes an image (also referred to as a target image) similar to an image to be retrieved. Therefore, the process of executing processing by adopting different calculation engines respectively according to different operation types is realized, and the data processing efficiency is improved.
Next, the image retrieval method provided by the present application is further described with reference to the accompanying drawings, and fig. 4 is a schematic flow chart of an image retrieval method provided by the present application, and as shown in the figure, the method includes:
s401, the matrix calculation engine determines a distance table according to the query vector corresponding to the image to be retrieved in the retrieval request and the central vector in the gallery.
In order to accelerate the retrieval speed of mass data, before performing image retrieval, the host system 10 or other devices or systems except the system 100 shown in fig. 2 obtain the long features of the images in the gallery in advance through a feature extraction model; and all or part of data in the image library is taken as a training sample, K clustering centers are obtained by utilizing a clustering algorithm, each clustering center comprises a center vector (also called a product quantization clustering center vector), each center vector comprises vectors with the same number as the number of segments in the long feature, and each vector can be represented by a multi-dimensional array. Illustratively, as in fig. 4, after the long features of the images in the gallery are subjected to product quantization and clustering, K central vectors 1 to K are obtained, each central vector including M segments, where M is the number of segments of the long features, for example, the central vector 1 includes a vector 11 to a vector 1M. Then, referring to the procedure shown in fig. 1, a product quantization process is performed on the long features of the respective images in the gallery, creating a short feature list.
When the retrieval request is obtained, the image to be retrieved in the retrieval request may be obtained first, and then, the preprocessing operation is performed on the image to be retrieved, including: the method comprises the steps of obtaining long features of an image to be retrieved by using a feature extraction model which is the same as the image preprocessing process in a gallery, and then dividing the long features of the image to be retrieved into M segments, wherein M is the same number of segments of the long features when product quantization processing is executed in the gallery, for example, a query vector 1 is divided into a vector 1 and a vector M in FIG. 4.
It should be noted that the above pre-processing process for the retrieved image may be performed by the heterogeneous system, or may be performed by the host system or other devices or systems, and the pre-processing result is sent to the heterogeneous system, and the heterogeneous system performs further processing on the image. In addition, the present application does not limit the specific processing procedures of the clustering process and the feature extraction model, and in specific implementation, a suitable algorithm may be selected according to business requirements to perform the clustering process and the image feature extraction process.
Further, in order to determine a vector similar to the query vector in the gallery, it is necessary to determine the similarity of the query vector and each center vector of the gallery. Specifically, each segment in the query vector may be used as a search object, and the distance between the vector of the segment corresponding to the search object and each vector of the corresponding segment in the K central vectors may be calculated, so as to obtain the distance table according to the distance calculation result. For example, fig. 5 is a schematic flowchart of a distance calculation method provided in an embodiment of the present application, and as shown in the figure, a distance between vector 11 of segment 1 of query vector 1 and vector 11 of segment 1 of center vector 1 is calculated to obtain D11, and then D11 is used to indicate a distance between vector 11 indicated by segment 1 of query vector and vector 11 indicated by segment 1 of center vector 1.
Specifically, the distance between two vectors may be calculated in any one of the following ways:
mode 1, the distance between two vectors can be calculated using the following formula 1:
Figure BDA0002874139940000081
wherein, X1For query vectors, X1kA Kth segmented vector of the query vector; x2For the central vector to be compared in the gallery, X2kIs the Kth subsection in the central vector, K takes the integer from 1 to n, and n is a positive integer larger than 1.
In the mode 2, on the basis of the formula 1, root-opening operation may be performed on both sides of the equal sign of the formula 1, that is, the following formula 2 is used to calculate the distance between two vectors:
Figure BDA0002874139940000082
in the mode 3, the root-opening operation can be removed on the basis of the formula 2, and the following formula 3 is adopted to calculate the distance between two vectors:
Figure BDA0002874139940000083
in the mode 4, since the term 1 in the formula 2 is the same for the accumulated results of the vectors corresponding to all the segments of the query vector, and for the process of calculating the distance between the vector indicated by any one segment in the query vector and the vector of any one segment in the center vector, the value can also be understood as a constant in the distance calculation process, that is, the calculation result of the term is included in the distance calculation process of any two vectors, so that the term can be removed from the distance calculation formula in order to simplify the calculation process. Specifically, the following equation 4 can be used to calculate the distance between two vectors:
Figure BDA0002874139940000084
further, in order to simplify the processing procedure of distance calculation, the calculation procedure of the above formula may also be converted into a matrix calculation procedure, and then a matrix calculation engine is used to perform matrix calculation, thereby avoiding the problems of time delay and efficiency caused by acquiring data one by one and performing operation.
Specifically, the query vector is converted into a row matrix according to the segment sequence number, for example, the vectors corresponding to the segments of the query vector are respectively used as row elements in the first matrix according to the segment sequence number in the query vector; the central vectors are converted into column matrixes according to the segment serial numbers, for example, according to the segment serial number of each central vector, the vectors corresponding to the segments of each central vector are respectively used as elements of one column in the second matrix, and then the matrix calculation engine executes matrix correlation operation, so that the data processing efficiency is improved. For convenience of description, the row matrix may be referred to as a first matrix, and the column matrix may be referred to as a second matrix.
For example, referring to fig. 6, taking the matrix mapping process of query vector 1 and center vector 1 as an example, first, according to the segment number of query vector 1, the vector of each segment in query vector 1 is taken as an element of a row in the first matrix, such as query vector 11 is taken as an element of a first row in the first matrix, … …, and so on, the first matrix is a matrix of M × 1, and each segment is taken as an element of a row in the first matrix; and according to the segment sequence number of the central vector 1, sequentially taking the vectors of all the segments of the central vector 1 as elements of one column in a second matrix, … …, and so on, wherein the second matrix is a 1 × M column matrix, and each segment is taken as an element of one column of the second matrix. At this time, when the distance between the query vector 1 and the center vector 1 is calculated by using the above formula 4, the multiplication operation of the first matrix and the second matrix may be performed, and then the result of the matrix multiplication is multiplied by 2 to obtain the result of the item 2 in the formula 4. Further, the first term in formula 4 is the accumulation operation on each segment in the central vector 1, the result of the 1 st term can be calculated in advance, the accumulated sums of the vectors with the same segment sequence number of each central vector are obtained respectively, and each accumulated sum is used as a column element of the third matrix respectively. Further, the calculation process of the formula 4 is converted into multiplication of the first matrix and the second matrix as shown in fig. 5, and the result obtained by the multiplication is accumulated with the third matrix, so that the distance calculation result of the formula 4 can be obtained.
It should be noted that the third matrix in fig. 6 may also be referred to as an offset matrix, and although fig. 5 is a result obtained by adding the result of multiplying the first matrix and the second matrix to the third matrix to obtain the distance formula, in a specific implementation, in order to simplify the operation process and speed up the data processing efficiency, all elements in the matrix obtained by multiplying the matrices may be multiplied by-1, and then the addition operation with the third matrix is performed to obtain the processing result of formula 4.
Alternatively, in addition to the above formula 4, formulas 1 to 3 may also adopt a similar concept, and each formula is used as a row element or a column element of a matrix according to a segment sequence number, so as to convert a distance calculation process into a matrix calculation process, and perform the above matrix multiplication and addition operations by using a matrix calculation engine, thereby increasing the speed of matrix calculation processing.
Based on any of the above manners, the distance calculation results of any two vectors can be obtained, and further, the distance calculation results can be stored by using a distance table as shown in fig. 5, wherein D is11Indicating the result of the distance calculation, D, of the vector 11 indicated by segment 1 in the query vector 1 from the vector 11 indicated by segment 1 in the central vector 121Indicating the distance calculation results of the vector 11 indicated by the segment 2 in the query vector 1 and the vector 21 indicated by the segment 2 in the central vector 2, … …, and so on, and the distance table includes the distance calculation results of the vectors of the segments respectively corresponding to the vectors indicated by the segments in the query vector and the central vector.
After generating the distance table, the matrix computation engine may store the distance table in the second memory 205 of the heterogeneous system 20 and send a notification to the scalar computation engine, which in turn performs subsequent operations. When the distance table is stored in the second memory 205, the scalar calculation engine 202 can directly read the distance table through the second memory 205, and generate the distance matrix, according to the specific process in steps S402 and S403.
Because the calculation process of the distance table mainly relates to the distance calculation process of the matrix, the distance calculation process can be directly processed in a matrix operation mode by utilizing a matrix calculation engine, and the distance calculation process can be completed more quickly. Moreover, the matrix calculation engine supports floating point number operation, and the operation of converting floating point numbers and integer numbers is not needed when the CPU executes distance calculation, so that the data processing efficiency is simplified. In addition, by optimizing the distance calculation method, the distance between any two vectors can be rapidly confirmed, the similarity between the vectors is further judged, and the image retrieval efficiency is further improved.
S402, the scalar calculation engine generates a distance matrix according to the short feature list and the distance table.
The short feature list is typically generated during the creation of the gallery and is stored in the first memory 30 shown in FIG. 1. As shown in fig. 8, in the image retrieval process, before the scalar calculation engine generates the distance matrix, the scalar calculation engine may notify the direct memory engine 204 to load the short feature list of the gallery into the second storage 205 of the heterogeneous system 20, and then the scalar calculation engine traverses the short feature list one by one, obtains a distance value from the distance table according to the identifier of each segment in the short feature list, then writes the distance value into the corresponding position of the distance matrix, and finally obtains the distance matrix, which may be stored in the second storage of the heterogeneous system.
It should be noted that the distance matrix is a logical structure for storing data, and in a specific implementation, a storage space of a specified size may be divided in the second memory 205, and the storage of the distance matrix is implemented by using the storage space, for example, storage space 1 is specified for storing data of D11, and storage space 2 is specified for storing data of D12.
For example, as shown in fig. 7, the short feature list includes short features of all images in the gallery, and the distance value corresponding to the short feature number of each image may be respectively used as a row element in the distance matrix, for example, image 1 includes 64 number numbers, the first short feature is identified as 1, the distance value D11 in the first row and first column in the distance table may be used as an element in the first row and first column in the distance matrix, similarly, the second short feature of image 1 is identified as 2, the distance value D22 in the second row and second column in the distance table may be used as an element in the second row and second column in the distance matrix, … …, and so on, a row element in the distance matrix represents the distance value of one image, and thereby obtaining a matrix of N × M, where N is the number of images included in the short feature list, and M is the number of segments of the long feature.
Alternatively, the direct memory engine may load all or a portion of the short feature list into the second storage 205 of the heterogeneous system 20 based on the size of the space remaining in the second storage 205.
The second memory 205 may be a storage medium with high-speed read/write latency, such as a Static Random Access Memory (SRAM), and the scalar calculation engine may directly read the short feature list and the distance table through the second memory, so as to avoid data transmission latency caused by reading from the first memory. In addition, in the conventional technology, the short feature list is loaded into the memory by the CPU, data with high access frequency is loaded into the cache of the CPU, but since the cache space is limited, all the short feature lists cannot be loaded into the cache, and the distance matrix generation process needs to frequently read the data of the short feature list and the distance table, which causes a problem that the data to be read does not exist in the cache (also referred to as a cache miss problem), and affects the speed of reading the data. In the application, the second memory 205 is added in the heterogeneous system, the short feature list is loaded into the second memory 205, and the scalar calculation engine is connected with the second memory through the internal bus, so that frequent relocation is not needed in the data processing process. In addition, since the identifiers in the short feature list are not sequentially identified according to the segmentation order, but the elements are sequentially written into the matrix row by row, and the distance table and the short feature list are respectively stored in the second memory 205 of the heterogeneous system, the above process can be understood as a process of randomly reading the distance table and sequentially writing the distance matrix. The short feature list is stored in the second memory, so that the data reading speed in the short feature list can be increased, and the processing efficiency of the image retrieval process is further improved.
After generating the distance matrix, the scalar calculation engine may notify the vector calculation engine to continue to perform the distance accumulation operation, see step S403.
S403, the vector calculation engine determines the accumulated distance vector according to the distance matrix.
Each row in the distance matrix obtained in step S402 has a plurality of distance values, each distance value represents a distance between a vector corresponding to a segment of the query vector and a vector corresponding to a segment of the central vector in the gallery, and the distance between the query vector and the same central vector in the gallery can be obtained only by performing an accumulation operation on all distances in the same row.
As shown in fig. 9, the vector calculation engine can use one accumulation instruction to realize the accumulation operation of multiple floating point numbers, which involves a large amount of floating point accumulation operation in the conventional process, and if 1 hundred million vectors are included in the graph library, 1 hundred million accumulation operations are generated, each accumulation operation includes M (PQ segment number) elements, and the vector accumulation speed is greatly increased by directly executing the accumulation operation of the vector by using the vector calculation engine.
S404, the scalar calculation engine determines a retrieval result according to the accumulated distance result.
The scalar calculation engine may rank the accumulated distance results first and then select at least one image matching the query vector as a search result based on the ranking results.
The sorting method may adopt Top X sorting, that is, X distance results are screened out from the distance results in descending order, where X is a positive integer greater than or equal to 1, and a specific value of X may be specified manually. Specifically, the sorting process of the accumulated distance results may adopt a binary heap manner to screen the distance results of Top X. For example, the scalar calculation engine may divide the data to be sorted into at least two groups, then sort the data in each group according to size, finally sort the sorting results of the different groups for the second time, and then select at least one accumulated distance result in the final sorting result as a retrieval result.
Further, the heterogeneous system may also determine the image in the associated gallery according to the search result of the query vector, specifically, since the short feature list is generated according to the image in the gallery, as shown in fig. 7, the short feature list may record a relationship between the identifier of the short feature and the image in the gallery, and then, the identifier of the image may also be added to the distance matrix generated according to the short feature list and the distance table. For example, in the distance matrix shown in fig. 9, a field for identifying an image may be added to each vector, and accordingly, a field for identifying an image may also be included in the accumulated distance result, and when the accumulated distance result is determined according to Top X or the like, an identifier of a matched image may be determined according to the field of the image, so as to obtain the matched image from a memory for storing images in the gallery, and then present the matched image as a retrieval result of the image to be retrieved to the user.
As a possible embodiment, after the heterogeneous system obtains the sorting result, the direct memory access engine may transfer the sorting result to the first storage, and then the host system searches for the image in the gallery associated with the sorting result as the search result according to the sorting result, and presents the search result of the search request to the user through an interface or a network (web) interface.
It should be noted that the images in the gallery may be stored in the image retrieval system shown in fig. 1, for example, by using the first memory 30 of the host system 10, or by using a device other than the system to store the images in the gallery, and when determining the retrieval result of the query vector of the image to be retrieved, the matching image may be obtained from the device other than the system according to the identifier of the image indicated in the retrieval result.
As one possible embodiment, after the distance matrix is generated in step S301, the image retrieval process may be performed in a multi-stage parallel processing manner, and specifically, the parallel processing operation may be performed in two manners as follows:
mode 1, parallel processing of multiple query requests.
When the heterogeneous system obtains a plurality of query requests, the query requests may be executed in parallel according to the above steps S301 to S304, at this time, since the process of each query request is relatively independent, each computing engine in the heterogeneous system may adopt a plurality of processor cores or processes or threads to execute the processing process of the plurality of query requests at the same time, thereby implementing parallel processing of the plurality of query requests.
Mode 2, parallel processing of the same query request.
For the same query request, the processing capacities of the matrix calculation engine, the vector calculation engine and the scalar calculation engine can be determined according to the statistical data to execute parallel processing. Specifically, referring to fig. 11, fig. 11 is a schematic diagram of a parallel processing flow, as shown in the figure, after the matrix calculation engine generates the distance table, the scalar calculation engine may accumulate a part of the distance matrix to obtain an accumulated distance vector, and then the scalar calculation engine performs Top X sorting on the accumulated distance vector to obtain a sorting result of the part of the distance matrix, where the above process may be referred to as an iterative processing process based on the part of the distance matrix. For the processing procedure of the same distance matrix, the iterative processing procedure of the distance matrix may be divided into two or more sub-procedures, and each procedure respectively executes the above iterative processing procedure based on the partial distance matrix. For a plurality of iterative processing processes based on the partial distance matrix, the final result of the distance matrix can be determined only by updating the sorting result in the sorting stage. For example, the sorting result may be stored in a stack form, and each time an iterative process based on a partial distance matrix is completed, the sorting result in the stack is updated to obtain the latest sorting result until the process of the entire distance matrix is completed.
Alternatively, as shown in fig. 9, the distance calculation engine 201 may divide the gallery into a plurality of groups, and then sequentially calculate the distance between each segment of the query vector and the central vector corresponding to each group in the gallery, so as to generate a distance sub-table of the distance table; next, a distance matrix is generated by the scalar calculation engine 203 according to the distance sub-table, wherein a part of data of the distance table is called as the distance sub-table; then, a distance matrix is generated by the scalar calculation engine 203 according to the distance sub-table; the accumulated distance results are determined by the vector calculation engine 202 according to the distance matrix, and the ordering order of the distance results is determined by the scalar calculation engine 203 according to the accumulated distance results. Every time the distance calculation engine 201 generates a distance sub-table, the scalar calculation engine 203 and the vector calculation engine 202 sequentially execute corresponding processing procedures according to the procedures until the processing of the whole distance table is completed. The sorting process may be based on determining X vectors in a manner similar to a first-in first-out stack, comparing the X vectors with the vectors in the stack after acquiring new X vectors each time, and replacing the vectors in the original stack if the vectors with smaller distances exist in the newly acquired X vectors. By the implementation mode, fine-grained segmentation can be performed on the data, so that distance matrix generation, vector accumulation and sorting can be performed simultaneously, and the efficiency of image retrieval is further improved.
Through the description of the process, the image retrieval method provided by the application adopts the thought of 'special personnel' and executes the image retrieval processing by utilizing a heterogeneous system comprising a plurality of special computing engines, so that the problem of low image retrieval speed caused by the execution of the image retrieval processing by a CPU (central processing unit) is solved. Furthermore, in the process of calculating the distance of the vector, a plurality of calculation modes are provided to simplify the processing process of distance calculation. Moreover, by mapping the distance calculation process into the matrix calculation process, the vector calculation processing process can be quickly completed by using the matrix calculation engine, the image retrieval processing efficiency is further improved, and the data processing time delay is reduced. In addition, by adding the shared memory in the heterogeneous system and utilizing the shared memory to store the distance table and the short characteristic list, the cache miss problem can be avoided, and the data processing efficiency is improved. On the other hand, in the image retrieval process, a plurality of query requests or different steps in the same query request can be executed in parallel based on the processing capacity of each computing engine in the heterogeneous system, so that the processing time delay of the image retrieval is further reduced, and the requirement of real-time image retrieval is met.
As a possible embodiment, the method of the present application may be applied to image retrieval, and may also be applied to scenes such as video retrieval, face recognition, and text retrieval, for a video retrieval scene, each frame of image may respectively perform retrieval processing by using the methods of fig. 3 to fig. 11 to obtain a retrieval result, and further, the retrieval result including all frames is used as the retrieval result, and the video retrieval processing may also be completed quickly. In addition, for a face recognition scene, the image retrieval method may also be adopted, and the image of the face to be retrieved is used as the image to be retrieved, and the process of quickly obtaining the matched image may also be implemented by further executing the methods of fig. 3 to 11.
It should be noted that, for simplicity of description, the above method embodiments are described as a series of action combinations, but those skilled in the art should understand that the present application is not limited by the described action sequence, and those skilled in the art should understand that the embodiments described in the specification belong to the preferred embodiments, and the mentioned actions are not necessarily required by the present application.
The method for image retrieval provided by the embodiment of the present application is described in detail above with reference to fig. 1 to 11, and the apparatus for image retrieval provided by the embodiment of the present application is described below with reference to fig. 12.
Fig. 12 is a schematic structural diagram of an image retrieval apparatus 600 provided in the present application, and as shown in the figure, the apparatus 600 includes a first calculating unit 601 and a second calculating unit 602, wherein,
a first calculating unit 601, configured to calculate similarities between the image to be retrieved and a plurality of images included in a gallery;
a second calculating unit 602, configured to determine a retrieval result according to similarities between the image to be retrieved and the multiple images included in the gallery, where the retrieval result includes whether a target image similar to the image to be retrieved is included in the gallery.
It should be understood that the first computing unit 601 in the apparatus 600 of the embodiment of the present application may correspond to an implementation manner of the matrix computing engine 201 in fig. 2, and the second computing unit 602 may correspond to an implementation manner of fig. 2 including the vector computing engine 202 and the scalar computing engine 203. The first computing unit 601 and the second computing unit 602 may also implement the image retrieval methods shown in fig. 3 to 11 by software, and the apparatus 600 and the modules thereof may also be software modules. For example, the GPU calls an instruction in the memory to implement the functions of the first calculation unit 601 and the second calculation unit 602, respectively. Alternatively, when the first calculation unit 601 and the second calculation unit 602 are implemented by software, computer instructions in a memory may also be called by a CPU or other type of processor to implement the functions of the first calculation unit and the second calculation unit, respectively. The processor includes a Digital Signal Processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, and the like. A general purpose processor may be a microprocessor or any conventional processor or the like.
Optionally, the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to distances between the query vector corresponding to the image to be retrieved and center vectors corresponding to the plurality of images, or the similarity between the image to be retrieved and the plurality of images included in the gallery is a distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images.
Optionally, the first calculating unit 601 is further configured to convert the query vector into a query matrix, convert the center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix, where the distance table is used to determine distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
Optionally, the apparatus 600 further includes a shared memory (not shown in fig. 6), and the second calculating unit 602 is further configured to obtain the distance table from the shared memory; and determining whether the gallery contains a target image similar to the image to be retrieved according to the distance table.
Optionally, the apparatus 600 further includes a third calculating unit 603, configured to obtain an index list corresponding to the image in the gallery, where the second calculating engine is specifically configured to obtain the distance table from the shared memory; and determining whether the gallery contains a target image similar to the image to be retrieved or not according to the distance table.
A second calculating unit 602, further configured to obtain a distance matrix according to the distance table and the index list, where the distance matrix is used to indicate a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix.
Optionally, the second calculating unit 602 is further configured to determine, according to the distance matrix, a retrieval result according to a result of sorting distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images.
Optionally, the second calculating unit 602 is further configured to determine the retrieval result according to a preset threshold, where a distance between a vector associated with the image and the query vector included in the retrieval result satisfies a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
Optionally, the image to be retrieved comes from a retrieval request, where the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, where the video to be retrieved includes at least one frame of the image to be retrieved.
The apparatus 600 according to the embodiment of the present application may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of each unit in the apparatus 600 are respectively for implementing corresponding flows of each method in fig. 3 to fig. 11, and are not described herein again for brevity.
The device 600 provided by the application can utilize a plurality of computing units to respectively execute different data processing processes, and the data processing efficiency is improved. In the process of calculating the distance of the vector, a plurality of calculation modes are adopted to simplify the processing process of distance calculation. Moreover, the device 600 may map the distance calculation process to a matrix calculation process, and utilize the first calculation unit to quickly complete the distance calculation processing process, thereby further improving the efficiency of image retrieval processing and reducing the time delay of data processing. In addition, in the image retrieval process, a plurality of query requests or different steps in the same query request can be executed in parallel based on the processing capacity of each computing engine in the heterogeneous system, so that the processing time delay of the image retrieval is further reduced, and the requirement of real-time image retrieval is met.
The image retrieval system is shown as a heterogeneous system 20 in fig. 1, and the heterogeneous system 20 is used to implement the functions of the corresponding processes of the methods in fig. 3 to fig. 11, and for brevity, no further description is given here.
The present application further provides a data processing system, which includes a host system 10 and a heterogeneous system 20 shown in fig. 1, where the host system is configured to obtain a retrieval request, send an image to be retrieved in the retrieval request to the heterogeneous system, and then the heterogeneous system executes the functions of the corresponding processes of the methods shown in fig. 3 to fig. 11, which are not described herein again for brevity.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded or executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a Solid State Drive (SSD).
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (33)

1. An image retrieval system, comprising: a set of interfaces and compute engines;
the interface is used for acquiring an image to be retrieved;
the set of compute engines includes a first compute engine and a second compute engine, wherein,
the first calculation engine is used for calculating the similarity between the image to be retrieved and a plurality of images contained in the gallery;
the second calculation engine is used for determining a retrieval result according to the similarity between the image to be retrieved and a plurality of images contained in the gallery, and the retrieval result comprises a target image similar to the image to be retrieved.
2. The system according to claim 1, wherein the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images, or the similarity between the image to be retrieved and the plurality of images included in the gallery is a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images.
3. The system of claim 2, wherein the first computing engine is specifically configured to convert the query vector into a query matrix, convert the center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix, where the distance table is used to determine distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images.
4. The system of claim 3, further comprising a shared memory, wherein the shared memory is coupled to the first compute engine via an internal bus;
the first compute engine is further to store the distance table to the shared memory via the internal bus.
5. The system of claim 4, wherein the shared memory is coupled to the second compute engine via an internal bus;
the second calculation engine is specifically configured to obtain the distance table from the shared memory; and determining whether the gallery contains a target image similar to the image to be retrieved or not according to the distance table.
6. The system of claim 5, further comprising a third computing engine, wherein the third computing engine is configured to obtain an index list corresponding to the images in the gallery;
the second calculation engine is specifically configured to obtain a distance matrix according to the distance table and the index list, where the distance matrix is used to indicate a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix.
7. The system of claim 6,
the second calculation engine is specifically configured to determine, according to the distance matrix, a search result according to a result of ranking of distances between the query vector corresponding to the image to be searched and the center vectors corresponding to the plurality of images.
8. The system of claim 6,
the second calculation engine is specifically configured to determine the search result according to a preset threshold, where a distance between a vector associated with the image and the query vector included in the search result meets a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
9. The system of any of claims 1 to 8, wherein the system is a chip or a PCIe card, and the first compute engine and the second compute engine are connected by an internal bus.
10. The system of claim 9, wherein the first compute engine and the second compute engine in the set of compute engines are each logic circuits.
11. The system of any one of claims 1 to 8, wherein the system is a virtualization system, and the first compute engine and the second compute engine in the set of compute engines are each a virtual machine or a container running on a device in the virtualization system.
12. The system according to claims 1 to 11, wherein the image to be retrieved comes from a retrieval request, the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of the image to be retrieved.
13. A method of image retrieval, the method being applicable to an image retrieval system, the image retrieval system comprising an interface and a set of compute engines, the set of compute engines comprising a first compute engine and a second compute engine, the method comprising:
the first calculation engine calculates the similarity between the image to be retrieved and a plurality of images contained in a gallery;
and the second calculation engine determines a retrieval result according to the similarity between the image to be retrieved and a plurality of images contained in the gallery, wherein the retrieval result comprises whether a target image similar to the image to be retrieved is contained in the gallery.
14. The method according to claim 13, wherein the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images, or the similarity between the image to be retrieved and the plurality of images included in the gallery is a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images.
15. The method of claim 14, wherein the calculating, by the first computing engine, similarities between the image to be retrieved and the plurality of images included in the gallery comprises:
and the first calculation engine converts the query vector into a query matrix, converts the central vector into a central matrix, and obtains a distance table according to the query matrix and the central matrix, wherein the distance table is used for determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images.
16. The method of claim 15, wherein the system further comprises a shared memory, the shared memory being coupled to the first compute engine via an internal bus;
the first compute engine stores the distance table to the shared memory via the internal bus.
17. The method of claim 16, wherein the shared memory is connected to the second engine via an internal bus;
the second calculation engine acquires the distance table from the shared memory; and determining whether the gallery contains a target image similar to the image to be retrieved according to the distance table.
18. The method according to claim 17, wherein the system further comprises a third calculation engine, the third calculation engine is configured to obtain an index list corresponding to the images in the gallery, and the second calculation engine is specifically configured to obtain the distance table from the shared memory; determining whether the gallery contains a target image similar to the image to be retrieved or not according to the distance table;
the second calculation engine obtains a distance matrix according to the distance table and the index list, wherein the distance matrix is used for indicating a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix.
19. The method of claim 18, wherein the determining, by the second computing engine, whether a target image similar to the image to be retrieved is included in the gallery according to the distance table comprises:
and the second calculation engine determines a retrieval result according to the distance matrix and the sequencing result of the distances between the query vector corresponding to the image to be retrieved and the central vectors corresponding to the plurality of images.
20. The method of claim 18, wherein the determining, by the second computing engine, whether a target image similar to the image to be retrieved is included in the gallery according to the distance table comprises:
the second calculation engine determines the retrieval result according to a preset threshold, and the distance between the vector associated with the image and the query vector included in the retrieval result meets a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
21. The method of any of claims 13 to 18, wherein the system is a chip or PCIe card, and the first compute engine and the second compute engine are connected via an internal bus.
22. The method of any of claim 21, wherein the first compute engine and the second compute engine in the set of compute engines are each logic circuits.
23. The method of any one of claims 13 to 18, wherein the system is a virtualization system, and wherein the first compute engine and the second compute engine in the set of compute engines are each a virtual machine or a container running on a device in the virtualization system.
24. The method according to any one of claims 13 to 23, wherein the image to be retrieved comes from a retrieval request, the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of the image to be retrieved.
25. An apparatus for image retrieval, the apparatus comprising a first computing unit and a second computing unit,
the first calculating unit is used for calculating the similarity between the image to be retrieved and a plurality of images contained in the gallery;
the second computing unit is configured to determine a retrieval result according to similarities between the image to be retrieved and the plurality of images included in the gallery, where the retrieval result includes whether a target image similar to the image to be retrieved is included in the gallery.
26. The apparatus according to claim 25, wherein the similarity between the image to be retrieved and the plurality of images included in the gallery is obtained according to a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images, or the similarity between the image to be retrieved and the plurality of images included in the gallery is a distance between a query vector corresponding to the image to be retrieved and a center vector corresponding to the plurality of images.
27. The apparatus of claim 26,
the first calculating unit is further configured to convert the query vector into a query matrix, convert the center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix, where the distance table is used to determine distances between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the plurality of images.
28. The apparatus of claim 27, wherein the apparatus further comprises a shared memory;
the second computing unit is further configured to obtain the distance table from the shared memory; and determining whether the gallery contains a target image similar to the image to be retrieved according to the distance table.
29. The apparatus according to claim 28, further comprising a third computing unit,
the third computing unit is configured to obtain an index list corresponding to the images in the gallery, and the second computing engine is specifically configured to obtain the distance table from the shared memory; determining whether the gallery contains a target image similar to the image to be retrieved or not according to the distance table;
the second calculating unit is further configured to obtain a distance matrix according to the distance table and the index list, where the distance matrix is used to indicate a distance value associated with each index in the index list; and determining the distance between the query vector corresponding to the image to be retrieved and the central vector corresponding to the plurality of images according to the distance matrix.
30. The apparatus of claim 28,
the second calculating unit is further configured to determine, according to the distance matrix, a search result according to a result of ranking the distances between the query vector corresponding to the image to be searched and the center vectors corresponding to the plurality of images.
31. The apparatus of claim 28,
the second calculating unit is further configured to determine the retrieval result according to a preset threshold, where a distance between a vector associated with the image and the query vector included in the retrieval result meets a preset condition, where the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
32. The apparatus according to any one of claims 25 to 31, wherein the image to be retrieved is from a retrieval request, the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of the image to be retrieved.
33. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the operational steps of the method of any one of claims 13 to 24.
CN202011621979.6A 2020-07-07 2020-12-30 Image retrieval system, method and device Pending CN113971225A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202211628292.4A CN115878824B (en) 2020-07-07 2020-12-30 Image retrieval system, method and device
EP21837926.1A EP4167107A4 (en) 2020-07-07 2021-06-12 Image retrieval system, method and apparatus
PCT/CN2021/099890 WO2022007596A1 (en) 2020-07-07 2021-06-12 Image retrieval system, method and apparatus
US18/151,433 US20230161811A1 (en) 2020-07-07 2023-01-07 Image search system, method, and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010645032 2020-07-07
CN2020106450322 2020-07-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202211628292.4A Division CN115878824B (en) 2020-07-07 2020-12-30 Image retrieval system, method and device

Publications (1)

Publication Number Publication Date
CN113971225A true CN113971225A (en) 2022-01-25

Family

ID=79586122

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202011621979.6A Pending CN113971225A (en) 2020-07-07 2020-12-30 Image retrieval system, method and device
CN202011620577.4A Pending CN113971224A (en) 2020-07-07 2020-12-30 Image retrieval system, method and related equipment

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202011620577.4A Pending CN113971224A (en) 2020-07-07 2020-12-30 Image retrieval system, method and related equipment

Country Status (1)

Country Link
CN (2) CN113971225A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777727B (en) * 2023-06-21 2024-01-09 北京忆元科技有限公司 Integrated memory chip, image processing method, electronic device and storage medium

Also Published As

Publication number Publication date
CN113971224A (en) 2022-01-25

Similar Documents

Publication Publication Date Title
US10452691B2 (en) Method and apparatus for generating search results using inverted index
US9727308B2 (en) Sorting multiple records of data using ranges of key values
US20230161811A1 (en) Image search system, method, and apparatus
EP3907627A1 (en) Technologies for performing stochastic similarity searches in an online clustering space
US20110119467A1 (en) Massively parallel, smart memory based accelerator
WO2021169173A1 (en) Data clustering storage method and apparatus, computer device, and storage medium
US9442950B2 (en) Systems and methods for dynamic visual search engine
CN111325156A (en) Face recognition method, device, equipment and storage medium
CN114138784A (en) Information tracing method and device based on storage library, electronic equipment and medium
CN113360803A (en) Data caching method, device and equipment based on user behavior and storage medium
Song et al. Brepartition: Optimized high-dimensional knn search with bregman distances
CN113971225A (en) Image retrieval system, method and device
CN108052535B (en) Visual feature parallel rapid matching method and system based on multiprocessor platform
CN110209895B (en) Vector retrieval method, device and equipment
CN114077690A (en) Vector data processing method, device, equipment and storage medium
WO2021055868A1 (en) Associating user-provided content items to interest nodes
US10268912B2 (en) Offline, hybrid and hybrid with offline image recognition
Antaris et al. Similarity search over the cloud based on image descriptors' dimensions value cardinalities
CN116547647A (en) Search device and search method
Mohamed et al. Fast large-scale multimedia indexing and searching
US11507773B2 (en) Large-scale similarity search with on-chip cache
CN115408491B (en) Text retrieval method and system for historical data
CN112328630B (en) Data query method, device, equipment and storage medium
WO2023222091A1 (en) Vector retrieval method and apparatus
CN110991543B (en) Image region of interest clustering method and device, computing device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination