CN112214627A

CN112214627A - Search method, readable storage medium and electronic device

Info

Publication number: CN112214627A
Application number: CN201910630478.5A
Authority: CN
Inventors: 谢超; 竺知茹; 易小萌
Original assignee: Shanghai Zerui Information Technology Co ltd
Current assignee: Shanghai Zerui Information Technology Co ltd
Priority date: 2019-07-12
Filing date: 2019-07-12
Publication date: 2021-01-12

Abstract

The embodiment of the invention discloses a searching method, a readable storage medium and electronic equipment, wherein the method searches data based on a heterogeneous computer framework formed by a Central Processing Unit (CPU) and a Graphic Processing Unit (GPU), so that the data is screened according to an attribute tag during searching, the advantage of parallel acceleration of the graphic processing unit is utilized, and the performance bottleneck caused by large scale of the data during data processing in the graphic processing unit is avoided. Meanwhile, a production line comprising a data screening step in the central processing unit and a data transmission step from the central processing unit to the graphic processor is constructed, and extra expenses generated by screening and copying data in the central processing unit are effectively covered by performing data screening, transmission and calculation searching operations in parallel by all components in the production line.

Description

Search method, readable storage medium and electronic device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a search method, a readable storage medium, and an electronic device.

Background

Similarity queries generally refer to searching for data objects by comparing similarity, such as searching for one or more pictures in a picture library that are most similar to a specified picture. In modern big data application, the data scale is increasingly huge, the data types are more and more diversified, and efficient similarity query becomes more important, for example, query complex data such as characters, pictures, voice, videos and the like. At present, the method usually selected when similarity search is performed is the nearest neighbor search. The nearest neighbor search is to find one or more data closest to a specified search object in a search space, and includes various algorithms. There are many ways to determine the distance to the specified search object, the most classical being the euclidean distance. However, the existing nearest neighbor search algorithm does not support the attribute tags for filtering data during searching when the data set is provided with additional attribute tags. For example, the existing algorithm can find a picture most similar to a certain face in a picture library, but cannot screen out a picture with attribute labels such as 'age 20 years', 'basketball liking' and the like and most similar to a searched face object in the search.

Disclosure of Invention

In view of this, embodiments of the present invention provide a search method, a readable storage medium, and an electronic device, which are intended to filter data through an attribute tag when performing a nearest neighbor search.

In a first aspect, an embodiment of the present invention provides a search method, including:

determining a retrieval feature vector and a first label according to an input object;

determining a first data set from a first label and the retrieved feature vector, the first data set comprising at least one data subset, each of the data subsets comprising a plurality of clustered feature vectors and a second label corresponding to the feature vector;

determining a screening pattern for each of the subsets of data, the screening pattern comprising screening in a central processor or a graphics processor;

screening a data subset in the first data set according to the first label and the determined screening mode;

matching the retrieval feature vector with feature vectors in the screened data subset;

and determining output data according to the matching result.

Further, the determining a first data set from the first tag and the retrieved feature vector comprises:

determining a second data set, the second data set containing a subset of the total searchable data;

determining screening strength according to the first label and a second label corresponding to the feature vector contained in each data subset, wherein the screening strength is used for judging the quantity of the feature vectors contained in each data subset after screening;

determining the number N of the data subsets contained in the first data set according to the number of the data subsets contained in the second data set and the screening strength corresponding to each data subset;

calculating the distance between the retrieval feature vector and the center of each data subset contained in the second data set, wherein the center of each data subset is used for representing the corresponding data subset;

and determining N data subsets to form a first data set according to the distance.

Further, the determining the filtering mode of each data subset comprises:

and in response to the screening strength corresponding to the data subset being higher than a first threshold value, determining the screening mode of the data subset as screening in the central processing unit.

Further, the determining the filtering mode of each data subset comprises:

and in response to the screening strength corresponding to the data subset being lower than a first threshold value, determining a screening mode of the data subset as screening in the graphics processor.

Further, the screening the subset of data in the first data set according to the first tag and the determined screening mode includes:

in response to the screening mode of the subset of data being screening in the graphics processor, transmitting the subset of data to the graphics processor for screening.

in response to the screening mode of the data subset being screening in the central processing unit, performing screening of the data subset in the central processing unit by a first processing unit of the central processing unit; and

after the screening is completed, transmitting the screened data subset to the graphics processor through a second processing unit of the central processing unit.

Further, the screening of a next data subset is processed by the first processing unit while the second processing unit transmits the screened data subset to the graphics processor.

Further, the transmitting, by the second processing unit of the central processing unit, the filtered subset of data to the graphics processor comprises:

in response to the number of feature vectors contained in the plurality of screened subsets of data being less than a second threshold,

and aggregating the plurality of data subsets into one data subset and transmitting the data subset to the graphics processor.

In a second aspect, embodiments of the present invention also provide an electronic device, including a memory and a processor, where the memory is used to store one or more computer program instructions, and the processor executes the one or more computer program instructions to implement the method according to any one of the first aspect.

In a third aspect, the present invention also provides a computer-readable storage medium for storing computer program instructions, which when executed by a processor implement the method according to any one of the first aspect.

The data are searched based on the heterogeneous computer framework formed by the Central Processing Unit (CPU) and the Graphic Processing Unit (GPU), the data are screened according to the attribute tags during searching, the advantage of parallel acceleration of the graphic processing unit is utilized, and the performance bottleneck caused by large data scale during data processing in the graphic processing unit is avoided. Meanwhile, by constructing a production line and carrying out data screening, transmission and calculation searching operations in parallel, the extra overhead generated by screening and copying data in a central processing unit is effectively covered.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of a heterogeneous computer architecture;

FIG. 2 is a flow chart of a search method according to an embodiment of the present invention;

FIG. 3 is a flow chart of an alternative implementation of the embodiment of the present invention for determining a first data set from a first tag and the retrieved feature vector;

FIG. 4 is a diagram illustrating a process of determining a second data set according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a pipeline formed by the data filtering step and the data transmission step according to an embodiment of the present invention;

fig. 6 is a schematic diagram of an electronic device according to an embodiment of the invention.

Detailed Description

The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention.

Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.

Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.

Fig. 1 is a schematic diagram of a heterogeneous computer architecture, which, as shown in fig. 1, is composed of a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU) connected via a high-speed serial bus (PCIe-bus).

Specifically, the central processing unit and the operation core of the graphics processor include a control unit (control)10, an operator (ALU)11, a cache (cache)12, and a Dynamic Random Access Memory (DRAM) 13. It can be seen that there are fewer computational cores in the central processing unit and more computational cores in the graphics processor, making the graphics processor more suitable for performing computationally simple but highly parallel tasks, and the central processing unit more suitable for performing computationally complex but less parallel tasks. In the searching method provided by the embodiment of the invention, tasks with complex computation and low parallelism can be processed by the central processing unit, such as the processes of determining the first data set, determining the screening mode of each data subset and the like. At the same time, computationally simple but highly parallel tasks can be handled by the graphics processor, such as matching the retrieved feature vectors with feature vectors in the filtered subset of data. However, when searching through the heterogeneous computer architecture, the following problems occur: performance bottlenecks in data processing due to large amounts of data being transferred from computer main memory to the global memory of the graphics processor; once the data amount is larger than the global memory capacity, a large amount of data switching needs to be performed between the computer main memory and the graphics processor video memory in the data processing process, so that the data transmission overhead is further increased, and the processing performance is affected.

Therefore, the searching method provided by the embodiment of the invention searches data based on a heterogeneous computer framework formed by a Central Processing Unit (CPU) and a Graphic Processing Unit (GPU), realizes the screening of the data according to the attribute tags during searching, and simultaneously performs feature screening and calculation searching on a data set by reasonably utilizing the advantage of parallel acceleration of the graphic processing unit, thereby avoiding the performance bottleneck caused by huge data scale during data processing in the graphic processing unit. Meanwhile, a production line comprising a data screening step in the central processing unit and a data transmission step from the central processing unit to the graphic processor is constructed, and the extra overhead generated by screening and copying data in the central processing unit is effectively covered by performing data screening, transmission and calculation searching operations in parallel by all components in the production line.

Fig. 2 is a flowchart of a search method according to an embodiment of the present invention, and as shown in fig. 2, the search method includes:

step S100: a search feature vector and a first label are determined from the input object.

Specifically, the input object is search data, such as data of pictures, characters, and the like, and is converted into a form suitable for data analysis, processing, and calculation before the search is performed, such as converting the input object into a retrieval feature vector. Wherein the conversion method may be, for example, a Convolutional Neural Network (CNN) or the like. The first label comprises at least one attribute label corresponding to the input object and is used for screening the data subset in the searching process. And in the process of converting the input object, not converting the first label. The first tag is a tag for representing some attributes of the input object, for example, when the input object is a picture, the first tag is used for representing some features of the picture, such as a shooting address of the picture, and some attributes of a person in the picture or a photographer of the picture, including hobbies, age, academic calendar, and the like.

Step S200: a first data set is determined from the first tag and the retrieved feature vector.

In particular, the first data set comprises at least one data subset, each of the data subsets comprising a plurality of clustered feature vectors and second labels corresponding to the feature vectors. The data represented by the feature vectors in the first data set is the same as the data type of the input object, for example, when the input object is a picture, the feature vectors in the first data set are all feature vectors used for representing the picture. The second label includes at least one label characterizing certain features of corresponding feature vectors in the data set, for example, when the input object is a picture, the feature vectors in the first data set are all feature vectors for characterizing the picture, and the second label is used for characterizing certain features of the picture, for example, when the picture content is a boy, the second label may be, for example, a "birth place" of the boy: beijing "," age: twenties "," occupation: students ", etc. The first data set is comprised of a portion of data in the selectable search data. The screening strength of the first label can be determined according to the first label and a second label corresponding to the feature vector contained in the first data set. The screening strength may be estimated by calculating a data distribution of attribute tags in the second data set. For example, the attribute label with a higher frequency of occurrence among the attribute labels included in the second data set, and the frequency of occurrence of each attribute label in the second data set are determined. And then according to whether each attribute label in the first labels is an attribute label with higher occurrence frequency and the occurrence frequency of each attribute label in the first labels in the second data set. Thereby estimating the percentage of the first label in the attribute labels of the second data set, and determining the value of 1 minus the percentage as the screening strength. The first tag is used for screening each data subset, and the screening strength is used for judging the number of feature vectors contained in each data subset after screening, that is, the number of feature vectors contained in each data subset after screening is smaller when the screening strength is stronger, and the number of feature vectors contained in each data subset after screening is larger when the screening strength is weaker. Therefore, when the screening intensity is low, the search efficiency is low if the data in the first data set is too much. When the screening intensity is high, if the data in the first data set is too little, a predetermined amount of data cannot be acquired when the search is completed. Therefore, in order to reduce the amount of calculation in the search process, improve the search efficiency, and obtain sufficient search results, it is necessary to determine the size of the first data set, that is, the number of feature vectors included, according to the first tag and the retrieval feature vector.

Step S300: determining a screening pattern for each of the subsets of data.

In particular, the screening mode includes screening in a central processor or a graphics processor. If the screening is performed in the central processing unit, the memory space needs to be reallocated for the screened data subset, resulting in the extra overhead of data copying. And if the screening is carried out in the graphic processor, directly accessing and processing the data subset transmitted to the global memory. The screening of the data subsets in the graphics processor can be performed with high concurrency and faster efficiency, but additional performance overhead is incurred in the transfer from the central processor to the graphics processor. The screening pattern for the subset of data is therefore determined based on the current cpu and gpu computing resource usage, the overhead incurred by the cpu to transmit the subset of data to the gpu, and the overhead incurred by the screening in the cpu for the data copy.

Further, the determination of the data subset screening mode may also be determined according to the screening strength. Each data subset corresponds to a screening intensity, and the screening intensity is determined according to the first label and a second label corresponding to the feature vector contained in the target data subset. When the screening strength is high, the number of the feature vectors contained in the screened data subsets is small, the extra overhead of data copying generated after screening in the central processing unit is small, and meanwhile, the overhead caused by the process of transmitting the central processing unit to the graphic processor can be saved; when the screening strength is small, the number of the feature vectors contained in the screened data subsets is large, the extra overhead of data copying generated after screening in the central processing unit is large, and the saved overhead caused by the process of transmitting the data subsets from the central processing unit to the graphic processor is small.

Further, in an optional implementation manner, the step S300 includes:

step S310, in response to that the screening strength corresponding to the data subset is higher than a first threshold, determining that the screening mode of the data subset is the screening in the central processing unit.

Step S320, in response to that the screening strength corresponding to the data subset is lower than a first threshold, determining that the screening mode of the data subset is screening in the graphics processor.

The first threshold is determined based on the overhead incurred by the process of transmitting the subset of data from the central processor to the graphics processor and the overhead of screening the generated copy of the data in the central processor.

Step S400: and screening the data subset in the first data set according to the first label and the determined screening mode.

Specifically, the filtering mode is the filtering mode determined in step S300, and in response to the filtering mode of the data subset being filtering in the graphics processor, the data subset is transmitted to the graphics processor for filtering. In response to the screening mode of the data subset being screening in the central processing unit, performing screening of the data subset in the central processing unit by a first processing unit of the central processing unit; and after the screening is finished, transmitting the screened data subset to the graphics processor through a second processing unit of the central processing unit. The first processing unit and the second processing unit are both program modules of the central processing unit. When the screening mode is to perform screening in the central processing unit, the screening step and the transmission step operate in a pipeline manner, that is, while the second processing unit transmits the screened data subset to the graphics processor, the first processing unit processes screening of a next data subset. When the screening mode is screening in the graphics processor, the non-screened data subset is also transmitted to the graphics processor through the pipeline, that is, the central processing unit screens a next data subset while transmitting the non-screened data subset to the graphics processor. In the searching method, each data subset needs to be independently calculated and searched, so in order to improve the working efficiency of the pipeline, in the data transmission process, in response to that the number of feature vectors contained in a plurality of screened data subsets is smaller than a second threshold value, the plurality of data subsets are collected into one data subset and then transmitted to the graphics processor. That is, when the number of feature vectors included in some data subsets after filtering is too small, in order to save the subsequent data transmission and the computation time in the graphics processor, the data subsets may be sorted and aggregated into one data subset to save the subsequent data transmission and the computation time in the graphics processor.

Further, a screened data cache area is maintained in the memory of the central processing unit. And maintaining a video memory data buffer area in the video memory of the graphics processor. When the data subsets are screened in the central processing unit, firstly, whether a screened data cache region has a spare storage space is checked. And if the spare storage space is available, allocating the storage space with the proper size for data copying and screening.

Further, each of the data subsets includes a plurality of clustered feature vectors and second labels corresponding to the feature vectors. And screening the data subset in the first data set according to the set screening condition and the relation between the first label and the second label and the determined screening mode. For example, when the set filtering condition is to retain the feature vector corresponding to the second label that is the same as the first label, the feature vector corresponding to the second label that is the same as the first label in the data subset is filtered, and the feature vector corresponding to the second label that is different from the first label in the data subset is deleted.

Step S500: and matching the retrieval feature vector with the feature vectors in the screened data subset.

Specifically, the data subsets obtained by the screening of the central processing unit are stored in a screening data cache region, the data subsets obtained by the screening of the graphic processing unit are stored in the video memory data cache region, and when the retrieval feature vectors are matched with the feature vectors in the screened data subsets, the screening data cache region and the video memory data cache region are sequentially checked. And if the screening data cache region has the data subsets and the video memory data cache region has vacant space, reading the data subsets from the screening data cache region, transmitting and storing the data subsets to the video memory data cache region, and matching the retrieval characteristic vectors with the characteristic vectors in the screened data subsets in the video memory data cache region. In this embodiment, the matching process is to perform similarity search on the retrieved feature vector and the feature vectors in the data subset, that is, the feature vectors in the data subset are more similar to the retrieved feature vector, and the matching degree is higher. The matching method may include, for example, calculating a euclidean distance between the search feature vector and the feature vectors included in the filtered data subset, where the closer the euclidean distance, the higher the feature vector matching degree. And when the number of the preset output feature vectors is P, obtaining P feature vectors closest to the Euclidean distance of the retrieval feature vectors as a matching result.

Step S600: and determining output data according to the matching result.

Specifically, the matching results obtained in each data subset are processed to determine the output data. For example, when the euclidean distance between the search feature vector and the feature vectors included in the filtered data subset is calculated, the feature vectors included in all matching results are sorted from small to large according to the euclidean distance from the search feature vector. When the predetermined output data contains Q eigenvectors, acquiring the first Q elements in the eigenvector sorting as output data.

The method of the embodiment searches data based on a heterogeneous computer framework composed of a Central Processing Unit (CPU) and a Graphic Processing Unit (GPU), realizes screening of the data according to the attribute tags during searching, and avoids performance bottlenecks caused by large data scale during data processing in the graphic processing unit by utilizing the advantage of parallel acceleration of the graphic processing unit. Meanwhile, a production line comprising a data screening step in the central processing unit and a data transmission step from the central processing unit to the graphic processor is constructed, and extra expenses generated by screening and copying data in the central processing unit are effectively covered by performing data screening, transmission and calculation searching operations in parallel by all components in the production line.

Fig. 3 is a flowchart of an alternative implementation manner of the embodiment of the present invention, in which a first data set is determined according to a first tag and the retrieved feature vector, as shown in fig. 3, the method includes:

step S210: a second data set is determined.

In particular, the second data set contains all searchable subsets of data. For example, when the search process is performed in the readable storage device a, the second data set contains feature vectors of all the readable data transformations in the readable storage device a.

Step S220: and determining the screening strength according to the first label and a second label corresponding to the feature vector contained in each data subset.

In particular, each of the first and second tags includes at least one attribute tag. The screening strength may be estimated by calculating a data distribution of each attribute tag in the second tags corresponding to each feature vector in the second data set. For example, the frequency of occurrence of each attribute tag included in the second tag in the second data set is determined, and attribute tags having higher frequency of occurrence in the second data set are further determined. And then judging whether each attribute label in the first labels is an attribute label with higher occurrence frequency, and the occurrence frequency of each attribute label in the first labels in the second data set. Thereby estimating the percentage of the attribute tags contained in the first tag in the attribute tags of the second data set, and determining the value obtained by subtracting the percentage from 1 as the screening strength.

Step S230: and determining the number N of the data subsets contained in the first data set according to the number of the data subsets contained in the second data set and the screening strength corresponding to each data subset.

Specifically, assuming that the filtering strength calculated in step S220 is p, and the number of data subsets included in the second data set is a, a data subset number N positively correlated to the filtering strength can be directly determined by the formula N ═ a × C/(1-p), where C is a constant positively correlated to the size of the search space and the search time, and 0< C < (1-p), and the value of C can be further reasonably adjusted by other factors. For example, when the memory of the central processing unit and the video memory of the graphics processor have a large amount of free storage space, the value C can be increased by a proper amount; or adjust the C value according to the desired search performance and accuracy requirements, e.g., the C value may be adaptively increased according to performance requirements when accuracy requirements are high.

Furthermore, the value of N can be reasonably adjusted by other factors. For example, the maximum of the N value recommended or default by the search algorithm and the currently calculated N value is selected.

Step S240: and calculating the distance between the retrieval feature vector and the center of each data subset contained in the second data set.

In particular, the data subset center is used for characterizing the corresponding data subset, and may be, for example, an average value of all feature vectors in the data subset. Since each data subset comprises a plurality of clustered feature vectors and a second label corresponding to the feature vectors, the closer the retrieval feature vector is to the center of each data subset included in the second data set, the higher the matching degree of the feature vectors in the data subsets with the retrieval feature vectors. In this embodiment, the distance between the search feature vector and the center of each data subset included in the second data set is a euclidean distance.

Step S250: and determining N data subsets to form a first data set according to the distance.

Specifically, the first data set is determined according to the number N of data subsets determined in step S230 and the distance between the retrieval feature vector determined in step S240 and the center of each data subset included in the second data set. Namely, selecting the data subsets with the centers of the N data subsets which are closest to the retrieval feature vector to form a first data set.

The method reduces the calculation amount in the searching process, improves the searching efficiency, and can enable the searching process to obtain enough searching results.

FIG. 4 is a diagram illustrating a process of determining a second data set according to an embodiment of the invention. The flow shown in fig. 4 may be used for determining the second data set in step S210 described above. As shown in fig. 4, the second data set is obtained by converting and clustering the data to be searched.

Specifically, each piece of data to be searched corresponds to a second tag, and the second tag includes at least one attribute tag corresponding to the input object, and is used for screening the data subset in the search process. And in the process of converting and clustering the data, the second label is not converted. The conversion process is to convert all the data to be searched into a form suitable for data analysis, processing and calculation, for example, converting the data into feature vectors. Wherein the conversion method may be, for example, a Convolutional Neural Network (CNN) or the like. And in the clustering process, a data set consisting of all the characteristic vectors is subjected to clustering analysis through a clustering algorithm, and the data set is divided into K data subsets. The clustering algorithm may be, for example, a K-means clustering algorithm, and each clustered data subset corresponds to a mean.

When the input object is a picture, the data to be searched is image data, each picture data corresponds to a second tag, the second tag includes at least one attribute tag corresponding to the input object, and the attribute tag may be, for example, an "identity: student "," place of birth: beijing, and the like, and the second label is not processed in the process of converting and clustering the data to be searched. When the data subset in the first data set is filtered, it is compared whether the contents of the second tag and the first tag are the same, for example, when the first tag includes attribute information "pet: puppy "and" age: at age twenty, retaining a second label in the subset of data comprises "pet: puppy "and" age: feature vector twenty years old ". And matching the retained feature vectors with the retrieval feature vectors, for example, calculating Euclidean distances between the feature vectors and the retrieval feature vectors, determining that the shorter the distance is, the higher the matching degree between the feature vectors and the retrieval feature vectors is, and outputting the corresponding number of feature vectors according to a preset number.

The method is used for preprocessing data to be searched, and comprises two steps of converting the data to be searched and clustering the converted data to obtain a data subset suitable for data analysis, processing and calculation, so that data search is facilitated.

Fig. 5 is a schematic view of a pipeline formed by the data filtering step and the data transmission step according to an embodiment of the present invention, and as shown in fig. 5, when the filtering mode is filtering in the central processing unit, the filtering step and the transmission step operate in a pipeline manner, that is, the second processing unit transmits the filtered data subset to the graphics processor, and the first processing unit processes the filtering of the next data subset.

Specifically, taking the time T1 in the figure as an example, when the data subset 0 is transmitted to the graphics processor through the second processing unit, the data subset 1 is subjected to data screening through the first processing unit. At the time T2, the data subset 0 enters the graphics processor for data search, and the data subset 1 performs data transmission through the second processing unit while the data subset 2 performs data filtering through the first processing unit.

Further, when the data subset 3 is subjected to data screening in the graphics processor, the unscreened data subset 3 and the screened data subset 2 are simultaneously transmitted to the graphics processor through the second processing unit at time T3, the data subset 1 enters the graphics processor to perform data search, and the data subset 4 is subjected to data screening through the first processing unit. When the next time T4 is reached, the data subset 2 is stored in the video memory data buffer, the data subset 3 is subjected to data screening in the graphics processor, when the data subset 3 completes data screening, the data subset 3 is stored in the video memory data buffer, then the screened data subset 3 and the data subset 2 are subjected to data search in the graphics processor at the same time, the data subset 4 is subjected to data transmission through the second processing unit, and the data subset 5 is subjected to data screening through the first processing unit.

In another optional implementation manner of this embodiment, at time T4, the data subset 3 performs data filtering in the graphics processor, the data subset 2 performs data searching, the data subset 4 performs data transmission through the second processing unit, and the data subset 5 performs data filtering through the first processing unit. At the next instant, the filtered data subsets 3 and 4 are concurrently performing a data search within the graphics processor.

In yet another alternative implementation of this embodiment, the step of filtering the subset of data and the step of searching in the graphics processor may be performed together.

In the process, the data screening, transmission and calculation searching operations are performed in parallel by all the components in the production line, so that the additional overhead generated by screening and copying the data in the central processing unit is effectively covered.

Fig. 6 is a schematic diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 6, in this embodiment, the electronic device includes a server, a terminal, and the like. As shown, the electronic device includes: a heterogeneous computer architecture of at least one first processor 62, which may be, for example, a Central Processing Unit (CPU), and one second processor 63, which may be, for example, a Graphics Processing Unit (GPU); a memory 61 communicatively coupled to at least one of the heterogeneous computer architectures; and a communication component 64 communicatively coupled to the storage medium, the communication component 64 receiving and transmitting data under control of the heterogeneous computer architecture; the memory 61 stores instructions executable by at least one heterogeneous computer architecture, and the instructions are executed by the at least one heterogeneous computer architecture to implement the search method in the above embodiments.

In particular, the memory 61, as a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The heterogeneous computer architecture implements the above-described search method by executing nonvolatile software programs, instructions, and modules stored in the memory 61 to thereby execute various functional applications and data processing of the device.

The memory 61 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory 61 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 61 may optionally include memory located remotely from the processor, which may be connected to an external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in memory 61 that, when executed by the heterogeneous computer architecture, perform the search method in any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.

The present invention also relates to a computer-readable storage medium for storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of searching, comprising:

and determining output data according to the matching result.

2. The method of claim 1, wherein said determining a first data set from a first tag and the retrieved feature vector comprises:

3. The method of claim 2, wherein said determining the screening pattern for each of the subsets of data comprises:

4. The method of claim 2, wherein said determining the screening pattern for each of the subsets of data comprises:

5. The method of claim 1, wherein the screening a subset of data in the first data set according to the first tag and the determined screening pattern comprises:

6. The method of claim 1, wherein the screening a subset of data in the first data set according to the first tag and the determined screening pattern comprises:

7. The method of claim 6, wherein the screening of a next subset of data is processed by the first processing unit while the second processing unit transmits the screened subset of data to the graphics processor.

8. The method of claim 6, wherein said transmitting, by a second processing unit of the central processor, the filtered subset of data to the graphics processor comprises:

9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-8.

10. A computer readable storage medium storing computer program instructions, which when executed by a processor implement the method of any one of claims 1-8.