CN109992687B - Face data searching method and device, electronic equipment and readable storage medium - Google Patents

Face data searching method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN109992687B
CN109992687B CN201910285920.5A CN201910285920A CN109992687B CN 109992687 B CN109992687 B CN 109992687B CN 201910285920 A CN201910285920 A CN 201910285920A CN 109992687 B CN109992687 B CN 109992687B
Authority
CN
China
Prior art keywords
sub
face data
data
maximum value
libraries
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910285920.5A
Other languages
Chinese (zh)
Other versions
CN109992687A (en
Inventor
梁喆
朱雨
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Megvii Technology Co Ltd filed Critical Beijing Megvii Technology Co Ltd
Priority to CN201910285920.5A priority Critical patent/CN109992687B/en
Publication of CN109992687A publication Critical patent/CN109992687A/en
Application granted granted Critical
Publication of CN109992687B publication Critical patent/CN109992687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application provides a method and a device for searching face data, electronic equipment and a computer readable storage medium, and relates to the field of data processing. The method comprises the following steps: when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of first sub-databases, then screening N pieces of face data with the highest similarity with the face data to be searched as second sub-databases aiming at each part of the first sub-databases to generate M parts of second sub-databases, and then selecting T pieces of face data with the highest similarity with the face data to be searched as search results from the M parts of second sub-databases; wherein M, N, T is a positive integer. Therefore, the data sets are grouped into the sub-libraries, and the sub-libraries are combined for screening, so that the same accuracy as that of the traditional searching method can be ensured, and compared with the traditional searching method, the resource consumption is reduced, and the searching efficiency is improved.

Description

Face data searching method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for searching face data, an electronic device, and a computer-readable storage medium.
Background
The existing society is a society with high-speed development, developed science and technology and information circulation, people communicate with each other more and more closely, the life is more and more convenient, and big data is a product of the high-tech era.
Today, many industries start to use big data, for example, public security agencies search with big data during face recognition. China is 14 billion in population, and public security agencies need to search more than 10 billion face databases for one person. The existing method includes firstly extracting 512-dimensional one-byte feature from each face database, calculating the L2 distance between the features of the face to be detected and the features in the database, theoretically, the smaller the L2 distance is, the more similar the face to be detected is to the face in the database, but in order to avoid the possibility of inaccurate extraction and the occurrence of misidentification, a public security system usually needs to give one face to be detected, then 1024 closest faces are screened from the face database, and the people in the public security manually screen again.
The existing method specifically comprises the following steps: calculating the distances between all features of the whole database and L2 of the features to be compared one by one on the GPU, firstly caching the first 1024L 2 distances, then comparing the distances with the maximum value of the cached 1024L 2 distances after calculating each L2 distance, if the newly calculated L2 distance is smaller than the maximum one of the 1024, inserting the newly calculated L2 distance into the 1024 distances, and then deleting the maximum L2 distance from the 1024 distances. This keeps the 1024 distances which are currently the smallest of all 1024L 2 calculated.
However, the disadvantages of this method are: a list with length of 1024 needs to be maintained all the time, and in the worst case, after a new L2 result is calculated each time, the operation of searching the maximum value is performed on the cached top1024 list, which not only is the process complicated, but also is very time-consuming (time complexity is at least log (n)), the resources are not sufficient, the screening result cannot be returned in less than 1s, and the efficiency is low.
Disclosure of Invention
The application provides a method and a device for searching face data, electronic equipment and a computer readable storage medium, which can solve the problems of complex process, time consumption and low efficiency in the existing face recognition process. The technical scheme is as follows:
in a first aspect, a method for searching face data is provided, and the method includes:
when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of a first sub-database;
for each first sub-library, screening N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library, and generating M second sub-libraries;
selecting T individual face data with the highest similarity to the face data to be searched from M second sub-libraries as a search result; wherein M, N, T is a positive integer.
Preferably, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched as a second sub-library to generate M pieces of second sub-libraries includes:
randomly selecting N pieces of face data from each first sub-library, calculating N distances L2 between the N pieces of face data and the face data to be searched, and generating a first data list;
screening out a first maximum value of the first data list;
and storing the first maximum value into a preset maximum value variable, and storing an address corresponding to the first maximum value into a preset index variable.
Preferably, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library to generate M pieces of second sub-libraries further includes:
step A, aiming at each first sub-database, selecting any face data except N face data, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
b, if the data are smaller than the maximum value, replacing the maximum value with the data to generate a second data list;
step C, screening out a second maximum value of the second data list;
d, replacing a first maximum value in the maximum value variable by the second maximum value, and replacing an address corresponding to the first maximum value in a preset index variable by an address corresponding to the second maximum value;
step E, taking the second data list as a first data list;
and C, repeatedly executing the step A to the step E until all the face data in the M first sub-libraries are screened.
Preferably, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched as a second sub-library to generate M pieces of second sub-libraries specifically includes:
averaging all the first sub-libraries into K parts based on a preset Field Programmable Gate Array (FPGA); wherein each part corresponds to a set of preset comparators;
sequentially inputting each first sub-library in each part into a corresponding comparator by adopting a time-sharing multiplexing mechanism, and screening out the first N pieces of face data with the highest similarity with the face data to be searched as a second sub-library;
combining the second sub-libraries in the K parts to obtain M second sub-libraries; the FPGA is provided with M RAMs, each RAM is provided with a data list with the capacity of N, and M is an integral multiple of K.
Preferably, the step of selecting, from the M second sub-libraries, T pieces of face data with the highest similarity to the face data to be searched as a search result includes:
storing the M second sub-libraries into an internal memory, and sequencing the M N face data based on the similarity to obtain a sequenced array;
and selecting T pieces of face data with the highest similarity from the data columns as search results.
In a second aspect, an apparatus for searching face data is provided, the apparatus comprising:
the dividing module is used for randomly and averagely dividing a preset human face database into M parts to obtain M parts of first sub-databases when human face data to be searched are received;
the first screening module is used for screening N pieces of face data with the highest similarity to the face data to be searched as second sub-libraries aiming at each first sub-library to generate M second sub-libraries;
the second screening module is used for selecting T personal face data with the highest similarity to the face data to be searched from the M second sub-libraries as a search result; wherein M, T is a positive integer.
Preferably, the first screening module comprises:
the data list generation sub-module randomly selects N pieces of face data from each first sub-library, calculates N distances between the N pieces of face data and N pieces of L2 of the face data to be searched, and generates a first data list; wherein N is a positive integer;
the maximum value screening submodule is used for screening out a first maximum value of the first data list;
and the storage submodule is used for storing the first maximum value into a preset maximum value variable and storing an address corresponding to the first maximum value into a preset index variable.
Preferably, the first screening module further comprises:
the first comparison submodule is used for selecting any face data except the N face data aiming at each first sub-library, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
the replacing submodule is used for replacing the data with the maximum value if the data is smaller than the maximum value, and a second data list is generated;
the second comparison submodule is used for screening out a second maximum value of the second data list;
the replacing submodule is further configured to replace a first maximum value in the maximum value variable with the second maximum value, and replace an address corresponding to the first maximum value in a preset index variable with an address corresponding to the second maximum value;
the updating submodule is used for taking the second data list as a first data list;
and the first comparison sub-module, the replacement sub-module, the second comparison sub-module and the updating sub-module are repeatedly executed until all the face data in the M first sub-libraries are screened.
Preferably, the first screening module further comprises:
the calling submodule is used for calling a preset field programmable gate array FPGA and averagely dividing each first sub-library into K parts; wherein each part corresponds to a set of preset comparators;
the input sub-module is used for sequentially inputting each first sub-library in each part into the corresponding comparator by adopting a time-sharing multiplexing mechanism, and screening the first N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library;
the merging submodule is used for merging the second sub-libraries in the K parts to obtain M parts of second sub-libraries; the FPGA is provided with M RAMs, each RAM is provided with a data list with the capacity of N, and M is an integral multiple of K.
Preferably, the second screening module comprises:
the sequencing submodule is used for storing the M second sub-libraries into an internal memory and sequencing the M × N face data based on the similarity to obtain a sequenced array;
and the selecting submodule is used for selecting the T personal face data with the highest similarity from the data columns as a search result.
In a third aspect, an electronic device is provided, which includes:
a processor, a memory, and a bus;
the bus is used for connecting the processor and the memory;
the memory is used for storing operation instructions;
the processor is configured to call the operation instruction, and the executable instruction enables the processor to execute an operation corresponding to the method for searching the face data according to the first aspect of the present application.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the method for searching human face data shown in the first aspect of the present application.
The beneficial effect that technical scheme that this application provided brought is:
when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of first sub-databases, then screening N pieces of face data with the highest similarity with the face data to be searched as second sub-databases aiming at each part of the first sub-databases to generate M parts of second sub-databases, and then selecting T pieces of face data with the highest similarity with the face data to be searched as search results from the M parts of second sub-databases; wherein M, N, T is a positive integer. Therefore, the data sets are grouped into the sub-libraries, and the sub-libraries are combined for screening, so that the same accuracy as that of the traditional searching method can be ensured, and compared with the traditional searching method, the resource consumption is reduced, and the searching efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic flowchart of a method for searching face data according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a face data search apparatus according to yet another embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device for searching face data according to yet another embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The application provides a method and a device for searching face data, electronic equipment and a computer-readable storage medium, and aims to solve the above technical problems in the prior art.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
In one embodiment, a method for searching face data is provided, as shown in fig. 1, the method includes:
step S101, when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of a first sub-database;
the face database may include more than 10 hundred million pieces of face data. In the embodiment of the present invention, a field programmable gate array FPGA (field programmable gate array) is preset, and the FPGA is provided with M RAMs (Random Access memories), so that the number of face database partitions is the same as the number of RAMs in the FPGA, that is, the number of RAMs provided in the FPGA randomly and evenly partitions the face database into the number of parts.
S102, screening N pieces of face data with the highest similarity to the face data to be searched out as second sub-libraries aiming at each first sub-library, and generating M second sub-libraries;
in practical application, a public security organization matches face data to be searched with a face database, so that actually, the face data in the face database is not directly screened, a 512-dimensional one-byte face feature and a face feature of the face data to be searched are extracted for each face data, then the L2 distance between the face feature to be searched and each face feature in the face database is calculated, then all the L2 distances are screened (the smaller the L2 distance is, the more similar the face to be searched is to the face in the face database), a certain number of L2 distance minimum values are obtained, and finally the face data corresponding to the minimum values are screened out to be used as target data.
Therefore, in the embodiment of the present invention, for each first sub-library, the feature value of each face data is extracted, then the L2 distance between the face feature to be searched and each face feature is calculated, then all L2 distances are screened to obtain the first N L2 distances meeting the preset rule, finally, the face data corresponding to each L2 distance is screened out as the second sub-library, and so on until the screening of M first sub-libraries is completed, and M second sub-libraries are generated.
For example, the first sub-library includes 1 hundred million pieces of face data, after the L2 distance between each piece of face data and the face data to be searched is calculated, 1 hundred million L2 distances are arranged in the order from small to large, and the first N, for example, 128, face data corresponding to the L2 distance are selected to generate the second sub-library.
In a preferred embodiment of the present invention, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library, and generating M pieces of second sub-libraries includes:
randomly selecting N pieces of face data from each first sub-library, calculating N distances L2 between the N pieces of face data and the face data to be searched, and generating a first data list;
screening out a first maximum value of the first data list;
and storing the first maximum value into a preset maximum value variable, and storing an address corresponding to the first maximum value into a preset index variable.
Specifically, the preset FPGA is provided with M RAMs, each RAM is provided with a data list with a capacity of N, that is, each RAM can store N L2 distance values, and is further provided with a maximum variable and an index variable, the maximum variable records a maximum value of the L2 distance in the RAM, and the index variable records an address corresponding to the maximum value of the L2 distance in the RAM; wherein the data list is an unordered list.
After M first sub-libraries are obtained, N pieces of face data are randomly selected for each first sub-library, L2 distance values (namely N L2 distances) between the N pieces of face data and the face data to be searched are respectively calculated, all the L2 distance values are stored in an RAM to generate a first data list, then the maximum value of the L2 distance values is selected and stored in a preset maximum value variable, and the address of the maximum value is stored in a preset index variable.
In a preferred embodiment of the present invention, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library to generate M pieces of second sub-libraries further includes:
step A, aiming at each first sub-database, selecting any face data except N face data, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
b, if the data are smaller than the maximum value, replacing the maximum value with the data to generate a second data list;
step C, screening out a second maximum value of the second data list;
d, replacing a first maximum value in the maximum value variable by the second maximum value, and replacing an address corresponding to the first maximum value in a preset index variable by an address corresponding to the second maximum value;
step E, taking the second data list as a first data list;
and C, repeatedly executing the step A to the step E until all the face data in the M first sub-libraries are screened.
Specifically, after a first data list is generated for each first sub-library, any data (except for N data corresponding to the first data list) is continuously selected from the first sub-library, the L2 distance value between the data and the data to be searched is calculated, if the L2 distance value is smaller than the maximum value in the first data list, the L2 distance value is substituted for the maximum value in the first data list, a second data list is generated, then the maximum value in the second data list is re-screened, after the maximum value is screened, a new maximum value is stored in a preset maximum value variable, and meanwhile, the address of the new maximum value is stored in a preset index variable.
In a preferred embodiment of the present invention, the step of screening, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched as a second sub-library, and generating M pieces of second sub-libraries specifically includes:
averaging all the first sub-libraries into K parts based on a preset Field Programmable Gate Array (FPGA); wherein each part corresponds to a set of preset comparators;
sequentially inputting each first sub-library in each part into a corresponding comparator by adopting a time-sharing multiplexing mechanism, and screening out the first N pieces of face data with the highest similarity with the face data to be searched as a second sub-library;
combining the second sub-libraries in the K parts to obtain M second sub-libraries; the FPGA is provided with M RAMs, each RAM is provided with a data list with the capacity of N, and M is an integral multiple of K.
For ease of understanding, a simple example is illustrated. For example, the first data list is {1, 4, 7}, the maximum variable stores "7", the index variable stores "2" ("1" corresponds to "0", "4" corresponds to "1", and "7" corresponds to "2"), when a new data "3" is smaller than the maximum "7", the "7" is replaced by "3", the second data list {1, 4, 3} is generated, at this time, the maximum value is "4", and therefore, "4" is stored in the maximum variable, and "4" corresponds to "1" is stored in the index variable.
And repeating the steps until all the data in the first sub-library are screened to obtain a second sub-library of N data, and obtaining M second sub-libraries until all the M first sub-libraries are screened.
In the embodiment of the present invention, when step C is executed, a comparator may be used to perform data filtering. Specifically, a set of comparators is provided, including 128+64+32+16+8+4+2 comparators, while the M first sub-banks are sorted: 1. 2, 3 … M. In which two or more data items are compared to determine whether they are equal or to determine the magnitude relationship and the order of arrangement between them is called comparison, and a circuit or an apparatus capable of performing such a comparison function is called a comparator. A comparator is a circuit that compares an analog voltage signal with a reference voltage. The two paths of input of the comparator are analog signals, the output is binary signals 0 or 1, and when the difference value of the input voltage is increased or decreased and the positive sign and the negative sign are unchanged, the output is kept constant.
In the 1 st clock cycle, when a new L2 distance comes, the maximum value of 128 sub-libraries in the first sub-library is compared; in the 2 nd clock cycle, 64 comparators are used for pairwise comparison in parallel, and the 128 comparators are compared pairwise to obtain 64 larger numbers; in the 3 rd clock cycle, the 32 comparators are used to compare 64 numbers in parallel, and then 32 larger numbers are obtained. By analogy, the new maximum value of the first sub-bank is obtained at the 8 th clock cycle. If the new L2 distance is less than the current maximum, then the new L2 distance is written to the index variable of the first sub-bank.
Meanwhile, in the 2 nd clock cycle, inputting a new L2 distance in the second sub-bank, and comparing the distance with the maximum value of 128 sub-banks in the second sub-bank; in the 3 rd period, 64 comparators can be used for pairwise comparison in parallel, and the number of 128 comparators is 64 larger numbers; in the 4 th clock cycle, the 64 comparators are used in parallel to compare two by two, and 32 larger numbers are obtained. By analogy, the 9 th clock cycle results in a new maximum value for the second copy bank. If the new L2 distance is less than the current maximum, then the new L2 distance is written into the index variable of the second share sub-bank.
Therefore, by establishing a pipeline, the plurality of first sub-libraries can multiplex one set of comparators in a time-sharing manner, and each set of pipeline multiplexing comparator can save 7 times of comparator resources, so that the data processing efficiency is improved. Furthermore, by establishing K sets of pipelines, M sets of first sub-data time-division multiplexing multiple comparators can be used, for example, 16 sets of first sub-bank time-division multiplexing two sets (K ═ 2) of comparators, one set of comparators processing 8 sets of first sub-banks, or four sets of (K ═ 4) comparators, one set of comparators processing 4 sets of first sub-banks, so that the efficiency of data processing can be further improved.
Step S103, selecting T individual face data with the highest similarity to the face data to be searched from M second sub-libraries as a search result; wherein M, N, T is a positive integer.
In a preferred embodiment of the present invention, the step of selecting, from M second sub-libraries, T pieces of face data with the highest similarity to the face data to be searched as a search result includes:
storing the M second sub-libraries into an internal memory, and sequencing the M N face data based on the similarity to obtain a sequenced array;
and selecting T pieces of face data with the highest similarity from the data columns as search results.
Specifically, M second sub-libraries with the capacity of N are merged to obtain M × N pieces of face data, and since the data size at this time is already very small, the target data may be screened by using an application without using a comparator, specifically: and arranging the L2 distance values corresponding to the M x N pieces of face data from small to large, screening the first T L2 distance values, and taking the face data corresponding to the T L2 distance values as target data. The target data is the data with the highest similarity to the data to be searched, that is, the T person face has the highest similarity to the face to be searched.
In the embodiment of the invention, when receiving face data to be searched, a preset face database is randomly and averagely divided into M parts to obtain M parts of first sub-databases, then N pieces of face data with the highest similarity to the face data to be searched are screened out for each part of the first sub-databases to serve as second sub-databases to generate M parts of second sub-databases, and then T pieces of face data with the highest similarity to the face data to be searched are selected from the M parts of second sub-databases to serve as search results; wherein M, N, T is a positive integer. Therefore, the data sets are grouped into the sub-libraries, and the sub-libraries are combined for screening, so that the same accuracy as that of the traditional searching method can be ensured, and compared with the traditional searching method, the resource consumption is reduced, and the searching efficiency is improved.
Fig. 2 is a schematic structural diagram of a face data search apparatus according to another embodiment of the present application, and as shown in fig. 2, the apparatus of this embodiment may include:
the dividing module 201 is configured to, when receiving face data to be searched, randomly and averagely divide a preset face database into M parts to obtain M parts of a first sub-database;
the first screening module 202 is configured to screen, for each first sub-library, N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library, and generate M pieces of second sub-libraries;
the second screening module 203 is configured to select, from the M second sub-libraries, T pieces of face data with the highest similarity to the face data to be searched as a search result; wherein M, T is a positive integer.
In a preferred embodiment of the present invention, the first screening module includes:
the data list generation sub-module randomly selects N pieces of face data from each first sub-library, calculates N distances between the N pieces of face data and N pieces of L2 of the face data to be searched, and generates a first data list; wherein N is a positive integer;
the maximum value screening submodule is used for screening out a first maximum value of the first data list;
and the storage submodule is used for storing the first maximum value into a preset maximum value variable and storing an address corresponding to the first maximum value into a preset index variable.
In a preferred embodiment of the present invention, the first screening module further includes:
the first comparison submodule is used for selecting any face data except the N face data aiming at each first sub-library, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
the replacing submodule is used for replacing the data with the maximum value if the data is smaller than the maximum value, and a second data list is generated;
the second comparison submodule is used for screening out a second maximum value of the second data list;
the replacing submodule is further configured to replace a first maximum value in the maximum value variable with the second maximum value, and replace an address corresponding to the first maximum value in a preset index variable with an address corresponding to the second maximum value;
the updating submodule is used for taking the second data list as a first data list;
and the first comparison sub-module, the replacement sub-module, the second comparison sub-module and the updating sub-module are repeatedly executed until all the face data in the M first sub-libraries are screened.
In a preferred embodiment of the present invention, the first screening module further includes:
the calling submodule is used for calling a preset field programmable gate array FPGA and averagely dividing each first sub-library into K parts; wherein each part corresponds to a set of preset comparators;
the input sub-module is used for sequentially inputting each first sub-library in each part into the corresponding comparator by adopting a time-sharing multiplexing mechanism, and screening the first N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library;
the merging submodule is used for merging the second sub-libraries in the K parts to obtain M parts of second sub-libraries; the FPGA is provided with M RAMs, each RAM is provided with a data list with the capacity of N, and M is an integral multiple of K.
In a preferred embodiment of the present invention, the second screening module includes:
the sequencing submodule is used for storing the M second sub-libraries into an internal memory and sequencing the M × N face data based on the similarity to obtain a sequenced array;
and the selecting submodule is used for selecting the T personal face data with the highest similarity from the data columns as a search result.
The face data search apparatus of this embodiment can execute the face data search method shown in the first embodiment of this application, and the implementation principles thereof are similar, and are not described herein again.
In another embodiment of the present application, there is provided an electronic device including: a memory and a processor; at least one program stored in the memory for execution by the processor, which when executed by the processor, implements: when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of first sub-databases, then screening N pieces of face data with the highest similarity with the face data to be searched as second sub-databases aiming at each part of the first sub-databases to generate M parts of second sub-databases, and then selecting T pieces of face data with the highest similarity with the face data to be searched as search results from the M parts of second sub-databases; wherein M, N, T is a positive integer. Therefore, the data sets are grouped into the sub-libraries, and the sub-libraries are combined for screening, so that the same accuracy as that of the traditional searching method can be ensured, and compared with the traditional searching method, the resource consumption is reduced, and the searching efficiency is improved.
In an alternative embodiment, an electronic device is provided, as shown in fig. 3, an electronic device 3000 shown in fig. 3 comprising: a processor 3001 and a memory 3003. The processor 3001 is coupled to the memory 3003, such as via a bus 3002. Optionally, the electronic device 3000 may further comprise a transceiver 3004. It should be noted that the transceiver 3004 is not limited to one in practical applications, and the structure of the electronic device 3000 is not limited to the embodiment of the present application.
The processor 3001 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 3001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 3002 may include a path that conveys information between the aforementioned components. The bus 3002 may be a PCI bus or an EISA bus, etc. The bus 3002 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.
Memory 3003 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory 3003 is used for storing application program codes for performing the present scheme, and is controlled to be executed by the processor 3001. The processor 3001 is configured to execute application program code stored in the memory 3003 to implement any of the method embodiments shown above.
Among them, electronic devices include but are not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like.
Yet another embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when run on a computer, enables the computer to perform the corresponding content in the aforementioned method embodiments. Compared with the prior art, when face data to be searched is received, a preset face database is randomly and averagely divided into M parts to obtain M parts of first sub-databases, then N pieces of face data with the highest similarity to the face data to be searched are screened out for each part of the first sub-databases to serve as second sub-databases, M parts of second sub-databases are generated, and then T pieces of face data with the highest similarity to the face data to be searched are selected from the M parts of second sub-databases to serve as search results; wherein M, N, T is a positive integer. Therefore, the data sets are grouped into the sub-libraries, and the sub-libraries are combined for screening, so that the same accuracy as that of the traditional searching method can be ensured, and compared with the traditional searching method, the resource consumption is reduced, and the searching efficiency is improved.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (8)

1. A method for searching face data is characterized by comprising the following steps:
when receiving face data to be searched, randomly and averagely dividing a preset face database into M parts to obtain M parts of a first sub-database;
for each first sub-library, screening N pieces of face data with the highest similarity to the face data to be searched out as a second sub-library, and generating M second sub-libraries;
selecting T individual face data with the highest similarity to the face data to be searched from M second sub-libraries as a search result; wherein M, N, T is a positive integer;
the step of screening out N pieces of face data with the highest similarity to the face data to be searched as second sub-libraries and generating M pieces of second sub-libraries for each first sub-library comprises the following steps:
randomly selecting N pieces of face data from each first sub-library, calculating N distances L2 between the N pieces of face data and the face data to be searched, and generating a first data list;
screening out a first maximum value of the first data list;
and storing the first maximum value into a preset maximum value variable, and storing an address corresponding to the first maximum value into a preset index variable.
2. The method for searching face data according to claim 1, wherein the step of screening, for each first sub-library, N pieces of face data having the highest similarity with the face data to be searched out as a second sub-library to generate M pieces of second sub-libraries further comprises:
step A, aiming at each first sub-database, selecting any face data except N face data, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
b, if the data are smaller than the maximum value, replacing the maximum value with the data to generate a second data list;
step C, screening out a second maximum value of the second data list;
d, replacing a first maximum value in the maximum value variable by the second maximum value, and replacing an address corresponding to the first maximum value in a preset index variable by an address corresponding to the second maximum value;
step E, taking the second data list as a first data list;
and C, repeatedly executing the step A to the step E until all the face data in the M first sub-libraries are screened.
3. The method for searching face data according to claim 1, wherein the step of screening, for each first sub-library, N pieces of face data having the highest similarity with the face data to be searched as second sub-libraries to generate M pieces of second sub-libraries specifically comprises:
averaging all the first sub-libraries into K parts based on a preset Field Programmable Gate Array (FPGA); wherein each part corresponds to a set of preset comparators;
sequentially inputting each first sub-library in each part into a corresponding comparator by adopting a time-sharing multiplexing mechanism, and screening out the first N pieces of face data with the highest similarity with the face data to be searched as a second sub-library;
combining the second sub-libraries in the K parts to obtain M second sub-libraries; the FPGA is provided with M RAMs, each RAM is provided with a data list with the capacity of N, and M is an integral multiple of K.
4. The method according to claim 1, wherein the step of selecting T pieces of face data with the highest similarity to the face data to be searched from M pieces of second sub-libraries as search results comprises:
storing the M second sub-libraries into an internal memory, and sequencing the M N face data based on the similarity to obtain a sequenced array;
and selecting T pieces of face data with the highest similarity from the data columns as search results.
5. An apparatus for searching face data, comprising:
the dividing module is used for randomly and averagely dividing a preset human face database into M parts to obtain M parts of first sub-databases when human face data to be searched are received;
the first screening module is used for screening N pieces of face data with the highest similarity to the face data to be searched as second sub-libraries aiming at each first sub-library to generate M second sub-libraries;
the second screening module is used for selecting T personal face data with the highest similarity to the face data to be searched from the M second sub-libraries as a search result; wherein M, T is a positive integer;
wherein the first screening module comprises:
the data list generation sub-module randomly selects N pieces of face data from each first sub-library, calculates N distances between the N pieces of face data and N pieces of L2 of the face data to be searched, and generates a first data list; wherein N is a positive integer;
the maximum value screening submodule is used for screening out a first maximum value of the first data list;
and the storage submodule is used for storing the first maximum value into a preset maximum value variable and storing an address corresponding to the first maximum value into a preset index variable.
6. The apparatus for searching human face data according to claim 5, wherein the first filtering module further comprises:
the first comparison submodule is used for selecting any face data except the N face data aiming at each first sub-library, calculating the L2 distance between the face data and the face data to be searched, and comparing the calculated L2 distance with the maximum value of the first data list;
the replacing submodule is used for replacing the data with the maximum value if the data is smaller than the maximum value, and a second data list is generated;
the second comparison submodule is used for screening out a second maximum value of the second data list;
the replacing submodule is further configured to replace a first maximum value in the maximum value variable with the second maximum value, and replace an address corresponding to the first maximum value in a preset index variable with an address corresponding to the second maximum value;
the updating submodule is used for taking the second data list as a first data list;
and the first comparison sub-module, the replacement sub-module, the second comparison sub-module and the updating sub-module are repeatedly executed until all the face data in the M first sub-libraries are screened.
7. An electronic device, comprising:
a processor, a memory, and a bus;
the bus is used for connecting the processor and the memory;
the memory is used for storing operation instructions;
the processor is configured to execute the method for searching the face data according to any one of claims 1 to 4 by calling the operation instruction.
8. A computer-readable storage medium for storing computer instructions which, when executed on a computer, enable the computer to perform the method for searching face data according to any one of claims 1 to 4.
CN201910285920.5A 2019-04-10 2019-04-10 Face data searching method and device, electronic equipment and readable storage medium Active CN109992687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910285920.5A CN109992687B (en) 2019-04-10 2019-04-10 Face data searching method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910285920.5A CN109992687B (en) 2019-04-10 2019-04-10 Face data searching method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN109992687A CN109992687A (en) 2019-07-09
CN109992687B true CN109992687B (en) 2021-05-25

Family

ID=67133057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910285920.5A Active CN109992687B (en) 2019-04-10 2019-04-10 Face data searching method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN109992687B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326714B (en) * 2020-02-28 2024-03-22 杭州海康威视数字技术股份有限公司 Target comparison method, target comparison device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373626A (en) * 2015-12-09 2016-03-02 深圳融合永道科技有限公司 Distributed face recognition track search system and method
CN105608464A (en) * 2015-12-15 2016-05-25 重庆凯泽科技有限公司 Face recognition mobile terminal solving system and solving method
CN106529515A (en) * 2016-12-26 2017-03-22 浙江宇视科技有限公司 Facial feature library management method and system
CN107291706A (en) * 2016-03-30 2017-10-24 杭州海康威视数字技术股份有限公司 A kind of picture retrieval method and device
CN109426781A (en) * 2017-08-29 2019-03-05 阿里巴巴集团控股有限公司 Construction method, face identification method, device and the equipment of face recognition database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7881505B2 (en) * 2006-09-29 2011-02-01 Pittsburgh Pattern Recognition, Inc. Video retrieval system for human face content

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373626A (en) * 2015-12-09 2016-03-02 深圳融合永道科技有限公司 Distributed face recognition track search system and method
CN105608464A (en) * 2015-12-15 2016-05-25 重庆凯泽科技有限公司 Face recognition mobile terminal solving system and solving method
CN107291706A (en) * 2016-03-30 2017-10-24 杭州海康威视数字技术股份有限公司 A kind of picture retrieval method and device
CN106529515A (en) * 2016-12-26 2017-03-22 浙江宇视科技有限公司 Facial feature library management method and system
CN109426781A (en) * 2017-08-29 2019-03-05 阿里巴巴集团控股有限公司 Construction method, face identification method, device and the equipment of face recognition database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于大规模数据库的人脸识别研究;张诗嘉;《中国优秀硕士学位论文全文数据库信息科技辑》;20100215(第2期);第I138-321页 *

Also Published As

Publication number Publication date
CN109992687A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
CN110162750B (en) Text similarity detection method, electronic device and computer readable storage medium
US10402427B2 (en) System and method for analyzing result of clustering massive data
US20130301935A1 (en) Method and Apparatus of Identifying Similar Images
CN109325032B (en) Index data storage and retrieval method, device and storage medium
US10579616B2 (en) Data search system, data search method, and program product
CN107341507B (en) GPU and cascade hash based rapid image SIFT feature matching method
US11074246B2 (en) Cluster-based random walk processing
CN110837584B (en) Method and system for constructing suffix array in block parallel manner
US20170124162A1 (en) System and method for subset searching and associated search operators
CN107451204B (en) Data query method, device and equipment
CN109992687B (en) Face data searching method and device, electronic equipment and readable storage medium
CN109241360B (en) Matching method and device of combined character strings and electronic equipment
Masek et al. Multi-GPU implementation of k-nearest neighbor algorithm
Ganapathi et al. Parallel Divide-and-Conquer Algorithms for Bubble Sort, Selection Sort and Insertion Sort
US8407255B1 (en) Method and apparatus for exploiting master-detail data relationships to enhance searching operations
CN108334532B (en) Spark-based Eclat parallelization method, system and device
CN106980673B (en) Method and system for updating internal memory database table index
CN112306452A (en) Method, device and system for processing service data by merging and sorting algorithm
CN116521733A (en) Data query method and device
CN111008198A (en) Service data acquisition method and device, storage medium and electronic equipment
CN112214494A (en) Retrieval method and device
Cao Design and implementation of human-computer interaction system in parallel digital library system based on neural network
CN112115125A (en) Database access object name resolution method and device and electronic equipment
US10546061B2 (en) Predicting terms by using model chunks
KR101664712B1 (en) Bloomfilter query apparatus and method for identifying true positiveness without accessing hashtable

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant