CN111008620A - Target user identification method and device, storage medium and electronic equipment - Google Patents

Target user identification method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN111008620A
CN111008620A CN202010147725.9A CN202010147725A CN111008620A CN 111008620 A CN111008620 A CN 111008620A CN 202010147725 A CN202010147725 A CN 202010147725A CN 111008620 A CN111008620 A CN 111008620A
Authority
CN
China
Prior art keywords
candidate
similarity
vector
feature vector
biological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010147725.9A
Other languages
Chinese (zh)
Inventor
江南
杨文�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010147725.9A priority Critical patent/CN111008620A/en
Publication of CN111008620A publication Critical patent/CN111008620A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/70Multimodal biometrics, e.g. combining information from different biometric modalities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The application discloses a target user identification method and device, a storage medium and electronic equipment. The target user identification method comprises the following steps: acquiring a biological feature vector to be identified of a target user; in a target index library, respectively comparing the similarity of the biological feature vector to be identified with the searched initial node and the searched neighbor nodes thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological feature vector in the biological feature library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological feature vectors carrying the similarity; and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.

Description

Target user identification method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of object recognition technologies, and in particular, to a method and an apparatus for identifying a target user, a storage medium, and an electronic device.
Background
In many life scenarios, it is necessary to identify the identity of a person and obtain user information, for example, when purchasing a commodity, pay according to a biological characteristic (for example, identify a face or a fingerprint, and obtain user information related to the face or the fingerprint to pay), obtain user information according to the identified biological characteristic, log in an application (for example, an instant messaging application, a payment application, etc.), obtain user information according to the identified biological characteristic to check in a card, and the like.
In the prior art, when the user identity is identified according to the biological characteristics, the biological characteristics are only acquired in real time and compared with less biological characteristics pre-recorded in a database one by one. For example, when a card is punched and a card is signed according to the biological characteristics, the biological characteristics are compared with a pre-recorded feature vector library of a certain company person one by one to complete identity recognition. For another example, when a commodity is purchased, payment is carried out according to the biological characteristics, and the identity recognition can be completed only by comparing the acquired biological characteristics with the biological characteristics associated with the current login account. When the vending machine sells goods or monitors security and protection, when biological characteristics are collected to identify and acquire user information, the biological characteristics need to be compared with billions of biological characteristics of billions of users in the background database one by one, a large amount of time needs to be consumed, and when searching and comparing are performed in the background database under the scale of billions of biological characteristics, the accuracy of user identification can be greatly reduced.
Disclosure of Invention
One or more embodiments of the present disclosure provide a target user identification method, an apparatus, a storage medium, and an electronic device, so as to solve the problems of long time consumption and low identification accuracy when comparing a collected biometric feature with a search in a background database at a scale of billions of biometric features.
In a first aspect, one or more embodiments of the present specification provide a target user identification method, including:
acquiring a biological feature vector to be identified of a target user;
in a target index library, respectively comparing the similarity of a biological characteristic vector to be identified with a found initial node and a neighbor node thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological characteristic vector in the biological characteristic library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological characteristic vectors carrying the similarity;
and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
In a second aspect, one or more embodiments of the present specification further provide a target user identification apparatus, including:
the information acquisition unit is used for acquiring a biological feature vector to be identified of a target user;
the result generation unit is used for respectively comparing the similarity of the biological feature vectors to be identified with the searched starting node and the searched neighbor nodes in a target index library to generate a candidate result set, wherein the target index library is established in advance according to a feature vector library, and the candidate result set comprises a plurality of extracted candidate biological feature vectors with similarity;
and the information determining unit is used for determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
In a third aspect, one or more embodiments of the present specification further provide a storage medium having stored thereon a computer program that, when executed by a processor, implements:
acquiring a biological feature vector to be identified, detected by a user;
inputting the biological feature vectors to be identified into a target index library established in advance according to a feature vector library to generate a candidate result set, wherein the candidate result set comprises a plurality of extracted candidate biological feature vectors with similarity;
and respectively determining the first N candidate biological feature vectors with the similarity greater than a preset reliable threshold as target biological feature vectors, wherein the target biological feature vectors are associated with user information.
In a fourth aspect, one or more embodiments of the present specification further provide an electronic device, including:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement:
acquiring a biological feature vector to be identified of a target user;
in a target index library, respectively comparing the similarity of a biological characteristic vector to be identified with a found initial node and a neighbor node thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological characteristic vector in the biological characteristic library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological characteristic vectors carrying the similarity;
and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
One or more embodiments of the present disclosure adopt the following technical solutions: respectively comparing the biological feature vectors to be identified with the searched initial node and the neighbor nodes thereof in a target index library based on a nearest neighbor search algorithm to generate a candidate result set, wherein the candidate result set comprises candidate biological feature vectors with similarity; according to the candidate biological feature vector carrying the similarity and the preset reliable threshold, the identification result is determined, the acquired biological feature vector to be identified does not need to be compared with the candidate biological feature vectors in the feature vector library one by one, a large amount of time is saved, the identification accuracy of a user is improved, and the resource consumption during retrieval is reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic diagram of interaction between a server and a biometric acquisition device according to one or more embodiments of the present invention;
FIG. 2 is a flow diagram of a target user identification method according to one or more embodiments of the invention;
FIG. 3 is a flow diagram of a target user identification method according to one or more embodiments of the invention;
FIG. 4 is a flow diagram of a target user identification method according to one or more embodiments of the invention;
FIG. 5 is a block diagram of functional elements of a target subscriber identification device according to one or more embodiments of the invention;
FIG. 6 is a block diagram of functional elements of a target subscriber identification device according to one or more embodiments of the invention;
FIG. 7 is a block diagram of functional elements of a target subscriber identification device, according to one or more embodiments of the invention;
fig. 8 is a circuit connection block diagram of an electronic device according to one or more embodiments of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Nearest neighbor search algorithm: the method is an optimization model for finding the closest point in the scale space. Given a set of points S and a target point q ∈ M in the scale space M, the point closest to q is found in S. In many cases, M is a multidimensional euclidean space, and the distance is determined by the euclidean distance or the manhattan distance. The nearest neighbor search algorithm may be implemented by, for example, a tree method, a hash method, a vector quantization method, a hierarchical navigable worldlet method, or the like.
Layered navigable wordlike: a hierarchical Navigable small world graph (HNSW) is a nearest neighbor search method, and a hierarchical index graph is constructed using a concept similar to a skip list structure, where an upper layer is obtained by sampling data of a bottom layer, and a variation of a greedy algorithm is used to perform naive nearest neighbor search, and a recall rate and a search speed of HNSW are both high.
One embodiment of the present disclosure provides a target user identification method, which may be applied to the server 102. As shown in fig. 1, the server 102 may be communicatively connected with the biometric acquisition device 101 for data interaction. The biometric acquisition device 101 may be at least one of a 2D image acquisition module, a 3D point cloud image acquisition module, a voiceprint acquisition module, an iris recognition module, and other biometric acquisition modules. As shown in fig. 2, the method includes:
s22: and acquiring the biometric vector to be identified, detected by the user.
When a user passes through the biometric feature acquisition device 101, the biometric feature acquisition device 101 can acquire a biometric feature vector to be identified of the user and upload the biometric feature vector to the server 102, so that the server 102 acquires the biometric feature vector to be identified detected by the user. The format of the biometric vector to be recognized may be, but is not limited to: a 256-dimensional floating point number form or a 256-dimensional integer form or a binary feature form.
S24: and inputting the biological feature vector to be identified into a target index library established in advance according to the feature vector library to generate a candidate result set. And the candidate result set comprises a plurality of extracted candidate biological characteristic vectors carrying the similarity.
In the embodiment of the present invention, the target index library may be implemented by, but not limited to, a tree method, a hash method, a vector quantization method, an NSG algorithm, and a hierarchical navigable worldlet map method. In the embodiment of the present application, a HNSW model is used for example in the target index library.
It will be appreciated that the feature vector library includes a large number of candidate biometric vectors. The HNSW model may be constructed in a manner of, but not limited to: and taking each candidate biological feature vector in the feature vector library as a node in the HNSW model, wherein the feature vector library can be used as a node set P. Then inserting an initial node, traversing the node set P, inserting the 1 st node in the node set P into the HNSW index as the initial node, setting the maximum edge number ef of the current node, and starting the following steps from the 0 th layer of the HNSW index:
1. and inserting a current node into the current layer of the HNSW index, and randomly selecting one node which already exists in the current layer as an entry node. Calculating the distance between the neighbor node of the entrance node and the current node through the Euclidean distance or the cosine distance, searching ef nearest neighbor nodes in the current layer by using a greedy algorithm, and connecting the current node and the ef nearest neighbor nodes to generate an edge, wherein the connected nodes are called neighbor nodes. 2. Checking neighbor nodes of the current node, if none of the neighbor nodes exist in the upper layer, lifting the current node to the upper layer and returning to the step 1; if the number of nodes existing on the upper layer in the neighbor nodes connected with the current node or the number of the neighbor nodes of the current node is less than ef, the current node inserts the HNSW index, so that an HNSW model is established, and the generated HNSW model is a binary file which can be loaded.
The principle of similarity comparison according to the HNSW model is as follows: randomly selecting any feature point in the HNSW model as an initial feature point to perform similarity comparison, then performing similarity comparison with neighbor feature points of the feature point, finding a neighbor feature point with the highest similarity higher than the similarity of the initial feature point as a new initial feature point, and sequentially iterating until a neighbor feature point with the higher similarity than the latest initial feature point cannot be obtained, thereby obtaining a plurality of candidate biological feature vectors carrying the similarity in the indexing process.
Wherein, the candidate result set can be expressed as the following form: [ { candidate biometric feature vector 1, similarity X1}, { candidate biometric vector 2, similarity X2}, … … { candidate biometric vector N, similarity XN}]。
S26: and respectively determining the first N candidate biological characteristic vectors with the similarity larger than a preset reliable threshold as target biological characteristic vectors. Wherein the target biometric vector is associated with user information.
The preset reliable threshold is set in advance according to a large number of experimental results, and is used for balancing a correctly accepted ratio (TAR) and an incorrectly accepted ratio (FAR). For example, the preset reliability threshold may be set to, but is not limited to, 80%, 85%, 90%.
For example, when the preset reliability threshold is set to 80%, N =2, if the candidate result set is: { candidate biometric vector 1, similarity 65% }, { candidate biometric vector 2, similarity 70% }, { candidate biometric vector 3, similarity 75%, { candidate biometric vector 4, similarity 79% }, [ { candidate biometric vector 5, similarity 81% }, { candidate biometric vector 6, similarity 83% }, { candidate biometric vector 7, similarity 84%, { candidate biometric vector 8, similarity 85% } ], and then the target biometric vector is candidate biometric vector 7, candidate biometric vector 8.
The user information associated with the target biometric vector is different for different application scenarios. For example, when the user information is applied to the field of security monitoring, the user information associated with the target biometric vector may include an identification number, a gender, an age, an occupation, and the like of the user, and when the user information is applied to the field of unattended electronic payment, the user information associated with the target biometric vector may include an account number, a password, and the like of the user electronic wallet, and may be configured in advance according to actual needs.
One embodiment of the present specification provides a target user identification method, which includes inputting an acquired biological feature vector to be identified into a target index library established in advance according to a feature vector library to generate a candidate result set, where the candidate result set includes a plurality of extracted candidate biological feature vectors with similarity; respectively determining the first N candidate biological characteristic vectors with the similarity larger than a preset reliable threshold as target biological characteristic vectors, wherein, the target biological characteristic vector is associated with user information, through the method, only any characteristic point in the target index library is randomly selected as an initial characteristic point to carry out similarity comparison, then, the similarity comparison is carried out with the neighbor feature points of the feature points, a neighbor feature point with the highest similarity higher than the similarity of the initial feature point is found as a new initial feature point, the iteration is carried out in sequence until the neighbor feature point with the higher similarity than the latest initial feature point can not be obtained, therefore, the acquired biological characteristics to be identified do not need to be compared with the candidate biological characteristics in the characteristic vector library one by one, thereby saving a great deal of time, and the accuracy of the acquired user information is improved, and the resource consumption during retrieval is reduced.
Optionally, the biometric vector to be identified includes sub-feature vectors of multiple dimensions, the target index library includes sub-indexes of multiple dimensions, and the sub-feature vectors correspond to the sub-indexes one to one. For example, the biometric vector to be identified includes 4 dimensions, the 4 dimensions may be, but are not limited to, a 2D image feature dimension, a 3D point cloud image feature dimension, a voiceprint feature dimension, and an iris feature dimension, and the 2D image feature dimension, the 3D point cloud image feature dimension, the voiceprint feature dimension, and the iris feature dimension correspond to a seed index, respectively. It can be understood that the sub-index corresponding to the feature dimension of the 2D image is pre-established according to the 2D image feature vector library, the sub-index corresponding to the feature dimension of the 3D point cloud image is pre-established according to the 3D point cloud image feature vector library, the sub-index corresponding to the feature dimension of the voiceprint image is pre-established according to the voiceprint image feature vector library, and the sub-index corresponding to the feature dimension of the iris is pre-established according to the iris feature vector library.
Specifically, as shown in fig. 3, S24 includes:
s31: and respectively inputting the sub-feature vectors of different dimensions into the sub-indexes of corresponding dimensions for similarity comparison, so that the sub-indexes of multiple dimensions generate a first candidate result subset. Each first candidate result subset is a plurality of candidate biological feature vectors with similarity extracted aiming at the same dimensionality.
Based on the above 4 dimensions, the candidate biometric feature vectors with similarity extracted according to the 4 dimensions may be:
2D image feature dimension: [ { candidate 2D image feature vector 1, similarity X1{ candidate 2D image feature vector 2, similarity X2}, … … { candidate 2D image feature vector N, similarity XN}]。
3D point cloud image feature dimension: [ { candidate 3D Point cloud image feature vector 1, similarity Y1}, { candidate 3D point cloud image feature vector 2, similarity Y2}, … … { candidate 3D Point cloud image feature vector N, similarity YN}]。
Voiceprint characteristic dimension: [ { candidate voiceprint feature vector 1, similarity W1}, { candidate voiceprint feature vector 2, similarity W2}, … … { candidate voiceprint feature vector WNDegree of similarity WN}]。
Characteristic dimension of iris: [ { candidate Iris feature vector 1, similarity Z1}, { candidate iris voiceprint feature vector 2, similarity Z2… … { candidate iris voiceprint feature vector N, similarity ZN}]。
The candidate 2D image feature vector 1, the candidate 3D point cloud image feature vector 1, the candidate voiceprint feature vector 1 and the candidate iris feature vector 1 belong to sub-feature vectors of the candidate biological feature vector 1; candidate 2D image feature vector 2, candidate 3D point cloud image feature vector 2, candidate voiceprint feature vector 2, candidate iris feature vector 2 belong to the sub-feature vectors of candidate biological feature vector 2, and so on.
S32: and generating a candidate result set according to the plurality of first candidate result subsets and a preset similarity determination rule.
The preset similarity determination rule may be, but is not limited to, a weighted average model, a gradient descent tree model, and a ranknet algorithm. Understandably, the similarity determined by synthesizing the similarity of the sub-feature vectors with different dimensions is used for evaluating the similarity of the user and the candidate biological feature vectors in the feature vector library, and the reliability is higher.
Specifically, S32 may include:
in a plurality of first candidate result subsets, respectively determining the similarity of each candidate biological characteristic vector aiming at the similarities of the same candidate biological characteristic vector in different dimensions through a similarity determination rule; and constructing a candidate result set according to the similarity of each candidate biological feature vector.
For example, based on the above, the similarity of the candidate biological feature vector 1 is determined according to the respective corresponding similarities of the candidate 2D image feature vector 1, the candidate 3D point cloud image feature vector 1, the candidate voiceprint feature vector 1 and the candidate iris feature vector 1; and determining the similarity of the candidate biological feature vector 2 according to the similarity corresponding to the candidate 2D image feature vector 2, the candidate 3D point cloud image feature vector 2, the candidate voiceprint feature vector 2 and the candidate iris feature vector 2 respectively, and so on.
Alternatively, for the feature vector library containing billion-level data, the calculation amount for establishing a target index library is very large, and when the similarity of the target index library to the biometric feature vector to be identified is larger, therefore, the feature vector library can be divided into a plurality of sub-feature vector libraries (for example, the feature vector library of 1 billion-level data is divided into 4 sub-feature vector libraries on average), the target index library includes a plurality of sub-indexes, and the sub-feature vector libraries correspond to the sub-indexes one by one. As shown in fig. 4, S24 includes:
s41: and respectively inputting the biological feature vectors to be identified into different sub-indexes for similarity comparison, so that a plurality of sub-indexes generate a second candidate result subset. And each second candidate result subset comprises a plurality of candidate biological characteristic vectors with similarity extracted from the same sub-characteristic vector library.
For example, based on the above, the sub-indexes corresponding to the 4 sub-feature vector libraries are: and the sub-index 1, the sub-index 2, the sub-index 3 and the sub-index 4, the biological feature vector to be identified is input into the sub-index 1 to obtain a second candidate result subset 1, the sub-index 2 is input to obtain a second candidate result subset 2, the sub-index 3 is input to obtain a second candidate result subset 3, and the sub-index 4 is input to obtain a second candidate result subset 4.
S42: and merging and sorting a plurality of candidate biological characteristic vectors with similarity extracted from different sub-feature libraries, and extracting a candidate result set.
For example, when the second subset of candidate results 1 is: { candidate biometric vector 1, similarity 65% }, { candidate biometric vector 2, similarity 70% }; the second candidate result subset 2 is [ { candidate biometric vector 3, similarity 75% }, { candidate biometric vector 4, similarity 83% }; the third initial candidate result set 3 is [ { candidate biometric vector 5, similarity 72% }, { candidate biometric vector 6, similarity 85% }; the fourth initial candidate result set 3 is { candidate biometric vector 7, similarity 77%, { candidate biometric vector 4, similarity 79% }, { candidate biometric vector 7, similarity 78%, { candidate biometric vector 8, similarity 84% } ], and the candidate result sets extracted after merging and sorting are [ { candidate biometric vector 8, similarity 84% }, { candidate biometric vector 6, similarity 85% } ].
Referring to fig. 5, one or more embodiments of the invention further provide a target user identification apparatus 500, which can be applied to the server 102. As shown in fig. 1, the server 102 may be communicatively connected with the biometric acquisition device 101 for data interaction. The biometric acquisition device 101 may be at least one of a 2D image acquisition module, a 3D point cloud image acquisition module, a voiceprint acquisition module, an iris recognition module, and the like. It should be noted that the basic principle and the technical effects of the target user identification apparatus 500 provided in the embodiment of the present application are the same as those of the above embodiment, and for the sake of brief description, reference may be made to the corresponding contents in the above embodiment for the part of the embodiment of the present application that is not mentioned. The apparatus 500 comprises an information acquisition unit 501, a result generation unit 502, and an information determination unit 503, wherein,
the information acquisition unit 501 acquires a biometric vector to be recognized, in which the user is detected.
The result generating unit 502 inputs the biometric feature vectors to be identified into a target index library established in advance according to a feature vector library to generate a candidate result set, where the candidate result set includes a plurality of extracted candidate biometric feature vectors with similarity.
The information determining unit 503 determines the top N candidate biometric vectors with similarity greater than a preset reliability threshold as target biometric vectors, respectively, where the target biometric vectors are associated with user information.
One embodiment of the present disclosure provides a target user identification apparatus 500, which when executed can implement the following functions: inputting the obtained biological characteristic vectors to be identified into a target index library established in advance according to a characteristic vector library to generate a candidate result set, wherein the candidate result set comprises a plurality of extracted candidate biological characteristic vectors with similarity; respectively determining the first N candidate biological characteristic vectors with the similarity larger than a preset reliable threshold as target biological characteristic vectors, wherein, the target biological characteristic vector is associated with user information, through the method, only any characteristic point in the target index library is randomly selected as an initial characteristic point to carry out similarity comparison, then, the similarity comparison is carried out with the neighbor feature points of the feature points, a neighbor feature point with the highest similarity higher than the similarity of the initial feature point is found as a new initial feature point, the iteration is carried out in sequence until the neighbor feature point with the higher similarity than the latest initial feature point can not be obtained, therefore, the acquired biological characteristics to be identified do not need to be compared with the candidate biological characteristics in the characteristic vector library one by one, thereby saving a great deal of time, and the accuracy of the acquired user information is improved, and the resource consumption during retrieval is reduced.
Optionally, the biometric vector to be identified includes sub-feature vectors of multiple dimensions, the target index library includes sub-indexes of multiple dimensions, and the sub-feature vectors correspond to the sub-indexes one to one. As shown in fig. 6, the result generation unit 502 includes:
the first initial result generating module 601 respectively inputs the sub-feature vectors of different dimensions to the sub-indexes of corresponding dimensions for similarity comparison, so that the sub-indexes of multiple dimensions all generate a first candidate result subset.
Each first candidate result subset is a plurality of candidate biological feature vectors with similarity extracted aiming at the same dimensionality.
The candidate result set generating module 602 generates a candidate result set according to the plurality of first candidate result subsets and a preset similarity determination rule.
Specifically, the candidate result set generating module determines, in a plurality of first candidate result subsets, the similarity of each candidate biometric vector with respect to the similarities of the same candidate biometric vector in different dimensions through a similarity determination rule, and constructs a candidate result set according to the similarity of each candidate biometric vector.
Optionally, the feature vector library is divided into a plurality of sub-feature vector libraries, the target index library includes a plurality of sub-indexes, the sub-feature vector libraries correspond to the sub-indexes one by one, as shown in fig. 7, and the result generating unit 502 includes:
the second initial result generating module 701 respectively inputs the biometric feature vectors to be identified into different sub-indexes for similarity comparison, so that a plurality of sub-indexes all generate second candidate result subsets, wherein each second candidate result subset comprises a plurality of candidate biometric feature vectors with similarity extracted from the same sub-feature vector library.
The candidate result set generating module 702 merges and sorts the plurality of candidate biometric feature vectors with similarity extracted from different sub-feature libraries, and extracts a candidate result set.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device may be the server 102 according to the above-described embodiment. Referring to fig. 8, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The memory may include a memory, such as a Random-access memory (RAM), and may further include a non-volatile memory, such as at least 1 disk memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (peripheral component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program, thereby forming the target subscriber identity device 500 on a logical level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
acquiring a biological feature vector to be identified, detected by a user;
inputting the biological feature vectors to be identified into a target index library established in advance according to a feature vector library to generate a candidate result set, wherein the candidate result set comprises a plurality of extracted candidate biological feature vectors with similarity;
and respectively determining the first N candidate biological feature vectors with the similarity greater than a preset reliable threshold as target biological feature vectors, wherein the target biological feature vectors are associated with user information.
The method performed by the target user identification apparatus 500 according to the embodiment shown in fig. 2 of the present application may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in one or more embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in hardware, in a software module executed by a hardware decoding processor, or in a combination of the hardware and software modules executed by a hardware decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
Of course, besides the software implementation, the electronic device of the present application does not exclude other implementations, such as a logic device or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or a logic device.
One or more embodiments of the present specification also propose a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiment shown in fig. 2, and in particular to perform the following:
acquiring a biological feature vector to be identified, detected by a user;
inputting the biological feature vectors to be identified into a target index library established in advance according to a feature vector library to generate a candidate result set, wherein the candidate result set comprises a plurality of extracted candidate biological feature vectors with similarity;
and respectively determining the first N candidate biological feature vectors with the similarity greater than a preset reliable threshold as target biological feature vectors, wherein the target biological feature vectors are associated with user information.
In short, the above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Claims (10)

1. A target user identification method comprises the following steps:
acquiring a biological feature vector to be identified of a target user;
in a target index library, respectively comparing the similarity of a biological characteristic vector to be identified with a found initial node and a neighbor node thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological characteristic vector in the biological characteristic library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological characteristic vectors carrying the similarity;
and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
2. The method according to claim 1, wherein the biometric vector to be identified includes sub-feature vectors of multiple dimensions, the target index library includes sub-indices of multiple dimensions, the sub-feature vectors correspond to the sub-indices one to one, and performing similarity comparison on the biometric vector to be identified with the found start node and its neighbor nodes respectively based on a nearest neighbor search algorithm in the target index library to generate the candidate result set includes:
respectively comparing the similarity of the corresponding sub-feature vectors with the searched initial node and the neighbor nodes thereof in the sub-indexes of multiple dimensions, so that the sub-indexes of the multiple dimensions generate a first candidate result subset, wherein the first candidate result subset is a plurality of candidate biological feature vectors which are extracted for the same dimension and carry the similarity;
a set of candidate results is generated from the plurality of first subsets of candidate results.
3. The method of claim 2, wherein generating the candidate result set according to the plurality of first candidate result subsets and a preset similarity determination rule comprises:
in a plurality of first candidate result subsets, respectively determining the similarity of each candidate biological characteristic vector aiming at the similarities of the same candidate biological characteristic vector in different dimensions;
and constructing a candidate result set according to the similarity of each candidate biological feature vector.
4. The method of claim 2, the sub-feature vectors of the plurality of dimensions comprising at least two of 2D image features, 3D point cloud image features, voiceprint features, and iris features.
5. The method according to claim 1, wherein the feature vector library is divided into a plurality of sub-feature vector libraries, the target index library includes a plurality of sub-indexes, the sub-feature vector libraries correspond to the sub-indexes one to one, and the generating the candidate result set includes:
respectively comparing the similarity of the corresponding sub-feature vectors with the searched starting node and the neighbor nodes thereof in the plurality of sub-indexes, so that a plurality of sub-indexes generate second candidate result subsets, wherein each second candidate result subset comprises a plurality of candidate biological feature vectors with similarity extracted from the same sub-feature vector library;
and merging and sorting the plurality of second candidate result subsets, and extracting a candidate result set.
6. The method according to claim 1, wherein the determining the recognition result according to the candidate biometric feature vector with the similarity and a preset reliability threshold comprises:
determining a target biological characteristic vector from the candidate biological characteristic vectors of which the similarity of the candidate result set is greater than a preset reliable threshold;
and determining the user associated with the target biological characteristic vector as the target user.
7. The method as claimed in claim 1, wherein the generating the candidate result set by respectively comparing the biometric feature vector to be identified with the found start node and its neighboring nodes in the target index database based on the nearest neighbor search algorithm comprises:
determining a candidate biological characteristic vector as an initial node in the target index library, and comparing the similarity of the biological characteristic vector to be identified with the initial node and the neighbor nodes thereof;
taking the node with the highest similarity as a new initial node, and circularly comparing the similarity of the biometric feature vector to be identified with the initial node and the neighbor nodes thereof;
when a node with higher similarity to the new starting node is found, performing operation of taking the node with higher similarity as the new starting node;
and when the node with higher similarity to the new initial node is not found, constructing a candidate result set according to the candidate biological characteristic vector serving as the initial node and the similarity of the candidate biological characteristic vector.
8. A target user identification device comprising:
the information acquisition unit is used for acquiring a biological feature vector to be identified of a target user;
the result generation unit is used for respectively comparing the similarity of the biological feature vectors to be identified with the searched starting node and the searched neighbor nodes in a target index library to generate a candidate result set, wherein the target index library is established in advance according to a feature vector library, and the candidate result set comprises a plurality of extracted candidate biological feature vectors with similarity;
and the information determining unit is used for determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
9. A storage medium having stored thereon a computer program which, when executed by a processor, implements:
acquiring a biological feature vector to be identified of a target user;
in a target index library, respectively comparing the similarity of a biological characteristic vector to be identified with a found initial node and a neighbor node thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological characteristic vector in the biological characteristic library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological characteristic vectors carrying the similarity;
and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
10. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement:
acquiring a biological feature vector to be identified of a target user;
in a target index library, respectively comparing the similarity of a biological characteristic vector to be identified with a found initial node and a neighbor node thereof based on a nearest neighbor search algorithm to generate a candidate result set, wherein the target index library is an index obtained by performing index construction operation on the biological characteristic vector in the biological characteristic library through the nearest neighbor search algorithm, and the candidate result set comprises candidate biological characteristic vectors carrying the similarity;
and determining an identification result according to the candidate biological feature vector carrying the similarity and a preset reliability threshold.
CN202010147725.9A 2020-03-05 2020-03-05 Target user identification method and device, storage medium and electronic equipment Pending CN111008620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010147725.9A CN111008620A (en) 2020-03-05 2020-03-05 Target user identification method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010147725.9A CN111008620A (en) 2020-03-05 2020-03-05 Target user identification method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111008620A true CN111008620A (en) 2020-04-14

Family

ID=70121004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010147725.9A Pending CN111008620A (en) 2020-03-05 2020-03-05 Target user identification method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111008620A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461753A (en) * 2020-04-17 2020-07-28 支付宝(杭州)信息技术有限公司 Method and device for recalling knowledge points in intelligent customer service scene
CN111695419A (en) * 2020-04-30 2020-09-22 华为技术有限公司 Image data processing method and related device
CN112200133A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Privacy-protecting face recognition method and device
CN113780827A (en) * 2021-09-14 2021-12-10 北京沃东天骏信息技术有限公司 Article screening method and device, electronic equipment and computer readable medium
CN115733616A (en) * 2022-10-31 2023-03-03 支付宝(杭州)信息技术有限公司 Biological characteristic authentication method and system
CN115733617A (en) * 2022-10-31 2023-03-03 支付宝(杭州)信息技术有限公司 Biological characteristic authentication method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646190A (en) * 2012-03-19 2012-08-22 腾讯科技(深圳)有限公司 Authentication method, device and system based on biological characteristics
CN102982165A (en) * 2012-12-10 2013-03-20 南京大学 Large-scale human face image searching method
CN104765768A (en) * 2015-03-09 2015-07-08 深圳云天励飞技术有限公司 Mass face database rapid and accurate retrieval method
CN105160295A (en) * 2015-07-14 2015-12-16 东北大学 Rapid high-efficiency face identification method for large-scale face database
CN110008256A (en) * 2019-04-09 2019-07-12 杭州电子科技大学 It is a kind of to be navigated the approximate KNN searching method of worldlet figure based on layering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646190A (en) * 2012-03-19 2012-08-22 腾讯科技(深圳)有限公司 Authentication method, device and system based on biological characteristics
CN102982165A (en) * 2012-12-10 2013-03-20 南京大学 Large-scale human face image searching method
CN104765768A (en) * 2015-03-09 2015-07-08 深圳云天励飞技术有限公司 Mass face database rapid and accurate retrieval method
CN105160295A (en) * 2015-07-14 2015-12-16 东北大学 Rapid high-efficiency face identification method for large-scale face database
CN110008256A (en) * 2019-04-09 2019-07-12 杭州电子科技大学 It is a kind of to be navigated the approximate KNN searching method of worldlet figure based on layering

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461753A (en) * 2020-04-17 2020-07-28 支付宝(杭州)信息技术有限公司 Method and device for recalling knowledge points in intelligent customer service scene
CN111461753B (en) * 2020-04-17 2022-05-17 支付宝(杭州)信息技术有限公司 Method and device for recalling knowledge points in intelligent customer service scene
CN111695419A (en) * 2020-04-30 2020-09-22 华为技术有限公司 Image data processing method and related device
CN111695419B (en) * 2020-04-30 2024-06-28 华为技术有限公司 Image data processing method and related device
CN112200133A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Privacy-protecting face recognition method and device
CN112200133B (en) * 2020-10-28 2022-05-17 支付宝(杭州)信息技术有限公司 Privacy-protecting face recognition method and device
CN113780827A (en) * 2021-09-14 2021-12-10 北京沃东天骏信息技术有限公司 Article screening method and device, electronic equipment and computer readable medium
CN115733616A (en) * 2022-10-31 2023-03-03 支付宝(杭州)信息技术有限公司 Biological characteristic authentication method and system
CN115733617A (en) * 2022-10-31 2023-03-03 支付宝(杭州)信息技术有限公司 Biological characteristic authentication method and system
CN115733617B (en) * 2022-10-31 2024-01-23 支付宝(杭州)信息技术有限公司 Biological feature authentication method and system

Similar Documents

Publication Publication Date Title
CN111008620A (en) Target user identification method and device, storage medium and electronic equipment
CN106446816B (en) Face recognition method and device
CN108763952B (en) Data classification method and device and electronic equipment
US10997460B2 (en) User identity determining method, apparatus, and device
CN111506889B (en) User verification method and device based on similar user group
US20230376527A1 (en) Generating congruous metadata for multimedia
CN110688974B (en) Identity recognition method and device
CN109299594B (en) Identity verification method and device
JP6607061B2 (en) Information processing apparatus, data comparison method, and data comparison program
CN112733645B (en) Handwritten signature verification method, handwritten signature verification device, computer equipment and storage medium
CN115830649A (en) Network asset fingerprint feature identification method and device and electronic equipment
CN114996125A (en) Test case generation method, device, equipment and storage medium
CN116108150A (en) Intelligent question-answering method, device, system and electronic equipment
CN115712866A (en) Data processing method, device and equipment
CN110851608A (en) Infringement detection method, device and equipment based on block chain and storage medium
CN116467408A (en) Document searching method and device
CN114840762A (en) Recommended content determining method and device and electronic equipment
CN113963407A (en) Face recognition result judgment method and device based on business scene
CN114003753A (en) Picture retrieval method and device
CN111597368A (en) Data processing method and device
CN109165488B (en) Identity authentication method and device
CN118278378A (en) Document plagiarism detection method and device
CN116151213A (en) Document deduplication method, device, electronic equipment and readable storage medium
CN117520657A (en) In-memory search implementation method and device based on deep hash algorithm and electronic equipment
CN111709272A (en) Fingerprint acquisition method based on small-area fingerprint, identity authentication method and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200414