CN111143627A - User identity data determination method, device, equipment and medium - Google Patents

User identity data determination method, device, equipment and medium Download PDF

Info

Publication number
CN111143627A
CN111143627A CN201911383265.3A CN201911383265A CN111143627A CN 111143627 A CN111143627 A CN 111143627A CN 201911383265 A CN201911383265 A CN 201911383265A CN 111143627 A CN111143627 A CN 111143627A
Authority
CN
China
Prior art keywords
identity information
node
nodes
virtual
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911383265.3A
Other languages
Chinese (zh)
Other versions
CN111143627B (en
Inventor
张阳
熊云
杨双全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911383265.3A priority Critical patent/CN111143627B/en
Publication of CN111143627A publication Critical patent/CN111143627A/en
Application granted granted Critical
Publication of CN111143627B publication Critical patent/CN111143627B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Abstract

The application discloses a method, a device, equipment and a medium for determining user identity data, and relates to a big data technology in the technical field of computers. The method comprises the following steps: acquiring a user identity query request, wherein the user identity query request comprises initial identity information; determining target identity information belonging to the same user as the initial identity information based on the relation map; the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to the spatial characteristics. According to the embodiment of the application, the flexibility is improved, the identity information in the whole graph does not need to be traversed, the calculation efficiency is improved, resources are saved, the incidence relation between the user identity information is generalized by introducing the virtual nodes, and the data recall rate and the recall rate are improved.

Description

User identity data determination method, device, equipment and medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a big data technology, and particularly relates to a user identity data determination method, device, equipment and medium.
Background
Currently, user data collected by a service party has user identity information with different dimensions, such as an identity card, a mobile phone number, an equipment code, and the like. However, data associated with a single type of identity information is often limited, so different types of identity information of the same user need to be associated to provide a greater data value for the business party.
In the related art, different types of identity information of the same user are associated, and the following method is adopted: and in the first mode, connection relations among different types of identity information of the same user are customized and constructed according to the known types of identity information. For example, a connection relationship between name, identity card and mobile phone number is constructed; and secondly, traversing the identity information of the whole graph by utilizing an off-line calculation mode based on the whole graph mode to obtain the connection relation between different types of identity information of the same user.
However, the above-mentioned method one has low flexibility, and if the identity information category is newly added, the connection relationship needs to be reconstructed; in the second mode, the connection relation between different types of identity information of the same user is obtained by traversing the identity information of the whole graph, so that the calculation cost is high, and the resource waste is caused.
Disclosure of Invention
The embodiment of the application discloses a method, a device, equipment and a medium for determining user identity data, which not only improve the flexibility, but also do not need to traverse the identity information in the whole graph, thereby improving the calculation efficiency, saving resources, generalizing the incidence relation between the user identity information by introducing a virtual node, and improving the data recall ratio and the recall ratio.
In a first aspect, an embodiment of the present application discloses a method for determining user identity data, including:
acquiring a user identity query request, wherein the user identity query request comprises initial identity information;
determining target identity information belonging to the same user as the initial identity information based on a relation graph;
the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
According to the method and the device, the target identity information which belongs to the same user with the initial identity information is determined from the relationship graph which comprises the user identity information node, the virtual node and the edge relation between different nodes according to the initial identity information included in the user identity query request, wherein the virtual node is determined according to the spatial characteristics. Therefore, the flexibility is improved, traversal of the identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, the association relation between the user identity information is generalized by introducing the virtual nodes, and the data recall ratio are improved.
In addition, the method for determining user identity data according to the above embodiment of the present application may further have the following additional technical features:
optionally, in the relationship graph, a user identity information node is constructed for the user identity information;
constructing a virtual node according to the spatial characteristics;
constructing an explicit edge relation among different user identity information nodes according to the co-occurrence relation;
and according to the spatial characteristics, constructing a recessive edge relation between the user identity information node and the virtual node.
One embodiment in the above application has the following advantages or benefits: according to the co-occurrence relationship, an explicit side relationship between different user identity information nodes is established, and according to the spatial characteristics, an invisible side relationship between the user identity information nodes and the virtual nodes is established, so that the user identity information relationship is generalized.
Optionally, constructing a virtual node according to the spatial features includes: determining an area every other first length distance, and constructing a virtual node for the area;
correspondingly, according to the spatial characteristics, constructing a recessive edge relationship between the user identity information node and the virtual node, including:
user identity information appearing in the virtual node association area is obtained, and a recessive edge relation between the virtual node and the appearing user identity information node is established.
Optionally, constructing a virtual node according to the spatial features includes: determining a region every a first length distance, determining a time interval every a first length, and respectively constructing virtual nodes for each time interval in the region;
correspondingly, according to the spatial characteristics, constructing a recessive edge relationship between the user identity information node and the virtual node, including:
user identity information appearing in the virtual node association area in the association period is obtained, and a recessive edge relation between the virtual node and the appearing user identity information node is established.
One embodiment in the above application has the following advantages or benefits: the implicit edge relation between the virtual node and the user identity information node is constructed according to the spatial characteristics or the spatial characteristics and the time characteristics, and the association capability of the user identity information is enriched.
Optionally, determining, based on the relationship graph, target identity information that belongs to the same user as the initial identity information includes:
in the relational graph, traversing by taking an identity information node associated with the initial identity information as an initial identity information node to obtain a target sub-graph;
determining the correlation degree between the starting identity information node and other identity information nodes in the target sub-graph;
and determining target identity information belonging to the same user as the initial identity information according to the correlation.
One embodiment in the above application has the following advantages or benefits: by acquiring the target sub-map corresponding to the initial identity information and determining the target identity information belonging to the same user according to the correlation between other identity information nodes in the target sub-map and the initial identity information node, the local map is traversed, the whole map is prevented from being traversed, the calculation efficiency is improved, and the calculation resources are saved.
Optionally, in the relationship graph, traversing by using the identity information node associated with the initial identity information as an initial identity information node to obtain a target sub-graph, where the traversing includes:
traversing by taking the identity information node associated with the initial identity information as an initial identity information node in the relational graph, acquiring a target node associated with the initial identity information node, and adding the target node to the target sub-graph;
if the target node is a virtual node, determining that the target node is a target virtual node, and acquiring other virtual nodes associated with the target virtual node from the relational graph according to spatial characteristics;
and adding the other virtual nodes into the target sub-graph, and constructing the generalization edge relation between the other virtual nodes and the target virtual node.
Optionally, obtaining other virtual nodes associated with the target virtual node from the relationship graph according to the spatial features includes:
and taking the virtual nodes with the association areas within a second length and the association time intervals within a second time length between the target virtual node and the virtual nodes as other virtual nodes associated with the target virtual node.
One embodiment in the above application has the following advantages or benefits: when the target node is a virtual node, acquiring other virtual nodes, and constructing a generalized edge relationship between the other virtual nodes and the target virtual node, so as to increase the quantity of the acquired identity information and reduce the limitation of the acquired identity information.
Optionally, determining the correlation between the starting identity information node and other identity information nodes in the target sub-graph includes:
and determining the correlation degree according to the edge relation coefficient quantity and the edge relation type between the initial identity information node and other identity information nodes in the target sub-graph.
One embodiment in the above application has the following advantages or benefits: the relevance is determined according to the number of the edge relations and the type of the edge relations, so that the target identity information which belongs to the same user with the initial identity information is determined to be accurate and reliable.
Optionally, the method further includes: extracting candidate communication paths of the user identity information nodes according to the edge relations among different nodes in the relation graph;
correspondingly, determining target identity information belonging to the same user as the starting identity information based on the relation graph comprises the following steps:
determining a starting identity information type to which the starting identity information belongs;
selecting a target communication path from the candidate communication paths according to the initial identity information type;
and extracting target identity information belonging to the same user as the initial identity information from the relation map based on the target communication path.
One embodiment in the above application has the following advantages or benefits: and selecting the target communication path according to the type of the initial identity information to extract target identity information belonging to the same user as the initial identity information from the relationship map, thereby improving the diversity of obtaining the target identity information.
In a second aspect, an embodiment of the present application further discloses a device for determining user identity data, including:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a user identity query request, and the user identity query request comprises initial identity information;
the determining module is used for determining target identity information which belongs to the same user with the initial identity information based on a relation map;
the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
In a third aspect, an embodiment of the present application further discloses an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of determining user identity data as described in any of the embodiments of the present application.
In a fourth aspect, embodiments of the present application further disclose a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the method for determining user identity data according to any of the embodiments of the present application.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic flow chart for constructing a relationship graph disclosed in the embodiments of the present application;
fig. 2 is a schematic flowchart of a method for determining user identity data according to an embodiment of the present application;
FIG. 3 is a schematic flow chart illustrating a process of obtaining a target sub-graph by traversing in a relational graph according to an embodiment of the present disclosure;
FIG. 4 is another schematic flow chart illustrating traversal of a relational graph to obtain a target sub-graph according to the embodiment of the present application;
5(a) -5 (b) are schematic diagrams of the target sub-map obtained by traversing the relationship map disclosed in the embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of another method for determining user identity data disclosed herein;
fig. 7 is a schematic structural diagram of a user identification data determining apparatus disclosed in an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device disclosed in an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The embodiment of the application provides a method for determining user identity data, aiming at the problems that in the related art, when different types of identity information of the same user are associated, the flexibility is low, the identity information in a whole graph needs to be traversed, the calculation cost is high, and resources are wasted.
According to the method and the device, the target identity information which belongs to the same user with the initial identity information is determined from the relation graph which comprises the user identity information node, the virtual node and the edge relation among different nodes according to the initial identity information included in the user identity query request. Therefore, the flexibility is improved, traversal of the identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, the association relation between the user identity information is generalized by introducing the virtual nodes, and the data recall ratio are improved.
In order to clearly illustrate the process of determining the target identity information of the same user as the initial identity information based on the relationship graph in the user identity data determining method disclosed in the embodiment of the present application, first, a relationship graph constructed in the embodiment of the present application is described below.
Fig. 1 is a schematic flowchart of a process for constructing a relationship graph disclosed in an embodiment of the present application, where the relationship graph disclosed in the embodiment of the present application may be executed by a user identity data determining apparatus, and the apparatus may be implemented by software and/or hardware, and may be integrated inside an electronic device. The method comprises the following steps:
s101, establishing a user identity information node for the user identity information in the relation graph.
In this embodiment, the user identification information includes user real identification information and device identification information (acquisition device identification information) for acquiring the user identification information. Wherein the acquisition device may be, but is not limited to: base station, camera, vocal print collection equipment and fingerprint collection equipment etc..
For example, the user true identity information includes: identity card, cell phone number, biometric features, and the like. Wherein, the biological characteristics can be face images, irises, fingerprints, sounds and the like; the equipment identity information acquisition comprises the following steps: international Mobile Equipment Identity (IMEI) and Media Access Control (MAC) Address, etc.
Before S101 is executed, the obtained user data is first subjected to structural transformation, and the structured data is imported into the identity wide table. Then, different types of user identity information and co-occurrence relations between different user identity information are extracted from the user data in the identity wide table through a relation converter. In the embodiment of the present application, the identity wide table refers to a table including different identity information fields. The co-occurrence relation means that one piece of user data includes at least two pieces of user identity information.
And further, according to the extracted different types of user identity information, constructing user identity information nodes in the relation graph. Wherein, user identity information node includes: the system comprises a user real identity information node and an acquisition equipment identity information node. The relationship map in this embodiment may be an initial relationship map.
It should be noted that, in this embodiment, the structure of the identity width table may be as described in table 1 below:
TABLE 1
Figure BDA0002342801740000071
Furthermore, the identity wide table may be defined in any storage database, such as ES/HBASE/MYSQL, etc.
And S102, constructing a virtual node according to the spatial characteristics.
The virtual node refers to a node capable of connecting different types of user identity information.
Illustratively, constructing a virtual node may be accomplished by:
in the first mode, an area is determined every first length distance, and a virtual node is constructed for the area.
The first length distance may be set according to actual needs, and is not specifically limited herein. For example, the first length distance is set to 10 meters (m) or 20m, etc.
In this embodiment, after an area is determined every first length distance, by determining whether a user identity information node (for example, an acquisition device) exists in the area, if so, a virtual node is established, so as to lay a foundation for subsequently establishing a recessive edge relationship between the virtual node and the user identity information node.
In the second mode, an area is determined every first length distance, a time interval is determined every first length, and virtual nodes are respectively constructed for each time interval in the area.
The first duration may be set according to actual needs, and is not specifically limited herein. For example, the first time period is set to 2 hours (h) or one week or the like.
That is, in this embodiment, on the basis of the first manner, the first time length is increased, and a time period is determined according to the first time length, so as to construct virtual nodes for each time period in the area, so that the constructed virtual nodes in the area are distributed more fully and comprehensively.
S103, constructing an explicit edge relation among different user identity information nodes according to the co-occurrence relation.
And S104, constructing a recessive edge relation between the user identity information node and the virtual node according to the spatial characteristics.
Illustratively, because of two ways of constructing the virtual node in S102, when constructing the implicit edge relationship between the user identity information node and the virtual node in the embodiment of the present application, there are two ways of corresponding, which are specifically as follows:
in the first mode, the user identity information appearing in the virtual node association area is obtained, and a recessive edge relation between the virtual node and the appearing user identity information node is established.
And secondly, acquiring the user identity information appearing in the virtual node association area at the association period, and constructing a recessive edge relation between the virtual node and the appearing user identity information node.
The association time interval may be any time interval determined in the process of constructing the virtual node, for example, four time intervals of 0:00-6:00,6:00-12:00,12:00-18:00 and 18:00-24:00 are determined in the process of constructing the virtual node, and according to the service requirement, any time interval may be used as the association time interval, and the association time intervals of different virtual nodes may be different. The range of the associated area is larger than the area range determined in the process of constructing the virtual nodes, for example, every 10m of virtual nodes are selected, and correspondingly, the associated area of the virtual nodes can be selected every 50 m. Because the association region range is larger than the region range of the virtual node, at least two virtual nodes exist in the association region range, so that the generalization edge relation between different virtual nodes belonging to the same association region range is established.
After constructing the explicit edge relationship between different user identity information nodes and the implicit edge relationship between the user identity information and the virtual node, a relationship graph can be obtained.
For example, if the user identity information node is a mobile phone number of wang two and wang two, and the base station 1 appears in the area 1 determined by the distance of the first length and the distance of 10m, a virtual node is established, then an explicit edge relation between wang two and the mobile phone number is established, and a relationship graph is obtained by establishing an implicit edge relation between the mobile phone number and the virtual node and between wang two and the virtual node.
Furthermore, in the process of constructing the relationship graph, the embodiment of the application can also count the number of edges of the user identity information node, and aggregate or split the user identity information in time and space according to the counted number of edges to obtain the aggregated user identity information node or the split user identity information node. In addition, in order to distinguish the aggregated user identity information from the split user identity information, attribute information such as types, edge numbers and the like can be added to the aggregated user identity information node and the split user identity information node. Wherein the types include: primitive, aggregate, and split.
In addition, the relation graph can be stored in the association engine to lay a foundation for subsequently inquiring the target identity information of any user. The relation graph and the data information in the relation graph are stored in the association engine.
It can be understood that, in the embodiment of the application, by constructing the dominant side relationship between different user identity information nodes and the recessive side relationship between the user identity information nodes and the virtual nodes, not only can the association relationship between different user identity information be obtained, but also other identity information having the recessive relationship with the user identity information can be obtained through the virtual nodes, so that as much identity information related to the same user as possible is associated, and favorable conditions are provided for subsequently inquiring the user identity information.
According to the user identity data determining method disclosed by the embodiment of the application, the user identity information nodes are constructed in the initial relation graph, the virtual nodes are constructed according to the spatial characteristics, then the dominant side relation between different user identity information nodes is constructed according to the co-occurrence relation, and the recessive side relation between the user identity information nodes and the virtual nodes is constructed according to the spatial characteristics to obtain the relation graph, so that conditions are provided for subsequently determining the target identity information of the same user based on the constructed relation graph.
According to the introduction, the dominant side relation between different identity information nodes and the recessive side relation between the user identity information and the virtual node are constructed in the initial relation graph to obtain the relation graph, and the relation graph lays a foundation for subsequently determining the target identity information. Based on the relationship graph constructed in the above embodiment, the method for determining user identity data disclosed in the embodiment of the present application is described below.
As shown in fig. 2, the method may include:
s201, obtaining a user identity query request, wherein the user identity query request comprises initial identity information.
S202, determining target identity information belonging to the same user as the initial identity information based on the relation graph.
The relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
Illustratively, the initial identity information is obtained by analyzing the obtained user identity query request. Then, target identity information belonging to the same user as the starting identity information is determined from the relational graph based on the starting identity information.
It should be noted that, the user identity query request may include target identity information and/or a target time interval of the query, in addition to the initial identity information, which is not limited herein. When the user identity query request is determined to only include the initial identity information, all the queried identity information which belongs to the same user as the initial identity information is taken as the target identity information according to a default mode.
After the initial identity information is obtained, invalid query operations are avoided. In this embodiment, the initial identity information may also be subjected to validity check to determine whether the stored user identity information data includes the initial identity information, if so, it indicates that the stored user identity information data is valid, the query operation is executed, and if not, it indicates that the query operation is invalid, and a query failure message is returned.
If the user identity query request also comprises a query time interval, determining whether the time interval is legal or not. For example, if data information before 12 months in 2019 is stored in the correlation engine, when the query time interval is 2020 and 2 months, the query time interval is determined to be illegal.
Further, when determining that the initial identity information is legal and determining target identity information belonging to the same user as the initial identity information based on the relational graph, traversing by taking an identity information node associated with the initial identity information as the initial identity information node in the relational graph to obtain a target sub-graph; determining the correlation degree between the initial identity information node and other identity information nodes in the target sub-map; and determining target identity information belonging to the same user as the initial identity information according to the correlation.
It should be noted that, traversing is performed in the relationship graph to obtain the target sub-graph, which will be described in the following embodiments, and redundant description thereof is not repeated here.
Determining the correlation between the starting identity information node and other identity information nodes in the target sub-graph comprises the following steps: and determining the correlation degree according to the edge relation coefficient quantity and the edge relation type between the initial identity information node and other identity information nodes in the target sub-graph.
In the embodiment of the present application, the smaller the number of edge relation coefficients between the starting identity information node and other identity information nodes in the target sub-graph is, and the edge relation is a dominant edge relation, the higher the correlation degree is, otherwise, the lower the correlation degree is.
According to the method for determining the user identity data, the target identity information which belongs to the same user with the initial identity information is determined from the relationship graph which comprises the user identity information node, the virtual node and the edge relation between different nodes according to the initial identity information included in the user identity query request, wherein the virtual node is determined according to the spatial characteristics. Therefore, the flexibility is improved, traversal of the identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, the association relation between the user identity information is generalized by introducing the virtual nodes, and the data recall ratio are improved.
Next, with reference to fig. 3 and fig. 4, a process of traversing, in the relationship graph in the embodiment of the present application, an identity information node associated with the initial identity information as an initial identity information node to obtain a target sub-graph is described.
First, as shown in fig. 3, obtaining a target sub-graph spectrum according to the embodiment of the present application includes the following steps:
s301, in the relational graph, traversing by taking the identity information node associated with the initial identity information as an initial identity information node, acquiring a target node associated with the initial identity information node, and adding the target node to the target sub-graph.
S302, if the target node is a virtual node, determining that the target node is the target virtual node, and acquiring other virtual nodes related to the target virtual node from the relational graph according to the spatial characteristics.
Acquiring other virtual nodes associated with the target virtual node from the relational graph according to the spatial characteristics, wherein the acquiring comprises the following steps: and taking the virtual nodes with the association area within the second length and the association time interval within the second time length between the target virtual node and the association area as other virtual nodes associated with the target virtual node.
For example, if the target virtual node is F, when the virtual node having the association area within 20m and the time period within 2h with the target virtual node F is the virtual node E, the virtual node E is regarded as another virtual node.
S303, adding the other virtual nodes into the target sub-graph, and constructing a generalization edge relation between the other virtual nodes and the target virtual node.
In the embodiment of the present invention, the generalized edge relationship system refers to a connection edge relationship between different virtual nodes, that is, an edge relationship between other virtual nodes and a target virtual node. Through the generalization edge relation between different virtual nodes and the recessive edge relation between the user identity information nodes and the virtual nodes, the user identity information nodes related to different virtual nodes can be related to find whether the user identity information nodes belong to the same user or not, and the recall rate can be improved.
In order to prevent the obtained target sub-graph from being too large and too complex in specification, the embodiment of the application can also perform line expansion degree setting on the initial identity information node. The wire expansion degree can be set according to actual needs. The line expansion degree may be carried in the user identity query request, or when the user identity query request does not carry the line expansion degree, the query is performed according to the default line expansion degree, which is not specifically limited herein.
For example, if the expansion degree is 4 and the initial identity information is wang-two face information, the relationship graph is traversed by taking a wang-two face information node associated with the wang-two face information as a starting point, a first-degree expansion node is obtained as a camera Y1, a second-degree expansion node is obtained as a virtual node 1, a third-degree expansion node is obtained as a virtual node 2, a fourth-degree expansion node is obtained as an acquisition device W, so that an expansion node set is obtained as { camera Y1, virtual node 1, virtual node 2, acquisition device W }, and the relationship graph constructed based on each node in the expansion node set and the initial identity information node is a target sub-graph.
Further, in the embodiment of the present application, the target sub-map may be obtained in another manner, as specifically shown in fig. 4.
S401, in the relation graph, traversing by taking the identity information node associated with the initial identity information as an initial identity information node, and acquiring a target node associated with the initial identity information node.
S402, if the target node is a user identity information node, adding the user identity information node into the target sub-graph.
For example, as shown in fig. 5(a), if the expansion degree is 4 and the initial identity information is a, traversing the relationship graph with the a node associated with a as a starting point to obtain a first-degree expansion node having B; the second-degree expansion nodes are B1, B2 and C; the three-degree expansion nodes are B11, B12, C1, C2 and D; the four-degree wire expansion nodes C11, C12, D1, and D2, so that a wire expansion node set is obtained as { B, B1, B2, C, B11, B12, C1, C2, D, C11, C12, D1, and D2}, and a relationship graph, that is, a target sub-graph, is constructed based on each node and the initial identity information node in the wire expansion node set, specifically, as shown in fig. 5 (B).
Fig. 6 is a flow chart of another method for determining user identity data disclosed in the present application. The embodiment is another implementation scheme of the user identity data determination method. The above-described case will be described with reference to fig. 6. The method specifically comprises the following steps:
as shown in fig. 6, the method may include:
s601, obtaining a user identity query request, wherein the user identity query request comprises initial identity information.
S602, extracting candidate communication paths of the user identity information nodes according to the edge relations among different nodes in the relation graph.
In the embodiment of the present invention, the different nodes include any node type in the relationship graph, that is, the user identity information node and the virtual node. Accordingly, the edge relationships include explicit edge relationships and implicit edge relationships.
For example, candidate communication paths of all user identity information nodes can be extracted from the communication path library.
S603, determining the initial identity information type to which the initial identity information belongs.
S604, according to the type of the initial identity information, selecting a target communication path from the candidate communication paths.
S605, based on the target communication path, extracting target identity information belonging to the same user as the initial identity information from the relation map.
Wherein the starting identity information types include: identity card, cell phone number, biometric features, and the like.
For example, if the initial identity information type is wang two, the communication paths R1 and R2 having a communication relationship with wang two are selected as the target communication paths according to wang two, and when the identity card in the communication path R2 belongs to wang two, the identity card is determined to be the target identity information of wang two.
According to the user identity data determining method disclosed by the embodiment of the application, the candidate communication paths of the user identity information nodes are extracted, so that the target communication paths are selected from the candidate communication paths according to the determined initial identity information type, and the target identity information belonging to the same user as the initial identity information is extracted from the relation graph based on the target communication paths. The flexibility is improved, traversal of identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, and the association relation between the user identity information is generalized by introducing the virtual nodes, so that the data recall rate and the recall rate are improved.
Fig. 7 is a schematic structural diagram of a user identity data determination apparatus according to an embodiment of the present application. The user identity data determination device can be implemented in software and/or hardware and can be integrated on the electronic equipment.
As shown in fig. 7, a user identity data determining apparatus 700 disclosed in this embodiment includes an obtaining module 710 and a determining module 720, where:
an obtaining module 710, configured to obtain a user identity query request, where the user identity query request includes initial identity information;
a determining module 720, configured to determine, based on the relationship graph, target identity information that belongs to the same user as the starting identity information;
the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
As an optional implementation form of the present application, the apparatus further includes: a relational graph construction module;
the system comprises a relation graph building module, a relation graph establishing module and a relation graph establishing module, wherein the relation graph building module is used for building user identity information nodes for user identity information in the relation graph; constructing a virtual node according to the spatial characteristics; constructing an explicit edge relation among different user identity information nodes according to the co-occurrence relation; and according to the spatial characteristics, constructing a recessive edge relation between the user identity information node and the virtual node.
As an optional implementation form of the present application, the relationship graph building module is further configured to determine an area every first length distance, and build a virtual node for the area;
the relationship map building module is further configured to obtain user identity information appearing in the virtual node association region, and build a recessive edge relationship between the virtual node and the appearing user identity information node.
As an optional implementation form of the present application, the relationship graph building module is further configured to determine an area every first length distance, determine a time interval every first length, and respectively build a virtual node for each time interval in the area;
the relationship map building module is further used for obtaining user identity information appearing in the virtual node association area in the association period and building a recessive edge relationship between the virtual node and the appearing user identity information node.
As an optional implementation form of the present application, the determining module 720 is specifically configured to:
in the relational graph, traversing by taking an identity information node associated with the initial identity information as an initial identity information node to obtain a target sub-graph;
determining the correlation degree between the starting identity information node and other identity information nodes in the target sub-graph;
and determining target identity information belonging to the same user as the initial identity information according to the correlation.
As an alternative implementation form of the present application, the determining module 720 further includes: constructing a sub-graph spectrum unit;
the construction sub-graph unit is used for traversing by taking the identity information node associated with the initial identity information as an initial identity information node in the relational graph, acquiring a target node associated with the initial identity information node, and adding the target node to the target sub-graph;
if the target node is a virtual node, determining that the target node is a target virtual node, and acquiring other virtual nodes associated with the target virtual node from the relational graph according to spatial characteristics;
and adding the other virtual nodes into the target sub-graph, and constructing the generalization edge relation between the other virtual nodes and the target virtual node.
As an alternative implementation form of the present application, the sub-graph spectrum unit is constructed, and is further configured to:
and taking the virtual nodes with the association areas within a second length and the association time intervals within a second time length between the target virtual node and the virtual nodes as other virtual nodes associated with the target virtual node.
As an alternative implementation form of the present application, the determining module 720 is further configured to:
and determining the correlation degree according to the edge relation coefficient quantity and the edge relation type between the initial identity information node and other identity information nodes in the target sub-graph.
As an optional implementation form of the present application, the apparatus further includes: a path extraction module;
the path extraction module is used for extracting candidate communication paths of the user identity information nodes according to the edge relations among different nodes in the relation graph;
accordingly, the determining module 720 is further configured to:
determining a starting identity information type to which the starting identity information belongs;
selecting a target communication path from the candidate communication paths according to the initial identity information type;
and extracting target identity information belonging to the same user as the initial identity information from the relation map based on the target communication path.
It should be noted that the foregoing explanation of the embodiment of the user identity data determining method is also applicable to the user identity data determining apparatus of the embodiment, and the implementation principle is similar, and is not described herein again.
The user identity data determining apparatus disclosed in this embodiment determines, according to initial identity information included in the user identity query request, target identity information belonging to the same user as the initial identity information from a relationship graph including edge relationships between user identity information nodes and virtual nodes and between different nodes, where the virtual nodes are determined according to spatial features. Therefore, the flexibility is improved, traversal of the identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, the association relation between the user identity information is generalized by introducing the virtual nodes, and the data recall ratio are improved.
According to an embodiment of the application, the application also discloses an electronic device and a readable storage medium.
Fig. 8 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 8, the electronic apparatus includes: one or more processors 801, memory 802, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device disclosing some of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 8 illustrates an example of a processor 801.
The memory 802 is a non-transitory computer readable storage medium as disclosed herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the user identity data determination methods disclosed herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the user identity data determination method disclosed herein.
The memory 802, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the obtaining module 710 and the determining module 720 shown in fig. 7) corresponding to the block chain based authentication method in the embodiments of the present application. The processor 801 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 802, that is, implements the user identification data determination method in the above-described method embodiment.
The memory 802 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the electronic device of the user identification data determination method, and the like. Further, the memory 802 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 802 may optionally include memory located remotely from the processor 801, which may be connected to the electronic device of the user identity data determination method via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the user identity data determination method may further include: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, and are exemplified by a bus in fig. 8.
The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus based on the block chain authentication method, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 804 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to disclose machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to disclose machine instructions and/or data to a programmable processor.
To disclose interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can disclose input to the computer. Other kinds of devices may also be used to disclose interactions with a user; for example, the feedback disclosed to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the target identity information belonging to the same user with the initial identity information is determined from the relationship graph comprising the user identity information node, the virtual node and the edge relationship between different nodes according to the initial identity information included in the user identity query request, wherein the virtual node is determined according to the spatial characteristics. Therefore, the flexibility is improved, traversal of the identity information in the whole graph is not needed, the calculation efficiency is improved, resources are saved, the association relation between the user identity information is generalized by introducing the virtual nodes, and the data recall ratio are improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A method for determining user identity data, comprising:
acquiring a user identity query request, wherein the user identity query request comprises initial identity information;
determining target identity information belonging to the same user as the initial identity information based on a relation graph;
the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
2. The method of claim 1, wherein the relationship map is constructed by:
in the relation graph, a user identity information node is constructed for user identity information;
constructing a virtual node according to the spatial characteristics;
constructing an explicit edge relation among different user identity information nodes according to the co-occurrence relation;
and according to the spatial characteristics, constructing a recessive edge relation between the user identity information node and the virtual node.
3. The method of claim 2, wherein constructing virtual nodes from spatial features comprises: determining an area every other first length distance, and constructing a virtual node for the area;
correspondingly, according to the spatial characteristics, constructing a recessive edge relationship between the user identity information node and the virtual node, including:
user identity information appearing in the virtual node association area is obtained, and a recessive edge relation between the virtual node and the appearing user identity information node is established.
4. The method of claim 2, wherein constructing virtual nodes from spatial features comprises: determining a region every a first length distance, determining a time interval every a first length, and respectively constructing virtual nodes for each time interval in the region;
correspondingly, according to the spatial characteristics, constructing a recessive edge relationship between the user identity information node and the virtual node, including:
user identity information appearing in the virtual node association area in the association period is obtained, and a recessive edge relation between the virtual node and the appearing user identity information node is established.
5. The method according to any one of claims 2-4, wherein determining target identity information belonging to the same user as the starting identity information based on a relationship graph comprises:
in the relational graph, traversing by taking an identity information node associated with the initial identity information as an initial identity information node to obtain a target sub-graph;
determining the correlation degree between the starting identity information node and other identity information nodes in the target sub-graph;
and determining target identity information belonging to the same user as the initial identity information according to the correlation.
6. The method of claim 5, wherein traversing the relationship graph with the identity information node associated with the starting identity information as a starting identity information node to obtain a target sub-graph comprises:
traversing by taking the identity information node associated with the initial identity information as an initial identity information node in the relational graph, acquiring a target node associated with the initial identity information node, and adding the target node to the target sub-graph;
if the target node is a virtual node, determining that the target node is a target virtual node, and acquiring other virtual nodes associated with the target virtual node from the relational graph according to spatial characteristics;
and adding the other virtual nodes into the target sub-graph, and constructing the generalization edge relation between the other virtual nodes and the target virtual node.
7. The method of claim 6, wherein obtaining other virtual nodes associated with the target virtual node from the relationship graph according to spatial features comprises:
and taking the virtual nodes with the association areas within a second length and the association time intervals within a second time length between the target virtual node and the virtual nodes as other virtual nodes associated with the target virtual node.
8. The method of claim 5, wherein determining the degree of correlation between the starting identity information node and other identity information nodes in the target sub-graph comprises:
and determining the correlation degree according to the edge relation coefficient quantity and the edge relation type between the initial identity information node and other identity information nodes in the target sub-graph.
9. The method of claim 1, further comprising: extracting candidate communication paths of the user identity information nodes according to the edge relations among different nodes in the relation graph;
correspondingly, determining target identity information belonging to the same user as the starting identity information based on the relation graph comprises the following steps:
determining a starting identity information type to which the starting identity information belongs;
selecting a target communication path from the candidate communication paths according to the initial identity information type;
and extracting target identity information belonging to the same user as the initial identity information from the relation map based on the target communication path.
10. A user identification data determination apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a user identity query request, and the user identity query request comprises initial identity information;
the determining module is used for determining target identity information which belongs to the same user with the initial identity information based on a relation map;
the relation graph comprises user identity information nodes, virtual nodes and edge relations among different nodes; the virtual nodes are determined according to spatial features.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of user identity data determination of any one of claims 1-9.
12. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of determining user identity data of any one of claims 1-9.
CN201911383265.3A 2019-12-27 2019-12-27 User identity data determination method, device, equipment and medium Active CN111143627B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911383265.3A CN111143627B (en) 2019-12-27 2019-12-27 User identity data determination method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911383265.3A CN111143627B (en) 2019-12-27 2019-12-27 User identity data determination method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111143627A true CN111143627A (en) 2020-05-12
CN111143627B CN111143627B (en) 2023-08-15

Family

ID=70521294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911383265.3A Active CN111143627B (en) 2019-12-27 2019-12-27 User identity data determination method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111143627B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003187176A (en) * 2001-12-20 2003-07-04 Hottolink:Kk Information ranking calculation method
US20110029618A1 (en) * 2009-08-02 2011-02-03 Hanan Lavy Methods and systems for managing virtual identities in the internet
CN106534164A (en) * 2016-12-05 2017-03-22 公安部第三研究所 Cyberspace user identity-based effective virtual identity description method in computer
WO2018081732A1 (en) * 2016-10-31 2018-05-03 Dg Holdings, Inc. Portable and persistent virtual identity systems and methods
US20180159891A1 (en) * 2015-03-30 2018-06-07 Amazon Technologies, Inc. Threat detection and mitigation through run-time introspection and instrumentation
CN108427956A (en) * 2017-02-14 2018-08-21 腾讯科技(深圳)有限公司 A kind of clustering objects method and apparatus
CN109828967A (en) * 2018-12-03 2019-05-31 深圳市北斗智能科技有限公司 A kind of accompanying relationship acquisition methods, system, equipment, storage medium
CN109919316A (en) * 2019-03-04 2019-06-21 腾讯科技(深圳)有限公司 The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN109978016A (en) * 2019-03-06 2019-07-05 重庆邮电大学 A kind of network user identity recognition methods
CN110515968A (en) * 2019-08-30 2019-11-29 北京百度网讯科技有限公司 Method and apparatus for output information
CN110516076A (en) * 2019-08-11 2019-11-29 西藏宁算科技集团有限公司 A kind of the cloud computing management method and system of knowledge based map
CN110543586A (en) * 2019-09-04 2019-12-06 北京百度网讯科技有限公司 Multi-user identity fusion method, device, equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003187176A (en) * 2001-12-20 2003-07-04 Hottolink:Kk Information ranking calculation method
US20110029618A1 (en) * 2009-08-02 2011-02-03 Hanan Lavy Methods and systems for managing virtual identities in the internet
US20180159891A1 (en) * 2015-03-30 2018-06-07 Amazon Technologies, Inc. Threat detection and mitigation through run-time introspection and instrumentation
WO2018081732A1 (en) * 2016-10-31 2018-05-03 Dg Holdings, Inc. Portable and persistent virtual identity systems and methods
CN106534164A (en) * 2016-12-05 2017-03-22 公安部第三研究所 Cyberspace user identity-based effective virtual identity description method in computer
CN108427956A (en) * 2017-02-14 2018-08-21 腾讯科技(深圳)有限公司 A kind of clustering objects method and apparatus
CN109828967A (en) * 2018-12-03 2019-05-31 深圳市北斗智能科技有限公司 A kind of accompanying relationship acquisition methods, system, equipment, storage medium
CN109919316A (en) * 2019-03-04 2019-06-21 腾讯科技(深圳)有限公司 The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN109978016A (en) * 2019-03-06 2019-07-05 重庆邮电大学 A kind of network user identity recognition methods
CN110516076A (en) * 2019-08-11 2019-11-29 西藏宁算科技集团有限公司 A kind of the cloud computing management method and system of knowledge based map
CN110515968A (en) * 2019-08-30 2019-11-29 北京百度网讯科技有限公司 Method and apparatus for output information
CN110543586A (en) * 2019-09-04 2019-12-06 北京百度网讯科技有限公司 Multi-user identity fusion method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡庆平: "面向移动互联网信息服务的用户行为研究", 《中国优秀硕士论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN111143627B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
KR102557613B1 (en) Query method, apparatus, electronic device and storage medium
WO2018149292A1 (en) Object clustering method and apparatus
CN107133309B (en) Method and device for storing and querying process example, storage medium and electronic equipment
EP3198479A1 (en) Enriching events with dynamically typed big data for event processing
EP3077926A1 (en) Pattern matching across multiple input data streams
CN113051446A (en) Topological relation query method, device, electronic equipment and medium
CN105302809A (en) Group user level association method and system
CN112269789A (en) Method and device for storing data and method and device for reading data
EP3828732A2 (en) Method and apparatus for processing identity information, electronic device, and storage medium
CN110619002A (en) Data processing method, device and storage medium
CN111814067B (en) Friend recommendation method, device, equipment and storage medium
CN111625552A (en) Data collection method, device, equipment and readable storage medium
CN110781200B (en) Processing method, device, equipment and medium for block chain abnormal data
CN111259090A (en) Graph generation method and device of relational data, electronic equipment and storage medium
CN106156258B (en) Method, device and system for counting data in distributed storage system
CN112328658A (en) User profile data processing method, device, equipment and storage medium
KR101614890B1 (en) Method of creating multi tenancy history, server performing the same and storage media storing the same
CN111966767A (en) Track thermodynamic diagram generation method and device, electronic equipment and storage medium
CN110995687A (en) Cat pool equipment identification method, device, equipment and storage medium
CN111143627B (en) User identity data determination method, device, equipment and medium
CN105978744A (en) Resource allocation method, device and system
US20140089438A1 (en) Method and device for processing information
CN111753330A (en) Method, device and equipment for determining data leakage subject and readable storage medium
CN104394197A (en) SQL (Structured Query Language) injection detection system and method based on cloud environment
CN111292223A (en) Graph calculation processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant