CN113704566A - Identification number body identification method, storage medium and electronic equipment - Google Patents

Identification number body identification method, storage medium and electronic equipment Download PDF

Info

Publication number
CN113704566A
CN113704566A CN202111266763.7A CN202111266763A CN113704566A CN 113704566 A CN113704566 A CN 113704566A CN 202111266763 A CN202111266763 A CN 202111266763A CN 113704566 A CN113704566 A CN 113704566A
Authority
CN
China
Prior art keywords
path
meta
determining
user identification
identification number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111266763.7A
Other languages
Chinese (zh)
Other versions
CN113704566B (en
Inventor
杨悦
李君阳
马英楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beike Technology Co Ltd
Original Assignee
Beike Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beike Technology Co Ltd filed Critical Beike Technology Co Ltd
Priority to CN202111266763.7A priority Critical patent/CN113704566B/en
Publication of CN113704566A publication Critical patent/CN113704566A/en
Application granted granted Critical
Publication of CN113704566B publication Critical patent/CN113704566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses an identification number body identification method, a storage medium and an electronic device, wherein the method comprises the following steps: receiving a first user identification number and a second user identification number of a main body to be identified; determining at least one first path between the first user identification number and the second user identification number based on the identification relationship map; determining path characteristics corresponding to at least one first path based on the meta path set corresponding to the identification relation graph; inputting the path characteristics into a classification network, and determining whether the first user identification number and the second user identification number are the same body or not based on the target probability output by the classification network; the embodiment embodies more related information between the two identification numbers by constructing the path characteristics, improves the accuracy of the identification result, processes the path characteristics through the classification network, determines whether the two identification numbers are the same main body according to the classification result, and realizes quick and accurate main body identification.

Description

Identification number body identification method, storage medium and electronic equipment
Technical Field
The disclosure relates to the technical field of risk portrayal, in particular to an identification number body identification method, a storage medium and electronic equipment.
Background
Identifier-Mapping (Identifier-Mapping) is to identify IDs (users, equipment, mobile phone numbers and the like) of different sources as the same object or main body by a technical means, and serially connect fragmented behaviors and data of the users to eliminate data islands, so that accurate identification, accurate positioning, accurate release, recommendation and the like are realized. The goal in the field of wind control is to identify B, C unique subjects behind the end user, and thus identify the various types of risk behaviors of these subjects in the system; however, the prior art cannot distinguish whether different IDs belong to the same subject.
Disclosure of Invention
The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides an identification number body identification method, a storage medium and an electronic device.
According to an aspect of the embodiments of the present disclosure, there is provided an identification number body identification method, including:
receiving a first user identification number and a second user identification number of a main body to be identified;
determining at least one first path between the first user identification number and the second user identification number based on an identification relationship graph; the identification relation graph comprises a plurality of nodes with connection relations, and each node corresponds to a user identification number;
determining a path characteristic corresponding to the at least one first path based on the meta-path set corresponding to the identification relationship graph; wherein, the meta-path set comprises at least one meta-path, and each meta-path comprises a plurality of nodes connected by at least one edge relation type;
and inputting the path characteristics into a classification network, and determining whether the first user identification number and the second user identification number are the same body or not based on the target probability output by the classification network.
Optionally, the determining, based on the meta path set corresponding to the identified relationship graph, a path feature corresponding to the at least one first path includes:
determining the number of nodes included in each path in the at least one first path and at least one edge relation type among a plurality of nodes;
determining at least one meta-path from the set of meta-paths based on the number of nodes and the at least one edge relationship type;
determining the path feature based on the determined at least one meta-path.
Optionally, the determining the path feature based on the determined at least one meta-path includes:
determining at least one weight value corresponding to each meta path based on at least one edge relation type corresponding to the determined at least one meta path; wherein each meta path corresponds to at least one weight value;
determining at least one vector code corresponding to at least one second path based on the at least one weight value;
determining the path feature based on the at least one vector encoding.
Optionally, the determining, based on the at least one weight value, at least one vector code corresponding to at least one second path includes:
in response to that the number of the weight values corresponding to the meta-paths is smaller than n, taking 0 as a supplementary weight value, and enabling the number of the weight values corresponding to each meta-path to be n to obtain n weight values corresponding to each meta-path; wherein n is the maximum number of weight values included in a meta-path in the meta-path set, and n is an integer greater than 1;
and taking the n weighted values corresponding to the meta-path as vector codes of the second path corresponding to the meta-path.
Optionally, before determining at least one first path between the first user identification number and the second user identification number based on the identification relationship map, the method further includes:
acquiring at least one association between a plurality of user identification numbers with known attributes and a plurality of user identification numbers, and time information corresponding to the at least one association;
processing the time information corresponding to the at least one association by using a decay function to obtain weight values among the plurality of user identification numbers;
and establishing the identification relation graph by taking the plurality of user identification numbers as nodes and the plurality of weight values as connection attributes.
Optionally, before determining the path feature corresponding to the at least one first path based on the meta-path set corresponding to the identified relationship graph, the method further includes:
for the identification relation graph, taking a node corresponding to the global identification number as an initial node, and taking n as a search edge number to perform path search on the identification relation graph to obtain at least one n-order path; wherein n is an integer greater than 1;
determining the meta-path set based on the at least one n-th order path.
Optionally, the determining the meta-path set based on the at least one n-th order path includes:
determining an edge relation type corresponding to n edges included in each n-order path in the at least one n-order path;
performing duplicate removal operation on the at least one n-order path based on the n edge relation types corresponding to each n-order path to obtain at least one n-order meta path after duplicate removal;
and constructing the meta-path set based on the at least one n-order meta-path.
Optionally, the determining an edge relationship type corresponding to n edges included in each n-order path of the at least one n-order path includes:
determining attributes of user identification numbers corresponding to two nodes corresponding to each edge aiming at each edge in n edges included in the n-order path;
and determining the edge relation type corresponding to the edge based on the attributes of the user identification numbers corresponding to the two nodes corresponding to the edge.
Optionally, before inputting the path feature into a classification network, and determining whether the first user identification number and the second user identification number are the same subject based on a target probability output by the classification network, the method further includes:
training the classification network based on a training data set; wherein the training data set includes at least one pair of training identification numbers known to be of the same subject.
Optionally, the training the classification network based on a training data set includes:
determining at least one third path corresponding to the training identification number pair based on the identification relation graph;
determining a predicted path characteristic corresponding to the at least one third path based on the meta-path set corresponding to the identification relationship graph;
inputting the predicted path characteristics into the classification network, and outputting a prediction result indicating whether two training identification numbers in the training identification number pair are the same body;
determining network loss based on the prediction result and the known mark whether the training identification number pair corresponds to the same body;
supervising training of the classification network based on the network loss.
According to another aspect of the embodiments of the present disclosure, there is provided an identification number body recognition apparatus including:
the identification number receiving module is used for receiving a first user identification number and a second user identification number of the main body to be identified;
a path determination module, configured to determine at least one first path between the first user identification number and the second user identification number based on an identification relationship map; the identification relation graph comprises a plurality of nodes with connection relations, and each node corresponds to a user identification number;
a path feature determination module, configured to determine, based on the meta-path set corresponding to the identification relationship graph, a path feature corresponding to the at least one first path; wherein, the meta-path set comprises at least one meta-path, and each meta-path comprises a plurality of nodes connected by at least one edge relation type;
and the body identification module is used for inputting the path characteristics into a classification network and determining whether the first user identification number and the second user identification number are the same body or not based on the target probability output by the classification network.
Optionally, the path characteristic determining module includes:
an edge relation determining unit, configured to determine the number of nodes included in each path in the at least one first path and at least one edge relation type between multiple nodes;
a meta path determining unit, configured to determine at least one meta path from the meta path set based on the number of nodes and the at least one edge relationship type;
a feature determination unit configured to determine the path feature based on the determined at least one meta-path.
Optionally, the feature determining unit is specifically configured to determine, based on the at least one edge relationship type corresponding to the determined at least one meta path, at least one weight value corresponding to each meta path; wherein each meta path corresponds to at least one weight value; determining at least one vector code corresponding to at least one second path based on the at least one weight value; determining the path feature based on the at least one vector encoding.
Optionally, when determining at least one vector code corresponding to at least one second path based on the at least one weight value, the feature determining unit is configured to, in response to that the number of weight values corresponding to the meta path is less than n, use 0 as a supplementary weight value, and make the number of weight values corresponding to each meta path n, to obtain n weight values corresponding to each meta path; wherein n is the maximum number of weight values included in a meta-path in the meta-path set, and n is an integer greater than 1; and taking the n weighted values corresponding to the meta-path as vector codes of the second path corresponding to the meta-path.
Optionally, the apparatus further comprises:
the graph establishing module is used for acquiring at least one association between a plurality of user identification numbers with known attributes and the plurality of user identification numbers and time information corresponding to the at least one association; processing the time information corresponding to the at least one association by using a decay function to obtain weight values among the plurality of user identification numbers; and establishing the identification relation graph by taking the plurality of user identification numbers as nodes and the plurality of weight values as connection attributes.
Optionally, the apparatus further comprises:
the meta-path aggregation module is used for carrying out path search on the identification relation graph by taking the node corresponding to the global identification number as an initial node and n as a search edge number to obtain at least one n-order path; wherein n is an integer greater than 1; determining the meta-path set based on the at least one n-th order path.
Optionally, when determining the meta-path set based on the at least one n-order path, the meta-path set module is configured to determine an edge relationship type corresponding to n edges included in each n-order path in the at least one n-order path; performing duplicate removal operation on the at least one n-order path based on the n edge relation types corresponding to each n-order path to obtain at least one n-order meta path after duplicate removal; and constructing the meta-path set based on the at least one n-order meta-path.
Optionally, when determining the edge relationship type corresponding to the n edges included in each n-order path of the at least one n-order path, the meta path aggregation module is configured to determine, for each edge of the n edges included in the n-order path, attributes of the user identification numbers corresponding to two nodes corresponding to the edge; and determining the edge relation type corresponding to the edge based on the attributes of the user identification numbers corresponding to the two nodes corresponding to the edge.
Optionally, the apparatus further comprises:
a network training module for training the classification network based on a training data set; wherein the training data set includes at least one pair of training identification numbers known to be of the same subject.
Optionally, the network training module is specifically configured to determine, based on the identification relationship graph, at least one third path corresponding to the training identifier pair; determining a predicted path characteristic corresponding to the at least one third path based on the meta-path set corresponding to the identification relationship graph; inputting the predicted path characteristics into the classification network, and outputting a prediction result indicating whether two training identification numbers in the training identification number pair are the same body; determining network loss based on the prediction result and the known mark whether the training identification number pair corresponds to the same body; supervising training of the classification network based on the network loss.
According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the identification number body identification method according to any of the embodiments.
According to still another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instruction from the memory, and execute the instruction to implement the identification number body identification method according to any of the above embodiments.
Based on the identification number main body identification method, the storage medium and the electronic equipment provided by the embodiment of the disclosure, a first user identification number and a second user identification number of a main body to be identified are received; determining at least one first path between the first user identification number and the second user identification number based on an identification relationship graph; the identification relation graph comprises a plurality of nodes with connection relations, and each node corresponds to a user identification number; determining a path characteristic corresponding to the at least one first path based on the meta-path set corresponding to the identification relationship graph; wherein, the meta-path set comprises at least one meta-path, and each meta-path comprises a plurality of nodes connected by at least one edge relation type; inputting the path characteristics into a classification network, and determining whether the first user identification number and the second user identification number are the same body or not based on a target probability output by the classification network; in the embodiment, at least one first path between two identification numbers is obtained through path search, the path characteristics are determined based on the at least one first path, more related information between the two identification numbers is embodied through constructing the path characteristics, the accuracy of the identification result is improved, the path characteristics are processed through a classification network, whether the two identification numbers are the same main body is determined through the classification result, and the quick and accurate main body identification is realized.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
Fig. 1 is a schematic flowchart of an identification number body identification method according to an exemplary embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of step 106 in the embodiment of FIG. 1 of the present disclosure;
FIG. 3 is a schematic flow chart of step 1063 in the embodiment shown in FIG. 2 of the present disclosure;
fig. 4 is a partial flowchart of an identification number body identification method according to another exemplary embodiment of the present disclosure;
FIG. 5 is a partial flow chart of an identification number body identification method according to yet another exemplary embodiment of the disclosure;
fig. 6 is a schematic diagram of an exemplary identification number body identification method provided in an exemplary embodiment of the present disclosure, where a path characteristic is determined based on an identification relation diagram;
fig. 7 is a schematic structural diagram of an identification number body identification device according to an exemplary embodiment of the present disclosure;
fig. 8 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship. The data referred to in this disclosure may include unstructured data, such as text, images, video, etc., as well as structured data.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Exemplary method
Fig. 1 is a schematic flowchart of an identification number body identification method according to an exemplary embodiment of the present disclosure. The embodiment can be applied to an electronic device, as shown in fig. 1, and includes the following steps:
step 102, receiving a first user identification number and a second user identification number of a body to be identified.
Optionally, the attributes (categories) of the user identification number in this embodiment may include, but are not limited to: system type IDs (B, C registered user, etc.), mobile phone number type IDs, device type IDs (advertisement identifier IDFA, wireless network WIFI, IP, etc.), biological type IDs (face ID, voiceprint ID, identification card, etc.), and the like. The edge relation type between different user identification numbers can be determined through the attributes of the user identification numbers.
At least one first path between the first user identification number and the second user identification number is determined based on the identification relationship map, step 104.
The identification relation graph comprises a plurality of nodes with connection relation, and each node corresponds to a user identification number.
In one embodiment, before determining the at least one first path, an identification relationship graph is established based on a plurality of user identification numbers with connection relationships, each user identification number is used as a node in the identification relationship graph, and nodes with association relationships are connected through edges. There may not be a direct association between user identification numbers in the system, but the association may be made through other user identification numbers or actions.
And 106, determining the path characteristics corresponding to at least one first path based on the meta-path set corresponding to the identification relation graph.
The meta path set comprises at least one meta path, and each meta path comprises a plurality of nodes connected through at least one edge relation type.
In this embodiment, a path formed by nodes connected by different edge relationship types is described through a meta path, and the number and/or types of the edge relationship types included in each meta path are different, so that it is ensured that all the edge relationship type combinations included in a path with a set length are included through a meta path set, and it is ensured that corresponding path characteristics can be obtained when a node searches for a path.
And 108, inputting the path characteristics into the classification network, and determining whether the first user identification number and the second user identification number are the same body or not based on the target probability output by the classification network.
In the embodiment, the relationship between the first user identification number and the second user identification number is described through the path characteristics, and the path characteristics are processed by using the classification network to determine whether the first user identification number and the second user identification number correspond to the same main body, so that the identification efficiency is improved.
In the embodiment, at least one first path between two identification numbers is obtained through path search, path characteristics are determined based on the at least one first path, more related information between the two identification numbers is embodied through constructing the path characteristics, the accuracy of an identification result is improved, the path characteristics are processed through a classification network, whether the two identification numbers are the same main body is determined according to the classification result, and the quick and accurate main body identification is realized; the interpretability is strong, and more IDs can be recalled through multiple degrees of association.
As shown in fig. 2, based on the embodiment shown in fig. 1, step 106 may include the following steps:
step 1061, determining the number of nodes included in each path in the at least one first path and at least one edge relationship type between the plurality of nodes.
Optionally, the first user identification number and the second user identification number may directly have an association relationship, and/or indirectly have an association relationship, and different first paths correspond to association situations where two user identification numbers are different.
Step 1062, determining at least one meta-path from the meta-path set based on the number of nodes and the at least one edge relationship type.
In this embodiment, meta-paths with the same number of nodes and the same type of edge relationship can be found in the meta-path set by using the number of nodes included in each path and the type of edge relationship between the nodes.
Step 1063, determining a path feature based on the determined at least one meta-path.
Optionally, each edge in each meta-path in the meta-path set corresponds to a weight value, in this embodiment, the meta-path is encoded by using at least one weight value corresponding to the meta-path to obtain a group of codes corresponding to each first path, and because different lengths of the first paths are different, this embodiment may complement the codes corresponding to at least one first path into multiple groups of codes with equal length by filling; the vector formed by multiple groups of codes with equal length is used as the path characteristic in the embodiment, and all the association relations between the first user identification number and the second user identification number are embodied through the path characteristic, so that the obtained identification result is more accurate; the path is converted into path characteristics instead of single side as characteristics, pairwise relations between another node instead of all exhaustive points are determined through root node search, relations between zero connection and weak connection are removed in a preposed mode, and the calculated amount is greatly reduced.
As shown in fig. 3, based on the embodiment shown in fig. 2, step 1063 may include the following steps:
step 301, determining at least one weight value corresponding to each meta path based on at least one edge relation type corresponding to at least one determined meta path.
Wherein each meta path corresponds to at least one weight value.
In this embodiment, in the identification relationship diagram, there may be a plurality of path instances corresponding to the same meta path schema, and for this case, the weight value corresponding to the meta path may determine a product of the weight values corresponding to each path instance, and then determine, from among the plurality of path instances, at least one weight value corresponding to the path instance having the largest product of the weight values as at least one weight value corresponding to the meta path; or calculating products of a plurality of weighted values corresponding to each edge relation type corresponding to a plurality of path instances, and taking the products as the weighted values of the edge relation types in the meta-path.
For example, for the mode
Figure 235562DEST_PATH_IMAGE001
There may be a case where one mobile phone number is used by the same client on a plurality of different devices, and the processing method for the scenario is as follows:
Figure 152702DEST_PATH_IMAGE002
wherein, in the step (A),
Figure 558538DEST_PATH_IMAGE003
weights representing relations in the path, rel representing relations in the path, p representing a path instance,
Figure 953747DEST_PATH_IMAGE004
a weight multiplication representing a plurality of relations included in the p-path instance, meta-path representing meta-path, p ∈ meta-path representing that at least one path instance corresponds to the same meta-path,
Figure 768119DEST_PATH_IMAGE005
presentation pair
Figure 449636DEST_PATH_IMAGE006
Taking the maximum value, namely if different path instances of the same element path exist between two nodes, taking the instance with the maximum weight connection multiplication value in the path instances as the characteristic value of the element path; for the condition that a plurality of paths exist in the same mode, the path with the maximum value of the relation weight product is taken as the path for feature calculation, on one hand, the subsequent feature construction is facilitated, on the other hand, the feature with the maximum feature strength replaces the path in the same mode, and the data volume is effectively reduced; in addition, the same-mode multipath problem can be calculated by adopting methods such as relation weight product summation and the like according to actual application scenes.
Step 302, determining at least one vector code corresponding to at least one second path based on at least one weight value.
Optionally, in response to that the number of the weight values corresponding to the meta-paths is less than n, taking 0 as a supplementary weight value, and making the number of the weight values corresponding to each meta-path n, so as to obtain n weight values corresponding to each meta-path; wherein n is the maximum number of weight values included in the meta-path set, and n is an integer greater than 1;
and taking the n weighted values corresponding to the meta-path as vector codes of a second path corresponding to the meta-path.
Optionally, the meta-path set in this embodiment is obtained by n-order search in the identification relationship graph, so that the maximum number of weight values included in the meta-path is n, and in order to make a group of code values corresponding to each meta-path equal in length, in this embodiment, the number of weight values is less than n, and is supplemented, for example, zero is used to supplement the number of weight values to n, so that a group of codes including n elements corresponding to each meta-path can be obtained, and the codes are expressed as one-dimensional vectors; and taking the degree n of the search path as a super parameter of the feature structure, wherein the larger n is, the longer the path is, the richer the representation of the relationship is.
Step 303, determining path features based on at least one vector encoding.
In this embodiment, at least one-dimensional vector is encoded and combined in the vertical direction to obtain a matrix including n elements in the horizontal direction and corresponding to the second path in number in the vertical direction, and this matrix is used as a path feature in this embodiment.
As shown in fig. 4, on the basis of the embodiment shown in fig. 2, before step 104, the method may further include:
step 401, at least one association between a plurality of user identification numbers with known attributes and a plurality of user identification numbers is obtained, and time information corresponding to the at least one association is obtained.
In the embodiment, when the association relationship between every two user identification numbers is obtained, time information corresponding to the association is also obtained, and the association between the user identification numbers has strong and weak points and may appear or disappear with time.
And 402, processing the time information corresponding to at least one association by using a decay function to obtain weight values among the plurality of user identification numbers.
In this embodiment, the different attribute identification numbers may correspond to different attenuation functions, and the attenuation functions express that the association strength between two user identification numbers is weakened over time, so that the corresponding weight values are also reduced. If two IDs (user identification numbers) have been associated sporadically long ago but not any associations again since then, the confidence between the two will be lower and lower over time, satisfying newton's law of cooling. Therefore, according to the frequency and time of association between IDs, the confidence coefficient of each association of multiple associations is respectively solved, and the weight value between two user identification numbers is obtained by summation; with confidence between the two fitted with a time decay function, optionally, one time decay function can be shown in equation (1) below:
Figure 716670DEST_PATH_IMAGE007
formula (1);
wherein, δ is an attenuation parameter, and the value thereof can be extracted and preset according to different application scenes; x is a correlation time-to-date value, e.g., calculated in one example
Figure 915570DEST_PATH_IMAGE008
(ii) a Wherein the content of the first and second substances,
Figure 584449DEST_PATH_IMAGE009
a user identification number is indicated and,
Figure 39848DEST_PATH_IMAGE010
to represent
Figure 794178DEST_PATH_IMAGE009
The corresponding attribute (type) of the image,
Figure 531190DEST_PATH_IMAGE011
indicating a further user identification number to which the user is entitled,
Figure 913629DEST_PATH_IMAGE012
to represent
Figure 343474DEST_PATH_IMAGE011
The corresponding attribute (type) of the image,
Figure 585099DEST_PATH_IMAGE013
representing a weight value between two user identification numbers.
And step 403, establishing an identification relation graph by using a plurality of user identification numbers as nodes and a plurality of weight values as connection attributes.
The existing ID identification method does not consider the change of the association between IDs, one-time association is regarded as a fact, in a real scene, the actions of logging in by using equipment of other people or registering by using mobile phone numbers of other people exist, the mobile phone numbers and the equipment can belong to two people in the recommendation field for expanding recall, the true belongings of the IDs need to be clarified in the wind control field, and accidental injury is avoided. Therefore, in the embodiment, a method of fitting a time decay function by association is adopted, confidence of association of every two IDs (user identification numbers) is calculated, the IDs are used as nodes, the confidence is used as a weight, and an ID graph is constructed to associate the IDs; different from ID fact strong association, the embodiment of the disclosure defines different types of relationships for association of different user identification numbers, and describes the ID relationship more thoroughly by performing time attenuation processing on association time and frequency, wherein the larger the weight is, the stronger the relationship is, and the closer the relationship is in the time dimension; and different attenuation functions are adopted according to the relation of different types of IDs, so that the real data distribution is more consistent with the real data distribution, and the method is not limited by different relation types of different service scenes.
In an optional example, the obtained ID is used as a node, and the ID type is used as an attribute of the node, which is denoted as V; using the association of every two IDs as edges, using the weight value between the IDs as the weight value of the edges, using the weight value and the edge relation type R of the edges as the attributes of the relation, and constructing a weighted directed graph
Figure 860223DEST_PATH_IMAGE014
As an identification relationship graph, wherein G represents a weighted directed graph,
Figure 129792DEST_PATH_IMAGE015
and
Figure 464959DEST_PATH_IMAGE016
representing two nodes in a diagram, rel tablesAnd the relation between two nodes is shown, V represents a node set, R represents a relation set, and the ID are associated.
As shown in fig. 5, on the basis of the embodiment shown in fig. 2, before the step 106 is executed, the method may further include:
step 501, for the identification relation graph, the node corresponding to the global identification number is used as an initial node, and n is used as a search edge number to perform path search on the identification relation graph, so as to obtain at least one n-order path.
Wherein n is an integer greater than 1.
In this embodiment, a strong ID in the identification relationship graph, for example, a biological ID (identification card, etc.), may be used as the global identification number (GID).
Step 502, based on at least one n-order path, a meta-path set is determined.
Optionally, step 502 may include:
determining an edge relation type corresponding to n edges included in each n-order path in at least one n-order path;
performing duplicate removal operation on at least one n-order path based on n edge relation types corresponding to each n-order path to obtain at least one n-order meta path after duplicate removal;
de-duplication of a searched path type set PT to obtain a meta-path set
Figure 193880DEST_PATH_IMAGE017
And ensuring that each meta path is a combination of different types of paths, and taking the meta path as a characteristic mode, wherein the size of the meta path set is marked as N.
And forming a meta-path set based on at least one n-order meta-path.
In the embodiment, a GID is taken as a starting point, weighted graph construction characteristics are searched through an n-order path, wherein the value of n is adjusted and determined according to the calculation performance and the service scene setting; the reason for limiting to order n here is that the time complexity of breadth-first search can be represented as O (k + e), where k represents the number of nodes and e represents the number of edges, i.e., the time complexity is proportional to the number of nodes and the number of edges; when the nodes in the ID graph are too many and the relationship is complex, the computing performance is limited; secondly, as the path degree increases, the farther away from the starting point, the lower the confidence.
Optionally, in an optional example, the flow of the path search and the meta path feature determination may be as shown in fig. 6: the method comprises the following steps:
firstly, using GID as starting point and any non-GID node as end point to make breadth-first search, and retaining node ID and path type on the path in the search process
Figure 131749DEST_PATH_IMAGE018
The path type represents the combination of the associated types of the included edges, and PT represents a path type set; for constructing a path pattern; while preserving weights of relationships on paths
Figure 239514DEST_PATH_IMAGE019
Figure 11161DEST_PATH_IMAGE020
One path instance is represented and the PW represents a set of path instances with weights, as src-v1-v2-dst in fig. 6 represents one PW.
b, determining at least one weight value corresponding to each meta path, which may refer to the process of determining the weight value in fig. 3.
c, obtaining all paths from the starting point to the end point and corresponding modes for the meta path processed in the step b:
Figure 322319DEST_PATH_IMAGE021
each path mode corresponds to at most one path instance.
Because a plurality of paths are arranged from the starting point to the end point, and the path length is not fixed, but the path length is limited to n-degree search during searching, so that the maximum path length is n, each meta path is filled, and the weight of the path which is less than n degrees is filled by 0 to n degrees; taking the meta path as a lookup table, filling all paths between (src, dst), and then performing unique hot code conversion according to the lookup table, for example, establishing a meta path (meta-path) for the identification relationship graph provided by an optional example, where the path code of (src, dst) when n is 4 can be represented as shown in table 1 below:
Figure 204824DEST_PATH_IMAGE022
TABLE 1
Optionally, determining an edge relationship type corresponding to n edges included in each n-order path of the at least one n-order path includes:
determining attributes of user identification numbers corresponding to two nodes corresponding to the edges aiming at each edge in n edges included in the n-order path;
and determining the edge relation type corresponding to the edge based on the attributes of the user identification numbers corresponding to the two nodes corresponding to the edge.
In this embodiment, the edge relationship type may be determined based on attributes of user identification numbers corresponding to two nodes connected at the edge, and the connection relationships of the nodes with different attributes are different, for example, phoneOfCust ([ client ] registers and uses [ mobile phone number ]), sendimby ([ mobile phone number ] sent in [ user ] IM), changephone ofemployee ([ mobile phone number ] used in history, and the like.
In some optional embodiments, before performing step 108, the method may further include:
the classification network is trained based on a training data set.
Wherein the training data set comprises at least one pair of training identification numbers known to be the same subject.
In this embodiment, a training identification number pair known whether to be the same subject is used as a sample, and when belonging to the same subject, the classification probability can be recorded as 1, and when not belonging to the same subject, the classification probability is recorded as 0, and the known classification probability is used as a supervised training classification network, where the classification network may be a two-classification network model, and the network structure may be any classification network structure, and may be selected according to an actual application scenario and complexity; and predicting the new sample by using the trained classification network, wherein the prediction result is the probability that the two user identification numbers belong to the same main body and is a probability value from 0 to 1, and when the probability value is greater than a set threshold value, the two user identification numbers are considered to correspond to one main body.
Optionally, training the classification network based on a training data set includes:
determining at least one third path corresponding to the training identification number pair based on the identification relation graph;
determining a predicted path characteristic corresponding to at least one third path based on the meta-path set corresponding to the identification relation graph;
inputting the predicted path characteristics into a classification network, and outputting a prediction result indicating whether two training identification numbers in the training identification number pair are the same body;
determining network loss based on the prediction result and the known mark whether the training identification number pair is corresponding to the same main body;
based on the network loss, training of the classification network is supervised.
In the embodiment, a weighted ID graph constructed by confidence coefficients between IDs is calculated based on time attenuation, all weighted paths between the two IDs are searched, path sets are subjected to same-mode multi-path processing, path filling and single-hot code conversion to obtain characteristics taking starting points and end points as samples, the characteristics are used as the input of a two-class model, and a classification network is trained through supervised learning;
in this embodiment, a graph mining technique is adopted, and the probability that the two IDs are behind the same subject is predicted by using a classification network, using all paths between two IDs and the weight of the relationship on the paths as features. The method has strong interpretability, and more IDs can be recalled through multi-degree association.
Any identification number body identification method provided by the embodiment of the disclosure can be executed by any appropriate device with data processing capability, including but not limited to: terminal equipment, a server and the like. Alternatively, any of the identification number body identification methods provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor may execute any of the identification number body identification methods mentioned in the embodiments of the present disclosure by calling a corresponding instruction stored in a memory. And will not be described in detail below.
Exemplary devices
Fig. 7 is a schematic structural diagram of an identification number body identification device according to an exemplary embodiment of the present disclosure. The apparatus shown in fig. 7 comprises:
an identification number receiving module 71, configured to receive a first user identification number and a second user identification number of a body to be identified;
a path determination module 72 is configured to determine at least one first path between the first user identification number and the second user identification number based on the identification relationship map.
The identification relation graph comprises a plurality of nodes with connection relation, and each node corresponds to a user identification number.
And a path feature determining module 73, configured to determine, based on the meta-path set corresponding to the identification relationship graph, a path feature corresponding to at least one first path.
The meta path set comprises at least one meta path, and each meta path comprises a plurality of nodes connected through at least one edge relation type.
And a body identification module 74, configured to input the path characteristics into the classification network, and determine whether the first user identification number and the second user identification number are the same body based on the target probability output by the classification network.
In the embodiment, at least one first path between two identification numbers is obtained through path search, path characteristics are determined based on the at least one first path, more related information between the two identification numbers is embodied through constructing the path characteristics, the accuracy of an identification result is improved, the path characteristics are processed through a classification network, whether the two identification numbers are the same main body is determined according to the classification result, and the quick and accurate main body identification is realized; the interpretability is strong, and more IDs can be recalled through multiple degrees of association.
Optionally, the path characteristic determining module 73 includes:
an edge relation determining unit, configured to determine the number of nodes included in each path in at least one first path and at least one edge relation type between multiple nodes;
a meta path determining unit, configured to determine at least one meta path from the meta path set based on the number of nodes and the at least one edge relationship type;
a feature determination unit for determining a path feature based on the determined at least one meta-path.
Optionally, the feature determining unit is specifically configured to determine, based on the at least one edge relationship type corresponding to the determined at least one meta path, at least one weight value corresponding to each meta path; wherein each meta path corresponds to at least one weight value; determining at least one vector code corresponding to the at least one second path based on the at least one weight value; the path features are determined based on at least one vector encoding.
Optionally, when determining at least one vector code corresponding to at least one second path based on at least one weight value, the feature determining unit is configured to, in response to that the number of weight values corresponding to the meta path is less than n, use 0 as a supplementary weight value, make the number of weight values corresponding to each meta path n, and obtain n weight values corresponding to each meta path; wherein n is the maximum number of weighted values included in the meta-path set, and n is an integer greater than 1; and taking the n weighted values corresponding to the meta-path as vector codes of a second path corresponding to the meta-path.
Optionally, the apparatus provided in this embodiment further includes:
the graph establishing module is used for acquiring at least one association between the user identification numbers with the known attributes and the user identification numbers and time information corresponding to the at least one association; processing time information corresponding to at least one association by using a decay function to obtain weight values among a plurality of user identification numbers; and establishing an identification relation graph by taking a plurality of user identification numbers as nodes and a plurality of weight values as connection attributes.
Optionally, the apparatus provided in this embodiment further includes:
the meta-path aggregation module is used for carrying out path search on the identification relation graph by taking the node corresponding to the global identification number as an initial node and n as a search edge number to obtain at least one n-order path; wherein n is an integer greater than 1; based on at least one n-order path, a meta-path set is determined.
Optionally, the meta-path set module is configured to determine an edge relationship type corresponding to n edges included in each n-order path in the at least one n-order path when determining the meta-path set based on the at least one n-order path; performing duplicate removal operation on at least one n-order path based on n edge relation types corresponding to each n-order path to obtain at least one n-order meta path after duplicate removal; and forming a meta-path set based on at least one n-order meta-path.
Optionally, when determining the edge relationship type corresponding to the n edges included in each n-order path in the at least one n-order path, the meta path aggregation module is configured to determine, for each edge in the n edges included in the n-order path, an attribute of the user identification number corresponding to two nodes corresponding to the edge; and determining the edge relation type corresponding to the edge based on the attributes of the user identification numbers corresponding to the two nodes corresponding to the edge.
Optionally, the apparatus provided in this embodiment further includes:
the network training module is used for training the classification network based on the training data set; wherein the training data set comprises at least one pair of training identification numbers known to be the same subject.
Optionally, the network training module is specifically configured to determine, based on the recognition relation graph, at least one third path corresponding to the training identification number pair; determining a predicted path characteristic corresponding to at least one third path based on the meta-path set corresponding to the identification relation graph; inputting the predicted path characteristics into a classification network, and outputting a prediction result indicating whether two training identification numbers in the training identification number pair are the same body; determining network loss based on the prediction result and the known mark whether the training identification number pair is corresponding to the same main body; based on the network loss, training of the classification network is supervised.
Exemplary electronic device
Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 8. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.
FIG. 8 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.
As shown in fig. 8, the electronic device 80 includes one or more processors 81 and memory 82.
The processor 81 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 80 to perform desired functions.
Memory 82 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 81 to implement the identification number body identification methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 80 may further include: an input device 83 and an output device 84, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
For example, when the electronic device is a first device or a second device, the input device 83 may be the microphone or the microphone array described above for capturing the input signal of the sound source. When the electronic device is a stand-alone device, the input means 83 may be a communication network connector for receiving the acquired input signals from the first device and the second device.
The input device 83 may include, for example, a keyboard, a mouse, and the like.
The output device 84 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 84 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.
Of course, for simplicity, only some of the components of the electronic device 80 relevant to the present disclosure are shown in fig. 8, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 80 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the identification number body identification method according to various embodiments of the present disclosure described in the above-mentioned "exemplary methods" section of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the identification number body identification method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A method for identifying a body of an identification number, comprising:
receiving a first user identification number and a second user identification number of a main body to be identified;
determining at least one first path between the first user identification number and the second user identification number based on an identification relationship graph; the identification relation graph comprises a plurality of nodes with connection relations, and each node corresponds to a user identification number;
determining a path characteristic corresponding to the at least one first path based on the meta-path set corresponding to the identification relationship graph; wherein, the meta-path set comprises at least one meta-path, and each meta-path comprises a plurality of nodes connected by at least one edge relation type;
and inputting the path characteristics into a classification network, and determining whether the first user identification number and the second user identification number are the same body or not based on the target probability output by the classification network.
2. The method according to claim 1, wherein the determining the path feature corresponding to the at least one first path based on the meta-path set corresponding to the identified relationship graph comprises:
determining the number of nodes included in each path in the at least one first path and at least one edge relation type among a plurality of nodes;
determining at least one meta-path from the set of meta-paths based on the number of nodes and the at least one edge relationship type;
determining the path feature based on the determined at least one meta-path.
3. The method of claim 2, wherein determining the path characteristics based on the determined at least one meta-path comprises:
determining at least one weight value corresponding to each meta path based on at least one edge relation type corresponding to the determined at least one meta path; wherein each meta path corresponds to at least one weight value;
determining at least one vector code corresponding to at least one second path based on the at least one weight value;
determining the path feature based on the at least one vector encoding.
4. The method according to claim 3, wherein determining at least one vector code corresponding to at least one second path based on the at least one weight value comprises:
in response to that the number of the weight values corresponding to the meta-paths is smaller than n, taking 0 as a supplementary weight value, and enabling the number of the weight values corresponding to each meta-path to be n to obtain n weight values corresponding to each meta-path; wherein n is the maximum number of weight values included in a meta-path in the meta-path set, and n is an integer greater than 1;
and taking the n weighted values corresponding to the meta-path as vector codes of the second path corresponding to the meta-path.
5. The method according to any of claims 1-4, further comprising, prior to determining the at least one first path between the first user identification number and the second user identification number based on an identification relationship graph:
acquiring at least one association between a plurality of user identification numbers with known attributes and a plurality of user identification numbers, and time information corresponding to the at least one association;
processing the time information corresponding to the at least one association by using a decay function to obtain weight values among the plurality of user identification numbers;
and establishing the identification relation graph by taking the plurality of user identification numbers as nodes and the plurality of weight values as connection attributes.
6. The method according to any one of claims 1-4, further comprising, before determining the path feature corresponding to the at least one first path based on the meta-path set corresponding to the identified relationship graph, the step of:
for the identification relation graph, taking a node corresponding to the global identification number as an initial node, and taking n as a search edge number to perform path search on the identification relation graph to obtain at least one n-order path; wherein n is an integer greater than 1;
determining the meta-path set based on the at least one n-th order path.
7. The method of claim 6, wherein determining the meta-path set based on the at least one n-th order path comprises:
determining an edge relation type corresponding to n edges included in each n-order path in the at least one n-order path;
performing duplicate removal operation on the at least one n-order path based on the n edge relation types corresponding to each n-order path to obtain at least one n-order meta path after duplicate removal;
and constructing the meta-path set based on the at least one n-order meta-path.
8. The method according to claim 7, wherein the determining the edge relationship type corresponding to the n edges included in each of the at least one n-order path comprises:
determining attributes of user identification numbers corresponding to two nodes corresponding to each edge aiming at each edge in n edges included in the n-order path;
and determining the edge relation type corresponding to the edge based on the attributes of the user identification numbers corresponding to the two nodes corresponding to the edge.
9. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the identification number body identification method according to any one of claims 1 to 8.
10. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing the processor-executable instructions;
the processor is used for reading the executable instruction from the memory and executing the instruction to realize the identification number body identification method of any one of the claims 1 to 8.
CN202111266763.7A 2021-10-29 2021-10-29 Identification number body identification method, storage medium and electronic equipment Active CN113704566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111266763.7A CN113704566B (en) 2021-10-29 2021-10-29 Identification number body identification method, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111266763.7A CN113704566B (en) 2021-10-29 2021-10-29 Identification number body identification method, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN113704566A true CN113704566A (en) 2021-11-26
CN113704566B CN113704566B (en) 2022-01-18

Family

ID=78647393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111266763.7A Active CN113704566B (en) 2021-10-29 2021-10-29 Identification number body identification method, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113704566B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708821A (en) * 2024-02-06 2024-03-15 山东省计算中心(国家超级计算济南中心) Method, system, equipment and medium for detecting Lesu software based on heterogeneous graph embedding

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740274A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Undirected graph-based user account searching method and device
CN109976881A (en) * 2017-12-28 2019-07-05 腾讯科技(深圳)有限公司 Transaction recognition method and apparatus, storage medium and electronic device
CN110555451A (en) * 2018-05-31 2019-12-10 北京京东尚科信息技术有限公司 information identification method and device
CN110929173A (en) * 2019-12-05 2020-03-27 深圳前海微众银行股份有限公司 Method, device, equipment and medium for identifying same person
CN111931485A (en) * 2020-08-12 2020-11-13 北京建筑大学 Multi-mode heterogeneous associated entity identification method based on cross-network representation learning
CN112488140A (en) * 2019-09-12 2021-03-12 北京国双科技有限公司 Data association method and device
CN112579797A (en) * 2021-02-20 2021-03-30 支付宝(杭州)信息技术有限公司 Service processing method and device for knowledge graph
CN112750030A (en) * 2021-01-11 2021-05-04 深圳前海微众银行股份有限公司 Risk pattern recognition method, risk pattern recognition device, risk pattern recognition equipment and computer readable storage medium
CN112989169A (en) * 2021-02-23 2021-06-18 腾讯科技(深圳)有限公司 Target object identification method, information recommendation method, device, equipment and medium
CN113065573A (en) * 2020-01-02 2021-07-02 阿里巴巴集团控股有限公司 User classification method, user classification device and electronic equipment
CN113297462A (en) * 2020-05-12 2021-08-24 阿里巴巴集团控股有限公司 Data processing method, device, equipment and storage medium
CN113312951A (en) * 2020-10-30 2021-08-27 阿里巴巴集团控股有限公司 Dynamic video target tracking system, related method, device and equipment
CN113383362A (en) * 2019-06-24 2021-09-10 深圳市欢太科技有限公司 User identification method and related product
CN113536252A (en) * 2021-07-21 2021-10-22 北京房江湖科技有限公司 Account identification method and computer-readable storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740274A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Undirected graph-based user account searching method and device
CN109976881A (en) * 2017-12-28 2019-07-05 腾讯科技(深圳)有限公司 Transaction recognition method and apparatus, storage medium and electronic device
CN110555451A (en) * 2018-05-31 2019-12-10 北京京东尚科信息技术有限公司 information identification method and device
CN113383362A (en) * 2019-06-24 2021-09-10 深圳市欢太科技有限公司 User identification method and related product
CN112488140A (en) * 2019-09-12 2021-03-12 北京国双科技有限公司 Data association method and device
CN110929173A (en) * 2019-12-05 2020-03-27 深圳前海微众银行股份有限公司 Method, device, equipment and medium for identifying same person
CN113065573A (en) * 2020-01-02 2021-07-02 阿里巴巴集团控股有限公司 User classification method, user classification device and electronic equipment
CN113297462A (en) * 2020-05-12 2021-08-24 阿里巴巴集团控股有限公司 Data processing method, device, equipment and storage medium
CN111931485A (en) * 2020-08-12 2020-11-13 北京建筑大学 Multi-mode heterogeneous associated entity identification method based on cross-network representation learning
CN113312951A (en) * 2020-10-30 2021-08-27 阿里巴巴集团控股有限公司 Dynamic video target tracking system, related method, device and equipment
CN112750030A (en) * 2021-01-11 2021-05-04 深圳前海微众银行股份有限公司 Risk pattern recognition method, risk pattern recognition device, risk pattern recognition equipment and computer readable storage medium
CN112579797A (en) * 2021-02-20 2021-03-30 支付宝(杭州)信息技术有限公司 Service processing method and device for knowledge graph
CN112989169A (en) * 2021-02-23 2021-06-18 腾讯科技(深圳)有限公司 Target object identification method, information recommendation method, device, equipment and medium
CN113536252A (en) * 2021-07-21 2021-10-22 北京房江湖科技有限公司 Account identification method and computer-readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708821A (en) * 2024-02-06 2024-03-15 山东省计算中心(国家超级计算济南中心) Method, system, equipment and medium for detecting Lesu software based on heterogeneous graph embedding
CN117708821B (en) * 2024-02-06 2024-04-30 山东省计算中心(国家超级计算济南中心) Method, system, equipment and medium for detecting Lesu software based on heterogeneous graph embedding

Also Published As

Publication number Publication date
CN113704566B (en) 2022-01-18

Similar Documents

Publication Publication Date Title
CN107590255B (en) Information pushing method and device
WO2019076191A1 (en) Keyword extraction method and device, and storage medium and electronic device
CN110659657B (en) Method and device for training model
CN107291774B (en) Error sample identification method and device
CN113704566B (en) Identification number body identification method, storage medium and electronic equipment
CN115809887A (en) Method and device for determining main business range of enterprise based on invoice data
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN111435369B (en) Music recommendation method, device, terminal and storage medium
CN113947701B (en) Training method, object recognition method, device, electronic equipment and storage medium
KR101743169B1 (en) System and Method for Searching Missing Family Using Facial Information and Storage Medium of Executing The Program
US8918406B2 (en) Intelligent analysis queue construction
CN111310743B (en) Face recognition method and device, electronic equipment and readable storage medium
CN116340831B (en) Information classification method and device, electronic equipment and storage medium
CN116431912A (en) User portrait pushing method and device
CN113536252B (en) Account identification method and computer-readable storage medium
CN110377824B (en) Information pushing method and device, computer readable storage medium and electronic equipment
CN111814051B (en) Resource type determining method and device
CN114528908A (en) Network request data classification model training method, classification method and storage medium
CN110516717B (en) Method and apparatus for generating image recognition model
CN112801226A (en) Data screening method and device, computer readable storage medium and electronic equipment
CN112199978A (en) Video object detection method and device, storage medium and electronic equipment
CN112214387B (en) Knowledge graph-based user operation behavior prediction method and device
CN114547455B (en) Method and device for determining hot object, storage medium and electronic equipment
US11907658B2 (en) User-agent anomaly detection using sentence embedding
CN116911304B (en) Text recommendation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant