WO2024109454A1 - Label propagation method and apparatus for associated network, and computer readable storage medium - Google Patents

Label propagation method and apparatus for associated network, and computer readable storage medium Download PDF

Info

Publication number
WO2024109454A1
WO2024109454A1 PCT/CN2023/127581 CN2023127581W WO2024109454A1 WO 2024109454 A1 WO2024109454 A1 WO 2024109454A1 CN 2023127581 W CN2023127581 W CN 2023127581W WO 2024109454 A1 WO2024109454 A1 WO 2024109454A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
label
network
nodes
label propagation
Prior art date
Application number
PCT/CN2023/127581
Other languages
French (fr)
Chinese (zh)
Inventor
刘红宝
何朔
高鹏飞
郑建宾
汤韬
邱震尧
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2024109454A1 publication Critical patent/WO2024109454A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the present invention belongs to the field of computers, and in particular relates to a label propagation method and device for an associated network, and a computer-readable storage medium.
  • the present invention provides the following solutions.
  • a label propagation method for an association network comprising: constructing a first association network based on first-party data, and constructing a second association network based on second-party data; associating the first association network and the second association network based on a secure intersection protocol to obtain a federated association network; iteratively performing multiple rounds of label propagation on the nodes of the federated association network; wherein each round of label propagation comprises: determining a label propagation probability between adjacent nodes in a federated association graph; for each node, determining a current round label for each node based on the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
  • associating the first associated network with the second associated network based on a secure intersection protocol to obtain a federated associated network further includes: performing encrypted intersection on the first-party data and the second-party data, determining a common node in the first associated network and the second associated network, and associating the first associated network with the second associated network according to the common node to obtain a federated associated network;
  • determining the label propagation probability between adjacent nodes in a federated association graph further includes: determining the edge weight w ij of the edge ij between node i and its neighbor node j in the federated association network; determining the edge weight sum ⁇ j w ij between node i and all of its neighbor nodes J; and determining the label propagation probability P ij of neighbor node j for node i based on the ratio of the edge weight w ij and the edge weight sum ⁇ j w ij .
  • all neighbor nodes J represent all neighbor nodes in the graph where node i is located.
  • all neighbor nodes J represent node i in the first associated network.
  • the first and second parties interact with each other by summing the edge weights between node i and all neighboring nodes a in the first associated network and summing the edge weights between node i and all neighboring nodes b in the second associated network.
  • multiple rounds of label propagation are iteratively performed on the nodes of the federated association network, and also include: determining the labeled nodes and unlabeled nodes of the federated association network; updating the labels of the unlabeled nodes in rounds until the labels of the unlabeled nodes no longer change and/or exceed an update round threshold; and, keeping the labels of the labeled nodes unchanged.
  • the current round label of each node is determined based on the current round label of the neighboring nodes and the label propagation probability of the neighboring nodes to the node, including: for each node, determining the current round label of each neighboring node of the node, and the label propagation probability of each neighboring node to the node; among all the neighboring nodes of the node, calculating the sum of the label propagation probabilities corresponding to each label to obtain the label propagation aggregation probability corresponding to each label; updating the current round label of the node according to the label with the maximum label propagation aggregation probability.
  • the method further includes: the party where the node is located calculates the label propagation aggregation probability corresponding to each neighbor node label of the node.
  • the method also includes: the first party calculates the first-party label propagation aggregation probability corresponding to all neighbor node labels of the node in the first associated network; the second party calculates the second label propagation aggregation probability corresponding to all neighbor node labels of the node in the second associated network; the first party and the second party interact with each other in the first label propagation aggregation probability and the second label propagation aggregation probability; the first party and the second party each perform label propagation probability aggregation again based on the interactive information to obtain the label propagation aggregation probability corresponding to each label.
  • the method further includes: determining the graph weights of the first association network and the second association network according to the closeness of the node relationships between the first association network and the second association network; and introducing the graph weights during the interaction between the first association network and the second association network.
  • graph weights are introduced in the interaction process between the first association network and the second association network, including: if node i is a common node, edge weights and ⁇ j w ij are determined using the following formula:
  • ⁇ j w ij ⁇ a ⁇ a w ia + ⁇ b ⁇ b w ib ;
  • ⁇ a w ia is the sum of the edge weights between node i and all neighbor nodes a of the first associated network
  • ⁇ b w ib is the sum of the edge weights between node i and all neighbor nodes b of the second associated network
  • ⁇ a is the graph weight of the first associated network
  • ⁇ b is the graph weight of the second associated network.
  • graph weights are introduced during the interaction between the first association network and the second association network, and the process also includes: after the first party and the second party interact with the first label propagation aggregation probability and the second label propagation aggregation probability, the label propagation probability is aggregated again based on the graph weights of the first association network and the second association network to obtain the label propagation aggregation probability corresponding to each label.
  • it further includes: if the first associated network and the second associated network are directed graph networks, only the incoming neighbor nodes of each node are used as neighbor nodes.
  • a label propagation device for an association network comprising: a graph construction module, used to construct a first association network based on first-party data, and to construct a second association network based on second-party data; a federated network module, used to associate the first association network with the second association network based on a secure intersection protocol to obtain a federated association network; a label propagation module, used to iteratively perform multiple rounds of label propagation on the nodes of the federated association network; wherein each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, according to the labels of the neighbor nodes in this round and the neighbor nodes; The label propagation probability of a node to another node determines the label of each node in this round.
  • a label propagation device for an associated network comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so that the at least one processor can execute: the method according to the first aspect.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method of the first aspect.
  • One of the advantages of the above implementation is that it can achieve label propagation across platform networks while ensuring privacy data.
  • FIG1 is a schematic diagram of the structure of a label propagation device for an associated network according to an embodiment of the present invention
  • FIG2 is a schematic diagram of a flow chart of a label propagation method for an association network according to an embodiment of the present invention
  • FIG3 is a schematic diagram of a first association network and a second association network according to an embodiment of the present invention.
  • FIG4 is a schematic diagram of a federated association network according to an embodiment of the present invention.
  • FIG5 is a schematic diagram of determining the label propagation probability of the first association network and the second association network according to an embodiment of the present invention
  • FIG6 is a schematic diagram of determining a label propagation probability of a federated association network according to an embodiment of the present invention.
  • FIG7 is a schematic diagram of label propagation of an association network according to an embodiment of the present invention.
  • FIG8 is a schematic diagram of label propagation of an association network according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a label propagation device for an associated network according to an embodiment of the present invention.
  • A/B can mean A or B.
  • the “and/or” in this article is merely a way to describe the association relationship of associated objects, indicating that three relationships can exist.
  • a and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone.
  • first”, “second”, etc. are used for descriptive purposes only and should not be understood as indicating or implying relative importance or The number of technical features indicated is implicitly specified. Thus, a feature defined as “first”, “second”, etc. may explicitly or implicitly include one or more of the feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
  • FIG1 shows a schematic diagram of an example of a computing device 100 according to an embodiment of the present disclosure. It should be noted that FIG1 is a schematic diagram of the structure of the hardware operating environment of the label propagation method of the associated network.
  • the device of the label propagation method based on the associated network in the embodiment of the present invention can be a terminal device such as a PC, a portable computer, etc.
  • the label propagation method device of the associated network may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to realize the connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
  • the label propagation device structure of the associated network shown in FIG1 does not constitute a limitation on the label propagation method device of the associated network, and may include more or fewer components than shown in the figure, or a combination of certain components, or a different arrangement of components.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a label propagation method program for an associated network.
  • the operating system is a program for managing and controlling the hardware and software resources of the label propagation device for the associated network, and supports the operation of the label propagation program for the associated network and other software or programs.
  • the user interface 1003 is mainly used to receive requests, data, etc. sent by the first terminal, the second terminal, and the supervision terminal;
  • the network interface 1004 is mainly used to connect the backend server to communicate data with the backend server;
  • the processor 1001 can be used to call the label propagation program of the associated network stored in the memory 1005, and perform the following operations:
  • a first association network is constructed based on first-party data
  • a second association network is constructed based on second-party data
  • the first association network and the second association network are associated based on a secure intersection protocol to obtain a federated association network
  • multiple rounds of label propagation are iteratively performed on the nodes of the federated association network; wherein each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the current round label of each node according to the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
  • the two parties only need to exchange non-privacy data such as label propagation probability to carry out cross-institutional data association network joint calculation and application solutions without releasing the original data of both parties.
  • FIG. 2 shows a flow chart of a method for performing a label propagation method of an association network according to an embodiment of the present disclosure.
  • the method may be performed, for example, by the computing device 100 shown in FIG1. It should be understood that the method 200 may also include additional blocks not shown and/or may omit the blocks shown, and the scope of the present disclosure is not limited in this respect.
  • Step 210 constructing a first association network based on the first-party data, and constructing a second association network based on the second-party data;
  • Party A and Party B form nodes and edges in the association network based on their own data.
  • the A association network formed by the transfer data between users is a transfer association network, in which the user's mobile phone number is the node in the association network, and the nodes with transfer relations are connected, and the transfer amount is the edge weight between nodes.
  • Party B is the operator.
  • the call data between users forms the party B association network, which is the call association network.
  • the user's mobile phone number is the node in the association network.
  • the nodes with call records are connected, and the number of calls is the edge weight value between nodes.
  • the edge weight values of each association network can be normalized.
  • Step 220 associating the first associated network and the second associated network based on a secure intersection protocol to obtain a federated associated network
  • the above step 220 further includes: encrypting and intersecting the first-party data and the second-party data, determining common nodes in the first association network and the second association network, and associating the first association network and the second association network according to the common nodes to obtain a federated association network.
  • a secure intersection algorithm (such as a privacy intersection algorithm based on RSA+HASH) can be used to securely intersect the node data of the two parties, and the common nodes can be found without exposing the original data, thereby forming a virtual federated association network.
  • the first association network and the second association network in FIG3 can be associated to obtain the federated association network.
  • Va represents the node of party A
  • Va represents the node of party B
  • Vab represents the common node of both parties
  • Vab1, Vab2, and Vab3 are the common nodes of both parties.
  • Vab1 As an example, from the perspective of party A's association network alone, Vab1 has 2 neighbor nodes. From the perspective of global data, Vab1 has 4 neighbor nodes.
  • Step 230 iteratively performing multiple rounds of label propagation on the nodes of the federated association network
  • the nodes of the federated association network may include labeled nodes and unlabeled nodes.
  • all nodes in the first association network are labeled nodes, and all nodes in the second association network are unlabeled nodes. And so on.
  • the above-mentioned step 230 may further include the following steps: first, determining the labeled nodes and unlabeled nodes of the federated association network; updating the labels of the unlabeled nodes in rounds until the labels of the unlabeled nodes no longer change and/or exceed the update round threshold; and keeping the labels of the labeled nodes unchanged. In this way, the labels of the original samples can be kept unchanged, and the accuracy of label propagation can be guaranteed.
  • the labels of the annotated nodes can also be dynamically updated round by round, that is, the labels of all nodes in the federated association network are updated round by round until the labels of the nodes no longer change and/or exceed the update round threshold.
  • the original labels can be corrected and hidden risk labels can be mined.
  • each round of label propagation specifically includes the following steps 231-232:
  • Step 231 determining the label propagation probability between adjacent nodes in the federated association graph
  • the above step 231 may specifically include:
  • all neighbor nodes J represent all neighbor nodes in the graph where node i is located.
  • all neighbor nodes J represent a set of all neighbor nodes a of node i in the first associated network and all neighbor nodes b of node i in the second associated network.
  • the first party and the second party interact with the sum of the edge weights ⁇ a w ia between node i and all neighbor nodes a of the first associated network, and the sum of the edge weights ⁇ b w ib between node i and all neighbor nodes b of the second associated network.
  • the first party and the second party can each calculate the sum of the edge weights ⁇ j w ij between node i and all its neighbor nodes J based on the sum of the edge weights of the two parties interacting.
  • both the first party and the second party may determine the sum of the edge weights ⁇ j w ij using the following formula:
  • ⁇ j w ij ⁇ a w ia + ⁇ b w ib ;
  • ⁇ a w ia is the sum of the edge weights between node i and all neighboring nodes a in the first associated network
  • ⁇ b w ib is the sum of the edge weights between node i and all neighboring nodes b in the second associated network.
  • the impact of business scenarios on the closeness of node relationships can be further considered.
  • a transfer relationship is a strong relationship
  • a call relationship is a weak relationship. Therefore, when calculating the label propagation probability of an edge, the strength of the edge relationship under different business scenarios of both parties can be considered for weighted aggregation.
  • both the first party and the second party can determine the weighted edge weight ⁇ j w ij using the following formula:
  • ⁇ j w ij ⁇ a ⁇ a w ia + ⁇ b ⁇ b w ib ;
  • ⁇ a is the corresponding graph weight of the first association network
  • ⁇ b is the graph weight of the second association network
  • the label propagation probability of each edge of the federated association network is
  • w ij represents the weight value of edge ij.
  • J represents the neighbor nodes of node i; for shared nodes, J represents all neighbor nodes of node i on both sides A and B.
  • the calculation logic of ⁇ j w ij is that Party A calculates the sum of the weights of neighboring nodes of its own node i as ⁇ a w ia ; Party B calculates the sum of the weights of neighboring nodes of its own node i as ⁇ b w ib .
  • node Vab2 is taken as an example.
  • the probability of label propagation of its neighbor nodes for node i is on side B, similarly, the label propagation probabilities of its neighboring nodes for node i are 2/15, 4/15, and 8/15 respectively.
  • Step 232 for each node, determine the current round label of each node according to the current round labels of neighboring nodes and the label propagation probability of neighboring nodes for the node.
  • Node 5 is a risk node, displayed as a white node, and its label is set to "1"; the remaining nodes are unknown nodes, displayed as gray nodes, and their labels are set to "0".
  • the label of node 5 is always "1", and the labels of the remaining nodes are updated round by round until the labels of all nodes no longer change or exceed the update round threshold.
  • step 232 further includes the following steps:
  • Step 2321 for node i, determine the current round label of each neighbor node of node i, and the label propagation probability of each neighbor node for node i;
  • Step 2322 among all neighboring nodes of node i, calculate the sum of the label propagation probabilities corresponding to each label to obtain the label propagation aggregation probability corresponding to each label;
  • node i is a non-shared node
  • only the party where node i is located calculates the label propagation aggregation probability corresponding to each neighbor node label of node i.
  • node i is a shared node
  • the following steps are performed: first, the first party calculates the first-party label propagation aggregation probability corresponding to all neighbor node labels of node i in the first associated network; the second party calculates the second-party label propagation aggregation probability corresponding to all neighbor node labels of node i in the second associated network; secondly, the first party and the second party interact with each other to calculate the first label propagation aggregation probability and the second label propagation aggregation probability; finally, the first party and the second party each interact with each other to calculate the first label propagation aggregation probability and the second label propagation aggregation probability; The information is again subjected to label propagation probability aggregation to obtain the label propagation aggregation probability corresponding to each label.
  • Step 2323 Update the current round label of node i according to the label with the maximum label propagation aggregation probability.
  • the node label update rules shown in the above steps 2321 to 2323 may include the following specific steps:
  • the label propagation aggregation probability of the neighbor nodes of party A is calculated as ⁇ “0”, 1/15>, where “0” represents the risk-free label and 1/15 represents the label propagation aggregation probability corresponding to label “0”. It can be understood that since party A’s Vab2 has only one neighbor node 1, and its initial label value is “0”, the label propagation probability of node 1 to node Vab2 has been calculated as 1/15 in the above text. Therefore, for party A’s Vab2 node, it has only one propagable label “0”, and the label propagation aggregation probability corresponding to the propagable label “0” is 1/15.
  • the label propagation probability of node 3 to node Vab2 is 2/15
  • the label propagation probability of node 4 to node Vab2 is 4/15
  • the two parties exchange label propagation aggregation probabilities and add up the label propagation aggregation probabilities corresponding to the same label. They can calculate the label propagation aggregation probabilities of node Vab2 as ⁇ “0”, 7/15>, ⁇ “1”, 8/15>, that is, the label propagation aggregation probability corresponding to the propagable label “0” is 7/15, and the label propagation aggregation probability corresponding to the propagable label “1” is 8/15.
  • the updated node label distribution diagram of the federated association network is shown in Figure 8, where nodes Vab1 and Vab2 are both updated to label “1”.
  • the next round of label propagation continues until the node label no longer changes or the number of propagation rounds exceeds a certain threshold.
  • the graph weights of the first association network and the second association network are determined according to the closeness of the node relationships between the first association network and the second association network; and the graph weights are introduced during the interaction between the first association network and the second association network.
  • the first association network and the second association network can be determined to be strongly or weakly associated according to the business scenario, and then the graph weights of the first association network and the second association network can be introduced when calculating the label propagation probability of the edge.
  • the graph weights of the first association network and the second association network can also be introduced when calculating the label propagation aggregation probability of each label, and this application does not impose specific restrictions on this.
  • graph weights are introduced into the interaction process between the first association network and the second association network, including at least the following two introduction methods:
  • the edge weight ⁇ j w ij is determined using the following formula:
  • ⁇ j w ij ⁇ a ⁇ a w ia + ⁇ b ⁇ b w ib ;
  • ⁇ a w ia is the sum of the edge weights between node i and all neighbor nodes a of the first associated network
  • ⁇ b w ib is the sum of the edge weights between node i and all neighbor nodes b of the second associated network
  • ⁇ a is the graph weight of the first associated network
  • ⁇ b is the graph weight of the second associated network.
  • it further includes: if the first associated network and the second associated network are directed graph networks, only the incoming neighbor nodes of each node are used as neighbor nodes. For example, for a directed graph, when calculating the propagation probability of a node, only the incoming neighbor nodes of the target node may be considered. The specific judgment may be made in combination with the business scenario.
  • the embodiment of the present invention further provides a label propagation device for an associated network, which is used to execute the label propagation method for an associated network provided by any of the above embodiments.
  • FIG9 is a schematic diagram of the structure of a label propagation device for an associated network provided by an embodiment of the present invention.
  • the apparatus 900 includes:
  • a graph construction module 910 configured to construct a first association network based on first-party data and a second association network based on second-party data;
  • a federated network module 920 configured to associate the first associated network with the second associated network based on a secure intersection protocol to obtain a federated associated network
  • the label propagation module 930 is used to iteratively perform multiple rounds of label propagation on the nodes of the federated association network; wherein, Each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the label of each node in this round according to the label of the neighboring node in this round and the label propagation probability of the neighboring node for the node.
  • the device in the implementation mode of the present application can implement each process of the implementation mode of the aforementioned method and achieve the same effects and functions, which will not be repeated here.
  • a non-volatile computer storage medium of a label propagation method for an association network on which computer executable instructions are stored, and the computer executable instructions are configured to execute the method described in the above embodiments when executed by a processor.
  • the apparatus, equipment and computer-readable storage medium provided in the embodiments of the present application correspond one-to-one to the method. Therefore, the apparatus, equipment and computer-readable storage medium also have similar beneficial technical effects as the corresponding methods. Since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the apparatus, equipment and computer-readable storage medium will not be repeated here.
  • the embodiments of the present invention may be provided as methods, devices (equipment or system), or computer-readable storage media. Therefore, the present invention may be implemented in the form of a complete hardware implementation, a complete software implementation, or an implementation combining software and hardware. Moreover, the present invention may be implemented in the form of a computer-readable storage medium implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions.
  • These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
  • processors CPU
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in a computer-readable medium, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash RAM.
  • RAM random access memory
  • ROM read-only memory
  • Memory is a computer-readable medium. Quality example.
  • Computer-readable media include permanent and non-permanent, removable and non-removable media that can be used to store information by any method or technology.
  • Information can be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, read-only compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a label propagation method and apparatus for an associated network, and a computer readable storage medium. The method comprises: constructing a first associated network on the basis of first-party data, and constructing a second associated network on the basis of second-party data; associating the first associated network and the second associated network on the basis of a security intersection protocol to obtain a federated associated network; and iteratively performing multiple rounds of label propagation on nodes of the federated associated network, wherein each round of label propagation comprises: determining the probabilities of label propagation between adjacent nodes in a federated associated graph; and for each node, determining the label of the present round of each node according to the label of the present round of a neighboring node and the label propagation probability of the neighboring node for the node. By using the method, label propagation of cross-platform networks can be realized while ensuring privacy data.

Description

一种关联网络的标签传播方法、装置及计算机可读存储介质A method, device and computer-readable storage medium for label propagation of an associated network
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2022年11月25日提交中国专利局、申请号为202211492068.7、申请名称为“一种关联网络的标签传播方法、装置及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the Chinese Patent Office on November 25, 2022, with application number 202211492068.7 and application name “A method, device and computer-readable storage medium for label propagation of an associated network”, the entire contents of which are incorporated by reference in this application.
技术领域Technical Field
本发明属于计算机领域,具体涉及一种关联网络的标签传播方法、装置及计算机可读存储介质。The present invention belongs to the field of computers, and in particular relates to a label propagation method and device for an associated network, and a computer-readable storage medium.
背景技术Background technique
本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。This section is intended to provide a background or context to embodiments of the invention that are recited in the claims. No description herein is admitted to be prior art by inclusion in this section.
随着隐私保护法律的趋严,机构间的数据合作越来越需要考虑数据隐私保护问题。目前隐私计算技术主要聚焦于联邦学习、安全求交、匿踪查询等场景,均为针对单点数据进行联合计。当前关联网络中的标签传播算法其计算的数据源均在本地。无法实现隐私保护前提下的跨机构数据联合应用。标签的更新无法同时利用两方的关联网络数据,数据价值没有高效利用。As privacy protection laws become stricter, data cooperation between institutions increasingly needs to consider data privacy protection issues. Currently, privacy computing technology mainly focuses on scenarios such as federated learning, secure intersection, and anonymous query, all of which are joint calculations for single-point data. The data sources of the current label propagation algorithm in the association network are all local. It is impossible to achieve cross-institutional data joint application under the premise of privacy protection. The update of labels cannot use the association network data of both parties at the same time, and the data value is not efficiently utilized.
因此,如何实现隐私保护前提下的联邦网络标签传播是一个亟待解决的问题。Therefore, how to achieve federal network label propagation under the premise of privacy protection is an urgent problem to be solved.
发明内容Summary of the invention
针对上述现有技术中存在的问题,提出了一种关联网络的标签传播方法、装置及计算机可读存储介质,利用这种方法、装置及计算机可读存储介质,能够解决上述问题。In view of the problems existing in the above-mentioned prior art, a label propagation method, device and computer-readable storage medium for an associated network are proposed. By using this method, device and computer-readable storage medium, the above-mentioned problems can be solved.
本发明提供了以下方案。The present invention provides the following solutions.
第一方面,提供关联网络的标签传播方法,包括:基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;基于安全求交协议关联第一关联网络和第二关联网络,得到联邦关联网络;对联邦关联网络的节点迭代执行多轮的标签传播;其中,每轮标签传播包括:确定联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及邻居节点对于节点的标签传播概率,确定每个节点的本轮标签。In a first aspect, a label propagation method for an association network is provided, comprising: constructing a first association network based on first-party data, and constructing a second association network based on second-party data; associating the first association network and the second association network based on a secure intersection protocol to obtain a federated association network; iteratively performing multiple rounds of label propagation on the nodes of the federated association network; wherein each round of label propagation comprises: determining a label propagation probability between adjacent nodes in a federated association graph; for each node, determining a current round label for each node based on the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
在一种实施方式中,基于安全求交协议关联第一关联网络和第二关联网络,得到联邦关联网络,还包括:对第一方数据和第二方数据进行加密求交,确定第一关联网络和第二关联网络中的公共节点,根据公共节点关联第一关联网络和第二关联网络,得到联邦关联网络;In one embodiment, associating the first associated network with the second associated network based on a secure intersection protocol to obtain a federated associated network further includes: performing encrypted intersection on the first-party data and the second-party data, determining a common node in the first associated network and the second associated network, and associating the first associated network with the second associated network according to the common node to obtain a federated associated network;
在一种实施方式中,确定联邦关联图中相邻节点之间的标签传播概率,还包括:确定联邦关联网络中节点i与其邻居节点j之间的边ij的边权重wij;确定节点i与其全部邻居节点J之间的边权重和∑jwij;根据边权重wij和边权重和∑jwij的比值,确定邻居节点j对于节点i的标签传播概率PijIn one embodiment, determining the label propagation probability between adjacent nodes in a federated association graph further includes: determining the edge weight w ij of the edge ij between node i and its neighbor node j in the federated association network; determining the edge weight sum ∑ j w ij between node i and all of its neighbor nodes J; and determining the label propagation probability P ij of neighbor node j for node i based on the ratio of the edge weight w ij and the edge weight sum ∑ j w ij .
在一种实施方式中,若节点i为非共有节点,全部邻居节点J代表节点i所在图的全部邻居节点。In one implementation, if node i is a non-shared node, all neighbor nodes J represent all neighbor nodes in the graph where node i is located.
在一种实施方式中,若节点i为共有节点,全部邻居节点J代表节点i在第一关联网络 的全部邻居节点a和节点i在第二关联网络的全部邻居节点b的集合。In one embodiment, if node i is a common node, all neighbor nodes J represent node i in the first associated network. The set of all neighbor nodes a of node i and all neighbor nodes b of node i in the second associated network.
在一种实施方式中,若节点i为共有节点,第一方和第二方交互节点i与第一关联网络的全部邻居节点a之间边权重和以及节点i与第二关联网络的全部邻居节点b之间的边权重和。In one embodiment, if node i is a common node, the first and second parties interact with each other by summing the edge weights between node i and all neighboring nodes a in the first associated network and summing the edge weights between node i and all neighboring nodes b in the second associated network.
在一种实施方式中,还包括:若节点i为共有节点,利用以下公式确定边权重和∑jwij:∑jwij=∑awia+∑bwib;其中,∑awia为节点i与第一关联网络的全部邻居节点a之间边权重和,∑bwib为节点i与第二关联网络的全部邻居节点b之间的边权重和。In one embodiment, it also includes: if node i is a common node, using the following formula to determine the edge weight sum ∑ j w ij : ∑ j w ij =∑ a w ia +∑ b w ib ; wherein ∑ a w ia is the sum of the edge weights between node i and all neighbor nodes a of the first associated network, and ∑ b w ib is the sum of the edge weights between node i and all neighbor nodes b of the second associated network.
在一种实施方式中,对联邦关联网络的节点迭代执行多轮的标签传播,还包括:确定联邦关联网络的标注节点和未标注节点;逐轮更新未标注节点的标签,直至未标注节点的标签不再发生变化和/或超过更新轮次阈值;以及,保持标注节点的标签不变。In one embodiment, multiple rounds of label propagation are iteratively performed on the nodes of the federated association network, and also include: determining the labeled nodes and unlabeled nodes of the federated association network; updating the labels of the unlabeled nodes in rounds until the labels of the unlabeled nodes no longer change and/or exceed an update round threshold; and, keeping the labels of the labeled nodes unchanged.
在一种实施方式中,根据邻居节点的本轮标签以及邻居节点对于节点的标签传播概率,确定每个节点的本轮标签,包括:针对每个节点,确定节点的每个邻居节点的本轮标签,以及每个邻居节点对于节点的标签传播概率;在节点的全部邻居节点中,计算每种标签对应的标签传播概率之和,得到每个标签对应的标签传播聚合概率;根据具有最大标签传播聚合概率的标签更新节点的本轮标签。In one embodiment, the current round label of each node is determined based on the current round label of the neighboring nodes and the label propagation probability of the neighboring nodes to the node, including: for each node, determining the current round label of each neighboring node of the node, and the label propagation probability of each neighboring node to the node; among all the neighboring nodes of the node, calculating the sum of the label propagation probabilities corresponding to each label to obtain the label propagation aggregation probability corresponding to each label; updating the current round label of the node according to the label with the maximum label propagation aggregation probability.
在一种实施方式中,若节点为非共有节点,方法还包括:节点所在方计算节点的每种邻居节点标签对应的标签传播聚合概率。In one implementation, if the node is a non-shared node, the method further includes: the party where the node is located calculates the label propagation aggregation probability corresponding to each neighbor node label of the node.
在一种实施方式中,若节点为共有节点,方法还包括:第一方计算节点在第一关联网络中的所有邻居节点标签对应的第一方标签传播聚合概率;第二方计算节点在第二关联网络中的所有邻居节点标签对应的第二标签传播聚合概率;第一方和第二方交互第一标签传播聚合概率和第二标签传播聚合概率;第一方和第二方各自基于交互信息再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。In one embodiment, if the node is a shared node, the method also includes: the first party calculates the first-party label propagation aggregation probability corresponding to all neighbor node labels of the node in the first associated network; the second party calculates the second label propagation aggregation probability corresponding to all neighbor node labels of the node in the second associated network; the first party and the second party interact with each other in the first label propagation aggregation probability and the second label propagation aggregation probability; the first party and the second party each perform label propagation probability aggregation again based on the interactive information to obtain the label propagation aggregation probability corresponding to each label.
在一种实施方式中,还包括:根据第一关联网络和第二关联网络的节点关系紧密程度,确定第一关联网络和第二关联网络的图权重;以及,在第一关联网络和第二关联网络的交互过程中引入图权重。In one embodiment, the method further includes: determining the graph weights of the first association network and the second association network according to the closeness of the node relationships between the first association network and the second association network; and introducing the graph weights during the interaction between the first association network and the second association network.
在一种实施方式中,在第一关联网络和第二关联网络的交互过程中引入图权重,包括:若节点i为共有节点,利用以下公式确定边权重和∑jwijIn one embodiment, graph weights are introduced in the interaction process between the first association network and the second association network, including: if node i is a common node, edge weights and ∑ j w ij are determined using the following formula:
jwij=θaawiabbwib;其中,∑awia为节点i与第一关联网络的全部邻居节点a之间边权重和,∑bwib为节点i与第二关联网络的全部邻居节点b之间的边权重和,θa为第一关联网络的图权重,θb为第二关联网络的图权重。j w ij =θ aa w iabb w ib ; wherein, ∑ a w ia is the sum of the edge weights between node i and all neighbor nodes a of the first associated network, ∑ b w ib is the sum of the edge weights between node i and all neighbor nodes b of the second associated network, θ a is the graph weight of the first associated network, and θ b is the graph weight of the second associated network.
在一种实施方式中,在第一关联网络和第二关联网络的交互过程中引入图权重,还包括:第一方和第二方交互第一标签传播聚合概率和第二标签传播聚合概率之后,基于第一关联网络和第二关联网络的图权重再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。In one embodiment, graph weights are introduced during the interaction between the first association network and the second association network, and the process also includes: after the first party and the second party interact with the first label propagation aggregation probability and the second label propagation aggregation probability, the label propagation probability is aggregated again based on the graph weights of the first association network and the second association network to obtain the label propagation aggregation probability corresponding to each label.
在一种实施方式中,还包括:若第一关联网络和第二关联网络为有向图网络,仅将每个节点的流入邻居节点作为邻居节点。In one implementation, it further includes: if the first associated network and the second associated network are directed graph networks, only the incoming neighbor nodes of each node are used as neighbor nodes.
第二方面,提供一种关联网络的标签传播装置,包括:图构建模块,用于基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;联邦网络模块,用于基于安全求交协议关联第一关联网络和第二关联网络,得到联邦关联网络;标签传播模块,用于对联邦关联网络的节点迭代执行多轮的标签传播;其中,每轮标签传播包括:确定联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及邻居 节点对于节点的标签传播概率,确定每个节点的本轮标签。In a second aspect, a label propagation device for an association network is provided, comprising: a graph construction module, used to construct a first association network based on first-party data, and to construct a second association network based on second-party data; a federated network module, used to associate the first association network with the second association network based on a secure intersection protocol to obtain a federated association network; a label propagation module, used to iteratively perform multiple rounds of label propagation on the nodes of the federated association network; wherein each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, according to the labels of the neighbor nodes in this round and the neighbor nodes; The label propagation probability of a node to another node determines the label of each node in this round.
第三方面,提供一种关联网络的标签传播装置,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如第一方面的方法。According to a third aspect, a label propagation device for an associated network is provided, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so that the at least one processor can execute: the method according to the first aspect.
第四方面,提供一种计算机可读存储介质,计算机可读存储介质存储有程序,当程序被多核处理器执行时,使得多核处理器执行如第一方面的方法。In a fourth aspect, a computer-readable storage medium is provided, wherein the computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method of the first aspect.
上述实施方式的优点之一,能够在保证隐私数据的前提下实现跨平台网络的标签传播。One of the advantages of the above implementation is that it can achieve label propagation across platform networks while ensuring privacy data.
本发明的其他优点将配合以下的说明和附图进行更详细的解说。Other advantages of the present invention will be explained in more detail with reference to the following description and accompanying drawings.
应当理解,上述说明仅是本发明技术方案的概述,以便能够更清楚地了解本发明的技术手段,从而可依照说明书的内容予以实施。为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举例说明本发明的具体实施方式。It should be understood that the above description is only an overview of the technical solution of the present invention, so that the technical means of the present invention can be more clearly understood and implemented according to the contents of the specification. In order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand, the specific implementation methods of the present invention are described below by way of example.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读下文的示例性实施方式的详细描述,本领域普通技术人员将明白本文所述的优点和益处以及其他优点和益处。附图仅用于示出示例性实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的标号表示相同的部件。在附图中:The advantages and benefits described herein and other advantages and benefits will be apparent to those of ordinary skill in the art upon reading the detailed description of the exemplary embodiments below. The accompanying drawings are only for the purpose of illustrating exemplary embodiments and are not to be considered as limiting the present invention. Also, the same reference numerals are used throughout the accompanying drawings to represent the same components. In the accompanying drawings:
图1为根据本发明一实施方式的关联网络的标签传播设备的结构示意图;FIG1 is a schematic diagram of the structure of a label propagation device for an associated network according to an embodiment of the present invention;
图2为根据本发明一实施方式的关联网络的标签传播方法的流程示意图;FIG2 is a schematic diagram of a flow chart of a label propagation method for an association network according to an embodiment of the present invention;
图3为根据本发明一实施方式的第一关联网络和第二关联网络的示意图;FIG3 is a schematic diagram of a first association network and a second association network according to an embodiment of the present invention;
图4为根据本发明一实施方式的联邦关联网络的示意图;FIG4 is a schematic diagram of a federated association network according to an embodiment of the present invention;
图5为根据本发明一实施方式的确定第一关联网络和第二关联网络的标签传播概率的示意图;FIG5 is a schematic diagram of determining the label propagation probability of the first association network and the second association network according to an embodiment of the present invention;
图6为根据本发明一实施方式的确定联邦关联网络的标签传播概率的示意图;FIG6 is a schematic diagram of determining a label propagation probability of a federated association network according to an embodiment of the present invention;
图7为根据本发明一实施方式的关联网络的标签传播的示意图;FIG7 is a schematic diagram of label propagation of an association network according to an embodiment of the present invention;
图8为根据本发明一实施方式的关联网络的标签传播的示意图;FIG8 is a schematic diagram of label propagation of an association network according to an embodiment of the present invention;
图9为根据本发明一实施方式的关联网络的标签传播装置的结构示意图。FIG. 9 is a schematic structural diagram of a label propagation device for an associated network according to an embodiment of the present invention.
在附图中,相同或对应的标号表示相同或对应的部分。In the drawings, the same or corresponding reference numerals represent the same or corresponding parts.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的示例性实施方式。虽然附图中显示了本公开的示例性实施方式,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。The exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the accompanying drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments described herein. On the contrary, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
在本申请实施方式的描述中,应理解,诸如“包括”或“具有”等术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在,并且不旨在排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在的可能性。In the description of the embodiments of the present application, it should be understood that terms such as "including" or "having" are intended to indicate the presence of features, numbers, steps, behaviors, components, parts, or a combination thereof disclosed in this specification, and are not intended to exclude the possibility of the presence of one or more other features, numbers, steps, behaviors, components, parts, or a combination thereof.
除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。Unless otherwise specified, “/” means or. For example, A/B can mean A or B. The “and/or” in this article is merely a way to describe the association relationship of associated objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone.
术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性或者 隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”等的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请实施方式的描述中,除非另有说明,“多个”的含义是两个或两个以上。The terms "first", "second", etc. are used for descriptive purposes only and should not be understood as indicating or implying relative importance or The number of technical features indicated is implicitly specified. Thus, a feature defined as "first", "second", etc. may explicitly or implicitly include one or more of the feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
为清楚阐述本申请实施方式,首先将介绍一些后续实施方式中可能会出现的概念。In order to clearly explain the embodiments of the present application, some concepts that may appear in subsequent embodiments will be introduced first.
下面将参考附图并结合实施方式来详细说明本发明。The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with embodiments.
图1示出了根据本公开的实施方式的计算设备100的示例的示意图。需要说明的是,图1即可为关联网络的标签传播方法的硬件运行环境的结构示意图。本发明实施方式基于关联网络的标签传播方法设备可以是PC,便携计算机等终端设备。FIG1 shows a schematic diagram of an example of a computing device 100 according to an embodiment of the present disclosure. It should be noted that FIG1 is a schematic diagram of the structure of the hardware operating environment of the label propagation method of the associated network. The device of the label propagation method based on the associated network in the embodiment of the present invention can be a terminal device such as a PC, a portable computer, etc.
如图1所示,该关联网络的标签传播方法设备可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG1 , the label propagation method device of the associated network may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to realize the connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also be a storage device independent of the aforementioned processor 1001.
本领域技术人员可以理解,图1中示出的关联网络的标签传播设备结构并不构成对关联网络的标签传播方法设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art will understand that the label propagation device structure of the associated network shown in FIG1 does not constitute a limitation on the label propagation method device of the associated network, and may include more or fewer components than shown in the figure, or a combination of certain components, or a different arrangement of components.
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及关联网络的标签传播方法程序。其中,操作系统是管理和控制关联网络的标签传播设备硬件和软件资源的程序,支持关联网络的标签传播程序以及其它软件或程序的运行。As shown in Figure 1, the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a label propagation method program for an associated network. The operating system is a program for managing and controlling the hardware and software resources of the label propagation device for the associated network, and supports the operation of the label propagation program for the associated network and other software or programs.
在图1所示的关联网络的标签传播设备中,用户接口1003主要用于接收第一终端、第二终端和监管终端发送的请求、数据等;网络接口1004主要用于连接后台服务器与后台服务器进行数据通信;而处理器1001可以用于调用存储器1005中存储的关联网络的标签传播程序,并执行以下操作:In the label propagation device of the associated network shown in FIG1 , the user interface 1003 is mainly used to receive requests, data, etc. sent by the first terminal, the second terminal, and the supervision terminal; the network interface 1004 is mainly used to connect the backend server to communicate data with the backend server; and the processor 1001 can be used to call the label propagation program of the associated network stored in the memory 1005, and perform the following operations:
基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;基于安全求交协议关联第一关联网络和第二关联网络,得到联邦关联网络;对联邦关联网络的节点迭代执行多轮的标签传播;其中,每轮标签传播包括:确定联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及邻居节点对于节点的标签传播概率,确定每个节点的本轮标签。A first association network is constructed based on first-party data, and a second association network is constructed based on second-party data; the first association network and the second association network are associated based on a secure intersection protocol to obtain a federated association network; multiple rounds of label propagation are iteratively performed on the nodes of the federated association network; wherein each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the current round label of each node according to the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
由此,双方仅需要交互标签传播概率等非隐私数据,即可能够在双方原始数据不出库的前提下,进行跨机构数据关联网络联合计算与应用方案。Therefore, the two parties only need to exchange non-privacy data such as label propagation probability to carry out cross-institutional data association network joint calculation and application solutions without releasing the original data of both parties.
图2示出了根据本公开的实施方式的用于执行一种关联网络的标签传播方法的流程图。该方法例如可以由如图1所示的计算设备100来执行。应当理解的是,方法200还可以包括未示出的附加框和/或可以省略所示出的框,本公开的范围在此方面不受限制。2 shows a flow chart of a method for performing a label propagation method of an association network according to an embodiment of the present disclosure. The method may be performed, for example, by the computing device 100 shown in FIG1. It should be understood that the method 200 may also include additional blocks not shown and/or may omit the blocks shown, and the scope of the present disclosure is not limited in this respect.
步骤210,基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;Step 210, constructing a first association network based on the first-party data, and constructing a second association network based on the second-party data;
例如,参见图3,A方与B方分别基于自身的数据,形成关联网络中的节点和边。假设A方为银行,通过用户间的转账数据形成A方关联网络为转账关联网络,其中,用户手机号为关联网络中的节点,对有转账关系的节点则进行连边,将转账金额为节点间的边权 重值。B方为运营商,通过用户间的通话数据形成B方关联网络为通话关联网络,用户手机号为关联网络中的节点,对有通话记录的节点则进行连边,将通话次数为节点间的边权重值。可选地,各自关联网络的边权重值可进行归一化。For example, see Figure 3. Party A and Party B form nodes and edges in the association network based on their own data. Assuming Party A is a bank, the A association network formed by the transfer data between users is a transfer association network, in which the user's mobile phone number is the node in the association network, and the nodes with transfer relations are connected, and the transfer amount is the edge weight between nodes. Party B is the operator. The call data between users forms the party B association network, which is the call association network. The user's mobile phone number is the node in the association network. The nodes with call records are connected, and the number of calls is the edge weight value between nodes. Optionally, the edge weight values of each association network can be normalized.
步骤220,基于安全求交协议关联第一关联网络和第二关联网络,得到联邦关联网络;Step 220, associating the first associated network and the second associated network based on a secure intersection protocol to obtain a federated associated network;
在一种实施方式中,上述步骤220进一步包括:对第一方数据和第二方数据进行加密求交,确定第一关联网络和第二关联网络中的公共节点,根据公共节点关联第一关联网络和第二关联网络,得到联邦关联网络。In one embodiment, the above step 220 further includes: encrypting and intersecting the first-party data and the second-party data, determining common nodes in the first association network and the second association network, and associating the first association network and the second association network according to the common nodes to obtain a federated association network.
例如,参见图4,可以利用安全求交算法(比如基于RSA+HASH的隐私求交算法),对两方节点数据进行安全求交,在不暴露原始数据的前提下发现共有节点,从而形成虚拟的联邦关联网络。如图4所示,可以将图3中的第一关联网络和第二关联网络关联起来,得到该联邦关联网络中。其中,Va代表A方的节点,Va代表B方的节点,Vab代表双方共有节点,Vab1、Vab2、Vab3为双方的共有节点。以节点Vab1示例,单独从A方关联网络来看,Vab1的邻居节点有2个。而从全局数据来看,Vab1的邻居节点有4个。For example, referring to FIG4 , a secure intersection algorithm (such as a privacy intersection algorithm based on RSA+HASH) can be used to securely intersect the node data of the two parties, and the common nodes can be found without exposing the original data, thereby forming a virtual federated association network. As shown in FIG4 , the first association network and the second association network in FIG3 can be associated to obtain the federated association network. Among them, Va represents the node of party A, Va represents the node of party B, Vab represents the common node of both parties, and Vab1, Vab2, and Vab3 are the common nodes of both parties. Taking node Vab1 as an example, from the perspective of party A's association network alone, Vab1 has 2 neighbor nodes. From the perspective of global data, Vab1 has 4 neighbor nodes.
步骤230,对联邦关联网络的节点迭代执行多轮的标签传播;Step 230, iteratively performing multiple rounds of label propagation on the nodes of the federated association network;
具体地,联邦关联网络的节点可以包括有标注节点和未标注节点,比如,可以在第一关联网络和第二关联网络中分别存在一部分未标注节点。又比如,第一关联网络中全为有标注节点,第二关联网络中全为未标注节点。诸如此类。Specifically, the nodes of the federated association network may include labeled nodes and unlabeled nodes. For example, there may be some unlabeled nodes in the first association network and the second association network. For another example, all nodes in the first association network are labeled nodes, and all nodes in the second association network are unlabeled nodes. And so on.
在一种实施方式中,针对上述联邦关联网络同时包括有标注节点和未标注节点的情况,上述步骤230可以进一步包括如下步骤:首先,确定联邦关联网络的标注节点和未标注节点;逐轮更新未标注节点的标签,直至未标注节点的标签不再发生变化和/或超过更新轮次阈值;以及,保持标注节点的标签不变。如此,可以保证原始样本的标签不变,保证标签传播准确度。In one embodiment, for the case where the above-mentioned federated association network includes both labeled nodes and unlabeled nodes, the above-mentioned step 230 may further include the following steps: first, determining the labeled nodes and unlabeled nodes of the federated association network; updating the labels of the unlabeled nodes in rounds until the labels of the unlabeled nodes no longer change and/or exceed the update round threshold; and keeping the labels of the labeled nodes unchanged. In this way, the labels of the original samples can be kept unchanged, and the accuracy of label propagation can be guaranteed.
可选地,标注节点的标签也可以逐轮动态更新,也即逐轮更新联邦关联网络中所有节点的标签,直至节点的标签不再发生变化和/或超过更新轮次阈值。如此可以矫正原始标签,挖掘出隐藏风险标签。Optionally, the labels of the annotated nodes can also be dynamically updated round by round, that is, the labels of all nodes in the federated association network are updated round by round until the labels of the nodes no longer change and/or exceed the update round threshold. In this way, the original labels can be corrected and hidden risk labels can be mined.
在上述步骤230中,每轮标签传播具体包括以下步骤231-232:In the above step 230, each round of label propagation specifically includes the following steps 231-232:
步骤231,确定联邦关联图中相邻节点之间的标签传播概率;Step 231, determining the label propagation probability between adjacent nodes in the federated association graph;
在一种实施方式中,上述步骤231具体可以包括:In one implementation, the above step 231 may specifically include:
(1)确定联邦关联网络中节点i与其邻居节点j之间的边ij的边权重wij(1) Determine the edge weight w ij of the edge ij between node i and its neighbor node j in the federated association network;
(2)确定节点i与其全部邻居节点J之间的边权重和∑jwij(2) Determine the sum of edge weights ∑ j w ij between node i and all its neighboring nodes J;
在一种实施方式中,若节点i为非共有节点,全部邻居节点J代表节点i所在图的全部邻居节点。In one implementation, if node i is a non-shared node, all neighbor nodes J represent all neighbor nodes in the graph where node i is located.
在一种实施方式中,若节点i为共有节点,全部邻居节点J代表节点i在第一关联网络的全部邻居节点a和节点i在第二关联网络的全部邻居节点b的集合。In one implementation, if node i is a common node, all neighbor nodes J represent a set of all neighbor nodes a of node i in the first associated network and all neighbor nodes b of node i in the second associated network.
进一步地,若节点i为共有节点,第一方和第二方交互节点i与第一关联网络的全部邻居节点a之间边权重和∑awia以及节点i与第二关联网络的全部邻居节点b之间的边权重和∑bwib,如此第一方和第二方各自在其方均可以基于双方交互的两方边权重和来计算节点i与其全部邻居节点J之间的边权重和∑jwijFurther, if node i is a common node, the first party and the second party interact with the sum of the edge weights ∑ a w ia between node i and all neighbor nodes a of the first associated network, and the sum of the edge weights ∑ b w ib between node i and all neighbor nodes b of the second associated network. In this way, the first party and the second party can each calculate the sum of the edge weights ∑ j w ij between node i and all its neighbor nodes J based on the sum of the edge weights of the two parties interacting.
进一步地,在一种实施方式中,若节点i为共有节点,基于双方交互的两方边权重和,第一方和第二方均可以利用以下公式确定边权重和∑jwijFurther, in one implementation, if node i is a shared node, based on the sum of the edge weights of both parties' interactions, both the first party and the second party may determine the sum of the edge weights ∑ j w ij using the following formula:
jwij=∑awia+∑bwibj w ij = ∑ a w ia + ∑ b w ib ;
其中,∑awia为节点i与第一关联网络的全部邻居节点a之间边权重和,∑bwib为节点i与第二关联网络的全部邻居节点b之间的边权重和。Among them, ∑ a w ia is the sum of the edge weights between node i and all neighboring nodes a in the first associated network, and ∑ b w ib is the sum of the edge weights between node i and all neighboring nodes b in the second associated network.
可选地,在另一种实施方式中,可以进一步考虑由于业务场景对节点关系紧密度的影响。比如,在金融场景中,转账关系为强关系,通话关系为弱关系,因此在计算边的标签传播概率时,可考虑双方不同业务场景下的边关系强弱进行加权聚合。Optionally, in another implementation, the impact of business scenarios on the closeness of node relationships can be further considered. For example, in a financial scenario, a transfer relationship is a strong relationship, and a call relationship is a weak relationship. Therefore, when calculating the label propagation probability of an edge, the strength of the edge relationship under different business scenarios of both parties can be considered for weighted aggregation.
在此情况下,第一方和第二方均可以利用以下公式确定加权边权重和∑jwijIn this case, both the first party and the second party can determine the weighted edge weight ∑ j w ij using the following formula:
jwij=θaawiabbwibj w ijaa w iabb w ib ;
其中,θa为第一关联网络的对应的图权重,θb为第二关联网络的图权重。Among them, θa is the corresponding graph weight of the first association network, and θb is the graph weight of the second association network.
(3)根据边权重wij与边权重和∑jwij的比值,确定邻居节点j对于节点i的标签传播概率Pia(3) According to the ratio of edge weight w ij and the sum of edge weights ∑ j w ij , determine the label propagation probability P ia of neighbor node j to node i.
具体地,联邦关联网络各边的标签传播概率其中,wij代表边ij的权重值。此处,对于非共有节点,J代表节点i的邻居节点;对于共有节点,J代表节点i分别在A、B两方的所有邻居节点。Specifically, the label propagation probability of each edge of the federated association network is Where w ij represents the weight value of edge ij. Here, for non-shared nodes, J represents the neighbor nodes of node i; for shared nodes, J represents all neighbor nodes of node i on both sides A and B.
jwij的计算逻辑为,A方计算本方节点i的邻居节点权重和为∑awia;B方计算本方节点i的邻居节点权重和为∑bwib。双方交互∑awia和∑bwib,得到最终的权重计算分母值为∑jwij=∑awia+∑bwibThe calculation logic of ∑ j w ij is that Party A calculates the sum of the weights of neighboring nodes of its own node i as ∑ a w ia ; Party B calculates the sum of the weights of neighboring nodes of its own node i as ∑ b w ib . The two parties interact with ∑ a w ia and ∑ b w ib , and the final weight calculation denominator value is ∑ j w ij =∑ a w ia +∑ b w ib .
参考图5,此处以节点Vab2为例。在A方本地网络中,Vab2的邻居节点有1个,单独计算本方的标签传播概率为P=0.1/0.1=1;B方本地网络中,Vab2的邻居节点有3个,单独计算本方的标签传播概率分别为P=0.2/(0.2+0.4+0.8)=1/7,P=0.4/(0.2+0.4+0.8)=2/7,P=0.8/(0.2+0.4+0.8)=4/7;进一步地,双方交换目标节点Vab2与邻居节点的权重值,A方为0.1,B方为0.2+0.4+0.8=1.4。结合联邦关联网络,更新目标节点的标签传播概率。Referring to Figure 5, node Vab2 is taken as an example. In the local network of party A, Vab2 has one neighbor node, and the label propagation probability calculated separately is P = 0.1/0.1 = 1; in the local network of party B, Vab2 has three neighbor nodes, and the label propagation probabilities calculated separately are P = 0.2/(0.2+0.4+0.8) = 1/7, P = 0.4/(0.2+0.4+0.8) = 2/7, and P = 0.8/(0.2+0.4+0.8) = 4/7; further, the two parties exchange the weight values of the target node Vab2 and the neighbor nodes, which are 0.1 for party A and 0.2+0.4+0.8 = 1.4 for party B. Combined with the federated association network, the label propagation probability of the target node is updated.
参考图6,经过上述计算,在A方,其邻居节点对于该节点i的标签传播概率在B方,类似地,其邻居节点对于该节点i的标签传播概率分别为2/15,4/15,8/15。Referring to Figure 6, after the above calculation, on side A, the probability of label propagation of its neighbor nodes for node i is On side B, similarly, the label propagation probabilities of its neighboring nodes for node i are 2/15, 4/15, and 8/15 respectively.
步骤232,针对每个节点,根据邻居节点的本轮标签以及邻居节点对于该节点的标签传播概率,确定每个节点的本轮标签。Step 232 , for each node, determine the current round label of each node according to the current round labels of neighboring nodes and the label propagation probability of neighboring nodes for the node.
参考图7,此处继续以Vab2所形成的联邦关联网络为例,如下所示。节点5为风险节点,显示为白色节点,标签设置为“1”;其余节点为未知节点,显示为灰色节点,标签设置为“0”。标签传播过程中,节点5的标签始终为“1”,其余节点的标签则逐轮更新,直到所有节点的标签不再发生变化或超过更新轮次阈值为止。Referring to Figure 7, we continue to take the federated association network formed by Vab2 as an example, as shown below. Node 5 is a risk node, displayed as a white node, and its label is set to "1"; the remaining nodes are unknown nodes, displayed as gray nodes, and their labels are set to "0". During the label propagation process, the label of node 5 is always "1", and the labels of the remaining nodes are updated round by round until the labels of all nodes no longer change or exceed the update round threshold.
在一种实施方式中,上述步骤232中,进一步包括以下步骤:In one embodiment, the above step 232 further includes the following steps:
步骤2321,针对节点i,确定节点i的每个邻居节点的本轮标签,以及每个邻居节点对于节点i的标签传播概率;Step 2321, for node i, determine the current round label of each neighbor node of node i, and the label propagation probability of each neighbor node for node i;
步骤2322,在节点i的全部邻居节点中,计算每种标签对应的标签传播概率之和,得到每个标签对应的标签传播聚合概率;Step 2322, among all neighboring nodes of node i, calculate the sum of the label propagation probabilities corresponding to each label to obtain the label propagation aggregation probability corresponding to each label;
具体地,若节点i为非共有节点,则仅由节点i所在方计算节点i的每种邻居节点标签对应的标签传播聚合概率。Specifically, if node i is a non-shared node, only the party where node i is located calculates the label propagation aggregation probability corresponding to each neighbor node label of node i.
具体地,若节点i为共有节点,则执行以下步骤:首先,第一方计算节点i在第一关联网络中的所有邻居节点标签对应的第一方标签传播聚合概率;第二方计算节点i在第二关联网络中的所有邻居节点标签对应的第二标签传播聚合概率;其次,第一方和第二方交互第一标签传播聚合概率和第二标签传播聚合概率;最后,第一方和第二方各自基于交互 信息再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。Specifically, if node i is a shared node, the following steps are performed: first, the first party calculates the first-party label propagation aggregation probability corresponding to all neighbor node labels of node i in the first associated network; the second party calculates the second-party label propagation aggregation probability corresponding to all neighbor node labels of node i in the second associated network; secondly, the first party and the second party interact with each other to calculate the first label propagation aggregation probability and the second label propagation aggregation probability; finally, the first party and the second party each interact with each other to calculate the first label propagation aggregation probability and the second label propagation aggregation probability; The information is again subjected to label propagation probability aggregation to obtain the label propagation aggregation probability corresponding to each label.
步骤2323,根据具有最大标签传播聚合概率的标签更新节点i的本轮标签。Step 2323: Update the current round label of node i according to the label with the maximum label propagation aggregation probability.
上述步骤2321-步骤2323所示出的节点标签更新规则,可以包括如下具体步骤:The node label update rules shown in the above steps 2321 to 2323 may include the following specific steps:
首先,对于节点i的第T轮更新,设其邻居节点集合为J((J1,L1,Pi1),(J2,L2,Pi2),(Jj,Lj,Pij),…,(Jn,Ln,Pin)),其中Jj为邻居节点j的标识,Lj为邻居节点j的标签,Pij为边<i,Jj>的传播概率。First, for the T-th round update of node i, let its neighbor node set be J((J 1 , L 1 , Pi1 ),(J 2 , L 2 , Pi2 ),(J j , L j , Pi ij ),…,(J n , L n , Pin )), where J j is the identifier of neighbor node j, L j is the label of neighbor node j, and Pij is the propagation probability of edge <i, J j >.
其次,计算邻居节点集合的所有标签的聚合传播概率。具体为P(Lj)=∑Pij,其中,Pij为标签为Lj的邻居节点对于目标节点i的标签传播概率。其中,若节点i为非共有节点,则只需计算本方邻居节点标签的传播概率。若节点i为共有节点,则A方计算本方所有邻居节点标签对应的概率P(Laj)=∑Piaj,B方计算本方邻居节点标签对应的概率P(Lbj)=∑Pibj,A方和B方交互P(Laj)与P(Lbj),并各自在本方进行再次标签传播概率聚合,得到最终结合双方关联网络信息的标签传播聚合概率P(Lj)=P(Laj)+P(Lbj)。Secondly, calculate the aggregate propagation probability of all labels of the neighbor node set. Specifically, P(L j ) = ∑P ij , where P ij is the label propagation probability of the neighbor node with label L j for the target node i. Among them, if node i is a non-shared node, only the propagation probability of the neighbor node label of the party needs to be calculated. If node i is a shared node, party A calculates the probability corresponding to all neighbor node labels of the party P(L aj ) = ∑P iaj , and party B calculates the probability corresponding to the neighbor node label of the party P(L bj ) = ∑P ibj , party A and party B interact P(L aj ) and P(L bj ), and each party performs label propagation probability aggregation again on its own side, and finally obtains the label propagation aggregation probability P(L j ) = P(L aj ) + P(L bj ) combined with the associated network information of both parties.
最后,选择最大的P(Lj)所对应的标签Lj为节点i的本轮标签。重复以上步骤,直到所有节点的标签不再发生变化。Finally, select the label L j corresponding to the largest P(L j ) as the label of node i in this round. Repeat the above steps until the labels of all nodes no longer change.
一个具体示例中,参考图7和图8给出标签更新的具体计算示例。In a specific example, a specific calculation example of label update is given with reference to FIG7 and FIG8 .
参考图7,针对第一轮传播,对于节点Vab2,执行以下计算:Referring to FIG7 , for the first round of propagation, for node Vab2, the following calculation is performed:
(1)计算A方邻居节点的标签传播聚合概率为<“0”,1/15>,其中,“0”代表无风险标签,1/15代表标签“0”对应的标签传播聚合概率。可以理解,由于A方Vab2仅有一个邻居节点1,且其初始标签值为“0”,在上文中已经计算出节点1对节点Vab2的标签传播概率为1/15,因此对于A方Vab2节点来说,其仅存在一个可传播标签“0”,且该可传播标签“0”对应的标签传播聚合概率为1/15。(1) The label propagation aggregation probability of the neighbor nodes of party A is calculated as <“0”, 1/15>, where “0” represents the risk-free label and 1/15 represents the label propagation aggregation probability corresponding to label “0”. It can be understood that since party A’s Vab2 has only one neighbor node 1, and its initial label value is “0”, the label propagation probability of node 1 to node Vab2 has been calculated as 1/15 in the above text. Therefore, for party A’s Vab2 node, it has only one propagable label “0”, and the label propagation aggregation probability corresponding to the propagable label “0” is 1/15.
(2)、计算B方邻居节点的标签传播聚合概率为<“0”,6/15>、<“1”,8/15>,其中,“0”代表无风险标签,6/15代表标签“0”对应的标签传播聚合概率。“1”代表有风险标签,8/15代表标签“1”对应的标签传播聚合概率。由于B方Vab2有三个邻居节点(3、4、5),且节点3、4的初始标签值为“0”,且节点5的初始标签值为“1”。在上文中已经计算出节点3对节点Vab2的标签传播概率为2/15,节点4对节点Vab2的标签传播概率为4/15,节点5对节点Vab2的标签传播概率为8/15,因此对于B方Vab2节点来说,其存在2个可传播标签“0”和“1”,且该可传播标签“0”对应的标签传播聚合概率为6/15=2/15+4/15,该可传播标签“1”对应的标签传播聚合概率为8/15。(2) Calculate the label propagation aggregation probability of the neighboring nodes of party B as <“0”, 6/15>, <“1”, 8/15>, where “0” represents a risk-free label, and 6/15 represents the label propagation aggregation probability corresponding to label “0”. “1” represents a risky label, and 8/15 represents the label propagation aggregation probability corresponding to label “1”. Since party B’s Vab2 has three neighboring nodes (3, 4, 5), and the initial label values of nodes 3 and 4 are “0”, and the initial label value of node 5 is “1”. In the above, it has been calculated that the label propagation probability of node 3 to node Vab2 is 2/15, the label propagation probability of node 4 to node Vab2 is 4/15, and the label propagation probability of node 5 to node Vab2 is 8/15. Therefore, for the B-side Vab2 node, it has two propagable labels “0” and “1”, and the label propagation aggregation probability corresponding to the propagable label “0” is 6/15=2/15+4/15, and the label propagation aggregation probability corresponding to the propagable label “1” is 8/15.
(3)、双方交换标签传播聚合概率,并将相同标签对应的标签传播聚合概率累加,可以各自计算得到节点Vab2的标签传播聚合概率为<“0”,7/15>,<“1”,8/15>,也即该可传播标签“0”对应的标签传播聚合概率为7/15,该可传播标签“1”对应的标签传播聚合概率为8/15。(3) The two parties exchange label propagation aggregation probabilities and add up the label propagation aggregation probabilities corresponding to the same label. They can calculate the label propagation aggregation probabilities of node Vab2 as <“0”, 7/15>, <“1”, 8/15>, that is, the label propagation aggregation probability corresponding to the propagable label “0” is 7/15, and the label propagation aggregation probability corresponding to the propagable label “1” is 8/15.
(4)、选择具有最大标签传播聚合概率<“1”,8/15>对应的标签“1”为节点Vab2本轮的标签。其他节点类似上述步骤。(4) Select the label "1" corresponding to the maximum label propagation aggregation probability <"1", 8/15> as the label of node Vab2 in this round. The above steps are similar for other nodes.
经过第一轮传播后,该联邦关联网络更新后的节点标签分布图如图8所示,其中节点Vab1和Vab2均更新为标签“1”。继续下一轮标签传播,直到节点的标签不再发生变化或传播轮次大于一定阈值为止。After the first round of propagation, the updated node label distribution diagram of the federated association network is shown in Figure 8, where nodes Vab1 and Vab2 are both updated to label “1”. The next round of label propagation continues until the node label no longer changes or the number of propagation rounds exceeds a certain threshold.
在一种实施方式中,根据第一关联网络和第二关联网络的节点关系紧密程度,确定第一关联网络和第二关联网络的图权重;以及,在第一关联网络和第二关联网络的交互过程中引入所述图权重。 In one embodiment, the graph weights of the first association network and the second association network are determined according to the closeness of the node relationships between the first association network and the second association network; and the graph weights are introduced during the interaction between the first association network and the second association network.
比如,可以根据业务场景确定第一关联网络和第二关联网络为强关联或弱关联,进而可以在计算边的标签传播概率时,引入上述第一关联网络和第二关联网络的图权重。当然,也可以在计算每种标签的标签传播聚合概率时,引入上述第一关联网络和第二关联网络的图权重,本申请对此不作具体限制。For example, the first association network and the second association network can be determined to be strongly or weakly associated according to the business scenario, and then the graph weights of the first association network and the second association network can be introduced when calculating the label propagation probability of the edge. Of course, the graph weights of the first association network and the second association network can also be introduced when calculating the label propagation aggregation probability of each label, and this application does not impose specific restrictions on this.
在一种实施方式中,在第一关联网络和第二关联网络的交互过程中引入图权重,至少包括如下两种引入方式:In one embodiment, graph weights are introduced into the interaction process between the first association network and the second association network, including at least the following two introduction methods:
(1)在上述步骤231中,若节点i为共有节点,利用以下公式确定边权重和∑jwij(1) In the above step 231, if node i is a common node, the edge weight ∑ j w ij is determined using the following formula:
jwij=θaawiabbwib;其中,∑awia为节点i与第一关联网络的全部邻居节点a之间边权重和,∑bwib为节点i与第二关联网络的全部邻居节点b之间的边权重和,θa为第一关联网络的图权重,θb为第二关联网络的图权重。j w ij =θ aa w iabb w ib ; wherein, ∑ a w ia is the sum of the edge weights between node i and all neighbor nodes a of the first associated network, ∑ b w ib is the sum of the edge weights between node i and all neighbor nodes b of the second associated network, θ a is the graph weight of the first associated network, and θ b is the graph weight of the second associated network.
(2)在上述步骤232中,第一方和第二方交互第一标签传播聚合概率和第二标签传播聚合概率之后,基于第一关联网络和第二关联网络的图权重再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。例如,若节点i为共有节点,则A方计算本方所有邻居节点标签对应的概率P(Laj)=P(Liaj),B方计算本方邻居节点标签对应的概率P(Lbj)=P(Libj),A方和B方交互P(Laj)与P(Lbj),并各自在本方进行再次标签传播概率聚合,得到最终结合双方关联网络信息的标签传播聚合概率P(Lj)=θaP(Laj)+θbP(Lbj)。(2) In the above step 232, after the first party and the second party exchange the first label propagation aggregation probability and the second label propagation aggregation probability, the label propagation probability is aggregated again based on the graph weights of the first association network and the second association network to obtain the label propagation aggregation probability corresponding to each label. For example, if node i is a common node, party A calculates the probability P(L aj )=P(L iaj ) corresponding to the labels of all its neighboring nodes, and party B calculates the probability P(L bj )=P(L ibj ) corresponding to the labels of its neighboring nodes. Party A and party B exchange P(L aj ) and P(L bj ), and each party performs label propagation probability aggregation again on its own side to obtain the final label propagation aggregation probability P(L j )=θ a P(L aj )+θ b P(L bj ) combining the association network information of both parties.
在一种实施方式中,还包括:若第一关联网络和第二关联网络为有向图网络,仅将每个节点的流入邻居节点作为邻居节点。例如,对于有向图,计算节点的传播概率时,可以只考虑目标节点的流入邻居节点。具体可结合业务场景进行判断。In one embodiment, it further includes: if the first associated network and the second associated network are directed graph networks, only the incoming neighbor nodes of each node are used as neighbor nodes. For example, for a directed graph, when calculating the propagation probability of a node, only the incoming neighbor nodes of the target node may be considered. The specific judgment may be made in combination with the business scenario.
在本说明书的描述中,参考术语“一些可能的实施方式”、“一些实施方式”、“示例”、具体示例”、或“一些示例”等的描述意指结合该实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施方式或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施方式或示例以及不同实施方式或示例的特征进行结合和组合。In the description of this specification, the description with reference to the terms "some possible embodiments", "some embodiments", "examples", "specific examples", or "some examples" means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present invention. In this specification, the schematic representation of the above terms does not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art may combine and combine different embodiments or examples described in this specification and the features of different embodiments or examples without contradiction.
关于本申请实施方式的方法流程图,将某些操作描述为以一定顺序执行的不同的步骤。这样的流程图属于说明性的而非限制性的。可以将在本文中所描述的某些步骤分组在一起并且在单个操作中执行、可以将某些步骤分割成多个子步骤、并且可以以不同于在本文中所示出的顺序来执行某些步骤。可以由任何电路结构和/或有形机制(例如,由在计算机设备上运行的软件、硬件(例如,处理器或芯片实现的逻辑功能)等、和/或其任何组合)以任何方式来实现在流程图中所示出的各个步骤。About the method flow chart of the present application embodiment, some operations are described as different steps performed in a certain order. Such flow chart belongs to illustrative and non-restrictive. Some steps described in this article can be grouped together and performed in a single operation, some steps can be divided into multiple sub-steps, and some steps can be performed in an order different from that shown in this article. Each step shown in the flow chart can be realized in any way by any circuit structure and/or tangible mechanism (for example, by software, hardware (for example, the logical function realized by processor or chip) etc. running on computer equipment and/or any combination thereof).
基于相同的技术构思,本发明实施方式还提供一种关联网络的标签传播装置,用于执行上述任一实施方式所提供的关联网络的标签传播方法。图9为本发明实施方式提供的一种关联网络的标签传播装置结构示意图。Based on the same technical concept, the embodiment of the present invention further provides a label propagation device for an associated network, which is used to execute the label propagation method for an associated network provided by any of the above embodiments. FIG9 is a schematic diagram of the structure of a label propagation device for an associated network provided by an embodiment of the present invention.
如图9所示,装置900包括:As shown in FIG9 , the apparatus 900 includes:
图构建模块910,用于基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;A graph construction module 910, configured to construct a first association network based on first-party data and a second association network based on second-party data;
联邦网络模块920,用于基于安全求交协议关联所述第一关联网络和所述第二关联网络,得到联邦关联网络;A federated network module 920, configured to associate the first associated network with the second associated network based on a secure intersection protocol to obtain a federated associated network;
标签传播模块930,用于对所述联邦关联网络的节点迭代执行多轮的标签传播;其中, 每轮所述标签传播包括:确定所述联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及所述邻居节点对于所述节点的标签传播概率,确定每个节点的本轮标签。The label propagation module 930 is used to iteratively perform multiple rounds of label propagation on the nodes of the federated association network; wherein, Each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the label of each node in this round according to the label of the neighboring node in this round and the label propagation probability of the neighboring node for the node.
需要说明的是,本申请实施方式中的装置可以实现前述方法的实施方式的各个过程,并达到相同的效果和功能,这里不再赘述。It should be noted that the device in the implementation mode of the present application can implement each process of the implementation mode of the aforementioned method and achieve the same effects and functions, which will not be repeated here.
根据本申请的一些实施方式,提供了关联网络的标签传播方法的非易失性计算机存储介质,其上存储有计算机可执行指令,该计算机可执行指令设置为在由处理器运行时执行:上述实施方式所述的方法。According to some embodiments of the present application, a non-volatile computer storage medium of a label propagation method for an association network is provided, on which computer executable instructions are stored, and the computer executable instructions are configured to execute the method described in the above embodiments when executed by a processor.
本申请中的各个实施方式均采用递进的方式描述,各个实施方式之间相同相似的部分互相参见即可,每个实施方式重点说明的都是与其他实施方式的不同之处。尤其,对于装置、设备和计算机可读存储介质实施方式而言,由于其基本相似于方法实施方式,所以其描述进行了简化,相关之处可参见方法实施方式的部分说明即可。Each embodiment in this application is described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the device, equipment, and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, their descriptions are simplified, and the relevant parts can be referred to the partial description of the method embodiments.
本申请实施方式提供的装置、设备和计算机可读存储介质与方法是一一对应的,因此,装置、设备和计算机可读存储介质也具有与其对应的方法类似的有益技术效果,由于上面已经对方法的有益技术效果进行了详细说明,因此,这里不再赘述装置、设备和计算机可读存储介质的有益技术效果。The apparatus, equipment and computer-readable storage medium provided in the embodiments of the present application correspond one-to-one to the method. Therefore, the apparatus, equipment and computer-readable storage medium also have similar beneficial technical effects as the corresponding methods. Since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the apparatus, equipment and computer-readable storage medium will not be repeated here.
本领域内的技术人员应明白,本发明的实施方式可提供为方法、装置(设备或系统)、或计算机可读存储介质。因此,本发明可采用完全硬件实施方式、完全软件实施方式、或结合软件和硬件方面的实施方式的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机可读存储介质的形式。It will be appreciated by those skilled in the art that the embodiments of the present invention may be provided as methods, devices (equipment or system), or computer-readable storage media. Therefore, the present invention may be implemented in the form of a complete hardware implementation, a complete software implementation, or an implementation combining software and hardware. Moreover, the present invention may be implemented in the form of a computer-readable storage medium implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本发明是参照根据本发明实施方式的方法、装置(设备或系统)、和计算机可读存储介质的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to the flowchart and/or block diagram of the method, device (equipment or system) and computer-readable storage medium according to the embodiment of the present invention. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介 质的示例。Memory may include non-permanent storage in a computer-readable medium, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash RAM. Memory is a computer-readable medium. Quality example.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储(ROM)器、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。此外,尽管在附图中以特定顺序描述了本发明方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。Computer-readable media include permanent and non-permanent, removable and non-removable media that can be used to store information by any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, read-only compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device. In addition, although the operations of the method of the present invention are described in a particular order in the accompanying drawings, this does not require or imply that these operations must be performed in this particular order, or that all the operations shown must be performed to achieve the desired results. Additionally or alternatively, some steps can be omitted, multiple steps can be combined into one step, and/or one step can be decomposed into multiple steps.
虽然已经参考若干具体实施方式描述了本发明的精神和原理,但是应该理解,本发明并不限于所公开的具体实施方式,对各方面的划分也不意味着这些方面中的特征不能组合以进行受益,这种划分仅是为了表述的方便。本发明旨在涵盖所附权利要求的精神和范围内所包括的各种修改和等同布置。 Although the spirit and principle of the present invention have been described with reference to several specific embodiments, it should be understood that the present invention is not limited to the disclosed specific embodiments, and the division of various aspects does not mean that the features in these aspects cannot be combined to benefit, and such division is only for the convenience of expression. The present invention is intended to cover various modifications and equivalent arrangements included in the spirit and scope of the attached claims.

Claims (18)

  1. 一种关联网络的标签传播方法,其特征在于,包括:A label propagation method for an association network, characterized by comprising:
    基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络;Building a first association network based on the first-party data, and building a second association network based on the second-party data;
    基于安全求交协议关联所述第一关联网络和所述第二关联网络,得到联邦关联网络;Associating the first association network and the second association network based on a secure intersection protocol to obtain a federated association network;
    对所述联邦关联网络的节点迭代执行多轮的标签传播;Iteratively performing multiple rounds of label propagation on nodes of the federated association network;
    其中,每轮所述标签传播包括:确定所述联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及所述邻居节点对于所述节点的标签传播概率,确定每个节点的本轮标签。Among them, each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the current round label of each node according to the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
  2. 根据权利要求1所述的方法,其特征在于,基于安全求交协议关联所述第一关联网络和所述第二关联网络,得到联邦关联网络,还包括:The method according to claim 1, characterized in that the step of associating the first association network and the second association network based on a secure intersection protocol to obtain a federated association network further comprises:
    对所述第一方数据和所述第二方数据进行加密求交,确定所述第一关联网络和所述第二关联网络中的公共节点,根据所述公共节点关联所述第一关联网络和所述第二关联网络,得到联邦关联网络。The first party data and the second party data are encrypted and intersected to determine common nodes in the first association network and the second association network, and the first association network and the second association network are associated according to the common nodes to obtain a federated association network.
  3. 根据权利要求1所述的方法,其特征在于,确定所述联邦关联图中相邻节点之间的标签传播概率,还包括:The method according to claim 1, characterized in that determining the label propagation probability between adjacent nodes in the federated association graph further comprises:
    确定所述联邦关联网络中节点i与其邻居节点j之间的边ij的边权重wijDetermine the edge weight w ij of the edge ij between the node i and its neighbor node j in the federated association network;
    确定所述节点i与其全部邻居节点J之间的边权重和∑jwijDetermine the sum of edge weights ∑ j w ij between the node i and all its neighboring nodes J;
    根据所述边权重wij和所述边权重和∑jwij的比值,确定所述邻居节点j对于所述节点i的标签传播概率PijAccording to the ratio of the edge weight w ij and the edge weight sum ∑ j w ij , the label propagation probability P ij of the neighbor node j to the node i is determined.
  4. 根据权利要求3所述的方法,其特征在于,The method according to claim 3, characterized in that
    若所述节点i为非共有节点,所述全部邻居节点J代表所述节点i所在图的全部邻居节点。If the node i is a non-shared node, the all neighbor nodes J represent all neighbor nodes in the graph where the node i is located.
  5. 根据权利要求3所述的方法,其特征在于,The method according to claim 3, characterized in that
    若所述节点i为共有节点,所述全部邻居节点J代表所述节点i在所述第一关联网络的全部邻居节点a和所述节点i在所述第二关联网络的全部邻居节点b的集合。If the node i is a common node, the total neighbor nodes J represent the set of all neighbor nodes a of the node i in the first associated network and all neighbor nodes b of the node i in the second associated network.
  6. 根据权利要求3所述的方法,其特征在于,The method according to claim 3, characterized in that
    若所述节点i为共有节点,所述第一方和第二方交互所述节点i与所述第一关联网络的全部邻居节点a之间边权重和以及所述节点i与所述第二关联网络的全部邻居节点b之间的边权重和。If the node i is a common node, the first party and the second party exchange the sum of edge weights between the node i and all neighboring nodes a of the first associated network and the sum of edge weights between the node i and all neighboring nodes b of the second associated network.
  7. 根据权利要求3所述的方法,其特征在于,还包括:The method according to claim 3, further comprising:
    若所述节点i为共有节点,利用以下公式确定所述边权重和∑jwij
    jwij=∑awia+∑bwib
    If the node i is a common node, the edge weight ∑ j w ij is determined using the following formula:
    j w ij = ∑ a w ia + ∑ b w ib ;
    其中,∑awia为所述节点i与所述第一关联网络的全部邻居节点a之间边权重和,所述∑bwib为所述节点i与所述第二关联网络的全部邻居节点b之间的边权重和。Among them, ∑ a w ia is the sum of the edge weights between the node i and all neighboring nodes a of the first associated network, and ∑ b w ib is the sum of the edge weights between the node i and all neighboring nodes b of the second associated network.
  8. 根据权利要求1所述的方法,其特征在于,对所述联邦关联网络的节点迭代执行多轮的标签传播,还包括:The method according to claim 1, characterized in that the multiple rounds of label propagation are iteratively performed on the nodes of the federated association network, and further comprising:
    确定所述联邦关联网络的标注节点和未标注节点;Determining labeled nodes and unlabeled nodes of the federated association network;
    逐轮更新所述未标注节点的标签,直至所述未标注节点的标签不再发生变化和/或超过更新轮次阈值;以及,Updating the labels of the unlabeled nodes in rounds until the labels of the unlabeled nodes no longer change and/or exceed an update round threshold; and,
    保持所述标注节点的标签不变。 The label of the annotated node remains unchanged.
  9. 根据权利要求1所述的方法,其特征在于,根据邻居节点的本轮标签以及所述邻居节点对于所述节点的标签传播概率,确定每个节点的本轮标签,包括:The method according to claim 1 is characterized in that determining the current round label of each node according to the current round label of the neighbor node and the label propagation probability of the neighbor node to the node comprises:
    针对每个节点,确定所述节点的每个邻居节点的本轮标签,以及每个邻居节点对于所述节点的标签传播概率;For each node, determine the current round label of each neighbor node of the node, and the label propagation probability of each neighbor node to the node;
    在所述节点的全部邻居节点中,计算每种标签对应的标签传播概率之和,得到每个标签对应的标签传播聚合概率;Among all neighboring nodes of the node, the sum of the label propagation probabilities corresponding to each label is calculated to obtain the label propagation aggregation probability corresponding to each label;
    根据具有最大标签传播聚合概率的标签更新所述节点的本轮标签。The current round label of the node is updated according to the label with the maximum label propagation aggregation probability.
  10. 根据权利要求9所述的方法,其特征在于,若所述节点为非共有节点,所述方法还包括:The method according to claim 9, characterized in that if the node is a non-shared node, the method further comprises:
    所述节点所在方计算所述节点的每种邻居节点标签对应的标签传播聚合概率。The node location calculates the label propagation aggregation probability corresponding to each neighbor node label of the node.
  11. 根据权利要求9所述的方法,其特征在于,若所述节点为共有节点,所述方法还包括:The method according to claim 9, characterized in that if the node is a shared node, the method further comprises:
    所述第一方计算所述节点在第一关联网络中的所有邻居节点标签对应的第一方标签传播聚合概率;The first party calculates the first party label propagation aggregation probability corresponding to all neighbor node labels of the node in the first associated network;
    所述第二方计算所述节点在第二关联网络中的所有邻居节点标签对应的第二标签传播聚合概率;The second party calculates a second label propagation aggregation probability corresponding to all neighbor node labels of the node in the second associated network;
    所述第一方和所述第二方交互所述第一标签传播聚合概率和第二标签传播聚合概率;The first party and the second party interact with each other with the first label propagation aggregate probability and the second label propagation aggregate probability;
    所述第一方和所述第二方各自基于交互信息再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。The first party and the second party each perform label propagation probability aggregation again based on the interactive information to obtain a label propagation aggregation probability corresponding to each label.
  12. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    根据所述第一关联网络和所述第二关联网络的节点关系紧密程度,确定所述第一关联网络和所述第二关联网络的图权重;以及,Determining graph weights of the first association network and the second association network according to the closeness of the node relationships between the first association network and the second association network; and,
    在所述第一关联网络和所述第二关联网络的交互过程中引入所述图权重。The graph weight is introduced in the interaction process between the first association network and the second association network.
  13. 根据权利要求12所述的方法,其特征在于,在所述第一关联网络和所述第二关联网络的交互过程中引入所述图权重,包括:The method according to claim 12, characterized in that introducing the graph weight in the interaction process between the first association network and the second association network comprises:
    若所述节点i为共有节点,利用以下公式确定所述边权重和∑jwij
    jwij=θaawiabbwib
    If the node i is a common node, the edge weight ∑ j w ij is determined using the following formula:
    j w ijaa w iabb w ib ;
    其中,∑awia为所述节点i与所述第一关联网络的全部邻居节点a之间边权重和,所述∑bwib为所述节点i与所述第二关联网络的全部邻居节点b之间的边权重和,θa为所述第一关联网络的图权重,θb为所述第二关联网络的图权重。Among them, ∑ a w ia is the sum of the edge weights between the node i and all neighboring nodes a of the first associated network, ∑ b w ib is the sum of the edge weights between the node i and all neighboring nodes b of the second associated network, θ a is the graph weight of the first associated network, and θ b is the graph weight of the second associated network.
  14. 根据权利要求12所述的方法,其特征在于,在所述第一关联网络和所述第二关联络的交互过程中引入所述图权重,包括:The method according to claim 12, characterized in that introducing the graph weight in the interaction process between the first association network and the second association network comprises:
    所述第一方和所述第二方交互所述第一标签传播聚合概率和第二标签传播聚合概率之后,基于所述第一关联网络和所述第二关联网络的图权重再次进行标签传播概率聚合,得到每种标签对应的标签传播聚合概率。After the first party and the second party exchange the first label propagation aggregation probability and the second label propagation aggregation probability, the label propagation probability is aggregated again based on the graph weights of the first association network and the second association network to obtain the label propagation aggregation probability corresponding to each label.
  15. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    若所述第一关联网络和所述第二关联网络为有向图网络,仅将每个节点的流入邻居节点作为所述邻居节点。If the first associated network and the second associated network are directed graph networks, only the incoming neighbor nodes of each node are used as the neighbor nodes.
  16. 一种关联网络的标签传播装置,其特征在于,包括:A label propagation device for an associated network, characterized by comprising:
    图构建模块,用于基于第一方数据构建第一关联网络,基于第二方数据构建第二关联网络; A graph construction module, configured to construct a first association network based on the first-party data and a second association network based on the second-party data;
    联邦网络模块,用于基于安全求交协议关联所述第一关联网络和所述第二关联网络,得到联邦关联网络;A federated network module, configured to associate the first associated network with the second associated network based on a secure intersection protocol to obtain a federated associated network;
    标签传播模块,用于对所述联邦关联网络的节点迭代执行多轮的标签传播;其中,每轮所述标签传播包括:确定所述联邦关联图中相邻节点之间的标签传播概率;针对每个节点,根据邻居节点的本轮标签以及所述邻居节点对于所述节点的标签传播概率,确定每个节点的本轮标签。A label propagation module is used to iteratively perform multiple rounds of label propagation on the nodes of the federated association network; wherein each round of label propagation includes: determining the label propagation probability between adjacent nodes in the federated association graph; for each node, determining the current round label of each node based on the current round label of the neighboring node and the label propagation probability of the neighboring node for the node.
  17. 一种关联网络的标签传播装置,其特征在于,包括:A label propagation device for an associated network, characterized by comprising:
    至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:如权利要求1-15中任一项所述的方法。At least one processor; and, a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor so as to enable the at least one processor to execute: a method as described in any one of claims 1-15.
  18. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序,当所述程序被多核处理器执行时,使得所述多核处理器执行如权利要求1-15中任一项所述的方法。 A computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor executes the method according to any one of claims 1 to 15.
PCT/CN2023/127581 2022-11-25 2023-10-30 Label propagation method and apparatus for associated network, and computer readable storage medium WO2024109454A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211492068.7A CN115733763A (en) 2022-11-25 2022-11-25 Label propagation method and device for associated network and computer readable storage medium
CN202211492068.7 2022-11-25

Publications (1)

Publication Number Publication Date
WO2024109454A1 true WO2024109454A1 (en) 2024-05-30

Family

ID=85298377

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/127581 WO2024109454A1 (en) 2022-11-25 2023-10-30 Label propagation method and apparatus for associated network, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN115733763A (en)
WO (1) WO2024109454A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115733763A (en) * 2022-11-25 2023-03-03 中国银联股份有限公司 Label propagation method and device for associated network and computer readable storage medium
CN118096417A (en) * 2024-04-28 2024-05-28 江西求是高等研究院 Propagation network mode discovery method, system, computer and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991614A (en) * 2017-03-02 2017-07-28 南京信息工程大学 The parallel overlapping community discovery method propagated under Spark based on label
US20210097339A1 (en) * 2019-09-26 2021-04-01 Microsoft Technology Licensing, Llc Inference via edge label propagation in networks
CN113095946A (en) * 2021-04-28 2021-07-09 福州大学 Insurance customer recommendation method and system based on federal label propagation
CN115733763A (en) * 2022-11-25 2023-03-03 中国银联股份有限公司 Label propagation method and device for associated network and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991614A (en) * 2017-03-02 2017-07-28 南京信息工程大学 The parallel overlapping community discovery method propagated under Spark based on label
US20210097339A1 (en) * 2019-09-26 2021-04-01 Microsoft Technology Licensing, Llc Inference via edge label propagation in networks
CN113095946A (en) * 2021-04-28 2021-07-09 福州大学 Insurance customer recommendation method and system based on federal label propagation
CN115733763A (en) * 2022-11-25 2023-03-03 中国银联股份有限公司 Label propagation method and device for associated network and computer readable storage medium

Also Published As

Publication number Publication date
CN115733763A (en) 2023-03-03

Similar Documents

Publication Publication Date Title
US20200167344A1 (en) Method and device for writing service data in block chain system
WO2024109454A1 (en) Label propagation method and apparatus for associated network, and computer readable storage medium
US11106655B2 (en) Asset management system, method, apparatus, and electronic device
TW202008272A (en) Block-chain transaction method and device, and electronic device
US11233660B2 (en) Confidential blockchain transactions
JP2021515952A (en) Credit check system, credit check data storage method, device and computer program
TW201935383A (en) Asset management method and apparatus, and electronic device
WO2019033394A1 (en) Blockchain system and right management method therefor
US20180121482A1 (en) Change monitoring spanning graph queries
US11775656B2 (en) Secure multi-party information retrieval
WO2020233635A1 (en) Receipt storage method combining conditional restrictions of multiple types of dimensions and node
JP6940239B2 (en) Methods and systems for anonymizing data stock
US11775507B2 (en) Methods and apparatuses for reading and updating data structures, and electronic devices
CN110032598A (en) Method for updating field and device, electronic equipment
WO2022068360A1 (en) Shared root key-based information processing method and apparatus, and device and medium
WO2021164194A1 (en) Reward point management method based on blockchain, and related apparatus
EP3631669B1 (en) Expression evaluation of database statements for restricted data
CN109032499A (en) A kind of data access method of Distributed Storage, information data processing terminal
CN112966054A (en) Enterprise graph node relation-based ethnic group division method and computer equipment
WO2022110716A1 (en) Cold start recommendation method and apparatus, computer device and storage medium
Souza et al. Thea-a qos, privacy, and power-aware algorithm for placing applications on federated edges
Zhang et al. Data centers selection for moving GEO-distributed big data to cloud
US20230153450A1 (en) Privacy data management in distributed computing systems
US20230334333A1 (en) Methods, apparatuses, and systems for training model by using multiple data owners
CN110889040B (en) Method and device for pushing information