CN114615090A - Data processing method, system, device and medium based on cross-domain label propagation - Google Patents

Data processing method, system, device and medium based on cross-domain label propagation Download PDF

Info

Publication number
CN114615090A
CN114615090A CN202210499573.8A CN202210499573A CN114615090A CN 114615090 A CN114615090 A CN 114615090A CN 202210499573 A CN202210499573 A CN 202210499573A CN 114615090 A CN114615090 A CN 114615090A
Authority
CN
China
Prior art keywords
data
label
sample
value
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210499573.8A
Other languages
Chinese (zh)
Other versions
CN114615090B (en
Inventor
李�根
卞阳
陈立峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fucun Technology Shanghai Co ltd
Original Assignee
Fucun Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fucun Technology Shanghai Co ltd filed Critical Fucun Technology Shanghai Co ltd
Priority to CN202210499573.8A priority Critical patent/CN114615090B/en
Publication of CN114615090A publication Critical patent/CN114615090A/en
Application granted granted Critical
Publication of CN114615090B publication Critical patent/CN114615090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption

Abstract

The invention provides a data processing method, a system, equipment and a medium based on cross-domain label propagation, and the data processing method applied to a participant end comprises the following steps: receiving a tag value ciphertext of sample data sent by a demand side, and updating an initial tag value of a sample node corresponding to the sample data in a graph database according to the tag value ciphertext; and carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of the target node adjacent to the sample nodes. The data processing method, the system, the equipment and the medium based on cross-domain label propagation combine homomorphic encryption technology and privacy set intersection technology on the premise of data compliance and no data leakage of any party, realize cross-domain label propagation based on safety diagram calculation on the premise of double privacy protection of labels and diagram information, give consideration to the accuracy and the safety of large data fusion processing, and have wide application value.

Description

Data processing method, system, device and medium based on cross-domain label propagation
Technical Field
The invention relates to the technical field of big data, in particular to a data processing method, a system, equipment and a medium based on cross-domain label propagation.
Background
With the development of big data, the fusion and utilization of mass data in the fields of finance, telecommunication and the like become an important solution in the fields of commerce, public service and even security. However, the fusion of mass data across domains uses the premise that data compliance must be followed, i.e., data of either party cannot be revealed.
A graph database is a collection of objects that can describe a complex data structure, unlike other data structures, a graph can effectively describe the relationships between different data nodes. For example, when a social network is encoded with graph data, each node may represent a person, while a connection to the node may represent a relationship between two people.
The label propagation algorithm is a semi-supervised learning method based on a graph. The basic idea is that the neighboring nodes have a high probability of having the same label, so the label information of the neighboring unmarked nodes can be predicted by using the label of the marked node. Although the label propagation algorithm is widely applied to the field of internet big data, the label propagation algorithm is mostly applied to a single data domain, namely, the label propagation algorithm can be directly trained by own data without combining multi-party data, and the utilization capability of cross-domain data is limited.
Disclosure of Invention
The invention aims to overcome the defect that the prior art is difficult to consider diversification requirements and data compliance in cross-domain large data fusion and utilization, and provides a data processing method, a system, equipment and a medium based on cross-domain label propagation.
The invention solves the technical problems through the following technical scheme:
the invention provides a data processing method based on cross-domain label propagation, which is applied to participants; the data processing method comprises the following steps:
receiving a tag value ciphertext of sample data sent by a demand side, wherein the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
updating the initial tag value of the sample node corresponding to the sample data in the graph database according to the tag value ciphertext;
and carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of a target node adjacent to the sample nodes.
Preferably, the step of performing label value propagation processing on the sample nodes according to the relationship weights between the nodes in the graph database and the label values includes:
obtaining a relation weight ratio between the target node and each adjacent node according to the relation weight between the nodes in the graph database;
and respectively calculating the product of the label value of each adjacent node of the target node and the corresponding relation weight ratio according to the label value of the node in the graph database, and setting the sum of the products as the label value of the target node.
The second aspect of the present invention provides a data processing method based on cross-domain label propagation, which is applied to a demander, and the data processing method includes:
sending a tag value ciphertext of the sample data to a participant, wherein the tag value ciphertext is used for updating an initial tag value of a sample node corresponding to the sample data in a graph database of the participant; the label value ciphertext is obtained by encrypting the label value of the sample data by a preset encryption rule;
sending data to be queried to the participants to acquire a tag value ciphertext of a node corresponding to the data to be queried;
and decrypting the tag value ciphertext according to the preset encryption rule to obtain a tag value corresponding to the data to be queried.
The third aspect of the invention provides a data processing system based on cross-domain label propagation, which is applied to participants; the data processing system includes:
the receiving module is used for receiving a tag value ciphertext of the sample data sent by a demand party, wherein the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
the updating module is used for updating the initial tag value of the sample node corresponding to the sample data in the graph database according to the tag value ciphertext;
and the dyeing module is used for carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of the target node adjacent to the sample nodes.
Preferably, the dyeing module comprises:
the weight obtaining unit is used for obtaining a relation weight ratio between the target node and each adjacent node according to the relation weight between the nodes in the graph database;
and the label value calculation unit is used for calculating the product of the label value of each adjacent node of the target node and the corresponding relation weight ratio according to the label value of the node in the graph database, and setting the sum of the products as the label value of the target node.
A fourth aspect of the present invention provides a data processing system based on cross-domain label propagation, which is applied to a demander, and includes:
the sending module is used for sending the tag value ciphertext of the sample data to the participant so as to update the initial tag value of the sample node corresponding to the sample data in the graph database of the participant; the label value ciphertext is obtained by encrypting the label value of the sample data by a preset encryption rule;
the query module is used for sending data to be queried to the participants so as to obtain a tag value ciphertext of a node corresponding to the data to be queried;
and the query result acquisition module is used for decrypting the tag value ciphertext according to the preset encryption rule so as to acquire the tag value corresponding to the data to be queried.
Preferably, the graph database includes nodes corresponding to the sample data one to one.
Preferably, the preset encryption rule is a homomorphic encryption algorithm.
The invention also provides a data processing system based on cross-domain label propagation, which comprises the data processing system based on cross-domain label propagation and applied to the demander and the participant.
A fifth aspect of the present invention provides an electronic device, comprising a memory and a processor connected to the memory, wherein the processor implements the above-mentioned data processing method based on cross-domain tag propagation when executing a computer program stored on the memory.
A sixth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the above-mentioned data processing method based on cross-domain label propagation.
The positive progress effects of the invention are as follows: according to the data processing method, system, device and medium based on cross-domain label propagation, on the premise that data are in compliance and data of any party are not leaked, a homomorphic encryption technology and a privacy set intersection technology are combined, a cross-domain label propagation algorithm based on safety diagram calculation can be achieved on the premise that double privacy protection of labels and diagram information is achieved, accuracy and safety of large data fusion processing are considered, and the method, system, device and medium based on cross-domain label propagation have wide application value.
Drawings
Fig. 1 is a flowchart of a data processing method based on cross-domain tag propagation according to embodiment 1 of the present invention.
Fig. 2 is a flowchart of a data processing method based on cross-domain tag propagation according to embodiment 2 of the present invention.
Fig. 3 is a block diagram of a data processing system based on cross-domain tag propagation according to embodiment 3 of the present invention.
Fig. 4 is a schematic block diagram of a data processing system based on cross-domain label propagation according to embodiment 4 of the present invention.
Fig. 5 is a block diagram of an electronic device according to embodiment 5 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the invention thereto.
Graph data in the present invention may be stored in a graph database including, but not limited to, JanusGraph. The JanusGraph is an open-source distributed graph database, has good expansibility, can be used for storing and querying graph data of hundreds of billions of nodes and edges (node connecting lines) through a multi-computer cluster, and supports a large number of users to execute complex real-time graph traversal at high concurrency.
Example 1
Referring to fig. 1, the embodiment specifically provides a data processing method based on cross-domain label propagation, which is applied to a participant; the method comprises the following steps:
s1, receiving a tag value ciphertext of sample data sent by a demand party, wherein the tag value ciphertext is obtained after the tag value of the sample data is encrypted through a preset encryption rule;
s2, updating the initial tag value of the sample node corresponding to the sample data in the graph database according to the tag value ciphertext;
and S3, carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of the target node adjacent to the sample nodes.
The embodiment takes the participant side as an example for explanation, wherein the demand side has sample data, and the participant side has graph data. Privacy preserving set intersection protocol allows two parties holding respective sets to jointly compute the intersection operation of the two sets. The correct intersection is obtained on the last two parties of the protocol interaction and no information in the other party set beyond the intersection can be obtained. In this embodiment, data common to both parties may be obtained by a privacy set intersection technique, and the data of the requesting party and the node of the participating party may implement one-to-one correspondence, that is, data alignment, according to the identifier.
In step S1, the participant receives a tag value ciphertext of the sample data sent by the demander, where the tag value ciphertext is obtained by encrypting the tag value of the sample data according to a preset encryption rule; preferably, the predetermined encryption rule may be a homomorphic encryption algorithm. The algorithm satisfies the addition homomorphism and the multiplication homomorphism. Homomorphic encryption is a cryptographic technique based on the theory of computational complexity of mathematical problems. Processing the homomorphically encrypted data to obtain an output, decrypting the output, the result of which is the same as the output obtained by processing the unencrypted original data in the same way, namely:
Figure DEST_PATH_IMAGE001
. The data tag can be represented by numerical type such as 0 and 1, and the demand party encrypts the numerical type tag and encodes the encryption result and then sends the encrypted result to the participant. In the process of assigning values to nodes in the graph database of the participating party, step S2 is to update corresponding nodes, i.e., sample nodes, in the graph database with the encrypted tag value ciphertext of the sample data, i.e., to assign initial tag values of the sample nodes. Step S3 further dyes the neighboring nodes of the sample node according to the initial label value of the sample node. Specifically, label value propagation processing is carried out on the sample nodes by combining the label values of the sample nodes and the relation weights between the nodes in the graph database, so that the label values of target nodes adjacent to the sample nodes are obtained.
As a preferred embodiment, step S3 includes:
s31, acquiring a relation weight ratio between a target node and each adjacent node according to the relation weight between the nodes in the graph database;
and S32, respectively calculating the product of the label value of each adjacent node of the target node and the corresponding relation weight ratio according to the label value of the nodes in the graph database, and setting the sum of the products as the label value of the target node.
In the dyeing process of the neighbor nodes, the neighbor nodes are influenced by each node adjacent to the neighbor nodes, so that the influence degree of different adjacent nodes on the neighbor nodes can be embodied in a weight normalization mode and other modes; for example, a weighted average of the label values of the neighboring nodes is calculated according to the weights. Because the label propagation algorithm is based on the existing label iterative update of all nodes, considering that the data scale is huge and can reach hundreds of millions of data nodes, and abnormal users in mass data are quite sparse, in order to save computing resources, namely to ensure computing efficiency, all nodes can not be updated, and only the nodes within a set step length are updated, for example, only the adjacent nodes or the adjacent nodes of the adjacent nodes are updated iteratively.
Specifically, it may be noted that all nodes to be transmitted are V, and sum =0 is ordered for each node V in V; acquiring all neighbor nodes N (v) of v, and for each node u in N (v), recording w for each node x in N (u)uxIs the weight of the relationship between u and x, which is made available by the connecting edges of the nodes in the graph database, sum = sum + wuxRemember yxIs an x encrypted tag. For each node u in N (v): remember yuFor homomorphic ciphered 0, for each node x in N (u): w is aux’=wux/sum,yu=yu+w’*yx(ii) a And performing V traversal, namely calculating to obtain the sum of products of the label value of each adjacent node of the target node and the corresponding relation weight ratio.
In one specific example, the demander is a bank and the participant is an operator. The bank has a set of blacklisted members whose label value is 1, and the other members whose label value is 0 by default. The demand side encrypts the label values of the blacklist members by a homomorphic encryption algorithm and sends the label values to the participants, and after node dyeing is carried out in the graph database, dyeing results are stored in the participants, namely operators. Since the result is a secret value, it is not visible to the operator. And only when the demand party needs to inquire the associated data of certain sample data, sending the identifier ciphertext of the associated data to the participant, finding the corresponding node by the participant according to the identifier ciphertext, sending the tag value ciphertext of the node back to the demand party for decryption, and judging. Specifically, the relationship between the associated data and the sample data may be determined according to a judgment threshold set by the demander. For example, if the plaintext value of the tag value of the decrypted associated data is 0.32 and the determination threshold is 0.5 (the associated data is also a blacklist when being represented by ≧ 0.5), the associated data is considered not to be the blacklist data.
The data processing method based on cross-domain label propagation in the embodiment can realize a cross-domain label propagation algorithm based on safety diagram calculation on the premise of realizing double privacy protection of labels and diagram information by combining a homomorphic encryption technology and a privacy set intersection technology on the premise of data compliance and no leakage of data of any party, thereby carrying out fusion training by utilizing other diagram data and own data, enriching own data characteristics, considering accuracy and safety of big data processing and having wide application value.
Example 2
Referring to fig. 2, the present embodiment specifically provides a data processing method based on cross-domain label propagation, which is applied to a demander, and the data processing method includes:
s101, sending a tag value ciphertext of sample data to a participant, wherein the tag value ciphertext is used for updating an initial tag value of a sample node corresponding to the sample data in a graph database of the participant; the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
s102, sending data to be queried to a participant to acquire a tag value ciphertext of a node corresponding to the data to be queried;
s103, decrypting the label value ciphertext according to a preset encryption rule to obtain a label value corresponding to the data to be queried.
The embodiment takes the requesting party as an example for explanation, wherein the requesting party has sample data, and the participating party has graph data. Privacy preserving set intersection protocol allows two parties holding respective sets to jointly compute the intersection operation of the two sets. The last two parties in the protocol interaction get the correct intersection and do not get any information in the other party set outside the intersection. In this embodiment, data common to both parties may be obtained by a privacy set intersection technique, that is, data of a demand party and nodes of a participant implement one-to-one correspondence, that is, data alignment, according to identifiers.
In the step S101, the demand side sends a tag value ciphertext of the sample data to the participant side, and the tag value ciphertext is used for updating an initial tag value of a sample node corresponding to the sample data in a graph database of the participant side; and the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule. Preferably, the predetermined encryption rule may be a homomorphic encryption algorithm. The algorithm satisfies the addition homomorphism and the multiplication homomorphism. Homomorphic encryption is a cryptographic technique based on the theory of computational complexity of mathematical problems. Processing the homomorphically encrypted data to obtain an output, decrypting the output, the result of which is the same as the output obtained by processing the unencrypted original data in the same way, namely:
Figure 331860DEST_PATH_IMAGE001
. The data tag can be represented by numerical type such as 0 and 1, and the demand party encrypts the numerical type tag and encodes the encryption result and then sends the encrypted result to the participant. In the process of assigning values to nodes in the graph database of the participant, step S102 sends data to be queried to the participant to obtain a tag value ciphertext of the node corresponding to the data to be queried, and step S103 decrypts the tag value ciphertext according to a preset encryption rule to obtain a tag value corresponding to the data to be queried. The participator updates corresponding nodes, namely sample nodes, in the graph database by using encrypted tag value ciphertexts of the sample data, namely assigns the nodes as initial tag values of the sample nodes, and further dyes adjacent nodes, namely adjacent nodes, of the sample nodes. Specifically, label value propagation processing is carried out on the sample nodes by combining the label values of the sample nodes and the relation weights between the nodes in the graph database, so that the label values of target nodes adjacent to the sample nodes are obtained. In the dyeing process of the neighbor nodes, the neighbor nodes are influenced by each node adjacent to the neighbor nodes, so that the influence degree of different adjacent nodes on the neighbor nodes can be embodied by adopting a weight normalization mode and other modes; for example, a weighted average of the label values of the neighboring nodes is calculated according to the weights. Because the label propagation algorithm is based on alreadyAll nodes are updated iteratively by the label, considering that the data scale is huge and can reach hundreds of millions of data nodes, and abnormal users in mass data are quite sparse, all nodes can be not updated, and only the nodes within a set step length are updated, for example, only the adjacent nodes or the adjacent nodes of the adjacent nodes are updated iteratively, so that the computing efficiency is guaranteed while computing resources are saved.
Specifically, it may be noted that all nodes to be transmitted are V, and sum =0 is ordered for each node V in V; acquiring all neighbor nodes N (v) of v, and for each node u in N (v), recording w for each node x in N (u)uxIs the weight of the relationship between u and x, which is made available by the connecting edges of the nodes in the graph database, sum = sum + wuxRemember yxX is an encrypted tag. For each node u in N (v): remember yuFor homomorphic ciphered 0, for each node x in N (u): w is aux’=wux/sum,yu=yu+w’*yx(ii) a And performing V traversal, namely calculating to obtain the sum of products of the label value of each adjacent node of the target node and the corresponding relation weight ratio.
In one specific example, the demander is a bank and the participant is an operator. The bank has a set of blacklisted members whose label value is 1, and the other members whose label value is 0 by default. The demand side encrypts the label values of the blacklist members by a homomorphic encryption algorithm and sends the label values to the participants, and after node dyeing is carried out in the graph database, dyeing results are stored in the participants, namely operators. Since the result is a secret value, it is not visible to the operator. And only when the demand party needs to inquire the associated data of certain sample data, sending the identifier ciphertext of the associated data to the participant, finding the corresponding node by the participant according to the identifier ciphertext, sending the tag value ciphertext of the node back to the demand party for decryption, and judging. Specifically, the relationship between the associated data and the sample data may be determined according to a judgment threshold set by the demander. For example, if the plaintext value of the tag value of the decrypted associated data is 0.32 and the determination threshold is 0.5 (the associated data is also a blacklist when being represented by ≧ 0.5), the associated data is considered not to be the blacklist data.
The data processing method based on cross-domain label propagation in the embodiment can realize a cross-domain label propagation algorithm based on safety diagram calculation on the premise of realizing double privacy protection of labels and diagram information by combining a homomorphic encryption technology and a privacy set intersection technology on the premise of data compliance and no leakage of data of any party, thereby carrying out fusion training by utilizing other diagram data and own data, enriching own data characteristics, considering accuracy and safety of big data processing and having wide application value.
Example 3
Referring to fig. 3, this embodiment specifically provides a data processing system based on cross-domain label propagation, which is applied to a participant; the data processing system includes:
the receiving module 51 is configured to receive a tag value ciphertext of the sample data sent by the requesting party, where the tag value ciphertext is obtained by encrypting the tag value of the sample data according to a preset encryption rule;
an updating module 52, configured to update an initial tag value of a sample node corresponding to sample data in the graph database according to the tag value ciphertext;
and the dyeing module 53 is configured to perform label value propagation processing on the sample node according to the relationship weight between the nodes in the graph database and the label value, so as to obtain a label value of a target node adjacent to the sample node.
Preferably, the dyeing module 53 comprises:
the weight obtaining unit 531 is configured to obtain a relationship weight ratio between a target node and each adjacent node according to a relationship weight between nodes in a graph database;
and the label value calculation unit 532 is used for respectively calculating the products of the label values of each adjacent node of the target node and the corresponding relation weight ratios according to the label values of the nodes in the graph database, and setting the sum of the products as the label value of the target node.
The data processing system based on cross-domain label propagation of the embodiment combines a homomorphic encryption technology and a privacy set intersection technology on the premise of data compliance and no leakage of data of any party, can realize cross-domain label propagation based on safety diagram calculation on the premise of realizing double privacy protection of labels and diagram information, thereby carrying out fusion training by utilizing other diagram data and own data, enriching own data characteristics, considering accuracy and safety of big data processing, and having wide application value.
Example 4
Referring to fig. 4, this embodiment specifically provides a data processing system based on cross-domain label propagation, which is applied to a demander and includes:
a sending module 501, configured to send a tag value ciphertext of sample data to a participant so as to update an initial tag value of a sample node corresponding to the sample data in a graph database of the participant; the label value ciphertext is obtained by encrypting the label value of the sample data by a preset encryption rule;
the query module 502 is configured to send data to be queried to a participant to obtain a tag value ciphertext of a node corresponding to the data to be queried;
the query result obtaining module 503 is configured to decrypt the tag value ciphertext according to a preset encryption rule to obtain a tag value corresponding to the data to be queried.
The data processing system based on cross-domain label propagation of the embodiment can realize cross-domain label propagation based on safety diagram calculation on the premise of realizing double privacy protection of labels and diagram information by combining a homomorphic encryption technology and a privacy set intersection technology on the premise of data compliance and no leakage of data of any party, thereby carrying out fusion training by utilizing other diagram data and own data, enriching own data characteristics, considering the accuracy and the safety of big data processing and having wide application value.
Example 5
Referring to fig. 5, the present embodiment provides an electronic device 30, which includes a processor 31, a memory 32, and a computer program stored in the memory 32 and executable on the processor 31, and when the processor 31 executes the computer program, the data processing method based on cross-domain tag propagation in embodiments 1 and 2 is implemented. The electronic device 30 shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
The electronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, and a bus 33 connecting the various system components (including the memory 32 and the processor 31).
The bus 33 includes a data bus, an address bus, and a control bus.
The memory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 31 executes various functional applications and data processing, such as the data processing method based on cross-domain tag propagation in embodiment 1 and embodiment 2 of the present invention, by running the computer program stored in the memory 32.
The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through input/output (I/O) interfaces 35. Also, model-generating device 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 36. Network adapter 36 communicates with the other modules of model-generating device 30 via bus 33. Other hardware and/or software modules may be used in conjunction with the model-generating device 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 6
The present embodiment provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the data processing method based on cross-domain tag propagation in embodiment 1 and embodiment 2.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation manner, the present invention can also be implemented in the form of a program product, which includes program code for causing a terminal device to execute a data processing method based on cross-domain tag propagation in implementation examples 1 and 2 when the program product runs on the terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (11)

1. A data processing method based on cross-domain label propagation is characterized by being applied to participants; the data processing method comprises the following steps:
receiving a tag value ciphertext of sample data sent by a demand side, wherein the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
updating the initial tag value of the sample node corresponding to the sample data in the graph database according to the tag value ciphertext;
and carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of a target node adjacent to the sample nodes.
2. The data processing method based on cross-domain label propagation according to claim 1, wherein the step of performing label value propagation processing on the sample nodes according to the relationship weights between the nodes in the graph database and the label values comprises:
obtaining a relation weight ratio between the target node and each adjacent node according to the relation weight between the nodes in the graph database;
and respectively calculating the product of the label value of each adjacent node of the target node and the corresponding relation weight ratio according to the label value of the node in the graph database, and setting the sum of the products as the label value of the target node.
3. A data processing method based on cross-domain label propagation is applied to a demand side, and the data processing method comprises the following steps:
sending a tag value ciphertext of the sample data to a participant, wherein the tag value ciphertext is used for updating an initial tag value of a sample node corresponding to the sample data in a graph database of the participant; the label value ciphertext is obtained by encrypting the label value of the sample data by a preset encryption rule;
sending data to be queried to the participants to acquire a tag value ciphertext of a node corresponding to the data to be queried;
and decrypting the tag value ciphertext according to the preset encryption rule to obtain a tag value corresponding to the data to be queried.
4. A data processing system based on cross-domain label propagation is characterized by being applied to participants; the data processing system includes:
the receiving module is used for receiving a tag value ciphertext of the sample data sent by a demand party, wherein the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
the updating module is used for updating the initial tag value of the sample node corresponding to the sample data in the graph database according to the tag value ciphertext;
and the dyeing module is used for carrying out label value propagation processing on the sample nodes according to the relation weight and the label value between the nodes in the graph database so as to obtain the label value of the target node adjacent to the sample nodes.
5. The cross-domain tag propagation-based data processing system of claim 4, wherein the staining module comprises:
the weight obtaining unit is used for obtaining a relation weight ratio between the target node and each adjacent node according to the relation weight between the nodes in the graph database;
and the label value calculation unit is used for calculating the product of the label value of each adjacent node of the target node and the corresponding relation weight ratio according to the label value of the node in the graph database, and setting the sum of the products as the label value of the target node.
6. A data processing system based on cross-domain label propagation, applied to a demander, the data processing system comprising:
the sending module is used for sending the tag value ciphertext of the sample data to the participant so as to update the initial tag value of the sample node corresponding to the sample data in the graph database of the participant; the tag value ciphertext is obtained by encrypting the tag value of the sample data by a preset encryption rule;
the query module is used for sending data to be queried to the participants so as to obtain a tag value ciphertext of a node corresponding to the data to be queried;
and the query result acquisition module is used for decrypting the tag value ciphertext according to the preset encryption rule so as to acquire the tag value corresponding to the data to be queried.
7. The cross-domain label propagation-based data processing system as claimed in any one of claims 4-6, wherein nodes in one-to-one correspondence with the sample data are included in the graph database.
8. The cross-domain label propagation-based data processing system as claimed in any one of claims 4 to 6, wherein the preset encryption rule is a homomorphic encryption algorithm.
9. A cross-domain label propagation-based data processing system comprising the cross-domain label propagation-based data processing system of claim 4 and the cross-domain label propagation-based data processing system of any one of claims 5 or 6.
10. An electronic device comprising a memory and a processor coupled to the memory, the processor implementing the method of data processing based on cross-domain tag propagation of any one of claims 1-3 when executing a computer program stored on the memory.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the cross-domain tag propagation-based data processing method of any one of claims 1 to 3.
CN202210499573.8A 2022-05-10 2022-05-10 Data processing method, system, device and medium based on cross-domain label propagation Active CN114615090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210499573.8A CN114615090B (en) 2022-05-10 2022-05-10 Data processing method, system, device and medium based on cross-domain label propagation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210499573.8A CN114615090B (en) 2022-05-10 2022-05-10 Data processing method, system, device and medium based on cross-domain label propagation

Publications (2)

Publication Number Publication Date
CN114615090A true CN114615090A (en) 2022-06-10
CN114615090B CN114615090B (en) 2022-08-23

Family

ID=81869826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210499573.8A Active CN114615090B (en) 2022-05-10 2022-05-10 Data processing method, system, device and medium based on cross-domain label propagation

Country Status (1)

Country Link
CN (1) CN114615090B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060124737A1 (en) * 2004-11-29 2006-06-15 Kyunghee Oh Method and system for updating RFID tag value of transferred object
US20160283462A1 (en) * 2015-03-24 2016-09-29 Xerox Corporation Language identification on social media
US20170351819A1 (en) * 2016-06-01 2017-12-07 Grand Rounds, Inc. Data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations
US20170351681A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Label propagation in graphs
CN108234468A (en) * 2017-12-28 2018-06-29 中国电子科技集团公司第三十研究所 A kind of cross-domain data transmission guard method based on label
CN110136016A (en) * 2019-04-04 2019-08-16 中国科学院信息工程研究所 A kind of multi-tag transmission method and system based on implicit association
CN110956553A (en) * 2019-12-16 2020-04-03 电子科技大学 Community structure division method based on social network node dual-label propagation algorithm
CN112017024A (en) * 2020-07-23 2020-12-01 北京瓴岳信息技术有限公司 Credit risk assessment method, system, computer device and storage medium
CN112052399A (en) * 2020-08-12 2020-12-08 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium
CN112202919A (en) * 2020-10-22 2021-01-08 中国科学院信息工程研究所 Picture ciphertext storage and retrieval method and system under cloud storage environment
CN113095946A (en) * 2021-04-28 2021-07-09 福州大学 Insurance customer recommendation method and system based on federal label propagation
CN113254683A (en) * 2020-02-07 2021-08-13 阿里巴巴集团控股有限公司 Data processing method and device and label identification method and device
CN114039785A (en) * 2021-11-10 2022-02-11 奇安信科技集团股份有限公司 Data encryption, decryption and processing method, device, equipment and storage medium
CN114401154A (en) * 2022-03-24 2022-04-26 华控清交信息科技(北京)有限公司 Data processing method and device, ciphertext calculation engine and device for data processing

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060124737A1 (en) * 2004-11-29 2006-06-15 Kyunghee Oh Method and system for updating RFID tag value of transferred object
US20160283462A1 (en) * 2015-03-24 2016-09-29 Xerox Corporation Language identification on social media
US20170351819A1 (en) * 2016-06-01 2017-12-07 Grand Rounds, Inc. Data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations
US20170351681A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Label propagation in graphs
CN108234468A (en) * 2017-12-28 2018-06-29 中国电子科技集团公司第三十研究所 A kind of cross-domain data transmission guard method based on label
CN110136016A (en) * 2019-04-04 2019-08-16 中国科学院信息工程研究所 A kind of multi-tag transmission method and system based on implicit association
CN110956553A (en) * 2019-12-16 2020-04-03 电子科技大学 Community structure division method based on social network node dual-label propagation algorithm
CN113254683A (en) * 2020-02-07 2021-08-13 阿里巴巴集团控股有限公司 Data processing method and device and label identification method and device
CN112017024A (en) * 2020-07-23 2020-12-01 北京瓴岳信息技术有限公司 Credit risk assessment method, system, computer device and storage medium
CN112052399A (en) * 2020-08-12 2020-12-08 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium
CN112202919A (en) * 2020-10-22 2021-01-08 中国科学院信息工程研究所 Picture ciphertext storage and retrieval method and system under cloud storage environment
CN113095946A (en) * 2021-04-28 2021-07-09 福州大学 Insurance customer recommendation method and system based on federal label propagation
CN114039785A (en) * 2021-11-10 2022-02-11 奇安信科技集团股份有限公司 Data encryption, decryption and processing method, device, equipment and storage medium
CN114401154A (en) * 2022-03-24 2022-04-26 华控清交信息科技(北京)有限公司 Data processing method and device, ciphertext calculation engine and device for data processing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张世奇: ""基于知识图谱的云服务推荐系统研究与实现"", 《东华大学硕士学位论文》 *
张玲玉等: "基于OAN的知识图谱查询研究", 《软件》 *
蔡国永等: "社会语义网社区发现标签传递算法研究", 《计算机科学》 *

Also Published As

Publication number Publication date
CN114615090B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
Xu et al. Lightweight and expressive fine-grained access control for healthcare Internet-of-Things
US9158925B2 (en) Server-aided private set intersection (PSI) with data transfer
CN113157778B (en) Proxiable query method, system, device and medium for distributed data warehouse
Li et al. An efficient blind filter: Location privacy protection and the access control in FinTech
CN112989027B (en) Method for querying lists and for providing list querying services and related products
Wei et al. Efficient multi-party private set intersection protocols for large participants and small sets
CN114615090B (en) Data processing method, system, device and medium based on cross-domain label propagation
US11018857B2 (en) Encryption scheme using multiple parties
CN113806795B (en) Two-party privacy set union calculation method and device
Li et al. Privacy-preserving ID3 data mining over encrypted data in outsourced environments with multiple keys
Qiu et al. Hierarchical Access Control with Scalable Data Sharing in Cloud Storage
Peng et al. Homomorphic encryption application on FinancialCloud framework
Maffina et al. An improved and efficient message passing interface for secure communication on distributed clusters
Xu et al. Verifiable computation with access control in cloud computing
CN113821811A (en) Block chain-based data acquisition method and system, electronic device and storage medium
Tian et al. DS: Privacy-Preserving Data Filtering for Distributed Data Streams in Cloud
Li et al. Outsourcing privacy-preserving ID3 decision tree over horizontally partitioned data for multiple parties
CN114638007B (en) Method, system, device and medium for determining community relation based on graph data
CN113746829B (en) Multi-source data association method, device, equipment and storage medium
CN115174076B (en) Private pursuit and edge computing network construction method based on alliance chain technology
CN116388970B (en) Centralized cloud computing implementation method and device based on multiparty data
Xu et al. Multi-Source Data Privacy ProtectionMethod Based on Homomorphic Encryption and Blockchain.
Jahanavi et al. Cloud Computing using OWASP: Open Web Application Security Project
Wang et al. Outsourced Privacy-Preserving Data Alignment on Vertically Partitioned Database
US11907392B2 (en) System and method utilizing function secret sharing with conditional disclosure of secrets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant