CN115291889B - Data blood relationship establishing method and device and electronic equipment - Google Patents

Data blood relationship establishing method and device and electronic equipment Download PDF

Info

Publication number
CN115291889B
CN115291889B CN202211178969.9A CN202211178969A CN115291889B CN 115291889 B CN115291889 B CN 115291889B CN 202211178969 A CN202211178969 A CN 202211178969A CN 115291889 B CN115291889 B CN 115291889B
Authority
CN
China
Prior art keywords
attribute
ciphertext
source
variable
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211178969.9A
Other languages
Chinese (zh)
Other versions
CN115291889A (en
Inventor
刘琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huakong Tsingjiao Information Technology Beijing Co Ltd
Original Assignee
Huakong Tsingjiao Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huakong Tsingjiao Information Technology Beijing Co Ltd filed Critical Huakong Tsingjiao Information Technology Beijing Co Ltd
Priority to CN202211178969.9A priority Critical patent/CN115291889B/en
Publication of CN115291889A publication Critical patent/CN115291889A/en
Application granted granted Critical
Publication of CN115291889B publication Critical patent/CN115291889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/425Lexical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/46Secure multiparty computation, e.g. millionaire problem

Abstract

The application discloses a method, a device and an electronic device for establishing a data blood relationship, which relate to the technical field of multi-party safety calculation and the technical field of data processing, and comprise the following steps: constructing an abstract syntax tree of a ciphertext computing code in a Python script of the ciphertext computing task; traversing an Assign node and an Expr node contained in the abstract syntax tree; acquiring the source relation between variables in the ciphertext calculation code from the design node and the Expr node; and establishing a blood relation between each input data and each output data of the ciphertext calculation task based on the source relation among all variables in the ciphertext calculation code. By adopting the scheme, the establishment of the data blood relationship aiming at the ciphertext data is realized.

Description

Data blood relationship establishing method and device and electronic equipment
Technical Field
The present application relates to the field of multi-party secure computing technologies and data processing technologies, and in particular, to a method and an apparatus for establishing a data blood relationship, and an electronic device.
Background
The data blooding margin, i.e. the origin and origin of the data, mainly includes the source of the data, the processing method of the data, the mapping relationship, etc. Clear data bloodlines are the basis for maintaining stability of the data platform, and are more favorable for data change influence analysis and data problem investigation.
In a data system, the analysis and establishment of data blood relationship are an important part in a data governance system. The established data blood relationship can be used for the aspects of abnormal positioning, blood relationship tracking influence analysis and the like, and is greatly helpful for improving the quality and the efficiency of data management.
The existing data blood relationship analysis method and tool are both directed at relational databases such as MySQL and the like or big data systems such as Hive and the like. The data of the systems are in plain text, and the processing relation of the data is mostly described in the form of SQL script.
multi-Party secure computing (MPC) can perform data Computation or fusion between multiple non-mutually trusted databases on the premise that data is mutually confidential, the Computation is often realized by executing a ciphertext Computation task, the ciphertext Computation task is a data Computation task implemented by using a multi-Party secure computing technology, and all Computation processes are performed on the premise that data is mutually confidential.
In a multi-party secure computing platform, because data computation is performed in a ciphertext mode and a language describing a data processing relationship is a Python language, the existing data blood relationship analysis method and tool aiming at a plaintext database and an SQL script cannot be directly applied to data blood relationship establishment aiming at ciphertext data.
Disclosure of Invention
The embodiment of the application provides a data blood relationship establishing method and device and electronic equipment, and aims to solve the problem that data blood relationship establishing cannot be achieved for ciphertext data in the prior art.
The embodiment of the application provides a method for establishing a data blood relationship, which comprises the following steps:
constructing an abstract syntax tree of ciphertext computing codes in a Python script of the ciphertext computing task;
traversing an Assign node and an Expr node contained in the abstract syntax tree;
acquiring the source relation among variables in the ciphertext calculation code from the design node and the Expr node;
and establishing a blood relation between each input data and each output data of the ciphertext computing task based on the source relation among the variables in the ciphertext computing code.
Further, the obtaining of the source relationship between variables in the ciphertext computation code from the Assign node and the Expr node includes:
aiming at each traversed Assign node, obtaining an assigned target variable from an id attribute in target attributes of the Assign node;
when the value attribute of the Assign node comprises a func attribute and the attr attribute of the func attribute is pp.ss operation, acquiring a source relation between the target variable and an input variable of the ciphertext calculation code;
when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node;
and for each traversed Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is pp.
Further, when the value attribute of the Assign node does not include a func attribute, or the attr attribute of the included func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node includes:
when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, if the Assign node contains the slice attribute, the column number of the source variable in the Assign node is obtained from the slice attribute;
acquiring a source relation between the target variable and the column number of the source variable in the Assign node;
establishing a blood relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relationship accurate to the column between each input data and each output data of the ciphertext calculation task based on the source relationship among the variables in the ciphertext calculation code.
Further, before the establishing a column-accurate consanguinity relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, the method further includes:
acquiring a field name corresponding to the column number of the input data represented by the source variable with the slice attribute from a sample example;
establishing a column-accurate consanguinity relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relationship accurate to the field name between each input data and each output data of the ciphertext calculation task based on the source relationship among the variables in the ciphertext calculation code.
Further, before the establishing a blood-related relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, the method further includes:
acquiring an input corresponding relation between an input data address and an input variable and an output corresponding relation between an output data address and an output variable in a task configuration file of the ciphertext computing task;
establishing a blood relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relation between each input data address and each output data address of the ciphertext computing task based on the source relation among the variables in the ciphertext computing code and the input corresponding relation and the output corresponding relation.
The embodiment of the present application further provides a data blood relationship establishing apparatus, including:
the syntax tree construction module is used for constructing an abstract syntax tree of ciphertext computing codes in a Python script of the ciphertext computing task;
the node traversing module is used for traversing the Assign node and the Expr node contained in the abstract syntax tree;
a source relation obtaining module, configured to obtain, from contents in the Assign node and the Expr node, a source relation between variables in the ciphertext computation code;
and the blood relationship establishing module is used for establishing blood relationship between each input data and each output data of the ciphertext calculation task based on the source relationship among the variables in the ciphertext calculation code.
Further, the source relationship obtaining module is specifically configured to, for each traversed Assign node, obtain an assigned target variable from an id attribute in target attributes of the Assign node;
when the value attribute of the Assign node comprises a func attribute and the attr attribute of the func attribute is pp.ss operation, acquiring a source relation between the target variable and an input variable of the ciphertext calculation code;
when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node;
and for each traversed Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is pp.
Further, the source relationship obtaining module is specifically configured to, when the value attribute of the Assign node does not include a func attribute, or an attr attribute of the included func attribute is not pp.ss operation, if the Assign node includes a slice attribute, obtain a column number of a source variable in the Assign node from the slice attribute;
acquiring a source relation between the target variable and the column number of the source variable in the Assign node;
the blood relationship establishing module is specifically configured to establish a blood relationship accurate to a row between each input data and each output data of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code.
Further, the source relationship obtaining module is further configured to obtain, from the sample example, a field name corresponding to the column number of the input data represented by the source variable having the slice attribute;
the blood relationship establishing module is specifically configured to establish a blood relationship accurate to a field name between each input data and each output data of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code.
Further, the source relation obtaining module is further configured to obtain an input correspondence between an input data address and an input variable in a task configuration file of the ciphertext computing task, and an output correspondence between an output data address and an output variable;
the blood relationship establishing module is specifically configured to establish a blood relationship between each input data address and each output data address of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code, and the input corresponding relationship and the output corresponding relationship.
Embodiments of the present application further provide an electronic device, including a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: and realizing the establishment method of any data blood relationship.
An embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program implements any one of the above data blood relationship establishing methods.
Embodiments of the present application further provide a computer program product containing instructions that, when executed on a computer, cause the computer to perform any of the above-mentioned data relationship establishing methods.
The beneficial effect of this application includes:
in the method provided by the embodiment of the application, the source relationship among variables in the ciphertext calculation code is obtained by processing the ciphertext calculation code in the Python script of the ciphertext calculation content and analyzing the contents of the Assign node and the Expr node contained in the obtained abstract syntax tree, and the blood relationship between each input data and each output data of the ciphertext calculation task is established based on the source relationship. Therefore, the establishment of the data blood relationship aiming at the ciphertext data is realized.
Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application and not to limit the application. In the drawings:
fig. 1 is a flowchart of a data blood relationship establishing method according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a data relationship establishing method according to another embodiment of the present application;
FIG. 3 is a diagram of a ciphertext computation code of a Python script according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating an abstract syntax tree established by using ciphertext computing codes as character strings in an embodiment of the present application;
FIG. 5 is a diagram of an abstract syntax tree created in an embodiment of the present application;
FIG. 6 is a diagram illustrating the contents of an Assign node in an abstract syntax tree according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a relationship between blood factors established in an embodiment of the present application;
FIG. 8 is a schematic diagram of the relationship between blood margins established in the embodiment of the present application;
FIG. 9 is a diagram illustrating the contents of an Assign node including a slice attribute according to an embodiment of the present application;
FIG. 10 is a schematic diagram of the relationship between blood margins established in the embodiment of the present application;
fig. 11 is a schematic structural diagram of a data relationship establishing apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to provide an implementation scheme for establishing a data blood relationship for ciphertext data, embodiments of the present application provide a data blood relationship establishing method, an apparatus, and an electronic device, and the following description, with reference to the accompanying drawings, describes preferred embodiments of the present application, and it should be understood that the preferred embodiments described herein are only used to illustrate and explain the present application, and are not used to limit the present application. And the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
An embodiment of the present application provides a method for establishing a data blood relationship, as shown in fig. 1, including:
step 11, constructing an abstract syntax tree of ciphertext calculation codes in a Python script of the ciphertext calculation task;
step 12, traversing an Assign node and an Expr node contained in the abstract syntax tree;
step 13, acquiring the source relation among variables in the ciphertext calculation code from the content in the Assign node and the Expr node;
and 14, establishing a blood relation between each input data and each output data of the ciphertext calculation task based on the source relation among the variables in the ciphertext calculation code.
By adopting the method provided by the embodiment of the application, the source relation among variables in the ciphertext calculation code is obtained by processing the ciphertext calculation code in the Python script of the ciphertext calculation content and analyzing the content of the Assign node and the Expr node contained in the obtained abstract syntax tree, and the blood relationship between each input data and each output data of the ciphertext calculation task is established based on the source relation. Therefore, the establishment of the data blood relationship aiming at the ciphertext data is realized.
The method and apparatus provided herein are described in detail below with reference to the accompanying drawings using specific embodiments.
An embodiment of the present application provides a method for establishing a data blood relationship, as shown in fig. 2, including:
step 201, aiming at a ciphertext calculation task needing to establish a data blood relationship, acquiring an input corresponding relationship between an input data address and an input variable and an output corresponding relationship between an output data address and an output variable in a task configuration file of the ciphertext calculation task.
The multi-party secure computing technology is usually realized by executing a ciphertext computing task, wherein the ciphertext computing task generally comprises a task configuration file, a Python script for ciphertext operation, input data and output data, and the Python script comprises a ciphertext computing code.
The task configuration file generally stores input variables and output variables in the ciphertext calculation codes in the Python script of the ciphertext calculation task, wherein the input variables represent input data, the output variables represent output data, and the task configuration file may also store addresses (which may also be referred to as storage paths) of the input data and the output data.
For example, in a ciphertext computing task, the information in the task configuration file is as follows:
input variables representing input data include: "shuidianA A" and "shuidianA B";
correspondingly, the input variable address is: "cirher:// ds 02/power/suidianA" and "cirher:// ds 02/power/suidianB";
output variables representing output data: "total _ power _ all" and "total _ power _ nonpublic";
outputting variable addresses: are all stored in the ds03 host;
as shown in fig. 3, code is calculated for a ciphertext of a Python script provided in this embodiment, where the code includes an input variable and an output variable in the task configuration file.
In this step, the input correspondence between the input data address and the input variable and the output correspondence between the output data address and the output variable are obtained from the task configuration file of the ciphertext calculation task, and are used for subsequently establishing the data consanguinity relationship.
And 202, constructing an abstract syntax tree of the ciphertext computing code in the Python script of the ciphertext computing task.
As can be seen from the ciphertext computation code shown in fig. 3, a Python script of a ciphertext computation task generally includes 3 parts: part 1 is input data, part 2 is the calculation process, and part 3 is output data.
In this step, an abstract syntax tree of the ciphertext computing code in the Python script is constructed, and specifically, the abstract syntax tree of the ciphertext computing code can be constructed by using the Ast function in the Python language.
As shown in fig. 4, the code in fig. 3 is used as a character string, and the ast.part interface in Python is called for parsing to construct an abstract syntax tree of the code shown in fig. 3, and the constructed abstract syntax tree is shown in fig. 5.
As can be seen from fig. 5, the constructed abstract syntax tree includes three parts: the Import node of the 0 th section corresponds to the Import statement in fig. 3; assign nodes of paragraphs 1-5, corresponding to the compute statements in parts 1 and 2 of FIG. 3; the Expr (expression) nodes of paragraphs 6-7 correspond to the output statements of section 3 in fig. 3.
And step 203, sequentially traversing each node contained in the abstract syntax tree according to the sequence, executing step 204 if the node is an Assign node, executing step 208 if the node is an Expr node, and otherwise executing step 209.
And 204, acquiring the assigned target variable from the id attribute in the targets attribute of the Assign node.
The contents of each node of the abstract syntax tree contain various attributes for representing some sentence contents in the ciphertext calculation codes.
In this step, for an Assign node, an assigned target variable is first obtained from an id attribute in a targets attribute of the Assign node. For example, fig. 6 shows the content of the Assign node numbered 1 in the abstract syntax tree shown in fig. 5, and as can be seen from fig. 6, the assigned target variable "dispatch _ a" can be obtained from the id attribute in the targets attribute of the Assign node.
Step 205, determining whether the value attribute of the Assign node contains a func attribute, wherein the attr attribute of the func attribute is pp.ss operation, if yes, executing step 206, otherwise, executing step 207.
And step 206, when the value attribute of the Assign node comprises a func attribute, and the attr attribute of the func attribute is pp.ss operation, acquiring the source relationship between the target variable and the input variable of the ciphertext calculation code.
As can be seen from fig. 6, the value attribute of the Assign node numbered 1 includes a func attribute, and the attr attribute of the func attribute is a pp.ss operation, which indicates that an input variable is assigned to a target variable, for example, as can be seen from the content in the block in fig. 6, the source relationship between the target variable and the input variable of the ciphertext calculation code can be obtained by assigning the input variable "souiana" to the target variable "souiana", which indicates that the target variable "souiana" is derived from the input variable "souiana".
The source relationship between the variables obtained in this step may be stored in a cache.
And step 207, when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node.
When the Assign node does not contain the pp.ss operation, the Assign node can consider that the Assign node does not contain the input variable, but assigns a source variable to the target variable, so that the source relationship between the target variable and the source variable in the Assign node can be obtained.
Therefore, the source relationship between the target variable acquired in the step and the source variable in the Assign node can be finally analyzed by combining the source relationship between the source variable acquired before and the input variable when the source variable is taken as the target variable.
The source relationship between the variables obtained in this step may be stored in a cache.
And 208, aiming at the Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is pp.
When the attr attribute of the func attribute of the value attribute of the Expr node is a pp.previous operation, the pp.previous operation indicates that a source variable is assigned to an output variable, so that a source relation between the output variable and the source variable in the Expr node can be obtained.
If the value attribute of the Expr node does not include a func attribute, or the attr attribute of the included func attribute is not pp.
The source relationship between the variables obtained in this step may be stored in a cache.
Step 209, determine whether all nodes in the abstract syntax tree have been traversed, if so, execute step 210, otherwise, return to step 203.
And step 210, establishing a blood relationship between each input data and each output data of the ciphertext calculation task based on the source relationship between each variable in the obtained ciphertext calculation code.
Through the above steps 203 to 209, the source relationship between the variables in the obtained ciphertext computing code includes the input variable representing the input data and the output variable representing the output data, so that the source relationship between the input variables and the output variables can be finally established as the blood relationship between the input data and the output data of the ciphertext computing task.
For example, according to the ciphertext calculation code shown in fig. 3, the input variables are "pipeline a" and "pipeline b", the output variables are "total _ power _ all" and "total _ power _ nonpublic", and the source relationship between the variables in the code determines that the output variable "total _ power _ all" is derived from the input variables "pipeline a" and "pipeline b", and the output variable "total _ power _ nonpublic" is derived from the input variable "pipeline b", so the established blooding relationship may be as shown in fig. 7.
In this embodiment, if the step 201 is executed, in this step, a blood relationship between each input data address and each output data address of the ciphertext calculation task may be specifically established based on a source relationship between each variable in the ciphertext calculation code, and the input corresponding relationship and the output corresponding relationship that are obtained from the task configuration file.
For example, according to the ciphertext computing code shown in fig. 3, the established relationship may be as shown in fig. 8, and the relationship shown in fig. 8 indicates the addresses of the input data and the output data, so as to facilitate the subsequent use of the relationship.
In the above steps 206, 207 and 208, the source relationship between the variables obtained from an Assign node or an Expr node is obtained, and then cached, or after the source relationship between the variables is obtained from an Assign node or an Expr node each time, the source relationship between the target variable and the input variable in the Assign node is established based on the cached source relationship, and cached, or the source relationship between the output variable and the input variable in the Expr node is established and cached.
By adopting the data consanguinity relationship establishing method shown in fig. 2 provided by the embodiment of the application, the establishment of the data consanguinity relationship for the ciphertext data is realized, and the established consanguinity relationship can indicate the addresses of the input data and the output data, so that the consanguinity relationship can be used later.
In the embodiment of the present application, based on the method shown in fig. 2, a more refined blood relationship may be further established, which is specifically described as follows:
in the method shown in fig. 2, for an Assign node, when a value attribute of the Assign node does not include a func attribute, or an attr attribute of the included func attribute is not a pp.ss operation, if the Assign node includes a slice attribute, a column number of a source variable in the Assign node is obtained from the slice attribute;
acquiring a source relation between a target variable in the Assign node and a column number of a source variable in the Assign node, namely the acquired source relation is accurate to a certain column of data;
correspondingly, based on the source relation among variables in the ciphertext computing code, a blood relationship accurate to the row between each input data and each output data of the ciphertext computing task is established.
In each Assign node, for the source variable related to the input variable, the column number of the source variable also represents a certain column of the input variable, so that the source relationship between the output variable and the certain column of the input variable can be finally obtained, and the blood relationship accurate to the column between each input data and each output data of the ciphertext calculation task can be established.
For example, according to the ciphertext calculation code shown in fig. 3, in the related part of the Assign node corresponding to the statement "total _ power _ all = pipeline _ a [ i ] [1] + pipeline _ b [ i ] [1]" in part 2, as shown in fig. 9, it is known that the variable "pipeline _ a" is assigned as the target input variable "pipeline a" and then participates in the calculation as the source variable, and when the variable "pipeline _ a" participates in the calculation as the source variable, the 1 st column is used, and the 1 st column of the variable "pipeline _ a" corresponds to the 1 st column of the input variable "pipeline a".
Further, a field name corresponding to the column number of the input data represented by the source variable with the slice attribute can be obtained from the sample example;
correspondingly, when establishing the blood-related relationship, the blood-related relationship accurate to the field name between each input data and each output data of the ciphertext calculation task can be established based on the source relationship between each variable in the ciphertext calculation code.
For example, according to the ciphertext calculation code shown in fig. 3 described above, the field names of the 1 st column corresponding to the input variables "shuidianA a" and "shuidianA b" are both "Power Generation", the output variable "total _ Power _ all" is derived from the 1 st column of the input variable "shuidianA a" and the 1 st column of the input variable "shuidianB", and the output variable "total _ Power _ non _ public" is derived from the 1 st column of the input variable "shuidianA b", whereby the relationship of the blooding margin as shown in fig. 10 can be established in which the field names are accurate.
Based on the same inventive concept, according to the method for establishing a data relationship provided in the foregoing embodiment of the present application, correspondingly, another embodiment of the present application further provides a device for establishing a data relationship, a schematic structural diagram of which is shown in fig. 11, and specifically includes:
the syntax tree construction module 111 is configured to construct an abstract syntax tree of ciphertext computation codes in a Python script of the ciphertext computation task;
a node traversing module 112, configured to traverse the Assign node and the Expr node included in the abstract syntax tree;
a source relation obtaining module 113, configured to obtain, from contents in the Assign node and the Expr node, a source relation between variables in the ciphertext computation code;
a blood relationship establishing module 114, configured to establish a blood relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code.
Further, the source relationship obtaining module 113 is specifically configured to, for each traversed Assign node, obtain an assigned target variable from an id attribute in a target attribute of the Assign node;
when the value attribute of the Assign node comprises a func attribute and the attr attribute of the func attribute is pp.ss operation, acquiring a source relation between the target variable and an input variable of the ciphertext calculation code;
when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node;
and for each traversed Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is pp.
Further, the source relationship obtaining module 113 is specifically configured to, when the value attribute of the Assign node does not include a func attribute, or an attr attribute of the included func attribute is not pp.ss operation, if the Assign node includes a slice attribute, obtain a column number of a source variable in the Assign node from the slice attribute;
acquiring a source relation between the target variable and the column number of the source variable in the Assign node;
the blood relationship establishing module 114 is specifically configured to establish a blood relationship accurate to a column between each input data and each output data of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code.
Further, the source relationship obtaining module 113 is further configured to obtain, from the sample example, a field name corresponding to the column number of the input data represented by the source variable with the slice attribute;
the consanguinity relationship establishing module 114 is specifically configured to establish a consanguinity relationship that is accurate to a field name between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code.
Further, the source relationship obtaining module 113 is further configured to obtain an input corresponding relationship between an input data address and an input variable in a task configuration file of the ciphertext calculation task, and an output corresponding relationship between an output data address and an output variable;
the blood relationship establishing module 114 is specifically configured to establish a blood relationship between each input data address and each output data address of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code, and the input corresponding relationship and the output corresponding relationship.
The functions of the above modules may correspond to the corresponding processing steps in the flows shown in fig. 1 and fig. 2, and are not described herein again.
The data blood relationship establishing device provided by the embodiment of the application can be realized by a computer program. It should be understood by those skilled in the art that the above-mentioned division into modules is only one of many division into modules, and if the division into other modules or no division into modules is performed, it is within the scope of the present application as long as the data relationship establishing apparatus has the above-mentioned functions.
The embodiment of the present application further provides an electronic device, as shown in fig. 12, including a processor 121 and a machine-readable storage medium 122, where the machine-readable storage medium 122 stores machine-executable instructions that can be executed by the processor 121, and the processor 121 is caused by the machine-executable instructions to: and realizing the establishment method of any data blood relationship.
An embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for establishing a data relationship is implemented.
Embodiments of the present application further provide a computer program product containing instructions that, when executed on a computer, cause the computer to perform any of the above data relationship establishing methods.
The machine-readable storage medium in the electronic device may include a Random Access Memory (RAM) and a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (8)

1. A method for establishing data blood relationship is characterized by comprising the following steps:
constructing an abstract syntax tree of a ciphertext computing code in a Python script of the ciphertext computing task;
traversing an Assign node and an Expr node contained in the abstract syntax tree;
acquiring the source relation among variables in the ciphertext calculation code from the design node and the Expr node;
establishing a blood relation between each input data and each output data of the ciphertext computing task based on the source relation among the variables in the ciphertext computing code;
the obtaining of the source relationship among variables in the ciphertext calculation code from the content in the Assign node and the Expr node includes:
aiming at each traversed Assign node, obtaining an assigned target variable from an id attribute in a targets attribute of the Assign node;
when the value attribute of the Assign node comprises a func attribute and the attr attribute of the func attribute is pp.ss operation, acquiring a source relation between the target variable and an input variable of the ciphertext calculation code;
when the value attribute of the Assign node does not contain the func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, acquiring the source relationship between the target variable and the source variable in the Assign node, wherein the pp.ss operation indicates that the input variable is assigned to one target variable;
and for each traversed Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is a pp.
2. The method of claim 1, wherein the obtaining the source relationship between the target variable and the source variable in the Assign node when the value attribute of the Assign node does not include a func attribute or the attr attribute of the included func attribute is not a pp.ss operation comprises:
when the value attribute of the Assign node does not contain a func attribute, or the attr attribute of the contained func attribute is not pp.ss operation, if the Assign node contains a slice attribute, the column number of a source variable in the Assign node is obtained from the slice attribute;
acquiring a source relation between the target variable and the column number of the source variable in the Assign node;
establishing a blood relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relationship accurate to the row between each input data and each output data of the ciphertext calculation task based on the source relationship among the variables in the ciphertext calculation code.
3. The method of claim 2, before the establishing column-accurate consanguinity between each input data and each output data of the ciphertext computation task based on the source relationships between variables in the ciphertext computation code, further comprising:
acquiring a field name corresponding to the column number of the input data represented by the source variable with the slice attribute from a sample example;
establishing a column-accurate consanguinity relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relationship accurate to the field name between each input data and each output data of the ciphertext calculation task based on the source relationship among the variables in the ciphertext calculation code.
4. The method of claim 1, further comprising, prior to establishing a consanguinity relationship between the input data and the output data of the ciphertext computation task based on the source relationship between variables in the ciphertext computation code:
acquiring an input corresponding relation between an input data address and an input variable and an output corresponding relation between an output data address and an output variable in a task configuration file of the ciphertext computing task;
establishing a blood relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code, including:
and establishing a blood relation between each input data address and each output data address of the ciphertext computing task based on the source relation among the variables in the ciphertext computing code and the input corresponding relation and the output corresponding relation.
5. A data blood relationship establishing apparatus, comprising:
the syntax tree construction module is used for constructing an abstract syntax tree of ciphertext computing codes in a Python script of the ciphertext computing task;
the node traversing module is used for traversing the Assign node and the Expr node contained in the abstract syntax tree;
a source relation obtaining module, configured to obtain, from contents in the Assign node and the Expr node, a source relation between variables in the ciphertext computation code;
a consanguineness relationship establishing module, configured to establish a consanguineness relationship between each input data and each output data of the ciphertext computation task based on the source relationship between each variable in the ciphertext computation code;
the source relation obtaining module is specifically configured to, for each traversed Assign node, obtain an assigned target variable from an id attribute in target attributes of the Assign node;
when the value attribute of the Assign node comprises a func attribute and the attr attribute of the func attribute is pp.ss operation, acquiring a source relation between the target variable and an input variable of the ciphertext calculation code;
when the value attribute of the Assign node does not contain a func attribute, or the attr attribute of the contained func attribute is not subjected to pp.ss operation, acquiring a source relation between the target variable and a source variable in the Assign node, wherein the pp.ss operation indicates that an input variable is assigned to one target variable;
and for each traversed Expr node, when the attr attribute of the func attribute of the value attribute of the Expr node is a pp.
6. The apparatus of claim 5, wherein the source relationship obtaining module is specifically configured to, if the Assign node includes a slice attribute, obtain a column number of a source variable in the Assign node from the slice attribute when the value attribute of the Assign node does not include the func attribute or an attr attribute of the included func attribute is not a pp.ss operation;
acquiring a source relation between the target variable and the column number of the source variable in the Assign node;
the blood relationship establishing module is specifically configured to establish a blood relationship accurate to a column between each input data and each output data of the ciphertext calculation task based on the source relationship between each variable in the ciphertext calculation code.
7. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method of any one of claims 1 to 4.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 4.
CN202211178969.9A 2022-09-27 2022-09-27 Data blood relationship establishing method and device and electronic equipment Active CN115291889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211178969.9A CN115291889B (en) 2022-09-27 2022-09-27 Data blood relationship establishing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211178969.9A CN115291889B (en) 2022-09-27 2022-09-27 Data blood relationship establishing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN115291889A CN115291889A (en) 2022-11-04
CN115291889B true CN115291889B (en) 2023-01-13

Family

ID=83833503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211178969.9A Active CN115291889B (en) 2022-09-27 2022-09-27 Data blood relationship establishing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115291889B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032362A (en) * 2021-03-18 2021-06-25 广州虎牙科技有限公司 Data blood margin analysis method and device, electronic equipment and storage medium
CN113672628A (en) * 2021-10-22 2021-11-19 中航金网(北京)电子商务有限公司 Data blood margin analysis method, terminal device and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9111071B2 (en) * 2012-11-05 2015-08-18 Sap Se Expression rewriting for secure computation optimization
CN111538743B (en) * 2020-04-22 2023-08-18 电子科技大学 SQL-based data blood relationship analysis method and system
CN113742368A (en) * 2021-09-16 2021-12-03 北京航空航天大学 Data blood relationship analysis method
CN114357480A (en) * 2021-12-27 2022-04-15 徐工汉云技术股份有限公司 Data security query method, device and equipment based on SQL (structured query language) blood relationship
CN114398394A (en) * 2022-01-14 2022-04-26 建信金融科技有限责任公司 Data blood margin analysis method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032362A (en) * 2021-03-18 2021-06-25 广州虎牙科技有限公司 Data blood margin analysis method and device, electronic equipment and storage medium
CN113672628A (en) * 2021-10-22 2021-11-19 中航金网(北京)电子商务有限公司 Data blood margin analysis method, terminal device and medium

Also Published As

Publication number Publication date
CN115291889A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
US11221832B2 (en) Pruning engine
CN110908997B (en) Data blood relationship construction method and device, server and readable storage medium
CN109376166B (en) Script conversion method, script conversion device, computer equipment and storage medium
US10387236B2 (en) Processing data errors for a data processing system
US10210240B2 (en) Systems and methods for code parsing and lineage detection
US10318595B2 (en) Analytics based on pipes programming model
CN110968325A (en) Applet conversion method and device
CN111104335B (en) C language defect detection method and device based on multi-level analysis
US20140289705A1 (en) Systems and Methods for Generating Function-Relation Call Trees
CN110059006B (en) Code auditing method and device
CN110866029B (en) sql statement construction method, device, server and readable storage medium
CN107391528B (en) Front-end component dependent information searching method and equipment
US10223086B2 (en) Systems and methods for code parsing and lineage detection
CN110928941B (en) Data fragment extraction method and device
CN115291889B (en) Data blood relationship establishing method and device and electronic equipment
CN111143390A (en) Method and device for updating metadata
CN112835779A (en) Test case determination method and device and computer equipment
CN113220530B (en) Data quality monitoring method and platform
CN109597638B (en) Method and device for solving data processing and equipment linkage based on real-time computing engine
JP2022078962A (en) Automatic identification of lines of code related to error
CN110334098A (en) A kind of database combining method and system based on script
CN113901094B (en) Data processing method, device, equipment and storage medium
CN115545006B (en) Rule script generation method, device, computer equipment and medium
CN117407430B (en) Data query method, device, computer equipment and storage medium
CN117742779A (en) Method, device, equipment and storage medium for checking resource configuration information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant