CN110245270A - Data genetic connection storage method, system, medium and equipment based on graph model - Google Patents

Data genetic connection storage method, system, medium and equipment based on graph model Download PDF

Info

Publication number
CN110245270A
CN110245270A CN201910385135.7A CN201910385135A CN110245270A CN 110245270 A CN110245270 A CN 110245270A CN 201910385135 A CN201910385135 A CN 201910385135A CN 110245270 A CN110245270 A CN 110245270A
Authority
CN
China
Prior art keywords
data
graph model
name
relationship
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910385135.7A
Other languages
Chinese (zh)
Inventor
陈政
潘强
蔡灿
张翼飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Tianpeng Network Co Ltd
Original Assignee
Chongqing Tianpeng Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Tianpeng Network Co Ltd filed Critical Chongqing Tianpeng Network Co Ltd
Priority to CN201910385135.7A priority Critical patent/CN110245270A/en
Publication of CN110245270A publication Critical patent/CN110245270A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of data genetic connection storage method based on graph model, comprising: the SQL statement in parsing data mart modeling script;Create initial graph model;By the parsing result and initial graph model interaction;It repeats above operation, traverses the SQL statement in all data mart modeling scripts, generate a genetic connection graph model.The above scheme of the embodiment of the present invention, directly using data as the node of figure, relationship, attribute storage into chart database, without being pre-designed complicated relational data table structure, significantly reduces the design difficulty and complexity of such scene using graph model;Second, have benefited from the memory computer system of chart database Neo4j and the data structure of optimization, under mass data, data blood relationship upstream and downstream level quantity, the statistics of dependence table quantity can be rapidly completed in several milliseconds, and the retrieval of data field and tables of data dependence is rapidly completed.

Description

Data blood relationship storage method, system, medium and equipment based on graph model
Technical Field
The invention relates to the technical field of software, in particular to a data blood relationship storage method, a data blood relationship storage system, a data blood relationship storage medium and electronic equipment based on a graph model.
Background
In the prior art, a data warehouse can generate a large amount of data tables and data in order to support different services, and when data quality problems are solved, redundant data are cleaned, and data flow direction links are researched, bloody border dependency relationships among a large amount of data are difficult to clean quickly. Data blood relationship based on a graph model is usually stored in a form of manual recording or a relational database based on mysql and the like, however, the method is high in complexity, prone to errors, incapable of supporting complex data blood relationship analysis, and difficult to meet performance requirements under large-scale data.
Therefore, in the long-term research and development, the inventor has conducted a great deal of research and study on the storage of data blood-related relationships based on a graph model, and has proposed a data blood-related relationship storage method based on a graph model to solve one of the above technical problems.
Disclosure of Invention
The present invention is directed to a graph model-based data relationship storage method, system, medium, and electronic device, which can solve at least one of the above-mentioned problems. The specific scheme is as follows:
according to a first aspect of the present invention, there is provided a data relationship storage method based on a graph model, including:
analyzing SQL sentences in the data processing script;
creating an initial graph model;
associating the analysis result with an initial graph model;
and repeating the operations, traversing the SQL sentences in all the data processing scripts and generating a blood relationship graph model.
After the SQL statement in the data processing script is analyzed, the method comprises the following steps:
and acquiring the relationship between the data source table name and the field name, the data target table name and the field name, and the relationship between the data source table and the data target table field.
Wherein, the creating of the initial graph model specifically includes:
an initial graph model is created in the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model comprises:
and respectively taking the data source table field names and the data target table field names as nodes of the initial graph model, and writing the nodes into the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model further comprises:
and respectively taking the data source table name and the data target table name as the attributes of the initial graph model node, and writing the attributes into the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model comprises:
and taking the relation between the field names of the data source table and the field names of the data target table as the edge of the initial graph model, and writing the edge into the graph database Neo4 j.
And visually displaying the blood relationship graph model.
The blood relationship graph model is a mesh graph, wherein the mesh graph takes one node as a center, other nodes are associated with the center node, and different nodes are distinguished in color according to the depth of the blood relationship.
Wherein the node information in the blood relationship graph model comprises: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list.
Wherein the information of each list in the direct downstream table field list includes a relationship between a data source table field name and a data destination table field name.
According to a second aspect of the present invention, there is provided a data relationship storage system based on a graph model, including:
the analysis module is used for analyzing SQL sentences in the data processing script;
the creation module is used for creating an initial graph model in a graph database Neo4 j;
a node writing module, configured to write the data source table field name and the data target table field name into the graph database Neo4j, as nodes of the initial graph model, respectively;
an attribute writing module, configured to write the data source table name and the data target table name into the graph database Neo4j, as attributes of the initial graph model node;
a relation writing module, configured to write the relation between the data source table field name and the data target table field name as an edge of the initial graph model into the graph database Neo4 j;
and the traversal module is used for traversing and analyzing the SQL sentences in all the data processing scripts to generate a blood relationship graph model.
The analysis module is further used for obtaining the relationship between the data source table name and the field name, the data target table name and the field name, and the relationship between the data source table and the data target table field.
Wherein the node information in the blood relationship graph model comprises: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list.
According to a third aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the graph model-based data relationship storage method as described in any one of the above.
According to a fourth aspect of the present invention, there is provided an electronic apparatus including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the graph model-based data context storage method as described in any one of the above.
According to the scheme of the embodiment of the invention, the data are directly stored into the graph database as the nodes, the relations and the attributes of the graph by utilizing the graph model, a complex relational data table structure is not required to be designed in advance, and the design difficulty and the complexity of the scene are greatly reduced; second, thanks to the in-memory computation mechanism and optimized data structure of the graph database Neo4j, under a large amount of data, statistics of the number of levels upstream and downstream of the data lineage and the number of dependency tables can be completed quickly within a few milliseconds, and retrieval of data fields and dependency relationships of the data tables can be completed quickly.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:
FIG. 1 is a flow chart of a graph model-based data relationship storage method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a graph model-based data relationship storage method according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a data-context storage system based on a graph model according to an embodiment of the present invention;
fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a plurality" typically includes at least two.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that although the terms first, second, third, etc. may be used to describe … … in embodiments of the present invention, these … … should not be limited to these terms. These terms are used only to distinguish … …. For example, the first … … can also be referred to as the second … … and similarly the second … … can also be referred to as the first … … without departing from the scope of embodiments of the present invention.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such article or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in the article or device in which the element is included.
In the big data era, data has infinite value. The explosion of mobile internet has enabled internet companies to accumulate PB-level user data and business data. Under the drive of strong requirements, the big data technology is developed steadily and maturely, and massive and continuously increased data are recorded through storage components such as HDFS, HBase, MongoDB, Kafka and the like.
The data generation, processing fusion, circulation and circulation are carried out until the data are finally lost, and a relationship can be naturally formed among the data. The relationship between data is expressed by referring to a similar relationship in human society, which is called the blood-related relationship of data.
Example 1
Referring to fig. 1, an embodiment of the present invention provides a data relationship storage method based on a graph model, including the following steps:
analyzing SQL sentences in the data processing script;
creating an initial graph model;
associating the analysis result with an initial graph model;
and repeating the operations, traversing the SQL sentences in all the data processing scripts and generating a blood relationship graph model.
After the SQL statement in the data processing script is analyzed, the method comprises the following steps:
and acquiring the relationship between the data source table name and the field name, the data target table name and the field name, and the relationship between the data source table and the data target table field.
Wherein, the creating of the initial graph model specifically includes:
an initial graph model is created in the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model comprises:
and respectively taking the data source table field names and the data target table field names as nodes of the initial graph model, and writing the nodes into the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model further comprises:
and respectively taking the data source table name and the data target table name as the attributes of the initial graph model node, and writing the attributes into the graph database Neo4 j.
Wherein the associating the parsing result with the initial graph model comprises:
and taking the relation between the field names of the data source table and the field names of the data target table as the edge of the initial graph model, and writing the edge into the graph database Neo4 j.
And visually displaying the blood relationship graph model.
The blood relationship graph model is a mesh graph, wherein the mesh graph takes one node as a center, other nodes are associated with the center node, and different nodes are distinguished in color according to the depth of the blood relationship.
Wherein the node information in the blood relationship graph model comprises: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list.
Wherein the information of each list in the direct downstream table field list includes a relationship between a data source table field name and a data destination table field name.
Example 2
Referring to fig. 2, an embodiment of the present invention provides a data relationship storage method based on a graph model, including the following steps:
s1, analyzing SQL sentences in the data processing script to obtain the relationship between the data source table name and field name, the data target table name and field name, and the data source table and data target table field.
Specifically, the data processing script is analyzed in a certain mode, and the blood relationship between the data table and the data field in the data warehouse is obtained and used as a data basis for constructing a data blood relationship graph model based on the graph model. Since the emphasis of the embodiment of the present invention is on the storage of the blood relationship, the script parsing process is not described here. In this embodiment, the SQL statement in the data processing script is analyzed by the parser to obtain the relationship between the data source TABLE name (S _ TABLE), the data source TABLE field name (S _ COLUMN), the target TABLE name (T _ TABLE), the target TABLE field name (T _ COLUMN), the data source TABLE field, and the data target TABLE field.
S2, an initial graph model is created in a graph database Neo4 j.
Specifically, after the blood relationship between the data is analyzed, an initial graph model is created according to a data model of a Neo4j graph database, and then the data is stored in the initial graph model.
S3, the data source table field names and the data target table field names are respectively used as nodes of the initial graph model and written into the graph database Neo4 j.
Specifically, the field name of the data source table is set as the Node _ a of the initial graph model, and is written into a graph database Neo4 j; and setting the field name of the data object table as the Node _ B of the initial graph model, and writing the field name into a graph database Neo4 j.
S4, the data source table names and the data target table names are used as attributes of the initial graph model nodes and written into the graph database Neo4 j.
Specifically, the data source table name is set as the attribute of the initial graph model Node _ a, and is written into a graph database Neo4 j; and simultaneously setting the name of the data target table as the attribute of the initial graph model Node _ B, and writing the attribute into a graph database Neo4 j.
S5, taking the relation between the field names of the data source table and the field names of the data target table as the edge of the initial graph model, and writing the edge into the graph database Neo4 j.
Specifically, the relationship between the nodes Node _ a and Node _ B of the initial graph model is set, and written into the graph database Neo4 j. In the embodiment, an application programming interface is used for specifying the connection address and the account name of a Graph database object (Graph) and establishing connection with a Graph database Neo4 j; then, field names of a data source table and a target table are designed as vertexes of the initial graph model, a node object is created by using a create method, and the names of the data source table and the target table are respectively used as name attributes of corresponding node objects; the create method is then used to create a relationship object for the graph database Neo4j with parameters specifying the first as the data source field, the second as the description of the direction of the relationship as 'to', and the third as the data destination field.
And S6, repeating the steps, traversing and analyzing SQL sentences in all the data processing scripts, and generating a blood relationship graph model.
Specifically, the steps S1-S5 are repeated, SQL statements in all data processing scripts are traversed and analyzed, all data source tables, data target table fields and table names are obtained, and are respectively used as Nodes (Nodes), Relationships (Relationships) and attributes (Properties) of the initial graph model, and are written into the graph database Neo4j to form a large complete graph, namely, a blood-edge relationship graph model.
The blood relationship graph model is a mesh graph, wherein the mesh graph takes one node as a center, other nodes are associated with the center node, and different nodes are distinguished according to the depth of the blood relationship. The node information in the blood relationship graph model comprises: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list. The information of each list in the direct downstream table field list includes the relationship between the data source table field name and the data destination table field name.
Further, the data blood relationship storage method based on the graph model comprises the step of visually displaying the blood relationship graph model. Specifically, data are acquired based on the Cypher query grammar of Neo4j, and meanwhile, the data are combined with the front-end framework vue.
According to the data blood relationship storage method based on the graph model, provided by the embodiment of the invention, the initial graph model is used for directly storing the data into the graph database as the nodes, the relationships and the attributes of the graph without designing a complex relational data table structure in advance, so that the design difficulty and the complexity of the scene are greatly reduced; based on the memory computing mechanism and the optimized data structure of the graph database Neo4j, under the condition of a large amount of data, statistics of the number of upstream and downstream levels of the data blood margin and the number of dependency tables can be completed quickly within a few milliseconds, and retrieval of data fields and dependency relationships of the data tables can be completed quickly; meanwhile, by combining with good visual interface function design, a user can quickly check and search the data table and visually see key information such as blood relationship flow direction relation, blood relationship dependent hierarchy and the like among data by clicking and selecting through a mouse without compiling codes.
Example 3
Referring to fig. 3, an embodiment of the invention provides a graph model-based data relationship storage system 200, where the system 200 includes: the system comprises a parsing module 210, a creating module 220, a node writing module 230, an attribute writing module 240, a relationship writing module 250 and a traversing module 260.
The parsing module 210 is configured to parse the SQL statements in the data processing script to obtain the relationship between the data source table name and the field name, the data target table name and the field name, and the data source table and the data target table field. Specifically, the parsing module 210 parses the data processing script in a certain manner to obtain the data relationship between the data table and the data field in the data warehouse, which is used as a data basis for constructing a data relationship graph model based on a graph model. Since the emphasis of the embodiment of the present invention is on the storage of the relationship of blood relationship, the script analysis is not described here. In this embodiment, the parsing module 210 parses the SQL statement in the data processing script by a parser to obtain a relationship between a data source TABLE name (S _ TABLE), a data source TABLE field name (S _ COLUMN), a target TABLE name (T _ TABLE), a target TABLE field name (T _ COLUMN), a data source TABLE field, and a data target TABLE field.
The creation module 220 is configured to create an initial graph model in a graph database Neo4 j. Specifically, after the blood relationship between the data is analyzed, the creating module 220 creates an initial graph model according to the data model of the Neo4j graph database, and then stores the data in the initial graph model.
The node writing module 230 is configured to write the data source table field name and the data destination table field name into the graph database Neo4j as nodes of the initial graph model, respectively. Specifically, the Node writing module 230 sets the name of the data source table field as the Node _ a of the initial graph model, and writes the name into a graph database Neo4 j; and setting the field name of the data object table as the Node _ B of the initial graph model, and writing the field name into a graph database Neo4 j.
The attribute writing module 240 is configured to write the data source table name and the data target table name into the graph database Neo4j as the attributes of the initial graph model node. Specifically, the attribute writing module 240 sets the attribute of the data source table name as the initial graph model Node _ a, and writes the attribute into a graph database Neo4 j; and simultaneously setting the name of the data target table as the attribute of the initial graph model Node _ B, and writing the attribute into a graph database Neo4 j.
The relationship writing module 250 is configured to write the relationship between the data source table field name and the data destination table field name as an edge of the initial graph model into the graph database Neo4 j. Specifically, the relationship writing module 250 sets the relationship between the nodes Node _ a and Node _ B of the initial graph model, and writes the relationship into the graph database Neo4 j. In the embodiment of the invention, an application programming interface is used for appointing the connection address and the account name of a Graph database object (Graph) and establishing connection with a Graph database Neo4 j; then, field names of a data source table and a target table are designed as vertexes of the initial graph model, a node object is created by using a create method, and the names of the data source table and the target table are respectively used as name attributes of corresponding node objects; the create method is then used to create a relationship object for the graph database Neo4j with parameters specifying the first as the data source field, the second as the description of the direction of the relationship as 'to', and the third as the data destination field.
The traversal module 260 is configured to traverse and analyze SQL statements in all the data processing scripts to generate a blood relationship graph model. Specifically, the traversal module 260 traverses and analyzes SQL statements in all data processing scripts to obtain fields and table names of all data source tables and data target tables, and uses the fields and table names as Nodes (Nodes), Relationships (Relationships), and attributes (Properties) of the initial graph model, and writes the fields and table names into a graph database Neo4j to form a large complete graph, i.e., a blood-edge relationship graph model.
The blood relationship graph model is a mesh graph, wherein the mesh graph takes one node as a center, other nodes are associated with the center node, and different nodes are distinguished according to the depth of the blood relationship. The node information in the blood relationship graph model comprises: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list. The information of each list in the direct downstream table field list includes the relationship between the data source table field name and the data destination table field name.
Further, the graph model-based data relationship storage system 200 includes a visualization display module 270 for visually displaying the graph model of relationship between blood vessels. Specifically, the visualization display module 270 acquires data based on the Cypher query syntax of Neo4j, and completes visualization display by combining the front-end frame vue. In this embodiment, after the data blood relationship based on the graph model is constructed, the data blood relationship based on the graph model is queried and analyzed by designing a matched visual interface.
The blood relationship graph model displayed on the visual interface defaults to take a certain data table node as a center, and the node is displayed in red; all the data table nodes related to the data table nodes are displayed in a mesh graph form, the data table nodes are connected through gray connecting lines with arrows, and the color of each level of nodes is gradually lightened according to the depth of the blood relationship. Wherein, the nodes are displayed by circular icons, and the icons are marked with the name of the data table represented by the nodes; the data table comprises a data source table and a data target table. The node can display related information through a pop-up box after clicking, and the method comprises the following steps: table name, number of upstream layers, number of downstream layers, number of upstream tables, number of downstream tables, number of direct upstream tables, number of direct downstream tables, and direct downstream table field list. Clicking a certain table in the direct downstream table field list, and expanding the relationship between the display data fields, wherein the relationship comprises the following steps: the name of the field of the data source table, the name of the field of the data target table and a connecting line with a directional arrow.
The visual interface further comprises a search box capable of inputting texts, a search button is clicked after the table name is input, the blood relationship graph is redrawn, and the blood relationship graph related to the searched specific data table is directly displayed. The visual data blood relationship graph based on the graph model can be dragged, enlarged, reduced and the like, so that a user can visually check the flow direction condition of a certain data table in the whole data link.
The graph model-based data consanguinity relationship storage system 200 provided by the embodiment of the invention directly stores data into a graph database as nodes, relationships and attributes of a graph by using an initial graph model without designing a complex relational data table structure in advance, thereby greatly reducing the design difficulty and complexity of such scenes; based on the memory computing mechanism and the optimized data structure of the graph database Neo4j, under the condition of a large amount of data, statistics of the number of upstream and downstream levels of the data blood margin and the number of dependency tables can be completed quickly within a few milliseconds, and retrieval of data fields and dependency relationships of the data tables can be completed quickly; meanwhile, by combining with good visual interface function design, a user can quickly check and search the data table and visually see key information such as blood relationship flow direction relation, blood relationship dependent hierarchy and the like among data by clicking and selecting through a mouse without compiling codes.
Example 4
As shown in fig. 4, the present embodiment provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the one processor to cause the at least one processor to:
analyzing SQL sentences in the data processing scripts to obtain the names and field names of the data source tables, the names and field names of the data target tables and the relations between the data source tables and the data target tables;
creating an initial graph model in a graph database Neo4 j;
taking the data source table field names and the data target table field names as nodes of the initial graph model respectively, and writing the nodes into the graph database Neo4 j;
respectively taking the name of the data source table and the name of the data target table as the attributes of the initial graph model node, and writing the attributes into the graph database Neo4 j;
taking the relation between the field names of the data source table and the field names of the data target table as the edges of the initial graph model, and writing the edges into the graph database Neo4 j;
and repeating the steps, traversing and analyzing SQL sentences in all the data processing scripts, and generating a blood relationship graph model.
Example 4
The embodiment of the disclosure provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions can execute the vulnerability component version searching method in any method embodiment.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".

Claims (10)

1. A data blood relationship storage method based on a graph model is characterized by comprising the following steps:
analyzing SQL sentences in the data processing script;
creating an initial graph model;
associating the analysis result with an initial graph model;
and repeating the operations, traversing the SQL sentences in all the data processing scripts and generating a blood relationship graph model.
2. The method of claim 1, wherein after parsing the SQL statement in the data manipulation script, the method comprises:
and acquiring the relationship between the data source table name and the field name, the data target table name and the field name, and the relationship between the data source table and the data target table field.
3. The method according to claim 2, wherein the creating of the initial graph model specifically comprises:
an initial graph model is created in the graph database Neo4 j.
4. The method of claim 3, wherein the associating the parsed result with the initial graph model comprises:
and respectively taking the data source table field names and the data target table field names as nodes of the initial graph model, and writing the nodes into the graph database Neo4 j.
5. The method of claim 3, wherein associating the parsed result with an initial graph model further comprises:
and respectively taking the data source table name and the data target table name as the attributes of the initial graph model node, and writing the attributes into the graph database Neo4 j.
6. The method of claim 3, wherein the associating the parsed result with the initial graph model comprises:
and taking the relation between the field names of the data source table and the field names of the data target table as the edge of the initial graph model, and writing the edge into the graph database Neo4 j.
7. A graph model-based data relationship storage system, comprising:
the analysis module is used for analyzing SQL sentences in the data processing script;
the creation module is used for creating an initial graph model in a graph database Neo4 j;
a node writing module, configured to write the data source table field name and the data target table field name into the graph database Neo4j, as nodes of the initial graph model, respectively;
an attribute writing module, configured to write the data source table name and the data target table name into the graph database Neo4j, as attributes of the initial graph model node;
a relation writing module, configured to write the relation between the data source table field name and the data target table field name as an edge of the initial graph model into the graph database Neo4 j;
and the traversal module is used for traversing and analyzing the SQL sentences in all the data processing scripts to generate a blood relationship graph model.
8. The system of claim 7, wherein the parsing module is further configured to obtain relationships between data source table names and field names, data target table names and field names, and data source tables and data target table fields.
9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
10. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of any one of claims 1 to 6.
CN201910385135.7A 2019-05-09 2019-05-09 Data genetic connection storage method, system, medium and equipment based on graph model Pending CN110245270A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910385135.7A CN110245270A (en) 2019-05-09 2019-05-09 Data genetic connection storage method, system, medium and equipment based on graph model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910385135.7A CN110245270A (en) 2019-05-09 2019-05-09 Data genetic connection storage method, system, medium and equipment based on graph model

Publications (1)

Publication Number Publication Date
CN110245270A true CN110245270A (en) 2019-09-17

Family

ID=67883891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910385135.7A Pending CN110245270A (en) 2019-05-09 2019-05-09 Data genetic connection storage method, system, medium and equipment based on graph model

Country Status (1)

Country Link
CN (1) CN110245270A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781520A (en) * 2019-10-30 2020-02-11 上海观安信息技术股份有限公司 Sensitive table group discovery method and system
CN111427997A (en) * 2020-03-09 2020-07-17 北京明略软件系统有限公司 Method and device for displaying blood relationship, computer storage medium and terminal
CN111475682A (en) * 2020-04-06 2020-07-31 武汉智领云科技有限公司 Intelligent operation and maintenance platform based on super-large-scale data system
CN111538743A (en) * 2020-04-22 2020-08-14 电子科技大学 SQL-based data blood relationship analysis method and system
CN111723253A (en) * 2020-05-25 2020-09-29 贵州华泰智远大数据服务有限公司 Data blood relationship query method and query system based on graph database
CN111782641A (en) * 2020-06-28 2020-10-16 中国工商银行股份有限公司 Data error repairing method and system
CN112749186A (en) * 2021-01-22 2021-05-04 广州虎牙科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112749229A (en) * 2019-10-31 2021-05-04 北京国双科技有限公司 Data conversion method, device, storage medium and electronic equipment
CN112783887A (en) * 2019-11-07 2021-05-11 北京沃东天骏信息技术有限公司 Data processing method and device based on data warehouse
CN113094776A (en) * 2021-04-19 2021-07-09 城云科技(中国)有限公司 Method and system for constructing visual component model data and electronic equipment
CN113127442A (en) * 2020-01-10 2021-07-16 马上消费金融股份有限公司 Visualization method and device of data model and storage medium
CN113486108A (en) * 2021-07-06 2021-10-08 建信金融科技有限责任公司 Data processing method and device, electronic equipment and computer readable medium
CN113672674A (en) * 2021-07-15 2021-11-19 浙江大华技术股份有限公司 Method, electronic device and storage medium for automatically arranging service flow
CN113672774A (en) * 2021-07-29 2021-11-19 国电南瑞科技股份有限公司 Distribution network equipment topology coloring method and device based on distribution cloud master station and graph database
CN116010444A (en) * 2023-03-27 2023-04-25 中国人民解放军国防科技大学 Low-code interactive graph query statement construction method
CN117688217A (en) * 2024-02-02 2024-03-12 北方健康医疗大数据科技有限公司 System, method and medium for realizing data blood relationship structure based on directed graph
CN117786023A (en) * 2024-02-28 2024-03-29 北方健康医疗大数据科技有限公司 Medical data blood-edge analysis method, system, terminal and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3327991A1 (en) * 2016-11-29 2018-05-30 Alcatel Lucent Storage of coverage-related information of a telecommunication network
CN108170847A (en) * 2018-01-18 2018-06-15 国网福建省电力有限公司 A kind of big data storage method based on Neo4j chart databases
CN108804701A (en) * 2018-06-19 2018-11-13 苏州大学 Personage's portrait model building method based on social networks big data
CN109446279A (en) * 2018-10-15 2019-03-08 顺丰科技有限公司 Based on neo4j big data genetic connection management method, system, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3327991A1 (en) * 2016-11-29 2018-05-30 Alcatel Lucent Storage of coverage-related information of a telecommunication network
CN108170847A (en) * 2018-01-18 2018-06-15 国网福建省电力有限公司 A kind of big data storage method based on Neo4j chart databases
CN108804701A (en) * 2018-06-19 2018-11-13 苏州大学 Personage's portrait model building method based on social networks big data
CN109446279A (en) * 2018-10-15 2019-03-08 顺丰科技有限公司 Based on neo4j big data genetic connection management method, system, equipment and storage medium

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781520A (en) * 2019-10-30 2020-02-11 上海观安信息技术股份有限公司 Sensitive table group discovery method and system
CN112749229A (en) * 2019-10-31 2021-05-04 北京国双科技有限公司 Data conversion method, device, storage medium and electronic equipment
CN112783887A (en) * 2019-11-07 2021-05-11 北京沃东天骏信息技术有限公司 Data processing method and device based on data warehouse
CN113127442A (en) * 2020-01-10 2021-07-16 马上消费金融股份有限公司 Visualization method and device of data model and storage medium
CN113127442B (en) * 2020-01-10 2023-12-22 马上消费金融股份有限公司 Method, device and storage medium for visualizing data model
CN111427997A (en) * 2020-03-09 2020-07-17 北京明略软件系统有限公司 Method and device for displaying blood relationship, computer storage medium and terminal
CN111475682A (en) * 2020-04-06 2020-07-31 武汉智领云科技有限公司 Intelligent operation and maintenance platform based on super-large-scale data system
CN111538743A (en) * 2020-04-22 2020-08-14 电子科技大学 SQL-based data blood relationship analysis method and system
CN111538743B (en) * 2020-04-22 2023-08-18 电子科技大学 SQL-based data blood relationship analysis method and system
CN111723253A (en) * 2020-05-25 2020-09-29 贵州华泰智远大数据服务有限公司 Data blood relationship query method and query system based on graph database
CN111782641A (en) * 2020-06-28 2020-10-16 中国工商银行股份有限公司 Data error repairing method and system
CN111782641B (en) * 2020-06-28 2023-07-28 中国工商银行股份有限公司 Data error repairing method and system
CN112749186A (en) * 2021-01-22 2021-05-04 广州虎牙科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112749186B (en) * 2021-01-22 2024-02-09 广州虎牙科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN113094776A (en) * 2021-04-19 2021-07-09 城云科技(中国)有限公司 Method and system for constructing visual component model data and electronic equipment
CN113486108A (en) * 2021-07-06 2021-10-08 建信金融科技有限责任公司 Data processing method and device, electronic equipment and computer readable medium
CN113672674A (en) * 2021-07-15 2021-11-19 浙江大华技术股份有限公司 Method, electronic device and storage medium for automatically arranging service flow
CN113672774A (en) * 2021-07-29 2021-11-19 国电南瑞科技股份有限公司 Distribution network equipment topology coloring method and device based on distribution cloud master station and graph database
CN113672774B (en) * 2021-07-29 2023-09-29 国电南瑞科技股份有限公司 Distribution network equipment topology coloring method and device based on distribution cloud master station and graph database
CN116010444A (en) * 2023-03-27 2023-04-25 中国人民解放军国防科技大学 Low-code interactive graph query statement construction method
CN117688217A (en) * 2024-02-02 2024-03-12 北方健康医疗大数据科技有限公司 System, method and medium for realizing data blood relationship structure based on directed graph
CN117786023A (en) * 2024-02-28 2024-03-29 北方健康医疗大数据科技有限公司 Medical data blood-edge analysis method, system, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN110245270A (en) Data genetic connection storage method, system, medium and equipment based on graph model
US9804954B2 (en) Automatic cognitive adaptation of development assets according to requirement changes
US20180039399A1 (en) Interactive user interface for dynamically updating data and data analysis and query processing
Dourish No SQL: The shifting materialities of database technology
US7734619B2 (en) Method of presenting lineage diagrams representing query plans
US10691584B2 (en) Behavior driven development integration with test tool
US9466041B2 (en) User selected flow graph modification
EP2963543B1 (en) User interface generation using a model layer
US20140304214A1 (en) Navigable semantic network definition, modeling, and use
US10114619B2 (en) Integrated development environment with multiple editors
US9928288B2 (en) Automatic modeling of column and pivot table layout tabular data
CN110543571A (en) knowledge graph construction method and device for water conservancy informatization
US20160132304A1 (en) Contraction aware parsing system for domain-specific languages
EP3340078B1 (en) Interactive user interface for dynamically updating data and data analysis and query processing
US20150293947A1 (en) Validating relationships between entities in a data model
US20160092178A1 (en) Method and system for model driven development
US10423416B2 (en) Automatic creation of macro-services
CN112199086A (en) Automatic programming control system, method, device, electronic device and storage medium
US11898890B2 (en) User interfaces for displaying discretized elements of logical flow systems
EP3786810A1 (en) Automatic generation of computing artifacts for data analysis
Psallidas et al. Provenance for interactive visualizations
US11573790B2 (en) Generation of knowledge graphs based on repositories of code
US20120192069A1 (en) Transforming user interface actions to script commands
US20140130008A1 (en) Generating information models
US20130283233A1 (en) Multi-engine executable data-flow editor and translator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190917

RJ01 Rejection of invention patent application after publication