WO2022057279A1

WO2022057279A1 - Visual graph calculation method and system, and storage medium and electronic device

Info

Publication number: WO2022057279A1
Application number: PCT/CN2021/092928
Authority: WO
Inventors: 李欣刚; 陈泽瀛; 舒艳华; 蔡朝辉; 叶国林
Original assignee: 银联商务股份有限公司
Priority date: 2020-09-18
Filing date: 2021-05-11
Publication date: 2022-03-24
Also published as: CN112256695B; CN112256695A

Abstract

A visual graph calculation method and system, and a storage medium and an electronic device, which solve the technical problem in the prior art of data being insecure during a graph calculation process. According to the visual graph calculation method, on a visual graph calculation platform, a user selects operators of a graph and a connection relationship between the operators, and configures parameters of the operators; a big data cluster can generate a table data file of the graph according to the operators and the parameters; a graph calculation server constructs an instance of the graph and generates structured data of the graph according to the operators, the connection relationship between the operators, and the configuration parameters of the operators; the big data cluster transforms the structured data of the graph into table data of the graph; and the user can only obtain the structured data of the graph on the graph calculation server, and obtains the table data of the graph in the big data cluster. The entire process involves transmission between a back-end graph calculation server and a big data cluster, and a user cannot see any data, such that the data security is improved.

Description

Visualized graph computing method and its system, storage medium and electronic device

technical field

The present application relates to the field of computer technology, and in particular, to a visual graph computing method and system, storage medium, and electronic device.

Background technique

With the rapid development of big data technology, major companies, especially networked enterprises, are collecting data, storing data, processing data, sharing data, retrieving data, analyzing data, displaying data and mining the business value behind data from various angles. . The data generated by the interaction between different individuals is represented in the form of graphs, and large-scale graph data has accumulated in the fields of communication, Internet, e-commerce, social networks, and the Internet of Things.

The graph is composed of nodes and edges, and the data with the graph structure is graph data. Graph computing is a processing technology for graph data, such as graph databases and graph computing frameworks, whether distributed or single-node solutions, are built on physical machines and meet user needs through services deployed on physical machines. Shared use of the same service.

The graph computing server in the related art often needs to export the required graph data from the database, and then manually input it into the graph computing server for calculation. For data with sensitive fields, if the data is exported and manually entered into the graph computing server, increase The probability of data leakage or loss is increased, that is, there is a problem of data insecurity.

Application content

In view of this, the embodiments of the present application provide a visual graph computing method and system, storage medium, and electronic device, which solve the technical problem of data insecurity in the graph computing process in the prior art.

As a first aspect of the embodiments of the present application, the embodiments of the present application provide a visual graph computing method, the method is applied to a visual graph computing system, and the visual graph computing system includes a visual graph computing platform, a big data A cluster and a graph computing server, wherein the big data cluster stores the original table data of the graph, and the graph computing server stores the original structured data of the graph; the visualized graph computing method includes:

Obtain the original table data of the graph to be processed according to the first input of the user;

According to the second input of the user, a task workflow file of the graph to be processed is generated, where the task workflow file includes the original table data and a plurality of task nodes, wherein the plurality of task nodes include at least one first task node and at least one second task node;

Generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate a table data file;

Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data; and

Convert the structured data of the graph to be processed into table data of the graph to be processed.

In an embodiment of the present application, after obtaining the original table data of the graph to be processed, and before generating the task workflow file of the graph to be processed according to the second input instruction of the user, the visual graph computing method further include:

Preprocessing the original table data to obtain the preprocessing table data of the graph to be processed;

Generate at least one first command to execute the first task node according to the at least one first task node, calculate the original table data, and generate a table data file, including:

Generate at least one first command to execute the first task node according to the at least one first task node, calculate the preprocessing table data, and generate a table data file.

In an embodiment of the present application, according to the second input of the user, the task workflow file of the graph to be processed is generated, including:

Generate a data import node according to the data import operator input by the user;

According to the to-be-processed graph creation operator input by the user, a to-be-processed graph creation node is generated;

generating a to-be-processed graph computing node according to a plurality of algorithm operators of the to-be-processed graph input by the user;

Generate a data derivation node according to the data derivation operator input by the user;

generating a stop node according to the stop of the to-be-processed graph operator sub-instruction input by the user;

According to the connection relationship between the multiple algorithm operators input by the user, the relationship between the multiple algorithm operators is generated;

According to the preset configuration mode input by the user, parameter configuration is performed on the plurality of algorithm operators, and parameters of each of the algorithm operators are generated; and

generating the task workflow file of the graph to be processed according to the submission instruction input by the user;

Wherein, the at least one first task node includes: the data import node, the to-be-processed graph computing node, the data export node, and the stop node;

The at least one second task node includes: the to-be-processed graph creation node.

In an embodiment of the present application, generating at least one first command to execute the first task node according to the at least one first task node, calculating the preprocessing table data, and generating a table data file, including:

generating a data import command, a graph calculation command, a data export command, and a stop command according to the data import node, the graph computing node to be processed, the data export node, and the stop node;

The preprocessing table data is calculated to generate a table data file.

In an embodiment of the present application, the preprocessing table data is calculated to generate a table data file, including:

According to the data import adjustment command, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the to-be-processed graph, and calculate the point data between the multiple algorithm operators in the to-be-processed graph The edge data corresponding to the connection relationship;

Generate the header files of the point data and the edge data according to the parameters of the algorithm operator;

Wherein, the table data file includes: a plurality of point data, a plurality of edge data and a header file.

In an embodiment of the present application, a graph instance to be processed is generated according to the at least one first command and the at least one second task node, and the table data file is calculated according to the graph instance to be processed to generate The structured data of the graph to be processed includes:

Create an original instance of the to-be-processed graph according to the to-be-processed graph creation node and the data import command;

Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed;

Calculate and adjust commands according to the graph. According to the instance of the graph to be processed, the table data file, the multiple algorithm operators of the to-be-processed graph, the parameters of each of the algorithm operators, the multiple algorithms The connection relationship of the operator is calculated, and the structured data of the to-be-processed graph is generated;

According to the data export command, the structured data of the graph to be processed is exported.

In an embodiment of the present application, before deriving a command according to the data and exporting the structured data of the to-be-processed graph, generate a to-be-processed command according to the at least one first command and the at least one second task node A graph instance, and calculating the table data file according to the graph instance to be processed to generate structured data of the graph to be processed, further comprising:

Determine whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance;

When the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, the parameters of the algorithm operator are modified.

In an embodiment of the present application, before obtaining the original table data of the graph to be processed, the visualized graph calculation method further includes:

generating first verification information according to the username and password input by the user, where the first verification information represents a request to verify whether the username and password are correct;

When the second verification information is received, a first signature is generated, and the first signature is used to prompt the user that the username and password are correct, and according to the user's input, the original table data of the graph to be processed is obtained, in which the The second verification information indicates that the user name and the password are correct;

Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed, including:

acquiring the username of the user, and generating third verification information according to the username, where the third verification information represents a request to acquire the password of the username;

Generate fourth verification information according to the username input by the user and the password of the username, and the fourth verification information represents a request to verify whether the username and the password are correct;

When fifth verification information is received, the configuration file of the original instance of the graph to be processed is modified to generate an instance of the graph to be processed, wherein the fifth verification information indicates that the user name and the password are correct.

As a second aspect of the present application, an embodiment of the present application provides a visualized graph computing system, including:

A visual graph computing platform, configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;

The big data cluster is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate table data document;

A graph computing server, configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the table according to the graph instance to be processed. The data file is calculated to generate the structured data of the graph to be processed. ;

Wherein, the big data cluster is further used to convert the structured data of the graph to be processed into table data of the graph to be processed.

In an embodiment of the present application, the visualized graph computing system further includes:

A verification server, configured to verify whether the user name and the password are correct according to the user name and password of the user; query the password corresponding to the user name according to the user name.

As a third aspect of the present application, an embodiment of the present application provides a computer-readable storage medium, including:

a storage medium; the storage medium stores a computer program,

Wherein, the computer program is used to execute the above-mentioned visualized graph computing method.

As a fourth aspect of the present application, an embodiment of the present application provides an electronic device, the electronic device comprising:

processor;

a memory for storing the processor-executable instructions;

Wherein, the processor is configured to execute the above-mentioned visualized graph computing method.

A visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster, and a graph computing server, wherein the big data cluster The original table data of the graph is stored in the graph computing server, and the original structured data of the graph is stored in the graph computing server; the user selects the operator of the graph, the connection relationship between the operators, and the relationship between the operators on the visual graph computing platform. By configuring the parameters, the big data cluster can generate the table data file of the graph according to the operators and parameters of the graph, while the graph computing server builds the graph according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators. Example, and then the graph computing server performs graph computation according to the table data files in the big data cluster to generate the structural data of the graph, and then the big data cluster converts the structural data of the graph into the table data of the graph. Users can only perform graph computation on the graph computing server. Obtain the structure of the graph and the structural data of the graph, and obtain the table data of the graph in the big data cluster. During the whole process, there is no need for the user to download the table data of the graph and then import it into the graph computing server. All types of data are transmitted between the back-end graph computing servers and big data clusters, and users cannot see any data, which improves data security; in addition, when users perform graph computing, their main energy is It only needs to be placed in the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computing, and improves the efficiency of graph computing.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

FIG. 1 is a schematic structural diagram of a visualized graph computing system provided by an embodiment of the present application;

FIG. 2 shows a schematic flowchart of a visual graph computing method provided by an embodiment of the present application;

FIG. 3 shows a schematic flowchart of a visual graph computing method provided by another embodiment of the present application;

FIG. 4 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application;

FIG. 5 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application.

detailed description

In order to better understand the technical solutions of the present application, the embodiments of the present application are described in detail below with reference to the accompanying drawings.

It should be clear that the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

The terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present application. As used in the embodiments of this application and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise.

It should be understood that the term "and/or" used in this document is only an association relationship to describe the associated objects, indicating that there may be three kinds of relationships, for example, A and/or B, which may indicate that A exists alone, and A and B exist at the same time. B, there are three cases of B alone. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship.

FIG. 1 shows a computing system for a visualized graph provided by an embodiment of the present application, including:

The visual graph computing platform includes a client 4 and a core server 1. The client 4 can display a graph computing operation interface for users to perform operations on the graph computing operation interface, such as logging in to the system web page, according to user needs. The calculated graph can drag and drop various operators, etc. The core server 1 generates a workflow of graph computing in response to the user's operation on the system web page on the client 4;

Big data cluster 3. The big data cluster 3 stores the original table data of multiple graphs. Users can query the table data of multiple graphs, the table data of the points in the graph, and the table data of the edges of the graph according to the big data cluster 3. ;

Graph computing server 2, the original structured data of multiple graphs is stored in graph computing server 2, that is, instances of multiple graphs are stored, and users can view the structure of a graph according to graph computing server 2, wherein graph computing server 2 and Data is imported and exported between big data clusters 3. The table data in big data cluster 3 is imported into graph computing server 2, and users can view the structure of graphs that can be formed by the table data; The structural data of the graph is exported to the big data cluster 3, and the user can query the table data of the graph according to the big data cluster 3.

When the user needs to calculate a to-be-processed graph, the user can perform a visual calculation on the to-be-processed graph based on the computing system of the visualized graph, as shown in Figure 2, the specific visual graph calculation method includes the following steps:

Step S101: The client 4 displays the graph computing system web page, the user logs in the system web page, and inputs a first request. Process the original table data of each element of the graph, such as the table data of the points of the graph to be processed, and the table data of the edges of the graph to be processed;

Step S102: obtaining the first request of the user, and the core server 1 sends the first request of "viewing the original table data of the graph to be processed" to the big data cluster 3;

Step S103: the big data cluster 3 queries the original table data of each element of the to-be-processed graph according to the first request, and sends the raw table data of the to-be-processed graph to the core server 1;

At this point, the user can see the original table data of the graph to be processed on the system webpage.

Step S104: According to the to-be-processed graph, the user inputs a preset item on the web page, for example, submits a request or operation related to the calculation of the to-be-processed graph, such as dragging and dropping various operators;

Step S105: In response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the big data cluster 3, wherein the task workflow file includes the graph to be processed. The original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;

Step S106: The big data cluster 3 generates at least one first command to execute the first task node according to the received task workflow file and according to the at least one first task node, and transmits the at least one first command to graph computing The server 2; the big data cluster 3 also calculates the original table data of the graph to be processed according to the plurality of first task nodes, generates the table data file of the graph to be processed, and sends the table data file of the graph to be processed to the graph computing server 2 , wherein the original table data file of the graph to be processed includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element;

Step S107: The graph computing server 2 receives at least the first command and the original table data file of the graph to be processed, generates an instance of the graph to be processed according to the second task node and the at least one first command, and compares the original table to the instance of the graph to be processed. The data is calculated, the structured data of the graph to be processed is generated, and the structured data of the processing graph is transmitted to the big data cluster 3;

At this time, after the graph computing server 2 calculates the graph to be processed and generates structured data, the structured data can be sent to the core server 1, and the user can view the graph to be processed on the client 4 Structure.

Step S108: When the big data cluster 3 receives the structured data of the graph to be processed, it converts the structured data of the graph to be processed into table data of the graph to be processed. At this time, the big data cluster 3 can convert the graph to be processed. The table data is sent to the core server 1, and the user can view the table data of the to-be-processed graph on the client 4. In addition, because the big data cluster 3 has a storage function, the table data after the calculation of the to-be-processed graph is stored in the big data cluster 3, which provides direct table data for the calculation of other identical graphs in the future, improves the graph calculation efficiency, and when the user When the table data of the to-be-processed graph is required in other application scenarios, it can be viewed or exported directly from the big data cluster 3.

A visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster 3 and a graph computing server 2, wherein the big data The original table data of the graph is stored in the data cluster 3, and the original structured data of the graph is stored in the graph computing server 2; the user selects the operator of the graph, the connection relationship between the operators and the By configuring the parameters of the operators, the big data cluster 3 can generate the table data files of the graph according to the operators and parameters of the graph, while the graph computing server 2 can generate the graph data files according to the operators of the graph, the connection relationship between the operators, and the operators. Then, the graph calculation server 2 performs graph calculation according to the table data files in the big data cluster 3 to generate the structural data of the graph, and then the big data cluster 3 converts the structural data of the graph into the table data of the graph. , the user can only obtain the graph structure and graph structure data on the graph computing server 2, and obtain the graph table data in the big data cluster 3. During the whole process, the user does not need to download the graph table data and import it into the graph computing server 2. , and in the whole graph computing process, no matter what type of data in the graph is transmitted between the back-end graph computing server 2 and the big data cluster 3, users cannot see any data, which improves data security; In addition, when the user is performing graph computation, the user's main energy only needs to be placed on the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computation. Improves the efficiency of graph computation.

In an embodiment of the present application, in step S103, after the big data cluster 3 queries the raw table data of each element of the to-be-processed graph according to the first request, these raw table data are not necessarily all of the to-be-processed data The data that can be used in the graph, therefore, between step S104 and step S103, as shown in FIG. 3, the visualized graph calculation method further includes:

Step S1031: the core server 1 preprocesses the original table data, and obtains the preprocessed table data of the graph to be processed;

Step S106 is: the big data cluster 3 generates at least one first adjustment order according to the received task workflow file, and transmits the at least one first adjustment order to the graph computing server 2; the big data cluster 3 also generates at least one first adjustment order according to the at least one first adjustment order. The task node is instructed to calculate the preprocessing table data of the graph to be processed, generate the table data file of the graph to be processed, and send the table data file of the graph to be processed to the graph computing server 2, wherein the original table data file of the graph to be processed It includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element.

The original table data of the graph to be processed is preprocessed. During the data calculation and transmission process in the core server 1, the graph computing server 2 and the big data cluster 3, the amount of data calculation and transmission is reduced, and the entire calculation process is The original data of are all the original table data related to the graph to be processed, which improves the computational efficiency.

In an embodiment of the present application, as shown in FIG. 3, step S105: in response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the large Data cluster 3, wherein the task workflow file includes the original table data of the graph to be processed, multiple task nodes, and the execution sequence between the multiple task nodes, which may specifically include the following steps:

Step S1051: According to the data import operator input by the user, the core server 1 generates a data import node; that is, the user operates on the graph computing interface of the client 4, drags and pulls the data import class operator, and then the core server 1 responds to the user's Input to generate a data import node.

The data importing operators include importing edge file operators and importing point file operators.

Step S1052: Create an operator according to the to-be-processed graph input by the user, and generate a to-be-processed graph creation node; that is, the user operates on the graph computing interface of the client 4, drags and drops the graph to create an operator, and then the core server 1 responds to the user The input to generate the graph to create the node;

Step S1053: Generate a to-be-processed graph computing node according to the multiple algorithm operators of the to-be-processed graph input by the user; that is, the user operates on the graph computing interface of the client 4, and drags and drops the multiple algorithms of the graph to be processed. operator, and then the core server 1 generates a graph computing node to be processed in response to the user's input;

Among them, the multiple algorithm operators of the graph to be processed are mainly used for splicing the algorithm operator parameters input by the user at the client into a statement allowed by the graph computing server.

Step S1054: derive the operator according to the data input by the user, and generate a data derivation node; that is, the user operates on the graph computing interface of the client 4, drags and drags the graph data derivation operator, and then the core server 1 responds to the user's input and generates Data export operator;

Step S1055: Generate a stop node according to the stop pending graph operator input by the user; that is, the user operates on the graph computing interface of the client 4, drags and drops the pending graph operator, and then the core server 1 responds to the user's request. Input to generate a stop node. At this time, the user has dragged and dragged all the algorithm operators of the to-be-processed graph, that is, all the algorithm operators of the to-be-processed graph are ready;

Step S1056: According to the connection relationship between the multiple algorithm operators input by the user, the relationship between the algorithm operators is generated; that is, the user operates on the graph calculation interface of the client 4, and the drag and drop is ready in step S1056 The connection relationship between the multiple algorithm operators of the The above operation can be performed so that there is a connection relationship between the first algorithm operator, the second algorithm operator, and the seventh algorithm operator, then the core server 1 responds to the user's input and generates a computing node of multiple operators , that is, the graph to be processed can be calculated through the connection relationship of multiple algorithm operators, so as to obtain corresponding calculation data.

Step S1057: According to the preset configuration mode input by the user, perform parameter configuration on multiple algorithm operators, and generate parameters for each algorithm operator; that is, the user operates on the graph calculation interface of the client 4, dragging and dropping the operator parameters configuration, and then the core server 1 generates the parameters of each algorithm operator in response to the user's operation.

For example, when a user drags and drops an operator, he only needs to specify the value of the operator according to the parameter name of the operator. For example, the parameters of the load point file operator are described as follows:

An example of the value filled by the user is as follows:

Step S1058: Generate a task workflow file of the graph to be processed according to the submission instruction input by the user;

That is, when the user completes all the operations in the above steps S1051-S1057 on the graph computing interface of the client 4, and clicks "Submit task", the core server 1 responds to the "Submit task" instruction, steps S1051-Step S1057 All operations performed in , generate a task workflow file, wherein the task flow work file includes: preprocessing table data, multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node, Wherein, at least one first task node includes: a data import node, a graph calculation node to be processed, a data export node, and a stop node; at least one second task node includes: the to-be-processed graph creation node.

At this point, the core server 1 has completed the generated task workflow file. Although the core server 1 generates the task workflow in step S1058, when the user operates on the graph computing interface of the client 4, the background of the core server 1 corresponds to Corresponding nodes are generated, that is, steps S1051 to S1057, but when the user clicks to submit the task on the interface, as shown in Figure 3, the task nodes generated in steps S1051 to S1057 are unified in the task workflow file. , and the task workflow file includes the preprocessing table data obtained in step S1031 and a plurality of task nodes obtained in steps S1051-S1057.

When the core server 1 completes the task workflow file, the task workflow file is transmitted to the big data cluster 3 and the graph computing server 2, and the big data cluster 3 needs to generate at least the first task node according to the task workflow file. A first adjustment order, wherein at least one first adjustment order includes: a data import order, a to-be-processed graph calculation order, a data export order, and a stop adjustment order. At the same time, the big data cluster 3 performs calculation according to the preprocessing table data in the task workflow file, and generates the table data file of the graph to be processed, that is, the data cluster executes step S106 after receiving the task workflow file, and step S106 Specifically include the following steps:

Step S1061: According to the data import node, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the graph to be processed, and the point data corresponding to the connection relationship between the multiple algorithm operators in the graph to be processed. edge data;

Step S1062: generate header files of point data and the edge data according to the parameters of the algorithm operator, and send the table data to the graph computing server 2;

Step S1063-Step S1066: In response to the adjustment command, send the data importing command, the graph computing command, the data exporting command, and the stop command to the graph computing server 2 in sequence.

When the user operates on the graph computing interface of the client 4, the core server 1 forms the first task node and the second task node according to the operation, the big data cluster 3 generates the first command according to the first node, and the graph computing server 2 According to the first task node and the second task node, the operator corresponding to each task node is converted into the language allowed by the graph computing server 2, and then asks the big data cluster 3 for the first adjustment command, and then executes the corresponding commands in sequence according to the first adjustment command. operation. That is, step S107 executed by the graph computing server 2 specifically includes the following steps:

Step S1071:

According to the graph creation node to be processed and the data import command, the original instance of the graph to be processed is created; that is, when the user drags the graph creation operator to be processed on the graph computation interface of the client 4, the graph computation server 2 converts the graph creation operator into the language allowed by the graph calculator, and requests the big data cluster 3 to send the "data import command", and then when receiving the "data import command" sent by the big data cluster 3, the graph calculates The server 2 creates an original instance of the graph to be processed, and modifies the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed.

Step S1072: Calculate the adjustment command according to the to-be-processed graph, and perform the table data file, multiple algorithm operators of the to-be-processed graph, the parameters of each algorithm operator, and the connection relationship of the multiple algorithm operators according to the instance of the to-be-processed graph. Calculate, generate the structured data of the graph to be processed; that is, when the user drags and drops multiple algorithm operators of the graph to be processed on the graph calculation interface of the client 4, the graph calculation server 2 then calculates the multiple graph algorithm operators. Convert the language into the language allowed by the graph calculator, and request the big data cluster 3 to send the "graph calculation command", and then when receiving the "graph calculation command" sent by the big data cluster 3, the graph calculation server 2 obtains according to step S1071. The obtained instance of the graph to be processed is calculated, and the structured data of the graph to be processed is generated. That is, the user can query the structured data of the to-be-processed graph, that is, the structure of the to-be-processed graph, on the graph computing interface of the client terminal 4 .

Step S1073: According to the data deriving command, and according to the parameters of a plurality of algorithm operators, the structured data is exported to the big data cluster 3 for user query or other business use. That is, when the user drags the data export operator on the graph computing interface of the client 4, the graph computing server 2 converts the data export operator into the language allowed by graph computing, and requests the big data cluster 3 to send the "data export command" , when receiving the "data export command" sent by the big data cluster 3, the graph computing server 2 exports the structured data to the big data cluster 3 according to the parameters of the algorithm operator. At this time, the big data cluster 3 will structure the data. After the data is converted into table data, the user can view the table data of the graph to be processed on the graph calculation interface of the client 4 for user query or other business use.

Optionally, step S1073 specifically includes: judging whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance; when the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, modifying the parameters of the algorithm operator.

Step S1074: According to the stop adjustment command, stop the dragging of the algorithm operator for calculating the graph to be processed, and stop the calculation to be processed. That is, after the user successfully executes all the operators of the to-be-processed graph, that is, after obtaining the structure diagram and table data of the love-processed graph, and the current to-be-processed graph is no longer used, stop the to-be-processed graph and delete it. Instances of pending graphs, which can free up more resources.

In another embodiment of the present application, as shown in FIG. 4 , the visual graph computing system further includes a verification server 6, wherein the verification server 6 is used to verify whether the user name and password are correct according to the user name and password of the user; Name to query the password corresponding to the user name. Based on the visualized graph computing system, the visualized graph computing method further includes:

Step S100: Generate first verification information according to the user name and password input by the user, where the first verification information represents a request to verify whether the user name and password are correct;

When the second verification information is received, a first signature is generated, and the first signature is used to prompt that the user name and password of the user are correct, and the user side successfully logs in to the graph computing interface of the client 4, that is, step S101 is performed.

And when the graph computing server 2 creates a command according to the graph, in step S1071, the graph computing server 2 creates a command according to the graph to be processed, and after creating the original instance of the graph to be processed, the graph computing server 2 obtains the username of the current user, and when verifying After the server 6 obtains the user name of the user, it generates the third verification information according to the user name. The third verification information represents the password for requesting to obtain the user name, and sends the third verification information to the verification server 6, and requests the verification server 6 to give the password to the user. The password of the user name, when the graph computing server 2 receives the password of the user name sent by the verification server 6, the user name and password are input into the graph computing server 2 for verification, and when the verification is passed, the user can modify the original password. Instance's configuration file. In the visualized graph computing method provided by the embodiments of the present application, since the user name and password are written into the configuration file of the instance in the graph computing server 2, when the user starts the graph computing server 2, when the user accesses the graph computing instance created by himself , it can be consistent with the user name and password of the core server 1 of the visual graph computing, and the unified user authentication of the whole set of application systems is realized.

Optionally, the authentication server 6 is an LDAP server.

As another aspect of the present application, an embodiment of the present application provides a visualized graph computing system, including:

The big data cluster 3 is used to store the original table data of the graph, and is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and The original table data is calculated, and a table data file is generated; the structured data of the to-be-processed graph is converted into the table data of the to-be-processed graph;

The graph computing server 2 is configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the graph according to the graph instance to be processed. The table data file is calculated to generate the structured data of the graph to be processed. A visualized graph computing system provided by an embodiment of the present application includes a visualized graph computing platform, a big data cluster 3, and a graph computing server 2, wherein the big data cluster 3 stores original table data of graphs, and the graph The original structured data of the graph is stored in the computing server 2; the user selects the operators of the graph, the connection relationship between the operators, and configures the parameters of the operators on the visual graph computing platform, and the big data cluster 3 can be The operators and parameters of the graph generate the table data file of the graph, and the graph computing server 2 builds a graph instance according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators, and then the graph computing server 2 builds the graph according to the large The table data files in data cluster 3 perform graph computation to generate graph structure data, and then big data cluster 3 converts graph structure data into graph table data. Users can only obtain graph structure and graph data on graph computation server 2. For the structural data of the graph, the table data of the graph is obtained in the big data cluster 3. In the whole process, there is no need for the user to download the table data of the graph and then import it into the graph computing server 2. In the whole graph computing process, no matter what type of graph it is The data is transmitted between the back-end graph computing server 2 and the big data cluster 3, and users cannot see any data, which improves data security; in addition, when users perform graph computing, their main energy is only It needs to be placed in the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computing, and improves the efficiency of graph computing.

In an embodiment of the present application, as shown in FIG. 5 , the graph computing server 2 includes a Neo4j server, which is a high-performance NOSQL graph database, which stores structured data on the network instead of tables. It is an embedded, disk-based Java persistence engine with fully transactional features, but it stores structured data on a network (called a graph from a mathematical point of view) rather than in tables. Neo4j can also be seen as a high-performance graph engine with all the features of a full-fledged database. This application uses Neo4j as the graph database, so that for different users or different business scenarios, it is only necessary to create a Neo4j instance and assign different ports to realize graph data isolation; Neo4j server performs a series of graph operations through Cypher statements , the syntax is flexible, the parameters are easy to configure, and a variety of operation operators can be implemented; the Neo4j server can view the nodes and edges in the graph through the Neo4j Brower, which is convenient for analysis.

In an embodiment of the present application, as shown in FIG. 5 . Big Data Cluster 3 adopts CDH as the big data framework, CDH is Cloudera's 100% open source platform distribution, including Apache Hadoop, built for enterprise needs. By integrating Hadoop with more than a dozen other key open source projects, Cloudera has created a functionally advanced system that helps users execute end-to-end big data workflows. This application uses CDH as the big data framework. CDH is based on the stable version of Apache Hadoop and fixes the latest bugs; CDH is easy to install and upgrade; CDH supports rich components.

Optionally, the CDH of the big data cluster 3 in this embodiment of the present application includes the following components: HDFS, Hive, Oozie, Spark, and the like.

Among them, HDFS is a distributed file storage system that can store a large number of large files. The main difference between it and other distributed file systems is that it is a highly fault-tolerant system, suitable for deployment on cheap machines, and hdfs It can provide high-throughput data access and is very suitable for applications on large-scale data sets. That is, HDFS can convert the structured data sent by the graph computing server 2 to the big data cluster 3, and convert the structured data into table data.

Hive is a Hadoop-based data warehouse tool for data extraction, transformation, and loading. It is a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop. Hive data warehouse tools can map structured data files into a database table, and provide SQL query functions, which can convert SQL statements into MapReduce tasks for execution. That is, Hive can convert the table data converted by HDFS, that is, after HDFS converts the structured data into table data, the table data of the graph to be processed can be stored in Hive.

Oozie is an open source framework based on a workflow engine, contributed to Apache by Cloudera, for running a set of jobs or processes in a specific order within a workflow. In the cluster, it is responsible for scheduling tasks according to the order of business logic. That is, Oozie can convert the task node sent by the core server 1 into a command, and send the command to the graph computing server 2, that is, execute steps S1061-S1065. The specific execution steps are as described above, and will not be repeated again.

Spark is a fast and general computing engine designed for large-scale data processing. Spark can perform graph computations based on commands.

Exemplary Electronics

As a third aspect of the present application, an embodiment of the present application further provides an electronic device, including one or more processors and a memory.

The processor may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.

The memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like. The non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor may execute the above-mentioned program instructions to implement the above-mentioned over-prediction of own funds in various embodiments of the present application and/or or other desired functionality. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.

Exemplary computer program product and computer readable storage medium

In addition to the methods and apparatuses described above, embodiments of the present application may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification The steps of the calculation method of the visualization graph according to the embodiments described in the present application are described in the section.

The computer program product can write program codes for performing the operations of the embodiments of the present application in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.

In addition, embodiments of the present application may also be computer-readable storage media having computer program instructions stored thereon, the computer program instructions, when executed by a processor, cause the processor to perform the above-mentioned "Example Method" section of this specification The steps of the method for calculating a visualization graph according to various embodiments of the present application are described in .

The computer-readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor devices, devices, or devices, or a combination of any of the above, for example. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

The basic principles of the present application have been described above in conjunction with specific embodiments. However, it should be pointed out that the advantages, advantages, effects, etc. mentioned in the present application are only examples rather than limitations, and these advantages, advantages, effects, etc., are not considered to be Required for each embodiment of this application. In addition, the specific details disclosed above are only for the purpose of example and easy understanding, rather than limiting, and the above-mentioned details do not limit the application to be implemented by using the above-mentioned specific details.

The block diagrams of devices, apparatus, apparatuses, and systems referred to in this application are merely illustrative examples and are not intended to require or imply that the connections, arrangements, or configurations must be made in the manner shown in the block diagrams. As those skilled in the art will appreciate, these means, apparatuses, apparatuses, systems may be connected, arranged, configured in any manner.

It should also be pointed out that in the apparatus, equipment and method of the present application, each component or each step can be decomposed and/or recombined. These disaggregations and/or recombinations should be considered as equivalents of the present application.

The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use this application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Therefore, this application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present application shall be included in the present application. within the scope of protection.

Claims

A visualized graph computing method, the method is applied to a visualized graph computing system, the visualized graph computing system includes a visualized graph computing platform, a big data cluster and a graph computing server, wherein the big data cluster stores The original table data of the graph, the graph computing server stores the original structured data of the graph; the visualized graph computing method includes:

Obtain the original table data of the graph to be processed according to the first input of the user;

According to the second input of the user, a task workflow file of the graph to be processed is generated, where the task workflow file includes the original table data and a plurality of task nodes, wherein the plurality of task nodes include at least one first task node and at least one second task node;

Generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate a table data file;

Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data; and

Convert the structured data of the graph to be processed into table data of the graph to be processed.
The visualized graph computing method according to claim 1, after the acquisition of the original table data of the graph to be processed, and before generating the task workflow file of the graph to be processed according to the second input instruction of the user, the visualization The graph calculation method of also includes:

Preprocessing the original table data to obtain the preprocessing table data of the graph to be processed;

Generate at least one first command to execute the first task node according to the at least one first task node, calculate the original table data, and generate a table data file, including:

Generate at least one first command to execute the first task node according to the at least one first task node, calculate the preprocessing table data, and generate a table data file.
The visualized graph computing method according to claim 2,

According to the second input of the user, the task workflow file of the graph to be processed is generated, including:

Generate a data import node according to the data import operator input by the user;

According to the to-be-processed graph creation operator input by the user, a to-be-processed graph creation node is generated;

generating a to-be-processed graph computing node according to a plurality of algorithm operators of the to-be-processed graph input by the user;

Generate a data derivation node according to the data derivation operator input by the user;

generating a stop node according to the stop of the to-be-processed graph operator sub-instruction input by the user;

generating a relationship between the multiple algorithm operators according to the connection relationship between the multiple algorithm operators input by the user;

According to the preset configuration mode input by the user, parameter configuration is performed on the plurality of algorithm operators, and parameters of each of the algorithm operators are generated; and

generating the task workflow file of the graph to be processed according to the submission instruction input by the user;

Wherein, the at least one first task node includes: the data import node, the to-be-processed graph computing node, the data export node, and the stop node;

The at least one second task node includes: the to-be-processed graph creation node.
The visual graph computing method according to claim 3, generating at least one first command to execute the first task node according to the at least one first task node, calculating the preprocessing table data, and generating table data documents, including:

generating a data import command, a graph calculation command, a data export command, and a stop command according to the data import node, the graph computing node to be processed, the data export node, and the stop node;

The preprocessing table data is calculated to generate a table data file.
The visualized graph calculation method according to claim 4, wherein the preprocessing table data is calculated to generate a table data file, comprising:

According to the data import adjustment command, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the to-be-processed graph, and calculate the point data between the multiple algorithm operators in the to-be-processed graph The edge data corresponding to the connection relationship;

Generate the header files of the point data and the edge data according to the parameters of the algorithm operator;

Wherein, the table data file includes: a plurality of point data, a plurality of edge data and a header file.
The visualized graph computing method according to claim 5,

Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data, including:

Create an original instance of the to-be-processed graph according to the to-be-processed graph creation node and the data import command;

Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed;

Calculate and adjust commands according to the graph. According to the instance of the graph to be processed, the table data file, the multiple algorithm operators of the to-be-processed graph, the parameters of each of the algorithm operators, the multiple algorithms The connection relationship of the operator is calculated, and the structured data of the to-be-processed graph is generated;

According to the data export command, the structured data of the graph to be processed is exported.
The visual graph computing method according to claim 6, before deriving a command according to the data and exporting the structured data of the graph to be processed, according to the at least one first command and the at least one second task node, generating a graph instance to be processed, and calculating the table data file according to the graph instance to be processed, and generating structured data of the graph to be processed, further comprising:

Determine whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance;

When the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, the parameters of the algorithm operator are modified.
The visualized graph computing method according to claim 6,

Before obtaining the original table data of the graph to be processed, the visualized graph calculation method further includes:

generating first verification information according to the username and password input by the user, where the first verification information represents a request to verify whether the username and password are correct;

When the second verification information is received, a first signature is generated, and the first signature is used to prompt the user that the username and password are correct, and according to the user's input, the original table data of the graph to be processed is obtained, in which the The second verification information indicates that the user name and the password are correct;

Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed, including:

obtaining the user name of the user, and generating third verification information according to the user name, where the third verification information represents a request to obtain the password of the user name;

Generate fourth verification information according to the username input by the user and the password of the username, and the fourth verification information represents a request to verify whether the username and the password are correct;

When fifth verification information is received, the configuration file of the original instance of the graph to be processed is modified to generate an instance of the graph to be processed, wherein the fifth verification information indicates that the user name and the password are correct.
A visual graph computing system, including:

A visual graph computing platform, configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;

The big data cluster is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate table data document;

A graph computing server, configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the table according to the graph instance to be processed. The data file is calculated to generate the structured data of the to-be-processed graph;

Wherein, the big data cluster is further used to convert the structured data of the graph to be processed into table data of the graph to be processed.
The visualized graph computing system according to claim 9, further comprising:

A verification server, configured to verify whether the user name and the password are correct according to the user name and password of the user; query the password corresponding to the user name according to the user name.
A computer-readable storage medium, comprising:

a storage medium; the storage medium stores a computer program,

Wherein, the computer program is used to execute the visualized graph computing method according to any one of the above claims 1-8.
An electronic device comprising:

processor;

a memory for storing the processor-executable instructions;

Wherein, the processor is configured to execute the visualized graph computing method according to any one of the preceding claims 1-8.