WO2022057279A1 - Visual graph calculation method and system, and storage medium and electronic device - Google Patents

Visual graph calculation method and system, and storage medium and electronic device Download PDF

Info

Publication number
WO2022057279A1
WO2022057279A1 PCT/CN2021/092928 CN2021092928W WO2022057279A1 WO 2022057279 A1 WO2022057279 A1 WO 2022057279A1 CN 2021092928 W CN2021092928 W CN 2021092928W WO 2022057279 A1 WO2022057279 A1 WO 2022057279A1
Authority
WO
WIPO (PCT)
Prior art keywords
graph
processed
data
user
table data
Prior art date
Application number
PCT/CN2021/092928
Other languages
French (fr)
Chinese (zh)
Inventor
李欣刚
陈泽瀛
舒艳华
蔡朝辉
叶国林
Original Assignee
银联商务股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 银联商务股份有限公司 filed Critical 银联商务股份有限公司
Publication of WO2022057279A1 publication Critical patent/WO2022057279A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/116Details of conversion of file system types or formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/313User authentication using a call-back technique via a telephone network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of computer technology, and in particular, to a visual graph computing method and system, storage medium, and electronic device.
  • the graph is composed of nodes and edges, and the data with the graph structure is graph data.
  • Graph computing is a processing technology for graph data, such as graph databases and graph computing frameworks, whether distributed or single-node solutions, are built on physical machines and meet user needs through services deployed on physical machines. Shared use of the same service.
  • the graph computing server in the related art often needs to export the required graph data from the database, and then manually input it into the graph computing server for calculation.
  • For data with sensitive fields if the data is exported and manually entered into the graph computing server, increase The probability of data leakage or loss is increased, that is, there is a problem of data insecurity.
  • the embodiments of the present application provide a visual graph computing method and system, storage medium, and electronic device, which solve the technical problem of data insecurity in the graph computing process in the prior art.
  • the embodiments of the present application provide a visual graph computing method, the method is applied to a visual graph computing system, and the visual graph computing system includes a visual graph computing platform, a big data A cluster and a graph computing server, wherein the big data cluster stores the original table data of the graph, and the graph computing server stores the original structured data of the graph;
  • the visualized graph computing method includes:
  • a task workflow file of the graph to be processed is generated, where the task workflow file includes the original table data and a plurality of task nodes, wherein the plurality of task nodes include at least one first task node and at least one second task node;
  • the visual graph computing method further include:
  • the task workflow file of the graph to be processed is generated, including:
  • a to-be-processed graph creation node is generated
  • connection relationship between the multiple algorithm operators input by the user the relationship between the multiple algorithm operators is generated
  • parameter configuration is performed on the plurality of algorithm operators, and parameters of each of the algorithm operators are generated;
  • the at least one first task node includes: the data import node, the to-be-processed graph computing node, the data export node, and the stop node;
  • the at least one second task node includes: the to-be-processed graph creation node.
  • generating at least one first command to execute the first task node according to the at least one first task node, calculating the preprocessing table data, and generating a table data file including:
  • the preprocessing table data is calculated to generate a table data file.
  • the preprocessing table data is calculated to generate a table data file, including:
  • the table data file includes: a plurality of point data, a plurality of edge data and a header file.
  • a graph instance to be processed is generated according to the at least one first command and the at least one second task node, and the table data file is calculated according to the graph instance to be processed to generate
  • the structured data of the graph to be processed includes:
  • the structured data of the graph to be processed is exported.
  • the visualized graph calculation method before obtaining the original table data of the graph to be processed, the visualized graph calculation method further includes:
  • first verification information represents a request to verify whether the username and password are correct
  • a first signature is generated, and the first signature is used to prompt the user that the username and password are correct, and according to the user's input, the original table data of the graph to be processed is obtained, in which the The second verification information indicates that the user name and the password are correct;
  • Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed including:
  • the third verification information represents a request to acquire the password of the username
  • the configuration file of the original instance of the graph to be processed is modified to generate an instance of the graph to be processed, wherein the fifth verification information indicates that the user name and the password are correct.
  • an embodiment of the present application provides a visualized graph computing system, including:
  • a visual graph computing platform configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
  • the big data cluster is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate table data document;
  • a graph computing server configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the table according to the graph instance to be processed.
  • the data file is calculated to generate the structured data of the graph to be processed.
  • the big data cluster is further used to convert the structured data of the graph to be processed into table data of the graph to be processed.
  • the visualized graph computing system further includes:
  • a verification server configured to verify whether the user name and the password are correct according to the user name and password of the user; query the password corresponding to the user name according to the user name.
  • an embodiment of the present application provides a computer-readable storage medium, including:
  • the storage medium stores a computer program
  • the computer program is used to execute the above-mentioned visualized graph computing method.
  • an embodiment of the present application provides an electronic device, the electronic device comprising:
  • a memory for storing the processor-executable instructions
  • the processor is configured to execute the above-mentioned visualized graph computing method.
  • a visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster, and a graph computing server, wherein the big data cluster
  • the original table data of the graph is stored in the graph computing server, and the original structured data of the graph is stored in the graph computing server; the user selects the operator of the graph, the connection relationship between the operators, and the relationship between the operators on the visual graph computing platform.
  • the big data cluster can generate the table data file of the graph according to the operators and parameters of the graph, while the graph computing server builds the graph according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators.
  • the graph computing server performs graph computation according to the table data files in the big data cluster to generate the structural data of the graph, and then the big data cluster converts the structural data of the graph into the table data of the graph.
  • Users can only perform graph computation on the graph computing server. Obtain the structure of the graph and the structural data of the graph, and obtain the table data of the graph in the big data cluster. During the whole process, there is no need for the user to download the table data of the graph and then import it into the graph computing server.
  • FIG. 1 is a schematic structural diagram of a visualized graph computing system provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of a visual graph computing method provided by an embodiment of the present application
  • FIG. 3 shows a schematic flowchart of a visual graph computing method provided by another embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application.
  • FIG. 1 shows a computing system for a visualized graph provided by an embodiment of the present application, including:
  • the visual graph computing platform includes a client 4 and a core server 1.
  • the client 4 can display a graph computing operation interface for users to perform operations on the graph computing operation interface, such as logging in to the system web page, according to user needs.
  • the calculated graph can drag and drop various operators, etc.
  • the core server 1 generates a workflow of graph computing in response to the user's operation on the system web page on the client 4;
  • Big data cluster 3 stores the original table data of multiple graphs. Users can query the table data of multiple graphs, the table data of the points in the graph, and the table data of the edges of the graph according to the big data cluster 3. ;
  • Graph computing server 2 the original structured data of multiple graphs is stored in graph computing server 2, that is, instances of multiple graphs are stored, and users can view the structure of a graph according to graph computing server 2, wherein graph computing server 2 and Data is imported and exported between big data clusters 3.
  • the table data in big data cluster 3 is imported into graph computing server 2, and users can view the structure of graphs that can be formed by the table data;
  • the structural data of the graph is exported to the big data cluster 3, and the user can query the table data of the graph according to the big data cluster 3.
  • the specific visual graph calculation method includes the following steps:
  • Step S101 The client 4 displays the graph computing system web page, the user logs in the system web page, and inputs a first request. Process the original table data of each element of the graph, such as the table data of the points of the graph to be processed, and the table data of the edges of the graph to be processed;
  • Step S102 obtaining the first request of the user, and the core server 1 sends the first request of "viewing the original table data of the graph to be processed" to the big data cluster 3;
  • Step S103 the big data cluster 3 queries the original table data of each element of the to-be-processed graph according to the first request, and sends the raw table data of the to-be-processed graph to the core server 1;
  • Step S104 According to the to-be-processed graph, the user inputs a preset item on the web page, for example, submits a request or operation related to the calculation of the to-be-processed graph, such as dragging and dropping various operators;
  • Step S105 In response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the big data cluster 3, wherein the task workflow file includes the graph to be processed.
  • Step S106 The big data cluster 3 generates at least one first command to execute the first task node according to the received task workflow file and according to the at least one first task node, and transmits the at least one first command to graph computing The server 2; the big data cluster 3 also calculates the original table data of the graph to be processed according to the plurality of first task nodes, generates the table data file of the graph to be processed, and sends the table data file of the graph to be processed to the graph computing server 2 , wherein the original table data file of the graph to be processed includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element;
  • Step S107 The graph computing server 2 receives at least the first command and the original table data file of the graph to be processed, generates an instance of the graph to be processed according to the second task node and the at least one first command, and compares the original table to the instance of the graph to be processed.
  • the data is calculated, the structured data of the graph to be processed is generated, and the structured data of the processing graph is transmitted to the big data cluster 3;
  • the graph computing server 2 calculates the graph to be processed and generates structured data
  • the structured data can be sent to the core server 1, and the user can view the graph to be processed on the client 4 Structure.
  • Step S108 When the big data cluster 3 receives the structured data of the graph to be processed, it converts the structured data of the graph to be processed into table data of the graph to be processed. At this time, the big data cluster 3 can convert the graph to be processed.
  • the table data is sent to the core server 1, and the user can view the table data of the to-be-processed graph on the client 4.
  • the big data cluster 3 has a storage function
  • the table data after the calculation of the to-be-processed graph is stored in the big data cluster 3, which provides direct table data for the calculation of other identical graphs in the future, improves the graph calculation efficiency, and when the user When the table data of the to-be-processed graph is required in other application scenarios, it can be viewed or exported directly from the big data cluster 3.
  • a visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster 3 and a graph computing server 2, wherein the big data
  • the original table data of the graph is stored in the data cluster 3, and the original structured data of the graph is stored in the graph computing server 2; the user selects the operator of the graph, the connection relationship between the operators and the
  • the big data cluster 3 can generate the table data files of the graph according to the operators and parameters of the graph
  • the graph computing server 2 can generate the graph data files according to the operators of the graph, the connection relationship between the operators, and the operators.
  • the graph calculation server 2 performs graph calculation according to the table data files in the big data cluster 3 to generate the structural data of the graph, and then the big data cluster 3 converts the structural data of the graph into the table data of the graph.
  • the user can only obtain the graph structure and graph structure data on the graph computing server 2, and obtain the graph table data in the big data cluster 3. During the whole process, the user does not need to download the graph table data and import it into the graph computing server 2.
  • step S103 after the big data cluster 3 queries the raw table data of each element of the to-be-processed graph according to the first request, these raw table data are not necessarily all of the to-be-processed data
  • the visualized graph calculation method further includes:
  • Step S1031 the core server 1 preprocesses the original table data, and obtains the preprocessed table data of the graph to be processed;
  • Step S106 is: the big data cluster 3 generates at least one first adjustment order according to the received task workflow file, and transmits the at least one first adjustment order to the graph computing server 2; the big data cluster 3 also generates at least one first adjustment order according to the at least one first adjustment order.
  • the task node is instructed to calculate the preprocessing table data of the graph to be processed, generate the table data file of the graph to be processed, and send the table data file of the graph to be processed to the graph computing server 2, wherein the original table data file of the graph to be processed It includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element.
  • the original table data of the graph to be processed is preprocessed.
  • the graph computing server 2 and the big data cluster 3 the amount of data calculation and transmission is reduced, and the entire calculation process is
  • the original data of are all the original table data related to the graph to be processed, which improves the computational efficiency.
  • step S105 in response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the large Data cluster 3, wherein the task workflow file includes the original table data of the graph to be processed, multiple task nodes, and the execution sequence between the multiple task nodes, which may specifically include the following steps:
  • Step S1051 According to the data import operator input by the user, the core server 1 generates a data import node; that is, the user operates on the graph computing interface of the client 4, drags and pulls the data import class operator, and then the core server 1 responds to the user's Input to generate a data import node.
  • the data importing operators include importing edge file operators and importing point file operators.
  • Step S1052 Create an operator according to the to-be-processed graph input by the user, and generate a to-be-processed graph creation node; that is, the user operates on the graph computing interface of the client 4, drags and drops the graph to create an operator, and then the core server 1 responds to the user The input to generate the graph to create the node;
  • Step S1053 Generate a to-be-processed graph computing node according to the multiple algorithm operators of the to-be-processed graph input by the user; that is, the user operates on the graph computing interface of the client 4, and drags and drops the multiple algorithms of the graph to be processed. operator, and then the core server 1 generates a graph computing node to be processed in response to the user's input;
  • the multiple algorithm operators of the graph to be processed are mainly used for splicing the algorithm operator parameters input by the user at the client into a statement allowed by the graph computing server.
  • Step S1054 derive the operator according to the data input by the user, and generate a data derivation node; that is, the user operates on the graph computing interface of the client 4, drags and drags the graph data derivation operator, and then the core server 1 responds to the user's input and generates Data export operator;
  • Step S1055 Generate a stop node according to the stop pending graph operator input by the user; that is, the user operates on the graph computing interface of the client 4, drags and drops the pending graph operator, and then the core server 1 responds to the user's request. Input to generate a stop node.
  • the user has dragged and dragged all the algorithm operators of the to-be-processed graph, that is, all the algorithm operators of the to-be-processed graph are ready;
  • Step S1056 According to the connection relationship between the multiple algorithm operators input by the user, the relationship between the algorithm operators is generated; that is, the user operates on the graph calculation interface of the client 4, and the drag and drop is ready in step S1056
  • the connection relationship between the multiple algorithm operators of the The above operation can be performed so that there is a connection relationship between the first algorithm operator, the second algorithm operator, and the seventh algorithm operator, then the core server 1 responds to the user's input and generates a computing node of multiple operators , that is, the graph to be processed can be calculated through the connection relationship of multiple algorithm operators, so as to obtain corresponding calculation data.
  • Step S1057 According to the preset configuration mode input by the user, perform parameter configuration on multiple algorithm operators, and generate parameters for each algorithm operator; that is, the user operates on the graph calculation interface of the client 4, dragging and dropping the operator parameters configuration, and then the core server 1 generates the parameters of each algorithm operator in response to the user's operation.
  • Step S1058 Generate a task workflow file of the graph to be processed according to the submission instruction input by the user;
  • steps S1051-Step S1057 All operations performed in , generate a task workflow file, wherein the task flow work file includes: preprocessing table data, multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node, Wherein, at least one first task node includes: a data import node, a graph calculation node to be processed, a data export node, and a stop node; at least one second task node includes: the to-be-processed graph creation node.
  • the core server 1 has completed the generated task workflow file.
  • the core server 1 generates the task workflow in step S1058, when the user operates on the graph computing interface of the client 4, the background of the core server 1 corresponds to Corresponding nodes are generated, that is, steps S1051 to S1057, but when the user clicks to submit the task on the interface, as shown in Figure 3, the task nodes generated in steps S1051 to S1057 are unified in the task workflow file.
  • the task workflow file includes the preprocessing table data obtained in step S1031 and a plurality of task nodes obtained in steps S1051-S1057.
  • the task workflow file is transmitted to the big data cluster 3 and the graph computing server 2, and the big data cluster 3 needs to generate at least the first task node according to the task workflow file.
  • a first adjustment order wherein at least one first adjustment order includes: a data import order, a to-be-processed graph calculation order, a data export order, and a stop adjustment order.
  • the big data cluster 3 performs calculation according to the preprocessing table data in the task workflow file, and generates the table data file of the graph to be processed, that is, the data cluster executes step S106 after receiving the task workflow file, and step S106 Specifically include the following steps:
  • Step S1061 According to the data import node, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the graph to be processed, and the point data corresponding to the connection relationship between the multiple algorithm operators in the graph to be processed. edge data;
  • Step S1062 generate header files of point data and the edge data according to the parameters of the algorithm operator, and send the table data to the graph computing server 2;
  • Step S1063-Step S1066 In response to the adjustment command, send the data importing command, the graph computing command, the data exporting command, and the stop command to the graph computing server 2 in sequence.
  • step S107 executed by the graph computing server 2 specifically includes the following steps:
  • the original instance of the graph to be processed is created; that is, when the user drags the graph creation operator to be processed on the graph computation interface of the client 4, the graph computation server 2 converts the graph creation operator into the language allowed by the graph calculator, and requests the big data cluster 3 to send the "data import command", and then when receiving the "data import command" sent by the big data cluster 3, the graph calculates The server 2 creates an original instance of the graph to be processed, and modifies the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed.
  • Step S1072 Calculate the adjustment command according to the to-be-processed graph, and perform the table data file, multiple algorithm operators of the to-be-processed graph, the parameters of each algorithm operator, and the connection relationship of the multiple algorithm operators according to the instance of the to-be-processed graph.
  • Calculate generate the structured data of the graph to be processed; that is, when the user drags and drops multiple algorithm operators of the graph to be processed on the graph calculation interface of the client 4, the graph calculation server 2 then calculates the multiple graph algorithm operators. Convert the language into the language allowed by the graph calculator, and request the big data cluster 3 to send the "graph calculation command", and then when receiving the "graph calculation command" sent by the big data cluster 3, the graph calculation server 2 obtains according to step S1071.
  • the obtained instance of the graph to be processed is calculated, and the structured data of the graph to be processed is generated. That is, the user can query the structured data of the to-be-processed graph, that is, the structure of the to-be-processed graph, on the graph computing interface of the client terminal 4 .
  • Step S1073 According to the data deriving command, and according to the parameters of a plurality of algorithm operators, the structured data is exported to the big data cluster 3 for user query or other business use. That is, when the user drags the data export operator on the graph computing interface of the client 4, the graph computing server 2 converts the data export operator into the language allowed by graph computing, and requests the big data cluster 3 to send the "data export command" , when receiving the "data export command" sent by the big data cluster 3, the graph computing server 2 exports the structured data to the big data cluster 3 according to the parameters of the algorithm operator. At this time, the big data cluster 3 will structure the data. After the data is converted into table data, the user can view the table data of the graph to be processed on the graph calculation interface of the client 4 for user query or other business use.
  • step S1073 specifically includes: judging whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance; when the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, modifying the parameters of the algorithm operator.
  • Step S1074 According to the stop adjustment command, stop the dragging of the algorithm operator for calculating the graph to be processed, and stop the calculation to be processed. That is, after the user successfully executes all the operators of the to-be-processed graph, that is, after obtaining the structure diagram and table data of the love-processed graph, and the current to-be-processed graph is no longer used, stop the to-be-processed graph and delete it. Instances of pending graphs, which can free up more resources.
  • the visual graph computing system further includes a verification server 6, wherein the verification server 6 is used to verify whether the user name and password are correct according to the user name and password of the user; Name to query the password corresponding to the user name.
  • the visualized graph computing method further includes:
  • Step S100 Generate first verification information according to the user name and password input by the user, where the first verification information represents a request to verify whether the user name and password are correct;
  • step S101 is performed.
  • step S1071 the graph computing server 2 creates a command according to the graph to be processed, and after creating the original instance of the graph to be processed, the graph computing server 2 obtains the username of the current user, and when verifying After the server 6 obtains the user name of the user, it generates the third verification information according to the user name.
  • the third verification information represents the password for requesting to obtain the user name, and sends the third verification information to the verification server 6, and requests the verification server 6 to give the password to the user.
  • the password of the user name when the graph computing server 2 receives the password of the user name sent by the verification server 6, the user name and password are input into the graph computing server 2 for verification, and when the verification is passed, the user can modify the original password. Instance's configuration file.
  • the user name and password are written into the configuration file of the instance in the graph computing server 2, when the user starts the graph computing server 2, when the user accesses the graph computing instance created by himself , it can be consistent with the user name and password of the core server 1 of the visual graph computing, and the unified user authentication of the whole set of application systems is realized.
  • the authentication server 6 is an LDAP server.
  • an embodiment of the present application provides a visualized graph computing system, including:
  • a visual graph computing platform configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
  • the big data cluster 3 is used to store the original table data of the graph, and is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and
  • the original table data is calculated, and a table data file is generated;
  • the structured data of the to-be-processed graph is converted into the table data of the to-be-processed graph;
  • the graph computing server 2 is configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the graph according to the graph instance to be processed.
  • the table data file is calculated to generate the structured data of the graph to be processed.
  • a visualized graph computing system includes a visualized graph computing platform, a big data cluster 3, and a graph computing server 2, wherein the big data cluster 3 stores original table data of graphs, and the graph The original structured data of the graph is stored in the computing server 2; the user selects the operators of the graph, the connection relationship between the operators, and configures the parameters of the operators on the visual graph computing platform, and the big data cluster 3 can be
  • the operators and parameters of the graph generate the table data file of the graph
  • the graph computing server 2 builds a graph instance according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators, and then the graph computing server 2 builds the graph according to the large
  • the table data files in data cluster 3 perform graph computation to generate graph structure data, and then big data cluster 3 converts graph structure data into graph table data.
  • the graph computing server 2 includes a Neo4j server, which is a high-performance NOSQL graph database, which stores structured data on the network instead of tables. It is an embedded, disk-based Java persistence engine with fully transactional features, but it stores structured data on a network (called a graph from a mathematical point of view) rather than in tables. Neo4j can also be seen as a high-performance graph engine with all the features of a full-fledged database.
  • Neo4j server performs a series of graph operations through Cypher statements , the syntax is flexible, the parameters are easy to configure, and a variety of operation operators can be implemented; the Neo4j server can view the nodes and edges in the graph through the Neo4j Brower, which is convenient for analysis.
  • Big Data Cluster 3 adopts CDH as the big data framework
  • CDH is Cloudera's 100% open source platform distribution, including Apache Hadoop, built for enterprise needs. By integrating Hadoop with more than a dozen other key open source projects, Cloudera has created a functionally advanced system that helps users execute end-to-end big data workflows.
  • This application uses CDH as the big data framework.
  • CDH is based on the stable version of Apache Hadoop and fixes the latest bugs; CDH is easy to install and upgrade; CDH supports rich components.
  • the CDH of the big data cluster 3 in this embodiment of the present application includes the following components: HDFS, Hive, Oozie, Spark, and the like.
  • HDFS is a distributed file storage system that can store a large number of large files.
  • the main difference between it and other distributed file systems is that it is a highly fault-tolerant system, suitable for deployment on cheap machines, and hdfs It can provide high-throughput data access and is very suitable for applications on large-scale data sets. That is, HDFS can convert the structured data sent by the graph computing server 2 to the big data cluster 3, and convert the structured data into table data.
  • Hive is a Hadoop-based data warehouse tool for data extraction, transformation, and loading. It is a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop.
  • Hive data warehouse tools can map structured data files into a database table, and provide SQL query functions, which can convert SQL statements into MapReduce tasks for execution. That is, Hive can convert the table data converted by HDFS, that is, after HDFS converts the structured data into table data, the table data of the graph to be processed can be stored in Hive.
  • Oozie is an open source framework based on a workflow engine, contributed to Apache by Cloudera, for running a set of jobs or processes in a specific order within a workflow. In the cluster, it is responsible for scheduling tasks according to the order of business logic. That is, Oozie can convert the task node sent by the core server 1 into a command, and send the command to the graph computing server 2, that is, execute steps S1061-S1065. The specific execution steps are as described above, and will not be repeated again.
  • Spark is a fast and general computing engine designed for large-scale data processing. Spark can perform graph computations based on commands.
  • an embodiment of the present application further provides an electronic device, including one or more processors and a memory.
  • the processor may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
  • CPU central processing unit
  • the processor may control other components in the electronic device to perform desired functions.
  • the memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor may execute the above-mentioned program instructions to implement the above-mentioned over-prediction of own funds in various embodiments of the present application and/or or other desired functionality.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • embodiments of the present application may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification
  • the steps of the calculation method of the visualization graph according to the embodiments described in the present application are described in the section.
  • the computer program product can write program codes for performing the operations of the embodiments of the present application in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present application may also be computer-readable storage media having computer program instructions stored thereon, the computer program instructions, when executed by a processor, cause the processor to perform the above-mentioned "Example Method" section of this specification
  • the steps of the method for calculating a visualization graph according to various embodiments of the present application are described in .
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • a readable storage medium may include, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor devices, devices, or devices, or a combination of any of the above, for example. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • each component or each step can be decomposed and/or recombined. These disaggregations and/or recombinations should be considered as equivalents of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A visual graph calculation method and system, and a storage medium and an electronic device, which solve the technical problem in the prior art of data being insecure during a graph calculation process. According to the visual graph calculation method, on a visual graph calculation platform, a user selects operators of a graph and a connection relationship between the operators, and configures parameters of the operators; a big data cluster can generate a table data file of the graph according to the operators and the parameters; a graph calculation server constructs an instance of the graph and generates structured data of the graph according to the operators, the connection relationship between the operators, and the configuration parameters of the operators; the big data cluster transforms the structured data of the graph into table data of the graph; and the user can only obtain the structured data of the graph on the graph calculation server, and obtains the table data of the graph in the big data cluster. The entire process involves transmission between a back-end graph calculation server and a big data cluster, and a user cannot see any data, such that the data security is improved.

Description

可视化的图计算方法及其系统、存储介质以及电子设备Visualized graph computing method and its system, storage medium and electronic device 技术领域technical field
本申请涉及计算机技术领域,尤其涉及可视化的图计算方法及其系统、存储介质以及电子设备。The present application relates to the field of computer technology, and in particular, to a visual graph computing method and system, storage medium, and electronic device.
背景技术Background technique
随着大数据技术的快速发展,各大公司,尤其是联网企业,都在从各个角度采集数据、存储数据、处理数据、分享数据、检索数据、分析数据、展示数据和挖掘数据背后的商业价值。不同个体之间彼此交互而产生的数据以图的形式表现,在通信、互联网、电子商务、社交网络和物联网等领域都积累有大规模的图数据。With the rapid development of big data technology, major companies, especially networked enterprises, are collecting data, storing data, processing data, sharing data, retrieving data, analyzing data, displaying data and mining the business value behind data from various angles. . The data generated by the interaction between different individuals is represented in the form of graphs, and large-scale graph data has accumulated in the fields of communication, Internet, e-commerce, social networks, and the Internet of Things.
图由节点与边构成,具有图结构的数据为图数据。图计算是对图数据的处理技术,例如图数据库和图计算框架,无论是分布式还是单节点的方案,都是构建于物理机上,通过部署在物理机上的服务来满足用户需求,多个用户共享使用同一个服务。The graph is composed of nodes and edges, and the data with the graph structure is graph data. Graph computing is a processing technology for graph data, such as graph databases and graph computing frameworks, whether distributed or single-node solutions, are built on physical machines and meet user needs through services deployed on physical machines. Shared use of the same service.
相关技术中的图计算服务器往往需要将需要的图数据从数据库中导出来,然后手动输入图计算服务器中进行计算,对于存在敏感字段的数据,如果将数据导出来并手动输入图计算服务器,增加了数据泄露或者损失的概率,即存在数据不安全的问题。The graph computing server in the related art often needs to export the required graph data from the database, and then manually input it into the graph computing server for calculation. For data with sensitive fields, if the data is exported and manually entered into the graph computing server, increase The probability of data leakage or loss is increased, that is, there is a problem of data insecurity.
申请内容Application content
有鉴于此,本申请实施例提供了一种可视化的图计算方法及其系统、存储介质以及电子设备,解决了现有技术中图计算过程中数据不安全的技术问题。In view of this, the embodiments of the present application provide a visual graph computing method and system, storage medium, and electronic device, which solve the technical problem of data insecurity in the graph computing process in the prior art.
作为本申请实施例的第一方面,本申请实施例提供了一种可视化的图计算方法,所述方法应用于可视化的图计算系统,所述可视化的图计算系统包括可视化图计算平台,大数据集群以及图计算服务器,其中,所述大数据集群中存储有图的原始表数据,所述图计算服务器中存储有图的原 始结构化数据;所述可视化的图计算方法包括:As a first aspect of the embodiments of the present application, the embodiments of the present application provide a visual graph computing method, the method is applied to a visual graph computing system, and the visual graph computing system includes a visual graph computing platform, a big data A cluster and a graph computing server, wherein the big data cluster stores the original table data of the graph, and the graph computing server stores the original structured data of the graph; the visualized graph computing method includes:
根据用户的第一输入,获取待处理图的原始表数据;Obtain the original table data of the graph to be processed according to the first input of the user;
根据所述用户的第二输入,生成待处理图的任务工作流文件,所述任务工作流文件包括所述原始表数据、多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;According to the second input of the user, a task workflow file of the graph to be processed is generated, where the task workflow file includes the original table data and a plurality of task nodes, wherein the plurality of task nodes include at least one first task node and at least one second task node;
根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件;Generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate a table data file;
根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据;以及Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data; and
将所述待处理图的结构化数据转化为所述待处理图的表数据。Convert the structured data of the graph to be processed into table data of the graph to be processed.
在本申请一实施例中,在所述获取待处理图的原始表数据之后,且在根据用户的第二输入指令,生成待处理图的任务工作流文件之前,所述可视化的图计算方法还包括:In an embodiment of the present application, after obtaining the original table data of the graph to be processed, and before generating the task workflow file of the graph to be processed according to the second input instruction of the user, the visual graph computing method further include:
对所述原始表数据进行预处理,获取所述待处理图的预处理表数据;Preprocessing the original table data to obtain the preprocessing table data of the graph to be processed;
根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件,包括:Generate at least one first command to execute the first task node according to the at least one first task node, calculate the original table data, and generate a table data file, including:
根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,对所述预处理表数据进行计算,生成表数据文件。Generate at least one first command to execute the first task node according to the at least one first task node, calculate the preprocessing table data, and generate a table data file.
在本申请一实施例中,根据所述用户的第二输入,生成待处理图的任务工作流文件,包括:In an embodiment of the present application, according to the second input of the user, the task workflow file of the graph to be processed is generated, including:
根据所述用户输入的数据导入算子,生成数据导入节点;Generate a data import node according to the data import operator input by the user;
根据所述用户输入的所述待处理图创建算子,生成待处理图创建节点;According to the to-be-processed graph creation operator input by the user, a to-be-processed graph creation node is generated;
根据所述用户输入的待处理图的多个算法算子,生成待处理图计算节点;generating a to-be-processed graph computing node according to a plurality of algorithm operators of the to-be-processed graph input by the user;
根据所述用户输入的数据导出算子,生成数据导出节点;Generate a data derivation node according to the data derivation operator input by the user;
根据所述用户输入的停止所述待处理图算子指令,生成停止节点;generating a stop node according to the stop of the to-be-processed graph operator sub-instruction input by the user;
根据所述用户输入的所述多个算法算子之间的连接关系,生成所述多 个算法算子之间的关系;According to the connection relationship between the multiple algorithm operators input by the user, the relationship between the multiple algorithm operators is generated;
根据所述用户输入的预设配置方式,对所述多个算法算子进行参数配置,生成每个所述算法算子的参数;以及According to the preset configuration mode input by the user, parameter configuration is performed on the plurality of algorithm operators, and parameters of each of the algorithm operators are generated; and
根据所述用户输入的提交指令,生成所述待处理图的任务工作流文件;generating the task workflow file of the graph to be processed according to the submission instruction input by the user;
其中,所述至少一个第一任务节点包括:所述数据导入节点、待处理图计算节点、所述数据导出节点以及所述停止节点;Wherein, the at least one first task node includes: the data import node, the to-be-processed graph computing node, the data export node, and the stop node;
所述至少一个第二任务节点包括:所述待处理图创建节点。The at least one second task node includes: the to-be-processed graph creation node.
在本申请一实施例中,根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,对所述预处理表数据进行计算,生成表数据文件,包括:In an embodiment of the present application, generating at least one first command to execute the first task node according to the at least one first task node, calculating the preprocessing table data, and generating a table data file, including:
根据所述数据导入节点、待处理图计算节点、所述数据导出节点以及所述停止节点,生成数据导入调令、图计算调令、数据导出调令以及停止调令;generating a data import command, a graph calculation command, a data export command, and a stop command according to the data import node, the graph computing node to be processed, the data export node, and the stop node;
对所述预处理表数据进行计算,生成表数据文件。The preprocessing table data is calculated to generate a table data file.
在本申请一实施例中,对所述预处理表数据进行计算,生成表数据文件,包括:In an embodiment of the present application, the preprocessing table data is calculated to generate a table data file, including:
根据所述数据导入调令,计算所述预处理表数据中与所述待处理图的多个算法算子相对应的点数据,以及,与所述待处理图中的多个算法算子之间连接关系相对应的边数据;According to the data import adjustment command, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the to-be-processed graph, and calculate the point data between the multiple algorithm operators in the to-be-processed graph The edge data corresponding to the connection relationship;
根据所述算法算子的参数生成所述点数据与所述边数据的头文件;Generate the header files of the point data and the edge data according to the parameters of the algorithm operator;
其中,所述表数据文件包括:多个点数据、多个边数据以及头文件。Wherein, the table data file includes: a plurality of point data, a plurality of edge data and a header file.
在本申请一实施例中,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据,包括:In an embodiment of the present application, a graph instance to be processed is generated according to the at least one first command and the at least one second task node, and the table data file is calculated according to the graph instance to be processed to generate The structured data of the graph to be processed includes:
根据所述待处理图创建节点以及所述数据导入调令,创建所述待处理图的原始实例;Create an original instance of the to-be-processed graph according to the to-be-processed graph creation node and the data import command;
修改所述待处理图的原始实例的配置文件,生成待处理图的实例;Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed;
根据所述图计算调令,根据所述待处理图的实例,对所述表数据文件、 所述待处理图的多个算法算子以及每个所述算法算子的参数、所述多个算法算子的连接关系,进行计算,生成所述待处理图的结构化数据;Calculate and adjust commands according to the graph. According to the instance of the graph to be processed, the table data file, the multiple algorithm operators of the to-be-processed graph, the parameters of each of the algorithm operators, the multiple algorithms The connection relationship of the operator is calculated, and the structured data of the to-be-processed graph is generated;
根据所述数据导出调令,将所述待处理图的结构化数据导出。According to the data export command, the structured data of the graph to be processed is exported.
在本申请一实施例中,在根据所述数据导出调令,将所述待处理图的结构化数据导出之前,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据,还包括:In an embodiment of the present application, before deriving a command according to the data and exporting the structured data of the to-be-processed graph, generate a to-be-processed command according to the at least one first command and the at least one second task node A graph instance, and calculating the table data file according to the graph instance to be processed to generate structured data of the graph to be processed, further comprising:
判断所述算法算子的参数与所述实例中的配置参数是否一致;Determine whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance;
当所述算法算子的参数与所述实例中的配置参数不一致时,修改所述算法算子的参数。When the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, the parameters of the algorithm operator are modified.
在本申请一实施例中,在所述获取待处理图的原始表数据之前,所述可视化的图计算方法还包括:In an embodiment of the present application, before obtaining the original table data of the graph to be processed, the visualized graph calculation method further includes:
根据所述用户输入的用户名和密码,生成第一验证信息,所述第一验证信息表示请求验证所述用户名和密码是否正确;generating first verification information according to the username and password input by the user, where the first verification information represents a request to verify whether the username and password are correct;
当收到第二验证信息时,生成第一签名,所述第一签名用于提示所述用户的用户名和密码正确,并根据所述用户的输入,获取待处理图的原始表数据,其中所述第二验证信息表示所述用户名和所述密码正确;When the second verification information is received, a first signature is generated, and the first signature is used to prompt the user that the username and password are correct, and according to the user's input, the original table data of the graph to be processed is obtained, in which the The second verification information indicates that the user name and the password are correct;
修改所述待处理图的原始实例的配置文件,生成待处理图的实例,包括:Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed, including:
获取所述用户的用户名,并根据所述用户名生成第三验证信息,所述第三验证信息表示请求获取所述用户名的密码;acquiring the username of the user, and generating third verification information according to the username, where the third verification information represents a request to acquire the password of the username;
根据所述用户输入的用户名以及所述用户名的密码,生成第四验证信息,所述第四验证信息表示请求验证所述用户名和所述密码是否正确;Generate fourth verification information according to the username input by the user and the password of the username, and the fourth verification information represents a request to verify whether the username and the password are correct;
当收到第五验证信息时,修改所述待处理图的原始实例的配置文件,生成待处理图的实例,其中所述第五验证信息表示所述用户名和所述密码正确。When fifth verification information is received, the configuration file of the original instance of the graph to be processed is modified to generate an instance of the graph to be processed, wherein the fifth verification information indicates that the user name and the password are correct.
作为本申请的第二方面,本申请实施例提供了一种可视化的图计算系统,包括:As a second aspect of the present application, an embodiment of the present application provides a visualized graph computing system, including:
可视化图计算平台,用于根据用户的第一输入,获取待处理图的原始表数据,并根据所述用户的第二输入,生成待处理图的任务工作流文件,所述任务工作流文件包括所述原始表数据、多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;A visual graph computing platform, configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
大数据集群,用于存储图的原始表数据,根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件;The big data cluster is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate table data document;
图计算服务器,用于存储图的原始结构化数据,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据。;A graph computing server, configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the table according to the graph instance to be processed. The data file is calculated to generate the structured data of the graph to be processed. ;
其中,所述大数据集群还用于将所述待处理图的结构化数据转化为所述待处理图的表数据。Wherein, the big data cluster is further used to convert the structured data of the graph to be processed into table data of the graph to be processed.
在本申请一实施例中,所述可视化的图计算系统,还包括:In an embodiment of the present application, the visualized graph computing system further includes:
验证服务器,所述验证服务器用于根据所述用户的用户名以及密码验证所述用户名和所述密码是否正确;根据所述用户名查询与所述用户名对应的密码。A verification server, configured to verify whether the user name and the password are correct according to the user name and password of the user; query the password corresponding to the user name according to the user name.
作为本申请的第三方面,本申请实施例提供了一种计算机可读存储介质,包括:As a third aspect of the present application, an embodiment of the present application provides a computer-readable storage medium, including:
存储介质;所述存储介质存储有计算机程序,a storage medium; the storage medium stores a computer program,
其中,所述计算机程序用于执行上述所述的可视化的图计算方法。Wherein, the computer program is used to execute the above-mentioned visualized graph computing method.
作为本申请的第四方面,本申请实施例提供了一种电子设备,所述电子设备包括:As a fourth aspect of the present application, an embodiment of the present application provides an electronic device, the electronic device comprising:
处理器;processor;
用于存储所述处理器可执行指令的存储器;a memory for storing the processor-executable instructions;
其中,所述处理器,用于执行上述所述的可视化的图计算方法。Wherein, the processor is configured to execute the above-mentioned visualized graph computing method.
本申请实施例提供的一种可视化的图计算方法,应用于可视化的图计算系统,所述可视化的图计算系统包括可视化图计算平台,大数据集群以 及图计算服务器,其中,所述大数据集群中存储有图的原始表数据,所述图计算服务器中存储有图的原始结构化数据;用户通过在可视化图计算平台上选择图的算子、算子之间的连接关系以及对算子的参数进行配置,大数据集群即可根据图的算子以及参数生成图的表数据文件,而图计算服务器则根据图的算子、算子之间的连接关系以及算子的配置参数,构建图实例,然后图计算服务器根据大数据集群中的表数据文件进行图计算,生成图的结构数据,然后大数据集群再将图的结构数据转化为图的表数据,用户只能在图计算服务器上得到图的结构以及图的结构数据,在大数据集群中得到图的表数据,整个过程中,无需用户下载图的表数据再导入图计算服务器中,且整个图计算过程中,无论是图的什么类型的数据均是在后端的图计算服务器以及大数据集群之间传输,用户均不可见任何数据,提高了数据的安全性;除此之外,用户在进行图计算时,用户的主要精力只需要放在图的逻辑上,通过一个个算子构成图的工作流,大大降低了用户对图计算中的专业知识点的学习成本,提高了图计算的效率。A visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster, and a graph computing server, wherein the big data cluster The original table data of the graph is stored in the graph computing server, and the original structured data of the graph is stored in the graph computing server; the user selects the operator of the graph, the connection relationship between the operators, and the relationship between the operators on the visual graph computing platform. By configuring the parameters, the big data cluster can generate the table data file of the graph according to the operators and parameters of the graph, while the graph computing server builds the graph according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators. Example, and then the graph computing server performs graph computation according to the table data files in the big data cluster to generate the structural data of the graph, and then the big data cluster converts the structural data of the graph into the table data of the graph. Users can only perform graph computation on the graph computing server. Obtain the structure of the graph and the structural data of the graph, and obtain the table data of the graph in the big data cluster. During the whole process, there is no need for the user to download the table data of the graph and then import it into the graph computing server. All types of data are transmitted between the back-end graph computing servers and big data clusters, and users cannot see any data, which improves data security; in addition, when users perform graph computing, their main energy is It only needs to be placed in the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computing, and improves the efficiency of graph computing.
附图说明Description of drawings
为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings used in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1所示为本申请一实施例提供的可视化的图计算系统的结构示意图;FIG. 1 is a schematic structural diagram of a visualized graph computing system provided by an embodiment of the present application;
图2所示为本申请一实施例提供的可视化的图计算方法的流程示意图;FIG. 2 shows a schematic flowchart of a visual graph computing method provided by an embodiment of the present application;
图3所示为本申请另一实施例提供的可视化的图计算方法的流程示意图;FIG. 3 shows a schematic flowchart of a visual graph computing method provided by another embodiment of the present application;
图4所示为本申请另一实施例提供的可视化的图计算系统的结构示意图;FIG. 4 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application;
图5所示为本申请另一实施例提供的可视化的图计算系统的结构示 意图。FIG. 5 is a schematic structural diagram of a visualized graph computing system provided by another embodiment of the present application.
具体实施方式detailed description
为了更好的理解本申请的技术方案,下面结合附图对本申请实施例进行详细描述。In order to better understand the technical solutions of the present application, the embodiments of the present application are described in detail below with reference to the accompanying drawings.
应当明确,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。It should be clear that the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。The terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present application. As used in the embodiments of this application and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise.
应当理解,本文中使用的术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。It should be understood that the term "and/or" used in this document is only an association relationship to describe the associated objects, indicating that there may be three kinds of relationships, for example, A and/or B, which may indicate that A exists alone, and A and B exist at the same time. B, there are three cases of B alone. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship.
图1所示为本申请实施例提供的可视化的图的计算系统,包括:FIG. 1 shows a computing system for a visualized graph provided by an embodiment of the present application, including:
可视化图计算平台,可视化原图计算平台包括客户端4和核心服务器1,客户端4上可以显示图计算操作界面,供用户在图计算操作界面上进行操作,例如登录系统网页、根据用户所需要计算的图拖拉拽各种算子等。核心服务器1即响应于用户在客户端4上对系统网页的操作,生成图计算的工作流;The visual graph computing platform includes a client 4 and a core server 1. The client 4 can display a graph computing operation interface for users to perform operations on the graph computing operation interface, such as logging in to the system web page, according to user needs. The calculated graph can drag and drop various operators, etc. The core server 1 generates a workflow of graph computing in response to the user's operation on the system web page on the client 4;
大数据集群3,大数据集群3中存储有多个图的原始表数据,用户可以根据大数据集群3中查询多个图的表数据以及图中的点的表数据以及图的边的表数据; Big data cluster 3. The big data cluster 3 stores the original table data of multiple graphs. Users can query the table data of multiple graphs, the table data of the points in the graph, and the table data of the edges of the graph according to the big data cluster 3. ;
图计算服务器2,图计算服务器2中存储有多个图的原始结构化数据,即存储有多个图的实例,用户可以根据图计算服务器2来查看一个图的结构,其中图计算服务器2和大数据集群3之间进行数据的导入和导出,大数据集群3中的表数据导入至图计算服务器2中,用户可以查看该表数据 可以构成的图的结构;而图计算服务器2中的一个图的结构数据导出至大数据集群3中,用户可以根据大数据集群3去查询该图的表数据。 Graph computing server 2, the original structured data of multiple graphs is stored in graph computing server 2, that is, instances of multiple graphs are stored, and users can view the structure of a graph according to graph computing server 2, wherein graph computing server 2 and Data is imported and exported between big data clusters 3. The table data in big data cluster 3 is imported into graph computing server 2, and users can view the structure of graphs that can be formed by the table data; The structural data of the graph is exported to the big data cluster 3, and the user can query the table data of the graph according to the big data cluster 3.
当用户需要计算一个待处理图时,用户可以基于可视化的图的计算系统对该待处理图进行可视化计算,如图2所示,具体的可视化图的计算方法包括如下步骤:When the user needs to calculate a to-be-processed graph, the user can perform a visual calculation on the to-be-processed graph based on the computing system of the visualized graph, as shown in Figure 2, the specific visual graph calculation method includes the following steps:
步骤S101:客户端4显示图计算系统网页,用户登录系统网页,输入第一请求,第一请求用于请求“查看待处理图的原始表数据”,其中,待处理图的原始表数据包括待处理图的各要素的原始表数据,例如待处理图的点的表数据、待处理图的边的表数据;Step S101: The client 4 displays the graph computing system web page, the user logs in the system web page, and inputs a first request. Process the original table data of each element of the graph, such as the table data of the points of the graph to be processed, and the table data of the edges of the graph to be processed;
步骤S102:获取用户的第一请求,核心服务器1将该“查看待处理图的原始表数据”的第一请求发送至大数据集群3;Step S102: obtaining the first request of the user, and the core server 1 sends the first request of "viewing the original table data of the graph to be processed" to the big data cluster 3;
步骤S103:大数据集群3则根据该第一请求,查询该待处理图的各要素的原始表数据,并将待处理图的原始表数据发送至核心服务器1;Step S103: the big data cluster 3 queries the original table data of each element of the to-be-processed graph according to the first request, and sends the raw table data of the to-be-processed graph to the core server 1;
此时,用户可以在系统网页上看到待处理图的原始表数据。At this point, the user can see the original table data of the graph to be processed on the system webpage.
步骤S104:根据待处理图,用户在网页上输入预设项目,例如提交与计算待处理图相关的请求或者操作,例如拖拉拽各种算子;Step S104: According to the to-be-processed graph, the user inputs a preset item on the web page, for example, submits a request or operation related to the calculation of the to-be-processed graph, such as dragging and dropping various operators;
步骤S105:响应于用户的输入,核心服务器1根据用户的输入,生成待处理图的任务工作流文件,并将任务工作流文件传输至大数据集群3,其中,任务工作流文件包括待处理图的原始表数据以及多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;Step S105: In response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the big data cluster 3, wherein the task workflow file includes the graph to be processed. The original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
步骤S106:大数据集群3则根据接收到的任务工作流文件,根据至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并将至少一个第一调令传输至图计算服务器2;大数据集群3还根据多个第一任务节点,对待处理图的原始表数据进行计算,生成待处理图的表数据文件,并将待处理图的表数据文件发送至图计算服务器2,其中,待处理图的原始表数据文件包括:待处理图的各要素的原始表数据以及各要素的原始表数据的头文件;Step S106: The big data cluster 3 generates at least one first command to execute the first task node according to the received task workflow file and according to the at least one first task node, and transmits the at least one first command to graph computing The server 2; the big data cluster 3 also calculates the original table data of the graph to be processed according to the plurality of first task nodes, generates the table data file of the graph to be processed, and sends the table data file of the graph to be processed to the graph computing server 2 , wherein the original table data file of the graph to be processed includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element;
步骤S107:图计算服务器2接收到至少第一调令以及待处理图的原始表数据文件,根据第二任务节点以及至少一个第一调令生成待处理图实例,并根据该待处理图实例对原始表数据进行计算,生成该待处理图的结构化数据,并将该处理图的结构化数据传输至大数据集群3;Step S107: The graph computing server 2 receives at least the first command and the original table data file of the graph to be processed, generates an instance of the graph to be processed according to the second task node and the at least one first command, and compares the original table to the instance of the graph to be processed. The data is calculated, the structured data of the graph to be processed is generated, and the structured data of the processing graph is transmitted to the big data cluster 3;
此时,当图计算服务器2将该待处理图进行计算后,生成结构化数据后,可以将该结构化数据发送至核心服务器1中,用户可以在客户端4上即可查看该待处理图的结构。At this time, after the graph computing server 2 calculates the graph to be processed and generates structured data, the structured data can be sent to the core server 1, and the user can view the graph to be processed on the client 4 Structure.
步骤S108:大数据集群3接收到该待处理图的结构化数据时,将该待处理图的结构化数据转化为该待处理图的表数据,此时,大数据集群3可以将待处理图的表数据发送至核心服务器1中,用户可以在客户端4上查看该待处理图的表数据。另外,由于大数据集群3具有存储功能,该待处理图计算后的表数据则存储在大数据集群3,为以后其他相同图的计算提供直接的表数据,提高了图计算效率,并且当用户在其他应用场景需要该待处理图的表数据时,可以直接从大数据集群3中查看或者导出。Step S108: When the big data cluster 3 receives the structured data of the graph to be processed, it converts the structured data of the graph to be processed into table data of the graph to be processed. At this time, the big data cluster 3 can convert the graph to be processed. The table data is sent to the core server 1, and the user can view the table data of the to-be-processed graph on the client 4. In addition, because the big data cluster 3 has a storage function, the table data after the calculation of the to-be-processed graph is stored in the big data cluster 3, which provides direct table data for the calculation of other identical graphs in the future, improves the graph calculation efficiency, and when the user When the table data of the to-be-processed graph is required in other application scenarios, it can be viewed or exported directly from the big data cluster 3.
本申请实施例提供的一种可视化的图计算方法,应用于可视化的图计算系统,所述可视化的图计算系统包括可视化图计算平台,大数据集群3以及图计算服务器2,其中,所述大数据集群3中存储有图的原始表数据,所述图计算服务器2中存储有图的原始结构化数据;用户通过在可视化图计算平台上选择图的算子、算子之间的连接关系以及对算子的参数进行配置,大数据集群3即可根据图的算子以及参数生成图的表数据文件,而图计算服务器2则根据图的算子、算子之间的连接关系以及算子的配置参数,构建图实例,然后图计算服务器2根据大数据集群3中的表数据文件进行图计算,生成图的结构数据,然后大数据集群3再将图的结构数据转化为图的表数据,用户只能在图计算服务器2上得到图的结构以及图的结构数据,在大数据集群3中得到图的表数据,整个过程中,无需用户下载图的表数据再导入图计算服务器2中,且整个图计算过程中,无论是图的什么类型的数据均是在后端的图计算服务器2以及大数据集群3之间传输,用户均不可见任何数据,提高了数据的安全性;除此之外,用户在进 行图计算时,用户的主要精力只需要放在图的逻辑上,通过一个个算子构成图的工作流,大大降低了用户对图计算中的专业知识点的学习成本,提高了图计算的效率。A visualized graph computing method provided by an embodiment of the present application is applied to a visualized graph computing system, where the visualized graph computing system includes a visualized graph computing platform, a big data cluster 3 and a graph computing server 2, wherein the big data The original table data of the graph is stored in the data cluster 3, and the original structured data of the graph is stored in the graph computing server 2; the user selects the operator of the graph, the connection relationship between the operators and the By configuring the parameters of the operators, the big data cluster 3 can generate the table data files of the graph according to the operators and parameters of the graph, while the graph computing server 2 can generate the graph data files according to the operators of the graph, the connection relationship between the operators, and the operators. Then, the graph calculation server 2 performs graph calculation according to the table data files in the big data cluster 3 to generate the structural data of the graph, and then the big data cluster 3 converts the structural data of the graph into the table data of the graph. , the user can only obtain the graph structure and graph structure data on the graph computing server 2, and obtain the graph table data in the big data cluster 3. During the whole process, the user does not need to download the graph table data and import it into the graph computing server 2. , and in the whole graph computing process, no matter what type of data in the graph is transmitted between the back-end graph computing server 2 and the big data cluster 3, users cannot see any data, which improves data security; In addition, when the user is performing graph computation, the user's main energy only needs to be placed on the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computation. Improves the efficiency of graph computation.
在本申请一实施例中,在步骤S103中,大数据集群3则根据该第一请求,查询该待处理图的各要素的原始表数据后,这些原始表数据并不一定都是该待处理图能够用到的数据,因此,在步骤S104与步骤S103之间,如图3所示,可视化的图计算方法还包括:In an embodiment of the present application, in step S103, after the big data cluster 3 queries the raw table data of each element of the to-be-processed graph according to the first request, these raw table data are not necessarily all of the to-be-processed data The data that can be used in the graph, therefore, between step S104 and step S103, as shown in FIG. 3, the visualized graph calculation method further includes:
步骤S1031:核心服务器1对原始表数据进行预处理,获取待处理图的预处理表数据;Step S1031: the core server 1 preprocesses the original table data, and obtains the preprocessed table data of the graph to be processed;
步骤S106则为:大数据集群3则根据接收到的任务工作流文件,生成至少一个第一调令,并将至少一个第一调令传输至图计算服务器2;大数据集群3还根据至少一个第一任务节点调令,对待处理图的预处理表数据进行计算,生成待处理图的表数据文件,并将待处理图的表数据文件发送至图计算服务器2,其中,待处理图的原始表数据文件包括:待处理图的各要素的原始表数据以及各要素的原始表数据的头文件。Step S106 is: the big data cluster 3 generates at least one first adjustment order according to the received task workflow file, and transmits the at least one first adjustment order to the graph computing server 2; the big data cluster 3 also generates at least one first adjustment order according to the at least one first adjustment order. The task node is instructed to calculate the preprocessing table data of the graph to be processed, generate the table data file of the graph to be processed, and send the table data file of the graph to be processed to the graph computing server 2, wherein the original table data file of the graph to be processed It includes: the original table data of each element of the graph to be processed and the header file of the original table data of each element.
对待处理图的原始表数据进行预处理,在核心服务器1、图计算服务器2以及大数据集群3中的数据计算以及传输的过程中,减少了数据的计算量以及传输量,且整个计算过程中的原始数据均是与待处理图相关的原始表数据,提高了计算效率。The original table data of the graph to be processed is preprocessed. During the data calculation and transmission process in the core server 1, the graph computing server 2 and the big data cluster 3, the amount of data calculation and transmission is reduced, and the entire calculation process is The original data of are all the original table data related to the graph to be processed, which improves the computational efficiency.
在本申请一实施例中,如图3所示,步骤S105:响应于用户的输入,核心服务器1根据用户的输入,生成待处理图的任务工作流文件,并将任务工作流文件传输至大数据集群3,其中,任务工作流文件包括待处理图的原始表数据、多个任务节点以及多个任务节点之间的执行顺序,具体可包括如下步骤:In an embodiment of the present application, as shown in FIG. 3, step S105: in response to the user's input, the core server 1 generates a task workflow file of the graph to be processed according to the user's input, and transmits the task workflow file to the large Data cluster 3, wherein the task workflow file includes the original table data of the graph to be processed, multiple task nodes, and the execution sequence between the multiple task nodes, which may specifically include the following steps:
步骤S1051:根据用户输入的数据导入算子,核心服务器1生成数据导入节点;即用户在客户端4的图计算界面上进行操作,拖拉拽数据导入类算子,然后核心服务器1响应于用户的输入,生成数据导入节点。Step S1051: According to the data import operator input by the user, the core server 1 generates a data import node; that is, the user operates on the graph computing interface of the client 4, drags and pulls the data import class operator, and then the core server 1 responds to the user's Input to generate a data import node.
其中,数据导入类算子包括导入边文件算子和导入点文件算子。The data importing operators include importing edge file operators and importing point file operators.
步骤S1052:根据所用户输入的待处理图创建算子,生成待处理图创建节点;即用户在客户端4的图计算界面上进行操作,拖拉拽图创建算子,然后核心服务器1响应于用户的输入,生成图创建节点;Step S1052: Create an operator according to the to-be-processed graph input by the user, and generate a to-be-processed graph creation node; that is, the user operates on the graph computing interface of the client 4, drags and drops the graph to create an operator, and then the core server 1 responds to the user The input to generate the graph to create the node;
步骤S1053:根据所述用户输入的待处理图的多个算法算子,生成待处理图计算节点;即用户在客户端4的图计算界面上进行操作,拖拉拽图待处理图的多个算法算子,然后核心服务器1响应于用户的输入,生成待处理图计算节点;Step S1053: Generate a to-be-processed graph computing node according to the multiple algorithm operators of the to-be-processed graph input by the user; that is, the user operates on the graph computing interface of the client 4, and drags and drops the multiple algorithms of the graph to be processed. operator, and then the core server 1 generates a graph computing node to be processed in response to the user's input;
其中,待处理图的多个算法算子则主要是用于将用户在客户端输入的算法算子参数拼接为图计算服务器所允许的语句。Among them, the multiple algorithm operators of the graph to be processed are mainly used for splicing the algorithm operator parameters input by the user at the client into a statement allowed by the graph computing server.
步骤S1054:根据用户输入的数据导出算子,生成数据导出节点;即用户在客户端4的图计算界面上进行操作,拖拉拽图数据导出算子,然后核心服务器1响应于用户的输入,生成数据导出算子;Step S1054: derive the operator according to the data input by the user, and generate a data derivation node; that is, the user operates on the graph computing interface of the client 4, drags and drags the graph data derivation operator, and then the core server 1 responds to the user's input and generates Data export operator;
步骤S1055:根据所用户输入的停止待处理图算子,生成停止节点;即用户在客户端4的图计算界面上进行操作,拖拉拽停止待处理图算子,然后核心服务器1响应于用户的输入,生成停止节点,此时,用户已经将该待处理图的全部算法算子拖拉拽完毕,即待处理图的全部算法算子已经准备就绪;Step S1055: Generate a stop node according to the stop pending graph operator input by the user; that is, the user operates on the graph computing interface of the client 4, drags and drops the pending graph operator, and then the core server 1 responds to the user's request. Input to generate a stop node. At this time, the user has dragged and dragged all the algorithm operators of the to-be-processed graph, that is, all the algorithm operators of the to-be-processed graph are ready;
步骤S1056:根据用户输入的所述多个算法算子之间的连接关系,生成算法算子的关系;即用户在客户端4的图计算的界面上进行操作,拖拉拽步骤S1056中已准备就绪的多个算法算子之间的连接关系,例如第一个算法算子与第二个算法算子以及第七个算法算子之间存在连接关系,那么用户在客户端4的图计算的界面上即可操作使得第一个算法算子与第二个算法算子以及第七个算法算子之间存在连接关系,那么核心服务器1则响应于用户的输入,生成多个算子的计算节点,即通过多个算法算子的连接关系可以实施计算该待处理图,从而获取相应的计算数据。Step S1056: According to the connection relationship between the multiple algorithm operators input by the user, the relationship between the algorithm operators is generated; that is, the user operates on the graph calculation interface of the client 4, and the drag and drop is ready in step S1056 The connection relationship between the multiple algorithm operators of the The above operation can be performed so that there is a connection relationship between the first algorithm operator, the second algorithm operator, and the seventh algorithm operator, then the core server 1 responds to the user's input and generates a computing node of multiple operators , that is, the graph to be processed can be calculated through the connection relationship of multiple algorithm operators, so as to obtain corresponding calculation data.
步骤S1057:根据用户输入的预设配置方式,对多个算法算子进行参数配置,生成每个算法算子的参数;即用户在客户端4的图计算界面上进行操作,拖拉拽算子参数配置,然后核心服务器1响应于用户的操作,生 成每个算法算子的参数。Step S1057: According to the preset configuration mode input by the user, perform parameter configuration on multiple algorithm operators, and generate parameters for each algorithm operator; that is, the user operates on the graph calculation interface of the client 4, dragging and dropping the operator parameters configuration, and then the core server 1 generates the parameters of each algorithm operator in response to the user's operation.
例如,在用户拖拉拽某一个算子时,只需要根据算子的参数名指定算子的值即可,例如加载点文件算子的参数说明如下:For example, when a user drags and drops an operator, he only needs to specify the value of the operator according to the parameter name of the operator. For example, the parameters of the load point file operator are described as follows:
Figure PCTCN2021092928-appb-000001
Figure PCTCN2021092928-appb-000001
用户填值示例如下:An example of the value filled by the user is as follows:
Figure PCTCN2021092928-appb-000002
Figure PCTCN2021092928-appb-000002
步骤S1058:根据用户输入的提交指令,生成待处理图的任务工作流文件;Step S1058: Generate a task workflow file of the graph to be processed according to the submission instruction input by the user;
即当用户在客户端4的图计算界面上完成上述步骤S1051-步骤S1057的所有操作后,点击“提交任务”,核心服务器1则响应于该“提交任务”的指令,将步骤S1051-步骤S1057中所有执行的操作,生成任务工作流文件,其中任务流工作文件包括:预处理表数据、多个任务节点,其中,多个任务节点包括至少一个第一任务节点和至少一个第二任务节点,其中, 至少一个第一任务节点包括:数据导入节点、待处理图计算节点、数据导出节点以及停止节点;至少一个第二任务节点包括:所述待处理图创建节点。That is, when the user completes all the operations in the above steps S1051-S1057 on the graph computing interface of the client 4, and clicks "Submit task", the core server 1 responds to the "Submit task" instruction, steps S1051-Step S1057 All operations performed in , generate a task workflow file, wherein the task flow work file includes: preprocessing table data, multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node, Wherein, at least one first task node includes: a data import node, a graph calculation node to be processed, a data export node, and a stop node; at least one second task node includes: the to-be-processed graph creation node.
此时,核心服务器1完成了生成的任务工作流文件,虽然核心服务器1在步骤S1058中生成任务工作流,但是当用户在客户端4的图计算界面上进行操作时,核心服务器1后台则对应生成了相应的节点,即步骤S1051-步骤S1057,只是当用户在界面上点击提交任务时,如图3所示,步骤S1051至步骤S1057中所生成的任务节点均统一在了任务工作流文件中,而任务工作流文件包括步骤S1031得到的预处理表数据,以及步骤S1051-步骤S1057得到的多个任务节点。At this point, the core server 1 has completed the generated task workflow file. Although the core server 1 generates the task workflow in step S1058, when the user operates on the graph computing interface of the client 4, the background of the core server 1 corresponds to Corresponding nodes are generated, that is, steps S1051 to S1057, but when the user clicks to submit the task on the interface, as shown in Figure 3, the task nodes generated in steps S1051 to S1057 are unified in the task workflow file. , and the task workflow file includes the preprocessing table data obtained in step S1031 and a plurality of task nodes obtained in steps S1051-S1057.
当核心服务器1完成了任务工作流文件时,将任务工作流文件传输至大数据集群3以及图计算服务器2,而大数据集群3则需要根据任务工作流文件中的至少第一任务节点生成至少一个第一调令,其中,至少一个第一调令包括:数据导入调令、待处理图计算调令、数据导出调令以及停止调令。与此同时,大数据集群3则根据任务工作流文件中的预处理表数据进行计算,生成待处理图的表数据文件,即数据集群在收到任务工作流文件后执行步骤S106,而步骤S106具体包括以下步骤:When the core server 1 completes the task workflow file, the task workflow file is transmitted to the big data cluster 3 and the graph computing server 2, and the big data cluster 3 needs to generate at least the first task node according to the task workflow file. A first adjustment order, wherein at least one first adjustment order includes: a data import order, a to-be-processed graph calculation order, a data export order, and a stop adjustment order. At the same time, the big data cluster 3 performs calculation according to the preprocessing table data in the task workflow file, and generates the table data file of the graph to be processed, that is, the data cluster executes step S106 after receiving the task workflow file, and step S106 Specifically include the following steps:
步骤S1061:根据数据导入节点,计算预处理表数据中与待处理图的多个算法算子相对应的点数据,以及,与待处理图中的多个算法算子之间连接关系相对应的边数据;Step S1061: According to the data import node, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the graph to be processed, and the point data corresponding to the connection relationship between the multiple algorithm operators in the graph to be processed. edge data;
步骤S1062:根据算法算子的参数生成点数据与所述边数据的头文件,并将表数据文教发送至图计算服务器2中;Step S1062: generate header files of point data and the edge data according to the parameters of the algorithm operator, and send the table data to the graph computing server 2;
步骤S1063-步骤S1066:响应于调令,依次发送待数据导入调令、图计算调令、数据导出调令以及停止调令给图计算服务器2。Step S1063-Step S1066: In response to the adjustment command, send the data importing command, the graph computing command, the data exporting command, and the stop command to the graph computing server 2 in sequence.
当用户在客户端4的图计算界面上进行操作时,核心服务器1根据操作形成第一任务节点和第二任务节点,大数据集群3根据第一节点生成第一调令,而图计算服务器2则根据第一任务节点以及第二任务节点启动将各任务节点所对应的算子转化为图计算服务器2所允许的语言,然后向大 数据集群3索要第一调令,然后按照第一调令依次执行相应的操作。即图计算服务器2执行的步骤S107具体包括如下步骤:When the user operates on the graph computing interface of the client 4, the core server 1 forms the first task node and the second task node according to the operation, the big data cluster 3 generates the first command according to the first node, and the graph computing server 2 According to the first task node and the second task node, the operator corresponding to each task node is converted into the language allowed by the graph computing server 2, and then asks the big data cluster 3 for the first adjustment command, and then executes the corresponding commands in sequence according to the first adjustment command. operation. That is, step S107 executed by the graph computing server 2 specifically includes the following steps:
步骤S1071:Step S1071:
根据所述待处理图创建节点以及所述数据导入调令,创建所述待处理图的原始实例;即当用户在客户端4的图计算界面上拖拉拽待处理图创建算子时,图计算服务器2则将该图创建算子转化为图计算器所允许的语言,并请求大数据集群3发送“数据导入调令”,然后当接收到大数据集群3发送的“数据导入调令”时,图计算服务器2则创建待处理图的原始实例,并修改待处理图的原始实例的配置文件,生成待处理图的实例。According to the graph creation node to be processed and the data import command, the original instance of the graph to be processed is created; that is, when the user drags the graph creation operator to be processed on the graph computation interface of the client 4, the graph computation server 2 converts the graph creation operator into the language allowed by the graph calculator, and requests the big data cluster 3 to send the "data import command", and then when receiving the "data import command" sent by the big data cluster 3, the graph calculates The server 2 creates an original instance of the graph to be processed, and modifies the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed.
步骤S1072:根据待处理图计算调令,根据待处理图的实例,对表数据文件、待处理图的多个算法算子以及每个算法算子的参数、多个算法算子的连接关系,进行计算,生成所述待处理图的结构化数据;即当用户在客户端4的图计算界面上拖拉拽待处理图的多个算法算子时,图计算服务器2则将多个图算法算子转化为图计算器所允许的语言,并请求大数据集群3发送“图计算调令”,然后当接收到大数据集群3发送的“图计算调令”时,图计算服务器2则根据步骤S1071中获取到的待处理图的实例,进行计算,生成待处理图的结构化数据。即用户可以在客户端4的图计算界面可以查询到待处理图的结构化数据,即待处理图的结构。Step S1072: Calculate the adjustment command according to the to-be-processed graph, and perform the table data file, multiple algorithm operators of the to-be-processed graph, the parameters of each algorithm operator, and the connection relationship of the multiple algorithm operators according to the instance of the to-be-processed graph. Calculate, generate the structured data of the graph to be processed; that is, when the user drags and drops multiple algorithm operators of the graph to be processed on the graph calculation interface of the client 4, the graph calculation server 2 then calculates the multiple graph algorithm operators. Convert the language into the language allowed by the graph calculator, and request the big data cluster 3 to send the "graph calculation command", and then when receiving the "graph calculation command" sent by the big data cluster 3, the graph calculation server 2 obtains according to step S1071. The obtained instance of the graph to be processed is calculated, and the structured data of the graph to be processed is generated. That is, the user can query the structured data of the to-be-processed graph, that is, the structure of the to-be-processed graph, on the graph computing interface of the client terminal 4 .
步骤S1073:根据数据导出调令,依据多个算法算子的参数,将结构化数据导出至大数据集群3中,功用户查询或者其他业务使用。即当用户在客户端4的图计算界面上拖拉拽数据导出算子时,图计算服务器2将数据导出算子转化为图计算所允许的语言,并请求大数据集群3发送“数据导出调令”,当收到大数据集群3发送的“数据导出调令”时,图计算服务器2依据算法算子的参数,将结构化数据导出至大数据集群3中,此时,大数据集群3将结构化数据进行转化为表数据,用户即可在客户端4的图计算界面上查看待处理图的表数据,供用户查询或者其他业务使用。Step S1073: According to the data deriving command, and according to the parameters of a plurality of algorithm operators, the structured data is exported to the big data cluster 3 for user query or other business use. That is, when the user drags the data export operator on the graph computing interface of the client 4, the graph computing server 2 converts the data export operator into the language allowed by graph computing, and requests the big data cluster 3 to send the "data export command" , when receiving the "data export command" sent by the big data cluster 3, the graph computing server 2 exports the structured data to the big data cluster 3 according to the parameters of the algorithm operator. At this time, the big data cluster 3 will structure the data. After the data is converted into table data, the user can view the table data of the graph to be processed on the graph calculation interface of the client 4 for user query or other business use.
可选的,步骤S1073具体包括:判断算法算子的参数与实例中的配置参数是否一致;当算法算子的参数与实例中的配置参数不一致时,修改算 法算子的参数。Optionally, step S1073 specifically includes: judging whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance; when the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, modifying the parameters of the algorithm operator.
步骤S1074:根据停止调令,停止计算待处理图的算法算子的拖拉拽,并停止对待处理的计算。即当用户将待处理图的所有算子执行成功之后,即获取到该爱处理图的结构图以及表数据后,当前的待处理图已经不再使用时,将该待处理图停止并删除该待处理图的实例,能够释放更多的资源。Step S1074: According to the stop adjustment command, stop the dragging of the algorithm operator for calculating the graph to be processed, and stop the calculation to be processed. That is, after the user successfully executes all the operators of the to-be-processed graph, that is, after obtaining the structure diagram and table data of the love-processed graph, and the current to-be-processed graph is no longer used, stop the to-be-processed graph and delete it. Instances of pending graphs, which can free up more resources.
在本申请另一实施例中,如图4所示,可视化的图计算系统还包括验证服务器6,其中验证服务器6用于根据用户的用户名以及密码验证所述用户名和密码是否正确;根据用户名查询与用户名对应的密码。基于该可视化的图计算系统,该可视化的图计算方法还包括:In another embodiment of the present application, as shown in FIG. 4 , the visual graph computing system further includes a verification server 6, wherein the verification server 6 is used to verify whether the user name and password are correct according to the user name and password of the user; Name to query the password corresponding to the user name. Based on the visualized graph computing system, the visualized graph computing method further includes:
步骤S100:根据用户输入的用户名和密码,生成第一验证信息,第一验证信息表示请求验证所述用户名和密码是否正确;Step S100: Generate first verification information according to the user name and password input by the user, where the first verification information represents a request to verify whether the user name and password are correct;
当收到第二验证信息时,生成第一签名,第一签名用于提示所述用户的用户名和密码正确,用户方成功登录客户端4的图计算界面,即执行步骤S101。When the second verification information is received, a first signature is generated, and the first signature is used to prompt that the user name and password of the user are correct, and the user side successfully logs in to the graph computing interface of the client 4, that is, step S101 is performed.
并且在图计算服务器2在根据图创建调令时,步骤S1071中,图计算服务器2根据待处理图创建调令,创建待处理图的原始实例后,图计算服务器2获取当前用户的用户名,当验证服务器6获取用户的用户名后,并根据用户名生成第三验证信息,第三验证信息表示请求获取用户名的密码,并将第三验证信息发送至验证服务器6,请求验证服务器6给与该用户名的密码,当图计算服务器2收到验证服务器6发送的该用户名的密码后,将该用户名以及密码输入图计算服务器2中,进行验证,当验证通过时,用户才可以修改原始实例的配置文件。本申请实施例提供的可视化的图计算方法,由于将用户名和密码写入图计算服务器2中的实例的配置文件中那么当用户启动图计算服务器2时,当用户访问自己创建的图计算实例时,便可以与可视化图计算的核心服务器1的用户名和密码一致,实现了整套应用系统的用户统一认证。And when the graph computing server 2 creates a command according to the graph, in step S1071, the graph computing server 2 creates a command according to the graph to be processed, and after creating the original instance of the graph to be processed, the graph computing server 2 obtains the username of the current user, and when verifying After the server 6 obtains the user name of the user, it generates the third verification information according to the user name. The third verification information represents the password for requesting to obtain the user name, and sends the third verification information to the verification server 6, and requests the verification server 6 to give the password to the user. The password of the user name, when the graph computing server 2 receives the password of the user name sent by the verification server 6, the user name and password are input into the graph computing server 2 for verification, and when the verification is passed, the user can modify the original password. Instance's configuration file. In the visualized graph computing method provided by the embodiments of the present application, since the user name and password are written into the configuration file of the instance in the graph computing server 2, when the user starts the graph computing server 2, when the user accesses the graph computing instance created by himself , it can be consistent with the user name and password of the core server 1 of the visual graph computing, and the unified user authentication of the whole set of application systems is realized.
可选的,验证服务器6为LDAP服务器。Optionally, the authentication server 6 is an LDAP server.
作为本申请的另一面,本申请一实施例提供了一种可视化的图计算系 统,包括:As another aspect of the present application, an embodiment of the present application provides a visualized graph computing system, including:
可视化图计算平台,用于根据用户的第一输入,获取待处理图的原始表数据,并根据所述用户的第二输入,生成待处理图的任务工作流文件,所述任务工作流文件包括所述原始表数据、多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;A visual graph computing platform, configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
大数据集群3,用于存储图的原始表数据,用于存储图的原始表数据,根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件;将所述待处理图的结构化数据转化为所述待处理图的表数据;The big data cluster 3 is used to store the original table data of the graph, and is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and The original table data is calculated, and a table data file is generated; the structured data of the to-be-processed graph is converted into the table data of the to-be-processed graph;
图计算服务器2,用于存储图的原始结构化数据,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据。本申请实施例提供的一种可视化的图计算系统,包括可视化图计算平台,大数据集群3以及图计算服务器2,其中,所述大数据集群3中存储有图的原始表数据,所述图计算服务器2中存储有图的原始结构化数据;用户通过在可视化图计算平台上选择图的算子、算子之间的连接关系以及对算子的参数进行配置,大数据集群3即可根据图的算子以及参数生成图的表数据文件,而图计算服务器2则根据图的算子、算子之间的连接关系以及算子的配置参数,构建图实例,然后图计算服务器2根据大数据集群3中的表数据文件进行图计算,生成图的结构数据,然后大数据集群3再将图的结构数据转化为图的表数据,用户只能在图计算服务器2上得到图的结构以及图的结构数据,在大数据集群3中得到图的表数据,整个过程中,无需用户下载图的表数据再导入图计算服务器2中,且整个图计算过程中,无论是图的什么类型的数据均是在后端的图计算服务器2以及大数据集群3之间传输,用户均不可见任何数据,提高了数据的安全性;除此之外,用户在进行图计算时,用户的主要精力只需要放在图的逻辑上,通过一个个算子构成图的工作流,大大降低了用户对图计算中的专业知识点的学习成本,提高了图计算的效率。The graph computing server 2 is configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the graph according to the graph instance to be processed. The table data file is calculated to generate the structured data of the graph to be processed. A visualized graph computing system provided by an embodiment of the present application includes a visualized graph computing platform, a big data cluster 3, and a graph computing server 2, wherein the big data cluster 3 stores original table data of graphs, and the graph The original structured data of the graph is stored in the computing server 2; the user selects the operators of the graph, the connection relationship between the operators, and configures the parameters of the operators on the visual graph computing platform, and the big data cluster 3 can be The operators and parameters of the graph generate the table data file of the graph, and the graph computing server 2 builds a graph instance according to the operators of the graph, the connection relationship between the operators, and the configuration parameters of the operators, and then the graph computing server 2 builds the graph according to the large The table data files in data cluster 3 perform graph computation to generate graph structure data, and then big data cluster 3 converts graph structure data into graph table data. Users can only obtain graph structure and graph data on graph computation server 2. For the structural data of the graph, the table data of the graph is obtained in the big data cluster 3. In the whole process, there is no need for the user to download the table data of the graph and then import it into the graph computing server 2. In the whole graph computing process, no matter what type of graph it is The data is transmitted between the back-end graph computing server 2 and the big data cluster 3, and users cannot see any data, which improves data security; in addition, when users perform graph computing, their main energy is only It needs to be placed in the logic of the graph, and the workflow of the graph is formed by one operator, which greatly reduces the user's learning cost of professional knowledge points in graph computing, and improves the efficiency of graph computing.
在本申请一实施例中,如图5所示,图计算服务器2包括Neo4j服务器,Neo4j服务器是一个高性能的NOSQL图形数据库,它将结构化数据存储在网络上而不是表中。它是一个嵌入式的、基于磁盘的、具备完全的事务特性的Java持久化引擎,但是它将结构化数据存储在网络(从数学角度叫做图)上而不是表中。Neo4j也可以被看作是一个高性能的图引擎,该引擎具有成熟数据库的所有特性。本申请采用Neo4j作为图数据库,可以使得对于不同用户或者不同的业务场景,只需新建一个Neo4j实例,分配不同的端口,便可以实现图的数据隔离;Neo4j服务器通过Cypher语句进行一系列的图操作,语法灵活,参数易于配置,可以实现多种操作算子;Neo4j服务器可以通过Neo4j Brower可以查看图中的节点和边的情况,便于分析。In an embodiment of the present application, as shown in FIG. 5 , the graph computing server 2 includes a Neo4j server, which is a high-performance NOSQL graph database, which stores structured data on the network instead of tables. It is an embedded, disk-based Java persistence engine with fully transactional features, but it stores structured data on a network (called a graph from a mathematical point of view) rather than in tables. Neo4j can also be seen as a high-performance graph engine with all the features of a full-fledged database. This application uses Neo4j as the graph database, so that for different users or different business scenarios, it is only necessary to create a Neo4j instance and assign different ports to realize graph data isolation; Neo4j server performs a series of graph operations through Cypher statements , the syntax is flexible, the parameters are easy to configure, and a variety of operation operators can be implemented; the Neo4j server can view the nodes and edges in the graph through the Neo4j Brower, which is convenient for analysis.
在本申请一实施例中,如图5所示。大数据集群3采用CDH作为大数据框架,CDH是Cloudera的100%开源平台发行版,包括Apache Hadoop,专为满足企业需求而构建。通过将Hadoop与十几个其他关键的开源项目集成,Cloudera创建了一个功能先进的系统,可帮助用户执行端到端的大数据工作流程。本申请采用CDH作为大数据框架,CDH基于稳定版Apache Hadoop,并修复了最新出现的Bug;CDH安装和升级方便;CDH支持丰富的组件。In an embodiment of the present application, as shown in FIG. 5 . Big Data Cluster 3 adopts CDH as the big data framework, CDH is Cloudera's 100% open source platform distribution, including Apache Hadoop, built for enterprise needs. By integrating Hadoop with more than a dozen other key open source projects, Cloudera has created a functionally advanced system that helps users execute end-to-end big data workflows. This application uses CDH as the big data framework. CDH is based on the stable version of Apache Hadoop and fixes the latest bugs; CDH is easy to install and upgrade; CDH supports rich components.
可选的,本申请实施例中的大数据集群3的CDH包括以下组件:HDFS、Hive、Oozie和Spark等。Optionally, the CDH of the big data cluster 3 in this embodiment of the present application includes the following components: HDFS, Hive, Oozie, Spark, and the like.
其中,HDFS是一个分布式文件存储系统,可以将大量的大文件进行存储,它和其他的分布式文件系统的主要区别是它是一个高容错的系统,适合部署在廉价的机器上,并且hdfs能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。即HDFS可以将图计算服务器2发送至大数据集群3中的结构化数据进行转化,将结构化的数据转化为表数据。Among them, HDFS is a distributed file storage system that can store a large number of large files. The main difference between it and other distributed file systems is that it is a highly fault-tolerant system, suitable for deployment on cheap machines, and hdfs It can provide high-throughput data access and is very suitable for applications on large-scale data sets. That is, HDFS can convert the structured data sent by the graph computing server 2 to the big data cluster 3, and convert the structured data into table data.
Hive是基于Hadoop的一个数据仓库工具,用来进行数据提取、转化、加载,这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。Hive数据仓库工具能将结构化的数据文件映射为一张数据库表, 并提供SQL查询功能,能将SQL语句转变成MapReduce任务来执行。即Hive可以将HDFS转化后的表格数据,即HDFS将结构化的数据转化为表数据之后,可以将待处理图的表格数据存储在Hive中。Hive is a Hadoop-based data warehouse tool for data extraction, transformation, and loading. It is a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop. Hive data warehouse tools can map structured data files into a database table, and provide SQL query functions, which can convert SQL statements into MapReduce tasks for execution. That is, Hive can convert the table data converted by HDFS, that is, after HDFS converts the structured data into table data, the table data of the graph to be processed can be stored in Hive.
Oozie是一个基于工作流引擎的开源框架,由Cloudera公司贡献给Apache的,用在一个工作流内以一个特定顺序运行一组工作或流程。在集群中负责按照业务逻辑的顺序定时调度任务。即Oozie可以根据核心服务器1发送的任务节点转化为调令,并将调令发送至图计算服务器2中,即执行步骤S1061-步骤S1065,具体执行步骤如前述所述,再次不再做赘述。Oozie is an open source framework based on a workflow engine, contributed to Apache by Cloudera, for running a set of jobs or processes in a specific order within a workflow. In the cluster, it is responsible for scheduling tasks according to the order of business logic. That is, Oozie can convert the task node sent by the core server 1 into a command, and send the command to the graph computing server 2, that is, execute steps S1061-S1065. The specific execution steps are as described above, and will not be repeated again.
Spark是专为大规模数据处理而设计的快速通用的计算引擎。Spark可以根据调令进行图计算。Spark is a fast and general computing engine designed for large-scale data processing. Spark can perform graph computations based on commands.
示例性电子设备Exemplary Electronics
作为本申请的第三方面,本申请实施例还提供了一种电子设备,包括一个或多个处理器和存储器。As a third aspect of the present application, an embodiment of the present application further provides an electronic device, including one or more processors and a memory.
处理器可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备中的其他组件以执行期望的功能。The processor may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
存储器可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器可以运行上述所述程序指令,以实现上文所述的本申请的各个实施例的自有资金超额预测以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。The memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like. The non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor may execute the above-mentioned program instructions to implement the above-mentioned over-prediction of own funds in various embodiments of the present application and/or or other desired functionality. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
示例性计算机程序产品和计算机可读存储介质Exemplary computer program product and computer readable storage medium
除了上述方法和设备以外,本申请的实施例还可以是计算机程序产品, 其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本申请所述的实施例的可视化图的计算方法的步骤。In addition to the methods and apparatuses described above, embodiments of the present application may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification The steps of the calculation method of the visualization graph according to the embodiments described in the present application are described in the section.
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本申请实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。The computer program product can write program codes for performing the operations of the embodiments of the present application in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
此外,本申请的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本申请各种实施例的可视化图的计算方法的步骤。In addition, embodiments of the present application may also be computer-readable storage media having computer program instructions stored thereon, the computer program instructions, when executed by a processor, cause the processor to perform the above-mentioned "Example Method" section of this specification The steps of the method for calculating a visualization graph according to various embodiments of the present application are described in .
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的装置、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The computer-readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor devices, devices, or devices, or a combination of any of the above, for example. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
以上结合具体实施例描述了本申请的基本原理,但是,需要指出的是,在本申请中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本申请的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本申请为必须采用上述具体的细节来实现。The basic principles of the present application have been described above in conjunction with specific embodiments. However, it should be pointed out that the advantages, advantages, effects, etc. mentioned in the present application are only examples rather than limitations, and these advantages, advantages, effects, etc., are not considered to be Required for each embodiment of this application. In addition, the specific details disclosed above are only for the purpose of example and easy understanding, rather than limiting, and the above-mentioned details do not limit the application to be implemented by using the above-mentioned specific details.
本申请中涉及的器件、装置、设备、系统的方框图仅作为例示性的例 子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。The block diagrams of devices, apparatus, apparatuses, and systems referred to in this application are merely illustrative examples and are not intended to require or imply that the connections, arrangements, or configurations must be made in the manner shown in the block diagrams. As those skilled in the art will appreciate, these means, apparatuses, apparatuses, systems may be connected, arranged, configured in any manner.
还需要指出的是,在本申请的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本申请的等效方案。It should also be pointed out that in the apparatus, equipment and method of the present application, each component or each step can be decomposed and/or recombined. These disaggregations and/or recombinations should be considered as equivalents of the present application.
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本申请。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本申请的范围。因此,本申请不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use this application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Therefore, this application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present application shall be included in the present application. within the scope of protection.

Claims (12)

  1. 一种可视化的图计算方法,所述方法应用于可视化的图计算系统,所述可视化的图计算系统包括可视化图计算平台,大数据集群以及图计算服务器,其中,所述大数据集群中存储有图的原始表数据,所述图计算服务器中存储有图的原始结构化数据;所述可视化的图计算方法包括:A visualized graph computing method, the method is applied to a visualized graph computing system, the visualized graph computing system includes a visualized graph computing platform, a big data cluster and a graph computing server, wherein the big data cluster stores The original table data of the graph, the graph computing server stores the original structured data of the graph; the visualized graph computing method includes:
    根据用户的第一输入,获取待处理图的原始表数据;Obtain the original table data of the graph to be processed according to the first input of the user;
    根据所述用户的第二输入,生成待处理图的任务工作流文件,所述任务工作流文件包括所述原始表数据、多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;According to the second input of the user, a task workflow file of the graph to be processed is generated, where the task workflow file includes the original table data and a plurality of task nodes, wherein the plurality of task nodes include at least one first task node and at least one second task node;
    根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件;Generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate a table data file;
    根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据;以及Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data; and
    将所述待处理图的结构化数据转化为所述待处理图的表数据。Convert the structured data of the graph to be processed into table data of the graph to be processed.
  2. 根据权利要求1所述的可视化的图计算方法,在所述获取待处理图的原始表数据之后,且在根据用户的第二输入指令,生成待处理图的任务工作流文件之前,所述可视化的图计算方法还包括:The visualized graph computing method according to claim 1, after the acquisition of the original table data of the graph to be processed, and before generating the task workflow file of the graph to be processed according to the second input instruction of the user, the visualization The graph calculation method of also includes:
    对所述原始表数据进行预处理,获取所述待处理图的预处理表数据;Preprocessing the original table data to obtain the preprocessing table data of the graph to be processed;
    根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件,包括:Generate at least one first command to execute the first task node according to the at least one first task node, calculate the original table data, and generate a table data file, including:
    根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,对所述预处理表数据进行计算,生成表数据文件。Generate at least one first command to execute the first task node according to the at least one first task node, calculate the preprocessing table data, and generate a table data file.
  3. 根据权利要求2所述的可视化的图计算方法,The visualized graph computing method according to claim 2,
    根据所述用户的第二输入,生成待处理图的任务工作流文件,包括:According to the second input of the user, the task workflow file of the graph to be processed is generated, including:
    根据所述用户输入的数据导入算子,生成数据导入节点;Generate a data import node according to the data import operator input by the user;
    根据所述用户输入的所述待处理图创建算子,生成待处理图创建节点;According to the to-be-processed graph creation operator input by the user, a to-be-processed graph creation node is generated;
    根据所述用户输入的待处理图的多个算法算子,生成待处理图计算节点;generating a to-be-processed graph computing node according to a plurality of algorithm operators of the to-be-processed graph input by the user;
    根据所述用户输入的数据导出算子,生成数据导出节点;Generate a data derivation node according to the data derivation operator input by the user;
    根据所述用户输入的停止所述待处理图算子指令,生成停止节点;generating a stop node according to the stop of the to-be-processed graph operator sub-instruction input by the user;
    根据所述用户输入的所述多个算法算子之间的连接关系,生成所述多个算法算子之间的关系;generating a relationship between the multiple algorithm operators according to the connection relationship between the multiple algorithm operators input by the user;
    根据所述用户输入的预设配置方式,对所述多个算法算子进行参数配置,生成每个所述算法算子的参数;以及According to the preset configuration mode input by the user, parameter configuration is performed on the plurality of algorithm operators, and parameters of each of the algorithm operators are generated; and
    根据所述用户输入的提交指令,生成所述待处理图的任务工作流文件;generating the task workflow file of the graph to be processed according to the submission instruction input by the user;
    其中,所述至少一个第一任务节点包括:所述数据导入节点、待处理图计算节点、所述数据导出节点以及所述停止节点;Wherein, the at least one first task node includes: the data import node, the to-be-processed graph computing node, the data export node, and the stop node;
    所述至少一个第二任务节点包括:所述待处理图创建节点。The at least one second task node includes: the to-be-processed graph creation node.
  4. 根据权利要求3所述的可视化的图计算方法,根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,对所述预处理表数据进行计算,生成表数据文件,包括:The visual graph computing method according to claim 3, generating at least one first command to execute the first task node according to the at least one first task node, calculating the preprocessing table data, and generating table data documents, including:
    根据所述数据导入节点、待处理图计算节点、所述数据导出节点以及所述停止节点,生成数据导入调令、图计算调令、数据导出调令以及停止调令;generating a data import command, a graph calculation command, a data export command, and a stop command according to the data import node, the graph computing node to be processed, the data export node, and the stop node;
    对所述预处理表数据进行计算,生成表数据文件。The preprocessing table data is calculated to generate a table data file.
  5. 根据权利要求4所述的可视化的图计算方法,其特征在于,对所述预处理表数据进行计算,生成表数据文件,包括:The visualized graph calculation method according to claim 4, wherein the preprocessing table data is calculated to generate a table data file, comprising:
    根据所述数据导入调令,计算所述预处理表数据中与所述待处理图的多个算法算子相对应的点数据,以及,与所述待处理图中的多个算法算子之间连接关系相对应的边数据;According to the data import adjustment command, calculate the point data in the preprocessing table data corresponding to the multiple algorithm operators in the to-be-processed graph, and calculate the point data between the multiple algorithm operators in the to-be-processed graph The edge data corresponding to the connection relationship;
    根据所述算法算子的参数生成所述点数据与所述边数据的头文件;Generate the header files of the point data and the edge data according to the parameters of the algorithm operator;
    其中,所述表数据文件包括:多个点数据、多个边数据以及头文件。Wherein, the table data file includes: a plurality of point data, a plurality of edge data and a header file.
  6. 根据权利要求5所述的可视化的图计算方法,The visualized graph computing method according to claim 5,
    根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待 处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据,包括:Generate a graph instance to be processed according to the at least one first command and the at least one second task node, and calculate the table data file according to the graph instance to be processed to generate a structured graph of the to-be-processed graph data, including:
    根据所述待处理图创建节点以及所述数据导入调令,创建所述待处理图的原始实例;Create an original instance of the to-be-processed graph according to the to-be-processed graph creation node and the data import command;
    修改所述待处理图的原始实例的配置文件,生成待处理图的实例;Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed;
    根据所述图计算调令,根据所述待处理图的实例,对所述表数据文件、所述待处理图的多个算法算子以及每个所述算法算子的参数、所述多个算法算子的连接关系,进行计算,生成所述待处理图的结构化数据;Calculate and adjust commands according to the graph. According to the instance of the graph to be processed, the table data file, the multiple algorithm operators of the to-be-processed graph, the parameters of each of the algorithm operators, the multiple algorithms The connection relationship of the operator is calculated, and the structured data of the to-be-processed graph is generated;
    根据所述数据导出调令,将所述待处理图的结构化数据导出。According to the data export command, the structured data of the graph to be processed is exported.
  7. 根据权利要求6所述的可视化的图计算方法,在根据所述数据导出调令,将所述待处理图的结构化数据导出之前,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据,还包括:The visual graph computing method according to claim 6, before deriving a command according to the data and exporting the structured data of the graph to be processed, according to the at least one first command and the at least one second task node, generating a graph instance to be processed, and calculating the table data file according to the graph instance to be processed, and generating structured data of the graph to be processed, further comprising:
    判断所述算法算子的参数与所述实例中的配置参数是否一致;Determine whether the parameters of the algorithm operator are consistent with the configuration parameters in the instance;
    当所述算法算子的参数与所述实例中的配置参数不一致时,修改所述算法算子的参数。When the parameters of the algorithm operator are inconsistent with the configuration parameters in the instance, the parameters of the algorithm operator are modified.
  8. 根据权利要求6所述的可视化的图计算方法,The visualized graph computing method according to claim 6,
    在所述获取待处理图的原始表数据之前,所述可视化的图计算方法还包括:Before obtaining the original table data of the graph to be processed, the visualized graph calculation method further includes:
    根据所述用户输入的用户名和密码,生成第一验证信息,所述第一验证信息表示请求验证所述用户名和密码是否正确;generating first verification information according to the username and password input by the user, where the first verification information represents a request to verify whether the username and password are correct;
    当收到第二验证信息时,生成第一签名,所述第一签名用于提示所述用户的用户名和密码正确,并根据所述用户的输入,获取待处理图的原始表数据,其中所述第二验证信息表示所述用户名和所述密码正确;When the second verification information is received, a first signature is generated, and the first signature is used to prompt the user that the username and password are correct, and according to the user's input, the original table data of the graph to be processed is obtained, in which the The second verification information indicates that the user name and the password are correct;
    修改所述待处理图的原始实例的配置文件,生成待处理图的实例,包括:Modify the configuration file of the original instance of the graph to be processed to generate an instance of the graph to be processed, including:
    获取所述用户的用户名,并根据所述用户名生成第三验证信息,所述 第三验证信息表示请求获取所述用户名的密码;obtaining the user name of the user, and generating third verification information according to the user name, where the third verification information represents a request to obtain the password of the user name;
    根据所述用户输入的用户名以及所述用户名的密码,生成第四验证信息,所述第四验证信息表示请求验证所述用户名和所述密码是否正确;Generate fourth verification information according to the username input by the user and the password of the username, and the fourth verification information represents a request to verify whether the username and the password are correct;
    当收到第五验证信息时,修改所述待处理图的原始实例的配置文件,生成待处理图的实例,其中所述第五验证信息表示所述用户名和所述密码正确。When fifth verification information is received, the configuration file of the original instance of the graph to be processed is modified to generate an instance of the graph to be processed, wherein the fifth verification information indicates that the user name and the password are correct.
  9. 一种可视化的图计算系统,包括:A visual graph computing system, including:
    可视化图计算平台,用于根据用户的第一输入,获取待处理图的原始表数据,并根据所述用户的第二输入,生成待处理图的任务工作流文件,所述任务工作流文件包括所述原始表数据、多个任务节点,其中所述多个任务节点包括至少一个第一任务节点以及至少一个第二任务节点;A visual graph computing platform, configured to obtain the original table data of the graph to be processed according to the first input of the user, and generate the task workflow file of the graph to be processed according to the second input of the user, and the task workflow file includes the original table data and multiple task nodes, wherein the multiple task nodes include at least one first task node and at least one second task node;
    大数据集群,用于存储图的原始表数据,根据所述至少一个第一任务节点生成执行所述第一任务节点的至少一个第一调令,并对所述原始表数据进行计算,生成表数据文件;The big data cluster is used to store the original table data of the graph, generate at least one first command to execute the first task node according to the at least one first task node, and calculate the original table data to generate table data document;
    图计算服务器,用于存储图的原始结构化数据,根据所述至少一个第一调令以及所述至少一个第二任务节点,生成待处理图实例,并根据所述待处理图实例对所述表数据文件进行计算,生成所述待处理图的结构化数据;A graph computing server, configured to store the original structured data of the graph, generate a graph instance to be processed according to the at least one first command and the at least one second task node, and perform the processing on the table according to the graph instance to be processed. The data file is calculated to generate the structured data of the to-be-processed graph;
    其中,所述大数据集群还用于将所述待处理图的结构化数据转化为所述待处理图的表数据。Wherein, the big data cluster is further used to convert the structured data of the graph to be processed into table data of the graph to be processed.
  10. 根据权利要求9所述的可视化的图计算系统,还包括:The visualized graph computing system according to claim 9, further comprising:
    验证服务器,所述验证服务器用于根据所述用户的用户名以及密码验证所述用户名和所述密码是否正确;根据所述用户名查询与所述用户名对应的密码。A verification server, configured to verify whether the user name and the password are correct according to the user name and password of the user; query the password corresponding to the user name according to the user name.
  11. 一种计算机可读存储介质,包括:A computer-readable storage medium, comprising:
    存储介质;所述存储介质存储有计算机程序,a storage medium; the storage medium stores a computer program,
    其中,所述计算机程序用于执行上述权利要求1-8任一所述的可视化的图计算方法。Wherein, the computer program is used to execute the visualized graph computing method according to any one of the above claims 1-8.
  12. 一种电子设备,所述电子设备包括:An electronic device comprising:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor-executable instructions;
    其中,所述处理器,用于执行上述权利要求1-8任一所述的可视化的图计算方法。Wherein, the processor is configured to execute the visualized graph computing method according to any one of the preceding claims 1-8.
PCT/CN2021/092928 2020-09-18 2021-05-11 Visual graph calculation method and system, and storage medium and electronic device WO2022057279A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010984915.6A CN112256695B (en) 2020-09-18 2020-09-18 Visualized graph calculation method and system, storage medium and electronic device
CN202010984915.6 2020-09-18

Publications (1)

Publication Number Publication Date
WO2022057279A1 true WO2022057279A1 (en) 2022-03-24

Family

ID=74232875

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/092928 WO2022057279A1 (en) 2020-09-18 2021-05-11 Visual graph calculation method and system, and storage medium and electronic device

Country Status (2)

Country Link
CN (1) CN112256695B (en)
WO (1) WO2022057279A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794064A (en) * 2022-10-25 2023-03-14 中电金信软件有限公司 Configuration method and device of task processing flow, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256695B (en) * 2020-09-18 2023-07-28 银联商务股份有限公司 Visualized graph calculation method and system, storage medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105162878A (en) * 2015-09-24 2015-12-16 网宿科技股份有限公司 Distributed storage based file distribution system and method
US20180144251A1 (en) * 2016-11-23 2018-05-24 Institute For Information Industry Server and cloud computing resource optimization method thereof for cloud big data computing architecture
CN110738389A (en) * 2019-09-03 2020-01-31 深圳壹账通智能科技有限公司 Workflow processing method and device, computer equipment and storage medium
CN112256695A (en) * 2020-09-18 2021-01-22 银联商务股份有限公司 Visualized graph calculation method and system, storage medium and electronic device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050060287A1 (en) * 2003-05-16 2005-03-17 Hellman Ziv Z. System and method for automatic clustering, sub-clustering and cluster hierarchization of search results in cross-referenced databases using articulation nodes
CN108090198B (en) * 2017-12-22 2020-12-22 浙江创邻科技有限公司 Graph database creating method, graph database creating device, graph database loading device, and graph database loading medium
CN110083455B (en) * 2019-05-07 2022-07-12 网易(杭州)网络有限公司 Graph calculation processing method, graph calculation processing device, graph calculation processing medium and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105162878A (en) * 2015-09-24 2015-12-16 网宿科技股份有限公司 Distributed storage based file distribution system and method
US20180144251A1 (en) * 2016-11-23 2018-05-24 Institute For Information Industry Server and cloud computing resource optimization method thereof for cloud big data computing architecture
CN110738389A (en) * 2019-09-03 2020-01-31 深圳壹账通智能科技有限公司 Workflow processing method and device, computer equipment and storage medium
CN112256695A (en) * 2020-09-18 2021-01-22 银联商务股份有限公司 Visualized graph calculation method and system, storage medium and electronic device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794064A (en) * 2022-10-25 2023-03-14 中电金信软件有限公司 Configuration method and device of task processing flow, electronic equipment and storage medium
CN115794064B (en) * 2022-10-25 2024-02-06 中电金信软件有限公司 Configuration method and device of task processing flow, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112256695B (en) 2023-07-28
CN112256695A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
US11928596B2 (en) Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization
US11386218B2 (en) Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization
US9852196B2 (en) ETL tool interface for remote mainframes
JP7322119B2 (en) Queries to data sources on the network
US20200125530A1 (en) Data management platform using metadata repository
WO2018156551A1 (en) Platform management of integrated access datasets utilizing federated query generation and schema rewriting optimization
WO2022057279A1 (en) Visual graph calculation method and system, and storage medium and electronic device
US9992269B1 (en) Distributed complex event processing
US10185607B1 (en) Data statement monitoring and control
US11941140B2 (en) Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization
US9489423B1 (en) Query data acquisition and analysis
CN112948467B (en) Data processing method and device, computer equipment and storage medium
WO2018053889A1 (en) Distributed computing framework and distributed computing method
CN108319514A (en) A kind of visual scheduling system based on Slurm job managements
US11314707B1 (en) Configurable domain manager platform
US10776163B1 (en) Non-hierarchical management system for application programming interface resources
EP4280074A1 (en) Network security framework for maintaining data security while allowing remote users to perform user-driven quality analyses of the data
US11934984B1 (en) System and method for scheduling tasks
US20240054027A1 (en) Portable binary files for client side execution of federated application programming interfaces
CN117112697A (en) Data management method and related device
EP3376406A1 (en) Secure data sharing through a network application
CN111723088A (en) Method and device for pushing summary layer table

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21868117

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21868117

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21868117

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 280923)

122 Ep: pct application non-entry in european phase

Ref document number: 21868117

Country of ref document: EP

Kind code of ref document: A1