WO2020207457A1 - 一种有向图识别方法及系统和服务器 - Google Patents

一种有向图识别方法及系统和服务器 Download PDF

Info

Publication number
WO2020207457A1
WO2020207457A1 PCT/CN2020/084119 CN2020084119W WO2020207457A1 WO 2020207457 A1 WO2020207457 A1 WO 2020207457A1 CN 2020084119 W CN2020084119 W CN 2020084119W WO 2020207457 A1 WO2020207457 A1 WO 2020207457A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
degree
directed graph
nodes
processed
Prior art date
Application number
PCT/CN2020/084119
Other languages
English (en)
French (fr)
Inventor
赵凯
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020207457A1 publication Critical patent/WO2020207457A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures

Definitions

  • This application relates to, but is not limited to, directed graph technology, in particular to a directed graph recognition method, system and server.
  • a ring is also called a loop, which is a concept in graph theory.
  • a ring is an arrangement of edges, and it is satisfied that one walk along this arrangement can return to the starting point.
  • the present application provides a method and system for identifying a directed graph, and a server, which can realize the inspection of the ring graph.
  • the embodiment of the present invention provides a method for identifying a directed graph, including:
  • the directed graph is a directed cyclic graph.
  • the processing of the in-degree/out-degree of the direct downstream node/direct upstream node of the node whose in-degree/out-degree is 0 in the directed graph one by one until the processing of all nodes is completed includes:
  • the node Store the node with 0 in degree in the directed graph in the queue to be processed, start from the root node, delete the node, and subtract 1 from the in degree of the node directly downstream of the node. If the node directly downstream After the in-degree of the node minus 1 is 0, the downstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the in-degree of the node directly downstream of the node is reduced by 1, it is not 0, keep the direct downstream node of the node and the corresponding processed in-degree;
  • the node whose out-degree is 0 in the directed graph in the queue to be processed start with the outermost branch and leaf node, delete the node, and subtract 1 from the in-degree of the node directly upstream of the node. If the node If the indegree of the directly upstream node minus 1 is 0, then the upstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the indegree of the node directly upstream is reduced by 1 If the latter is not 0, the directly upstream node of the node and the corresponding processed in-degree number are retained.
  • the method further includes: dividing the directed graph into multiple sub-directed graphs according to edges included in the directed graph;
  • the determining the in-degree/out-degree of each node in the directed graph includes: determining the in-degree/out-degree of each node in each sub-directed graph in a distributed manner;
  • the processing of the in-degree/out-degree of the node directly downstream/directly upstream of the node whose in-degree/out-degree is 0 in the directed graph one by one includes:
  • the in-degree/out-degree of the node in the sub-directed graph determines the in-degree/out-degree of the node in the sub-directed graph whose in-degree/out-degree is 0 in the directed graph, and for the in-degree/out-degree in the directed graph with 0 Node, the in-degree/out-degree of the node's direct downstream node/direct upstream node is processed.
  • the dividing the directed graph into multiple sub-directed graphs includes:
  • the nodes in the directed graph are randomly divided into different sub-directed graphs.
  • the dividing the directed graph into multiple sub-directed graphs includes:
  • the node identification range of the nodes included in the sub-directed graph is obtained.
  • the number of nodes included in the sub-directed graph N (MAX(ID)-MIN(ID))/C, where C represents the number of processing devices, and MAX(ID) Is the maximum value of node identification in the directed graph, and MIN(ID) is the minimum value of node identification in the directed graph;
  • the processing device for processing sub-directed graphs in the distributed system stores the node information of the sub-directed graphs processed by itself in the queue of nodes to be processed, starting from the root node/outermost branch.
  • the node starts, broadcasting the node information of the node whose in-degree/out-degree is 0 in the sub-directed graph processed by itself in the distributed system;
  • the in-degree/out-degree of the node corresponding to the node information obtained by statistics is 0, delete the node, and subtract 1 from the in-degree of the node’s direct downstream node/direct upstream node. If the node’s direct downstream node /The indegree of the direct upstream node is 0 after subtracting 1, then the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to loop until all the nodes are processed , If the in-degree/out-degree of the direct downstream node/direct upstream node of the node minus 1 is not 0, then the direct downstream node/direct upstream node of the node and the corresponding processed in-degree/out-degree are retained; at the same time; , Send a notification message to other processing devices in the distributed system for processing sub-directed graphs to notify other servers to subtract 1 from the in-degree/out-degree of the node directly downstream of the node corresponding to the node information deal with;
  • the node is deleted.
  • the processing device for processing the sub-directed graph in the distributed system receives the broadcast message, determines whether there is a node corresponding to the node information in the broadcast message in its sub-directed graph, and if it exists, obtains the entry of the node
  • the degree/out degree is returned to the processing device of the broadcast node information in the distributed system. If it does not exist, the in degree/out degree of the node is returned to the processing device of the broadcast node information in the distributed system as 0;
  • the in-degree/out-degree of the node directly downstream/directly upstream node corresponding to the node information contained in the notification message is reduced by 1, if the in-degree/out-degree of the direct downstream node/direct upstream node is Equal to 0, the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to be performed, so as to loop until all nodes are processed; if the direct downstream node/direct upstream node enters If the degree/out degree is not equal to 0, it will not be processed.
  • the processing until all nodes are completed includes:
  • multiple threads are used to read node information from the queue of nodes to be processed in parallel, and multiple threads are used to perform the processing using the read node information respectively.
  • the method further includes:
  • alarm information is sent, and the alarm information includes node information of nodes included in the ring.
  • the present application also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute the directed graph recognition method described in any one of the above.
  • the present application further provides a device for realizing directed graph recognition, including a memory and a processor, wherein the memory stores the following instructions that can be executed by the processor: for executing any of the directed graphs described above Figure identification method steps.
  • the application also provides a server, including a first processing module and a second processing module; wherein,
  • the first processing module is used to determine the in-degree/out-degree of each node in the directed graph
  • the second processing module is used to process the in-degree/out-degree of the node directly downstream/directly upstream of the node whose in-degree/out-degree is 0 in the directed graph, until all nodes are processed; if there is a directed graph
  • the existence of nodes whose in-degree/out-degree is not 0 indicates that the directed graph is a directed cyclic graph.
  • the in-degree/out-degree in the directed graph is processed one by one on the in-degree/out-degree of the node directly downstream/directly upstream of the node whose in-degree/out-degree is 0 in the directed graph, Until all nodes are processed, including:
  • the node Store the node with 0 in degree in the directed graph in the queue to be processed, start from the root node, delete the node, and subtract 1 from the in degree of the node directly downstream of the node. If the node directly downstream After the in-degree of the node minus 1 is 0, the downstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the in-degree of the node directly downstream of the node is reduced by 1, it is not 0, keep the direct downstream node of the node and the corresponding processed in-degree;
  • the node whose out-degree is 0 in the directed graph in the queue to be processed start from the outermost branch leaf node, delete the node, and subtract 1 from the in-degree of the node directly upstream of the node. If the node If the indegree of the directly upstream node minus 1 is 0, then the upstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the indegree of the node directly upstream is reduced by 1 If the latter is not 0, the directly upstream node of the node and the corresponding processed in-degree number are retained.
  • This application also provides another server, including a preprocessing module, a processing module, and an interaction module;
  • the preprocessing module is used to determine the in-degree/out-degree of each node in the sub-directed graph corresponding to its own server;
  • the interaction module is used to broadcast the node information of nodes in the sub-directed graph corresponding to the server to which the in-degree/out-degree is 0 in the distributed system where the server belongs to, starting from the root node/outermost branch and leaf node; Receiving the in-degree/out-degree of the node corresponding to the broadcasted node information returned from other servers in the distributed system where the server belongs;
  • the processing module is used to store the information of the nodes whose in-degree/out-degree is 0 in the sub-directed graph corresponding to its own server in the queue of nodes to be processed; determine the in-degree/out-degree in the sub-directed graph corresponding to its own server The in-degree/out-degree of a node with 0 in the directed graph, for the in-degree/out-degree of 0 in the directed graph, the in-degree/out-degree of the node directly downstream of the node/direct upstream node Process until all nodes are processed; if there is a node whose in-degree/out-degree is not 0 in the sub-directed graph, it indicates that the sub-directed graph belongs to the directed graph as a directed cyclic graph.
  • the interaction module implements the broadcast through any of the following message communication protocols: Hypertext Transfer Protocol HTTP, Hypertext Transfer Security Protocol HTTPS, and Remote Program Call RPC protocol.
  • the processing module determines the in-degree/out-degree in the directed graph of the node whose in-degree/out-degree is 0 in the sub-directed graph corresponding to the server to which it belongs.
  • the in-degree/out-degree of the node is 0, the in-degree/out-degree of the direct downstream node/direct upstream node of the node is processed until all nodes are processed, including:
  • the in-degree/out-degree of the node corresponding to the node information obtained by statistics is 0, delete the node, and subtract 1 from the in-degree/out-degree of the node's direct downstream node/direct upstream node.
  • the in-degree/out-degree of the direct downstream node/direct upstream node minus 1 becomes 0, then the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to be repeated.
  • the in-degree/out-degree of the direct downstream node/direct upstream node of the node minus 1 is not 0, then the direct downstream node/direct upstream node of the node and the corresponding processed incoming Degree/out degree; at the same time, send notification messages to other processing devices in the distributed system for processing sub-directed graphs to notify other servers of the in degrees of the direct downstream node/direct upstream node of the node corresponding to the node information /The number of degrees out is reduced by 1;
  • the node is deleted.
  • the processing module determines the in-degree/out-degree in the directed graph of the node whose in-degree/out-degree is 0 in the sub-directed graph corresponding to the server to which it belongs.
  • the in-degree/out-degree of the node is 0, the in-degree/out-degree of the direct downstream node/direct upstream node of the node is processed until all nodes are processed, including:
  • Upon receiving the broadcast message determine whether there is a node corresponding to the node information in the broadcast message in its own sub-directed graph. If it exists, obtain the in-degree/out-degree of the node and return it to the broadcast node information in the distributed system If the processing device does not exist, return the in-degree/out-degree of the node to the processing device that broadcasts node information in the distributed system as 0;
  • the in-degree/out-degree of the node directly downstream/directly upstream node corresponding to the node information contained in the notification message is reduced by 1, if the in-degree/out-degree of the direct downstream node/direct upstream node is Equal to 0, the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to be performed, so as to loop until all nodes are processed; if the direct downstream node/direct upstream node enters If the degree/out degree is not equal to 0, it will not be processed.
  • the processing of all nodes in the processing module includes:
  • the node information is read from the queue of nodes to be processed every preset period. If the node information is not read continuously for a preset number of times, it means that the processing of all nodes is completed.
  • the processing module uses multiple threads to process nodes in the queue of nodes to be processed in parallel.
  • the processing module is further used for:
  • alarm information is sent, and the alarm information includes node information of nodes included in the ring.
  • This application also provides a directed graph recognition system, including: a distributed system with more than two servers; among them,
  • the sub-directed graph corresponding to each server in the distributed system is preset.
  • the directed graph processing system further includes: a control node, configured to divide the directed graph into a plurality of sub-directed graphs according to the edges included in the directed graph.
  • control node is specifically configured to:
  • the nodes in the directed graph are randomly divided into different sub-directed graphs; the information of the sub-directed graph is sent to each of the distributed systems. Server; or,
  • the node identification range of the nodes included in the sub-directed graph is obtained; the information of the sub-directed graph is sent to each server in the distributed system.
  • control node is an independent entity device, or is set on any server in the distributed system.
  • the server includes another server described by Angshu.
  • it includes: determining the in-degree/out-degree of each node in the directed graph; and individually matching the direct downstream nodes/direct upstream nodes of the nodes in the directed graph whose in-degree/out-degree is 0 The degree/out degree is processed until all nodes are processed; if there are nodes in the directed graph whose in-degree/out-degree is not 0, it indicates that the directed graph is a directed cyclic graph.
  • the embodiment of the application realizes the inspection of the directed graph.
  • it includes: dividing the directed graph into multiple sub-directed graphs according to the edges included in the directed graph; determining the in-degree/out-degree of each node in each sub-directed graph in a distributed manner; For each sub-directed graph, determine the in-degree/out-degree of the node in the sub-directed graph whose in-degree/out-degree is 0 in the directed graph, and for nodes in the directed graph whose in-degree/out-degree is 0 , Process the in-degree/out-degree of the direct downstream node/direct upstream node of the node until all nodes are processed; if there is a node in the sub-directed graph whose in-degree/out-degree is not 0, it indicates the directed graph It is a directed ring graph.
  • processing devices in the distributed system of the present application such as a server, use multiple threads to process the in-degree numbers of nodes in the directed graph, which further improves the processing efficiency.
  • FIG. 1 is a schematic flowchart of an embodiment of a method for identifying a directed graph of this application
  • FIG. 2 is a schematic flowchart of another embodiment of the method for identifying a directed graph of this application.
  • Figure 3(a) is a schematic diagram of an embodiment of a directed graph before the application is divided;
  • Figure 3(b) is a schematic diagram of an embodiment of a directed graph after the application is divided;
  • Figure 4 is a schematic diagram of the structure of the application directed graph recognition system
  • FIG. 5 is a schematic diagram of the composition structure of another server of this application.
  • the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • computer-readable media does not include non-transitory computer-readable media (transitory media), such as modulated data signals and carrier waves.
  • a graph (or network) consists of some vertices (also called nodes) and the edges that connect them (also called links).
  • the number of all edges (also known as links) connected by each vertex (also called node) is the degree of this vertex (also called link).
  • In-degree and out-degree are one of the important concepts in graph theory algorithms. In-degree usually refers to the sum of the number of times that a node in a directed graph is the end point of an edge in a directed graph. Therefore, the in-degree of a node refers to the number of edges that enter the node; and out-degree usually refers to a certain The sum of the number of times that a node is the starting point of an edge in a directed graph. Therefore, the degree of a node refers to the number of edges that output the node.
  • the identification of the directed graph in this application includes at least:
  • Step 100 Determine the in-degree/out-degree of each node in the directed graph.
  • Step 101 Process the in-degree/out-degree of the node directly downstream/directly upstream of the node whose in-degree/out-degree is 0 in the directed graph one by one until all nodes are processed.
  • this step may include:
  • Step 102 If there are nodes whose in-degree/out-degree is not 0 in the directed graph, it indicates that the directed graph is a directed cyclic graph.
  • the in-degree/out-degree indicates the in-degree or the out-degree
  • the direct downstream node/direct upstream node indicates the direct downstream node or the directly upstream node.
  • a breadth first search (BFS, Breadth First Search) algorithm or a depth first search (DFS, Depth First Search) algorithm can be used on a single server to traverse all the paths of the graph, and explore the nodes Stored in the list, when it is found that the newly discovered node already exists in the list, it can be judged as a cyclic graph.
  • BFS Breadth First Search
  • DFS Depth First Search
  • the technical solutions provided in the related technologies on the one hand, when applied to a large graph structure, it takes a long time to traverse once, resulting in low efficiency; on the other hand, when the amount of data reaches a certain magnitude, it cannot be used.
  • this application proposes in an exemplary example: assigning several sub-directed graphs obtained by dividing the directed graph to the distribution Several processing devices, such as servers, in the system.
  • FIG. 2 is a schematic flowchart of an embodiment of the method for identifying a directed graph of this application, as shown in FIG. 2, including:
  • Step 200 Divide the directed graph into multiple sub-directed graphs according to the edges included in the directed graph.
  • this step may further include: triggering the processing of the directed graph.
  • step 200 is entered.
  • dividing the directed graph into multiple sub-directed graphs in step 200 includes:
  • the nodes in the directed graph are randomly divided into different sub-directed graphs. Here, as long as the edges included in each sub-directed graph are not repeated.
  • a large-scale directed graph consists of 6 million nodes and 10 million edges.
  • the large-scale directed graph can be randomly divided into 100-20 sub-directed graphs including 100,000 to 500,000 edges. .
  • dividing the directed graph into multiple sub-directed graphs in step 200 includes:
  • the node ID range of the nodes included in the sub-directed graph is obtained.
  • a distributed system includes several processing devices such as servers for processing sub-directed graphs, and each processing device corresponds to the distributed system.
  • the number is the number of sub-directed graphs, MAX(ID) is the maximum value of node ID in the directed graph, MIN(ID) is the minimum value of node ID in the directed graph, then the node of the node included in the sub-directed graph
  • Figure 3(a) is a schematic diagram of an embodiment of the directed graph before the application is divided.
  • the directed graph shown in Figure 3(a) includes node A, node B, node C, node D, node E, node F.
  • the connected edge of each node in the directed graph is shown by the arrow line in Figure 3(a).
  • step 100200 suppose that the directed graph shown in Figure 3(a) is divided into two sub-directed graphs.
  • the sub-directed graph on the left includes 7 nodes and 6 Edges: Node A, Node B, Node C, Node D, Node E, Node I, Node J, and the edges of each node;
  • the sub-directed graph on the right includes 8 nodes and 6 edges: node C, node E, node F, node G, node H, node I, node M, node N, and the edges of each node.
  • Step 201 Use a distributed system to determine the in-degree/out-degree of each node in each sub-directed graph.
  • this step may include:
  • the processing device used to process the sub-directed graph in a distributed system determines the in-degree of each node in the sub-directed graph according to the nodes included in the sub-directed graph and the edges connected to the nodes. Out of degrees.
  • node A is the root node
  • the in-degree is 0, the in-degree of node B is 1, and the in-degree of node C is 1.
  • the in-degree of node D is 2, the in-degree of node I is 1, the in-degree of node J is 1, and the in-degree of node E is 0.
  • node M is the root node and the in-degree is 0
  • node N is the root node and the in-degree is 0, and the in-degree of node E is The degree is 1
  • the in degree of node F is 1
  • the in degree of node H is 2
  • the in degree of node I is 0,
  • Step 202 For each sub-directed graph, determine the in-degree/out-degree of the node in the sub-directed graph whose in-degree/out-degree is 0 in the directed graph, and for the in-degree/out-degree in the directed graph For the node with 0, the in-degree/out-degree of the direct downstream node/direct upstream node of the node is processed until all nodes are processed.
  • the in-degree/out-degree indicates the in-degree or the out-degree
  • the direct downstream node/direct upstream node indicates the direct downstream node or the directly upstream node.
  • this step may include:
  • the processing device used to process the sub-directed graph in the distributed system stores the node information of the sub-directed graph processed by itself in the queue of nodes to be processed, starting from the root node, in the distributed
  • the system broadcasts the node information of the first node with 0 in degree in the sub-directed graph processed by itself, such as the node ID; accordingly, other processing devices used to process sub-directed graphs in the distributed system, such as servers,
  • After receiving the broadcast message determine whether the first node corresponding to the node information exists in its own sub-directed graph. If it exists, obtain the in-degree of the first node and return it to the server that broadcasts the node information. If it does not exist, then Return the in-degree of the first node as 0;
  • the in-degree obtained by statistics is 0, then, on the one hand, delete the first node from its own memory, and subtract 1 from the in-degree of the node directly downstream of the first node. If the in-degree of the directly downstream node of the first node is After subtracting 1, it is 0, then the downstream node is stored in the queue of nodes to be processed, and the above-mentioned broadcast processing is continued to be performed, and this cycle until all nodes are processed; if the in-degree of the direct downstream node of the first node is reduced by 1 If it is not 0, the direct downstream node of the first node and the corresponding processed in-degree are retained; on the other hand, at the same time, it sends a notification message to other processing devices in the distributed system for processing sub-directed graphs, such as servers, to Notify other servers to subtract 1 from the indegree of the downstream node of the first node; if the indegree obtained by statistics is not 0, delete the first node;
  • processing devices used to process sub-directed graphs in the distributed system upon receiving the above notification message, subtract 1 from the in-degree of the node that is the direct downstream node of the first node, such as the second node. If the in-degree of the second node is equal to 0, the second node is stored in the queue of nodes to be processed, and the broadcast step is continued, so as to loop until all nodes are processed; if the in-degree of the second node is not Equal to 0, do not do any processing.
  • a processing device such as a server used to process sub-directed graphs in a distributed system may use multiple threads to read node information from the queue of nodes to be processed in parallel , Adopt multiple threads to use the read node information for subsequent processing.
  • node information can be read from the queue of nodes to be processed according to a preset cycle.
  • the relationship mapping table may be used to obtain the downstream nodes of the read node.
  • the relationship mapping table may take the information of each node as the main body, and store its upstream and/or downstream relationship and its own degree information.
  • threads are safe.
  • Thread safety means to prevent in-degree calculation errors due to multi-threaded concurrency, that is, thread safety will not cause in-degree calculation errors.
  • the process until all nodes are processed can include:
  • Multithreading reads node information from the queue of nodes to be processed every preset period such as 1s. If the node information is not read continuously for a preset number of times, such as 10 or more times, it means that the entire directed graph has been completed. Processing is complete.
  • Step 203 If there is a node whose in-degree/out-degree is not 0 in the sub-directed graph, it indicates that the directed graph is a directed cyclic graph.
  • the in-degree/out-degree of all nodes in the loop must not be 0. Therefore, if it is determined that the number of nodes remaining in the memory is not 0, there are still nodes, indicating that the directed graph is a ring Graph; if it is determined that the number of nodes remaining in the memory is 0, that is, no nodes exist, it indicates that the directed graph is not a ring graph.
  • server 1 For server 1, first, the node information of node A will be stored in the queue of nodes to be processed, and the node information of node A will be broadcasted, and server 2, after receiving the broadcast message, determine its own sub-directed graph Node A is not included in, so server 2 returns to server 1 the indegree of node A is 0; then, server 1 counts the indegree of node A in the entire directed graph.
  • server 1 deletes node A, subtracts 1 from the in-degree of node B's direct downstream node of node A, and at the same time, notifies server 2 to subtract 1 from the in-degree of node A's direct downstream node Processing; Then, on server 1, since the in-degree of node B is 0 after processing, node B is placed in the queue of nodes to be processed, and the broadcast process is repeated. On server 2, there is no node A Direct downstream nodes, so no processing is done. The processing of node B is similar to that of node A, and will not be repeated here.
  • server 1 When server 1 reduces the indegree of node C by 1 so that the indegree of node C is 0, it will broadcast the node information of node C, as shown by the thick arrow to the right in Figure 3(b), server 2 will Server 1 returns the in-degree of node C to 0; then server 1 will subtract 1 from the in-degree of node D, which is the direct downstream node of node C. At this time, the in-degree of node D changes from 2 to 1, and accordingly, Server 2 will subtract 1 from the in-degree of node E, the other directly downstream of node C. Since the in-degree of node E is 0 after processing, it puts node E in the queue of nodes to be processed, and repeats the above broadcast process carried out.
  • server 2 When server 2 subtracts 1 from the indegree of node E so that the indegree of node E is 0, it will broadcast the node information of node E, as shown by the thick leftward arrow in Figure 3(b), server 1 will Server 2 returns the in-degree of node E as 0; then, in server 2, node E has no direct downstream node, so no processing is done; accordingly, server 1 will reduce the in-degree of node I directly downstream of node E by 1 Since the in-degree of node I is 0 after processing, the node I is put into the queue of nodes to be processed, and the above-mentioned broadcast process is repeated.
  • server 1 will find that no node information has been read after the preset number of consecutive reads in the queue of the node to be processed.
  • the in-degree of node D in server 1 has been kept at 1, that is, yes There are nodes whose in-degree is not 0 in the directed graph, so the directed graph is a directed cyclic graph. It can also be seen from Fig. 3(b) that there are indeed rings in the directed graph.
  • processing will also start from the root node such as node M. Nodes with an in-degree of 0 will be put into the queue of nodes to be processed, and the server can process the queue of nodes to be processed according to first-in first-out.
  • the method of the present application may further include:
  • the alarm information may be sent, and the alarm information may include node information of the nodes included in the ring; optionally, the alarm information may be sent by means such as SMS.
  • the present application also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute any one of the above-mentioned directed graph recognition methods.
  • the present application further provides a device for directed graph recognition, including a memory and a processor, wherein the memory stores the following instructions that can be executed by the processor: for executing the steps of any one of the above-mentioned directed graph recognition methods .
  • FIG. 4 is a schematic diagram of the composition structure of the directed graph recognition system of this application.
  • the distributed system includes three servers (such as the first server, the second server, and the third server) as an example, but It is not used to limit the protection scope of this application.
  • a distributed system including more than two servers, where servers are used for:
  • the in-degree/out-degree of the direct downstream node/direct upstream node of the node is processed until all nodes are processed; If there are nodes whose in-degree/out-degree is not 0 in the sub-directed graph, it indicates that the sub-directed graph belongs to the directed graph as a directed cyclic graph.
  • the sub-directed graph corresponding to each server in the distributed system is preset.
  • the directed graph recognition system further includes: a control node (not shown in Figure 3(a) and Figure 3(b)) for dividing the directed graph according to the edges included in the directed graph For multiple sub-directed graphs.
  • control node may be specifically used for:
  • the preset number of edges included in the sub-directed graph randomly divide the nodes in the directed graph into different sub-directed graphs; send the information of the sub-directed graph to each server;
  • the node ID range of the nodes included in the sub-directed graph is obtained; the information of the sub-directed graph is sent to each server.
  • control node is also used to: trigger directed graph processing.
  • the division of the sub-directed graph may be triggered when the preset period arrives, or when the trigger is manually set, etc.
  • control node may also be set on any server in the distributed system.
  • Each server in the directed graph recognition system shown in FIG. 4 may include, but is not limited to: a preprocessing module, a processing module, and an interaction module;
  • the preprocessing module is used to determine the in-degree/out-degree of each node in the sub-directed graph corresponding to its own server;
  • the interaction module is used to broadcast the node information of nodes in the sub-directed graph corresponding to the server to which the in-degree/out-degree is 0 in the distributed system where the server belongs to, starting from the root node/outermost branch and leaf node; Receiving the in-degree/out-degree of the node corresponding to the broadcasted node information returned from other servers in the distributed system where the server belongs;
  • the processing module is used to store the information of the nodes whose in-degree/out-degree is 0 in the sub-directed graph corresponding to its own server in the queue of nodes to be processed; determine the in-degree/out-degree in the sub-directed graph corresponding to its own server The in-degree/out-degree of a node with 0 in the directed graph, for the in-degree/out-degree of 0 in the directed graph, the in-degree/out-degree of the node directly downstream of the node/direct upstream node Process until all nodes are processed; if there is a node whose in-degree/out-degree is not 0 in the sub-directed graph, it indicates that the sub-directed graph belongs to the directed graph as a directed cyclic graph.
  • the interaction module is used for message communication between servers, and the message communication protocol may include, but is not limited to: Hypertext Transfer Protocol (HTTP, Hyper Text Transfer Protocol), Hypertext Transfer Security Protocol (HTTPS, Hyper Text Transfer Protocol Secure), Remote Procedure Call (RPC, Remote Procedure Call) protocol, etc.
  • HTTP Hypertext Transfer Protocol
  • HTTPS Hyper Text Transfer Protocol
  • HTTPS Hyper Text Transfer Protocol Secure
  • RPC Remote Procedure Call
  • the processing module determines the in-degree/out-degree in the directed graph of the node whose in-degree/out-degree is 0 in the sub-directed graph corresponding to the server to which it belongs. For a node whose in-degree/out-degree is 0, the in-degree/out-degree of the direct downstream node/direct upstream node of the node is processed until all nodes are processed, which can include:
  • the in-degree/out-degree of the node corresponding to the node information obtained by statistics is 0, delete the node, and subtract 1 from the in-degree/out-degree of the node's direct downstream node/direct upstream node.
  • the in-degree/out-degree of the direct downstream node/direct upstream node minus 1 becomes 0, then the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to be repeated.
  • the in-degree/out-degree of the direct downstream node/direct upstream node of the node minus 1 is not 0, then the direct downstream node/direct upstream node of the node and the corresponding processed incoming Degree/out degree; at the same time, send notification messages to other processing devices in the distributed system for processing sub-directed graphs to notify other servers of the in degrees of the direct downstream node/direct upstream node of the node corresponding to the node information /The number of degrees out is reduced by 1;
  • the node is deleted.
  • the processing module determines the in-degree/out-degree in the directed graph of the node with the in-degree of 0 in the sub-directed graph corresponding to the server to which it belongs, for the in-degree in the directed graph /For a node with a degree of 0, process the in degree/out degree of the node's direct downstream node/direct upstream node until all nodes are processed, which can include:
  • the in-degree/out-degree of the node directly downstream/directly upstream node corresponding to the node information contained in the notification message is reduced by 1, if the in-degree/out-degree of the direct downstream node/direct upstream node is Equal to 0, the direct downstream node/direct upstream node is stored in the queue of the node to be processed, and the broadcast step is continued to be performed, so as to loop until all nodes are processed; if the direct downstream node/direct upstream node enters If the degree/out degree is not equal to 0, it will not be processed.
  • the processing module may read node information from the queue of nodes to be processed according to a preset cycle.
  • the processing module uses multiple threads to process the nodes in the queue of the nodes to be processed in parallel.
  • the processing module may include: multi-threading reads node information from the queue of nodes to be processed every preset period such as 1s, if continuous preset If the node information is not read for 10 or more times, it means that the entire graph has been processed.
  • the processing module is further used for:
  • the alarm information may be sent, and the alarm information may include node information of the nodes included in the ring; optionally, the alarm information may be sent by means such as SMS.
  • Fig. 5 is a schematic diagram of the composition structure of another server of this application. As shown in Fig. 5, the server includes: a first processing module and a second processing module; wherein,
  • the first processing module is used to determine the in-degree/out-degree of each node in the directed graph
  • the second processing module is used to process the in-degree/out-degree of the node directly downstream/directly upstream of the node whose in-degree/out-degree is 0 in the directed graph, until all nodes are processed; if there is a directed graph
  • the existence of nodes whose in-degree/out-degree is not 0 indicates that the directed graph is a directed cyclic graph.
  • the in-degree/out-degree in the directed graph is processed one by one on the in-degree/out-degree of the node directly downstream of the node whose in-degree/out-degree is 0 in the directed graph, until all The node processing is complete, including:
  • the node Store the node with 0 in degree in the directed graph in the queue to be processed, start from the root node, delete the node, and subtract 1 from the in degree of the node directly downstream of the node. If the node directly downstream After the in-degree of the node minus 1 is 0, the downstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the in-degree of the node directly downstream of the node is reduced by 1, it is not 0, keep the direct downstream node of the node and the corresponding processed in-degree;
  • the node whose out-degree is 0 in the directed graph in the queue to be processed start from the outermost branch leaf node, delete the node, and subtract 1 from the in-degree of the node directly upstream of the node. If the node If the indegree of the directly upstream node minus 1 is 0, then the upstream node is stored in the queue of the to-be-processed nodes, and the cycle is repeated until all the nodes are processed. If the indegree of the node directly upstream is reduced by 1 If the latter is not 0, the directly upstream node of the node and the corresponding processed in-degree number are retained.
  • the second processing module is also used for:
  • alarm information is sent, and the alarm information includes node information of nodes included in the ring.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请公开了一种有向图识别方法及系统和服务器,所述方法包括:确定有向图中每个节点的入度数/出度数;逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。本申请实现了对有环图的检查。

Description

一种有向图识别方法及系统和服务器
本申请要求2019年04月12日递交的申请号为201910293860.1、发明名称为“一种有向图识别方法及系统和服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及但不限于向图技术,尤指一种有向图识别方法及系统和服务器。
背景技术
在大数据量场景下的工作流调度引擎中,当出现有环图的工作流时,会因为环路的无限循环而导致无法完成正常的调度工作,最终导致环的下游任务不能被调度运行。这里,环也称回路,是图论里的概念,一个环是一个边的排列,并且满足沿着这个排列走一次可以回到起点。
发明内容
本申请提供一种有向图识别方法及系统和服务器,能够实现对有环图的检查。
本发明实施例提供了一种有向图识别方法,包括:
确定有向图中每个节点的入度数/出度数;
逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;
如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
在一种示例性实例中,所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
将所述有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数;
或者,
将所述有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始, 删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
在一种示例性实例中,所述方法之前还包括:根据有向图包括的边将有向图划分为多个子有向图;
所述确定有向图中每个节点的入度数/出度数,包括:采用分布式方式确定每个子有向图中各节点的入度数/出度数;
所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,包括:
对于每个子有向图,确定在子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理。
在一种示例性实例中,所述将有向图划分为多个子有向图,包括:
按照预先设置的子有向图包括的边的数量,随机将所述有向图中的节点划分到不同的所述子有向图中。
在一种示例性实例中,所述将有向图划分为多个子有向图,包括:
根据所述有向图中节点的节点标识,计算子有向图中包括的节点数量;
根据计算出的节点数量和子有向图对应的处理设备的编号,获取子有向图包括的节点的节点标识范围。
在一种示例性实例中,所述子有向图中包括的节点数量N=(MAX(ID)-MIN(ID))/C,其中,C表示所述处理设备的数量,MAX(ID)为所述有向图中节点标识的最大值,MIN(ID)为所述有向图中节点标识的最小值;
所述子有向图包括的节点的节点标识范围IDS=I*N~I*(N+1),其中,I为所述处理设备在对应的编号。
在一种示例性实例中,所述确定在子有向图中入度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
所述分布式系统中的用于处理子有向图的处理设备,将自身处理的子有向图中入度数为0的节点信息存储在待处理节点队列中,从根节点/最外层枝叶节点开始,在所述分 布式系统中广播自身处理的子有向图中入度数/出度数为0的节点的节点信息;
接收来自所述分布式系统中其它用于处理子有向图的处理设备返回的所述节点信息对应的节点的入度数/出度数,统计自身广播的节点信息在所述有向图中的入度数/出度数;
如果统计得到的所述节点信息对应的节点的入度数/出度数为0,删除该节点,将该节点的直接下游节点/直接上游节点的入度数进行减1处理,如果该节点的直接下游节点/直接上游节点的入度数减1后为0,则将该直接下游节点/直接上游节点存储在所述待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后不为0,则保留该节点的直接下游节点/直接上游节点和对应的处理后的入度数/出度数;同时,向分布式系统中其它用于处理子有向图的处理设备发送通知消息,以通知其他服务器对所述节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理;
如果统计得到的所述节点信息对应的节点的入度数/出度数不为0,删除该节点。
在一种示例性实例中,所述确定在子有向图中入度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
所述分布式系统中的用于处理子有向图的处理设备接收到广播消息,确定自身的子有向图中是否存在广播消息中的节点信息对应的节点,如果存在,获取该节点的入度数/出度数并返回给所述分布式系统中广播节点信息的处理设备,如果不存在,则向所述分布式系统中广播节点信息的处理设备返回该节点的入度数/出度数为0;
接收到通知消息,对通知消息中携带的节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该直接下游节点/直接上游节点的入度数/出度数等于0,则将该直接下游节点/直接上游节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果该直接下游节点/直接上游节点的入度数/出度数不等于0,不做处理。
在一种示例性实例中,所述直至所有节点处理完成,包括:
每隔预先设置的周期从所述待处理节点队列中读取节点信息,如果连续预先设置的次数未读取到节点信息,则所有节点处理完成。
在一种示例性实例中,采用多个线程从所述待处理节点队列中并行读取节点信息,采用多线程分别利用读取的节点信息进行所述处理。
在一种示例性实例中,所述方法还包括:
将所述环所包含的节点的节点信息存入数据库;
或者,发出告警信息,告警信息中包括所述环所包含的节点的节点信息。
本申请还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一项所述的有向图识别方法。
本申请又提供了一种用于实现有向图识别的装置,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行上述任一项所述的有向图识别方法的步骤。
本申请还提供了一种服务器,包括第一处理模块、第二处理模块;其中,
第一处理模块,用于确定有向图中每个节点的入度数/出度数;
第二处理模块,用于逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
在一种示例性实例中,所述第二处理模块中的所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
将所述有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数;
或者,
将所述有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始,删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
本申请还提供另一种服务器,包括预处理模块、处理模块、交互模块;
预处理模块,用于确定自身所属服务器对应的子有向图中各节点的入度数/出度数;
交互模块,用于在自身所属服务器所在分布式系统中,从根节点/最外层枝叶节点开 始,广播自身所属服务器对应的子有向图中入度数/出度数为0的节点的节点信息;接收来自自身所属服务器所在分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;
处理模块,用于将自身所属服务器对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有向有环图。
在一种示例性实例中,所述交互模块通过以下任意消息通信协议:超文本传输协议HTTP、超文本传输安全协议HTTPS、远端程序呼叫RPC协议实现所述广播。
在一种示例性实例中,所述处理模块中的确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
根据接收到的所述来自自身所属服务器所在分布式系统中其它服务器返回的广播的节点信息对应的节点的入度数/出度数,统计自身广播的节点信息在所述有向图中的入度数/出度数;
如果统计得到的所述节点信息对应的节点的入度数/出度数为0,删除该节点,将该节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后为0,则将该直接下游节点/直接上游节点存储在所述待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后不为0,则保留该节点的直接下游节点/直接上游节点和对应的处理后的入度数/出度数;同时,向分布式系统中其它用于处理子有向图的处理设备发送通知消息,以通知其他服务器对所述节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理;
如果统计得到的所述节点信息对应的节点的入度数/出度数不为0,删除该节点。
在一种示例性实例中,所述处理模块中的确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所 有节点处理完成,包括:
接收到广播消息,确定自身的子有向图中是否存在广播消息中的节点信息对应的节点,如果存在,获取该节点的入度数/出度数并返回给所述分布式系统中广播节点信息的处理设备,如果不存在,则向所述分布式系统中广播节点信息的处理设备返回该节点的入度数/出度数为0;
接收到通知消息,对通知消息中携带的节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该直接下游节点/直接上游节点的入度数/出度数等于0,则将该直接下游节点/直接上游节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果该直接下游节点/直接上游节点的入度数/出度数不等于0,不做处理。
在一种示例性实例中,所述处理模块中的直至所有节点处理完成,包括:
每隔预先设置的周期从所述待处理节点队列中读取节点信息,如果连续预先设置的次数未读取到节点信息,则表示所述所有节点处理完成。
在一种示例性实例中,所述处理模块采用多个线程对所述待处理节点队列中的节点并行进行处理。
在一种示例性实例中,所述处理模块还用于:
将所述环所包含的节点的节点信息存入数据库;
或者,发出告警信息,告警信息中包括所述环所包含的节点的节点信息。
本申请还提供了一种有向图识别系统,包括:两个以上服务器的分布式系统;其中,
服务器,用于:
确定自身对应的子有向图的中各节点的入度数/出度数;从根节点/最外层枝叶节点开始,广播自身对应的子有向图中入度数/出度数为0的节点的节点信息;接收来自分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;将自身对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有向有环图。
在一种示例性实例中,所述分布式系统中各服务器对应的子有向图是预先设置的。
在一种示例性实例中,所述有向图处理系统还包括:控制节点,用于根据有向图包 括的边将有向图划分为多个所述子有向图。
在一种示例性实例中,所述控制节点具体用于:
按照预先设置的子有向图包括的边的数量,随机将所述有向图中的节点划分到不同的子有向图中;将子有向图的信息发送给所述分布式系统中各服务器;或者,
根据所述有向图中节点的节点标识,计算子有向图中包括的节点数量;
根据计算出的节点数量和子有向图对应的服务器的编号,获取子有向图包括的节点的节点标识范围;将子有向图的信息发送给所述分布式系统中各服务器。
在一种示例性实例中,所述控制节点为独立实体设备,或者设置在所述分布式系统中的任意服务器上。
在一种示例性实例中,所述服务器包括昂书另一种所述的服务器。
在一种示例性实例中,包括:确定有向图中每个节点的入度数/出度数;逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。本申请实施例实现了对有向图的检查。
在一种示例性实例中,包括:根据有向图包括的边将有向图划分为多个子有向图;采用分布式方式确定每个子有向图中各节点的入度数/出度数;对于每个子有向图,确定在子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该有向图为有向有环图。本申请实施例将根据有向图划分得到的若干个子有向图,分别分配给分布式系统中的若干个处理设备如服务器来处理,由于分布式系统是可以水平扩展的,因此,本申请不受图的量级限制而实现对有环图的检查。
在一种示例性实例中,本申请分布式系统中的若干个处理设备如服务器采用多线程对有向图中节点的入度数进行处理,进一步提升了处理效率。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申 请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1为本申请有向图识别方法的一实施例流程示意图;
图2为本申请有向图识别方法的另一实施例流程示意图;
图3(a)为本申请划分前的有向图的实施例的示意图;
图3(b)为本申请划分后的有向图的实施例的示意图;
图4为本申请有向图识别系统的组成架构示意图;
图5为本申请另一种服务器的组成结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚明白,下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
在本申请一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
在图论和网络理论中,一个图(或网络)由一些顶点(也称为节点)和连接它们的 边(也称为连结)构成。每个顶点(也称为节点)连出的所有边(也称为连结)的数量就是这个顶点(也称为连结)的度。入度、出度是图论算法中重要的概念之一。入度通常指有向图中某节点作为有向图中边的终点的次数之和,因此,节点的入度数就是指进入该节点的边的条数;而出度通常指有向图中某节点作为有向图中边的起点的次数之和,因此,节点的出度数就是指输出该节点的边的条数。
为了实现对有向图的检查,以避免出现有环图的工作流,如图1所示,本申请对有向图的识别,至少包括:
步骤100:确定有向图中每个节点的入度数/出度数。
步骤101:逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成。
在一种示例性实例中,本步骤可以包括:
将有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数。或者,
将有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始,删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
步骤102:如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
需要说明的是,入度数/出度数表示入度数或者出度数,直接下游节点/直接上游节点表示直接下游节点或者直接上游节点。当使用入度数对有向图进行识别时,是逐个对入度数为0的节点的直接下游节点的入度数进行处理;当使用出度数对有向图进行识别时,是逐个对出度数为0的节点的直接上游节点的出度数进行处理。
为了确定有环图,通常,可以在单台服务器上利用广度优先搜索(BFS,Breadth First Search)算法或者深度优先搜索(DFS,Depth First Search)算法遍历图的所有路径,并将探索到的节点存放到列表中,当发现新探索到的节点在列表中已经存在时,可以判断 为有环图。相关技术中提供的技术方案,一方面,在应用于大型图结构时,遍历一遍会耗时很长,导致效率低下;另一方面,当数据量达到一定量级后是无法使用的。
为了在实现对有环图的检查的同时,还不受图的量级限制,本申请在一种示例性实例中提出:将根据有向图划分得到的若干个子有向图,分别分配给分布式系统中的若干个处理设备如服务器来处理。
图2为本申请有向图识别方法的实施例流程示意图,如图2所示,包括:
步骤200:根据有向图包括的边将有向图划分为多个子有向图。
在一种示例性实例中,本步骤之前还可以包括:触发对有向图的处理。
比如:当预设周期到来时、或者手动设置触发时等等,触发对所述有向图的处理即进入步骤200。
在一种示例性实例中,步骤200中将有向图划分为多个子有向图,包括:
按照预先设置的子有向图包括的边的数量,随机将有向图中的节点划分到不同的子有向图中。这里,只要各子有向图包括的边不重复即可。
举个例子来看,假设某大型有向图包括600万个节点和1000千万条边组成,可以将大型有向图随机划分为包括10万~50万条边的100~20个子有向图。
在一种示例性实例中,步骤200中将有向图划分为多个子有向图,包括:
根据有向图中节点的节点标识(ID),计算子有向图中包括的节点数量;
根据计算出的节点数量和子有向图对应的处理设备的编号,获取子有向图包括的节点的节点ID范围。
在一种示例性实例中,举个例子来看,假设在一个分布式系统中,包括若干台用于处理子有向图的处理设备如服务器,每台处理设备在该分布式系统中都会对应一个唯一的编号I。在本申请一种实施例中,子有向图中包括的节点数量即每台处理设备处理的节点数N=(MAX(ID)-MIN(ID))/C,其中,C表示处理设备的数量即子有向图的数量,MAX(ID)为有向图中节点ID的最大值,MIN(ID)为有向图中节点ID的最小值,那么,子有向图包括的节点的节点ID范围即该处理设备所处理的节点ID范围IDS=I*N~I*(N+1)。
图3(a)为本申请划分前的有向图的实施例的示意图,如图3(a)所示的有向图,包括节点A、节点B、节点C、节点D、节点E、节点F、节点G、节点H、节点I、节点J、节点M和节点N,其中,节点A、节点M和节点N为有向图的根节点,有向图的根节点是可以预先被标识出来的;有向图中各节点的连出的边如图3(a)中箭头线所示。
按照步骤100200的实现方式,假设将图3(a)所示的有向图划分为两个子有向图,如图3(b)所示,左侧的子有向图包括7个节点和6条边:节点A、节点B、节点C、节点D、节点E、节点I和节点J,以及各节点的边;右侧的子有向图包括8个节点和6条边:节点C、节点E、节点F、节点G、节点H、节点I、节点M和节点N,以及各节点的边。
步骤201:采用分布式系统确定每个子有向图中各节点的入度数/出度数。
在一种示例性实例中,本步骤可以包括:
分布式系统中的用于处理子有向图的处理设备如服务器,根据划分给自身的子有向图包括的节点和节点连出的边,确定该子有向图中各节点的入度数/出度数。
需要说明的是,如何确定一个有向图的入度数/出度数可以采用相关技术来实现,具体实现并不用于限定本申请的保护范围。本申请强调的是,将由有向图划分得到的若干个子有向图,分别分配给分布式系统中的若干个处理设备如服务器进行处理,由于分布式系统是可以水平扩展的,因此,本申请提供的有向图处理方法避免了单机内存对有向图的数据量大小的限制,也就是说,本申请非常适合应用于大型图结构。
以图3(b)所示的实施例中左侧所示的子有向图为例,节点A为根节点、入度数为0,节点B的入度数为1,节点C的入度数为1,节点D的入度数为2,节点I的入度数为1,节点J的入度数为1,节点E的入度数为0。以图3(b)所示的实施例中右侧所示的子有向图为例,节点M为根节点、入度数为0,节点N为根节点、入度数为0,节点E的入度数为1,节点F的入度数为1,节点H的入度数为2,节点I的入度数为0,节点G的入度数为1,节点C的入度数为0。
步骤202:对于每个子有向图,确定在子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成。
需要说明的是,入度数/出度数表示入度数或者出度数,直接下游节点/直接上游节点表示直接下游节点或者直接上游节点。当使用入度数对有向图进行识别时,是从根节点开始,逐个对入度数为0的节点的直接下游节点的入度数进行处理;当使用出度数对有向图进行识别时,是从最外层枝叶节点开始,逐个对出度数为0的节点的直接上游节点的出度数进行处理。
在一种示例性实例中,以通过入度数对有向图进行识别为例,本步骤可以包括:
分布式系统中的用于处理子有向图的处理设备如服务器,将自身处理的子有向图中入度数为0的节点信息存储在待处理节点队列中,从根节点开始,在分布式系统中广播 自身处理的子有向图中入度数为0的节点如第一节点的节点信息如节点ID;相应地,分布式系统中其它用于处理子有向图的处理设备如服务器,在收到该广播消息后,确定自身的子有向图中是否存在该节点信息对应的第一节点,如果存在,获取第一节点的入度数并返回给广播节点信息的服务器,如果不存在,则返回第一节点的入度数为0;
接收来自分布式系统中其它用于处理子有向图的处理设备如服务器返回的第一节点在其它子有向图中的入度数,统计自身广播的节点信息对应的第一节点在整体有向图中的入度数(即将返回的入度数相加得到第一节点在整个有向图中的入度数);
如果统计得到的入度数为0,那么,一方面,从自身内存中删除第一节点,将第一节点的直接下游节点的入度数进行减1处理,如果第一节点的直接下游节点的入度数减1后为0,则将该下游节点存储在待处理节点队列中,继续执行上述广播的处理,以此循环,直至所有节点处理完成;如果第一节点的直接下游节点的入度数减1后不为0,则保留第一节点的直接下游节点和对应的处理后的入度数;另一方面,同时向分布式系统中其它用于处理子有向图的处理设备如服务器发送通知消息,以通知其他服务器对第一节点的下游节点的入度数进行减1处理;如果统计得到的入度数不为0,删除第一节点;
相应地,分布式系统中其它用于处理子有向图的处理设备如服务器,在收到上述通知消息,对该节点即第一节点的直接下游节点如第二节点的入度数进行减1处理,如果第二节点的入度数等于0,则将第二节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果第二节点的入度数不等于0,不做任何处理。
在一种示例性实例中,为了更进一步地提升处理效率,分布式系统中的用于处理子有向图的处理设备如服务器,可以采用多个线程从待处理节点队列中并行读取节点信息,采用多线程分别利用读取的节点信息进行后续处理。
在一种示例性实例中,可以按照预先设置的周期从待处理节点队列中读取节点信息。
在一种示例性实例中,可以使用关系映射表获取读取的节点的下游节点。
在一种示例性实例中,关系映射表可以以每个节点信息作为主体,存储其上游和/或下游的关系以及自身的入度数信息。
在一种示例性实例中,线程是安全的。线程安全表明是防止入度数因为多线程并发而产生计算错误,也就是说,线程安全就不会出现入度数计算错误的问题。
在一种示例性实例中,其中的直至所有节点处理完成,可以包括:
多线程每隔预先设置的周期如1s,从待处理节点队列中读取节点信息,如果连续预 先设置的次数如10次或者更多次未读取到节点信息,则表示整个有向图已经全部处理完毕。
步骤203:如果子有向图中存在入度数/出度数不为0的节点,表明该有向图为有向有环图。
如果是有环图,那么,环路的所有节点的入度数/出度数一定不为0,因此,如果判断出内存中剩余的节点数不为0即仍然存在节点,表明该有向图是环图;如果判断出内存中剩余的节点数为0即不存在节点,则表明该有向图不是环图。
这里,以通过入度数对有向图进行识别为例,并以图3(b)中子有向图为例,假设处理左侧子有向图的服务器为服务器1,处理右侧子有向图的服务器为服务器2。
对于服务器1,首先,会将节点A的节点信息存储在待处理节点队列中,并将节点A的节点信息广播出去,而服务器2,在收到广播消息后,确定出自身的子有向图中不包括节点A,因此,服务器2向服务器1返回节点A的入度数为0;接着,服务器1对节点A在整个有向图中的入度数进行统计,显然统计得到的节点A在整个有向图中的入度数为0,于是服务器1删除节点A,将节点A的直接下游节点即节点B的入度数减1,同时,通知服务器2对节点A的直接下游节点的入度数进行减1处理;然后,在服务器1,由于节点B的入度数处理后为0,因此将节点B放入待处理节点队列中,从上述广播的过程开始重复执行,而在服务器2,不存在节点A的直接下游节点,因此不做处理。对节点B的处理与节点A类似,这里不再赘述。
当服务器1将节点C的入度数减1使得节点C的入度数为0后,会将节点C的节点信息广播出去,如图3(b)中向右的粗箭头所示,服务器2会向服务器1返回节点C的入度数为0;接着服务器1会将节点C的直接下游节点即节点D的入度数减1,此时,节点D的入度数由原来的2变为1,相应地,服务器2会将节点C的另一直接下游即节点E的入度数减1,由于节点E的入度数处理后为0,因此将节点E放入待处理节点队列中,从上述广播的过程开始重复执行。
当服务器2将节点E的入度数减1使得节点E的入度数为0后,会将节点E的节点信息广播出去,如图3(b)中向左的粗箭头所示,服务器1会向服务器2返回节点E的入度数为0;接着,在服务器2中,节点E没有直接下游节点,因此不做处理;相应地,服务器1会将节点E的直接下游即节点I的入度数减1,由于节点I的入度数处理后为0,因此将节点I放入待处理节点队列中,从上述广播的过程开始重复执行。
就这样一直处理下去,服务器1会发现,待处理节点队列中连续预设次数读取后都 没有读取到节点信息,但是,服务器1中的节点D的入度数一直都保持为1,即有向图中存在入度数不为0的节点,因此该有向图为有向有环图。从图3(b)也可以看到,在有向图中确实有环的存在。
需要说明的是,对于服务器2,也会先从根节点如节点M开始处理。入度数为0的节点都会放入待处理节点队列中,服务器可以按照先进先出对待处理节点队列进行处理。
在一种示例性实例中,本申请方法还可以包括:
将环所包含的节点的节点信息存入数据库;
或者,发出告警信息,告警信息中可以包括环所包含的节点的节点信息;可选地,可以通过短信等方式发出告警信息。
本申请还提供一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一项的有向图识别方法。
本申请再提供一种用于有向图识别的装置,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行上述任一项有向图识别方法的步骤。
图4为本申请有向图识别系统的组成架构示意图,图4所示的实施例中,仅以分布系统包括三个服务器(例如第一服务器、第二服务器和第三服务器)为例,但是并不用于限定本申请的保护范围。如图4所示,包括两个以上服务器的分布式系统,其中,服务器,用于:
确定自身对应的子有向图的中各节点的入度数/出度数;
从根节点/最外层枝叶节点开始,广播自身对应的子有向图中入度数/出度数为0的节点的节点信息;
接收来自分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;
将自身对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有向有环图。
在一种示例性实例中,分布式系统中各服务器对应的子有向图是预先设置好的。
在一种示例性实例中,有向图识别系统还包括:控制节点(图3(a)和图3(b)中未示出),用于根据有向图包括的边将有向图划分为多个子有向图。
在一种示例性实例中,控制节点可以具体用于:
按照预先设置的子有向图包括的边的数量,随机将有向图中的节点划分到不同的子有向图中;将子有向图的信息发送给各服务器;
或者,
根据有向图中节点的节点ID,计算子有向图中包括的节点数量;
根据计算出的节点数量和子有向图对应的处理设备的编号,获取子有向图包括的节点的节点ID范围;将子有向图的信息发送给各服务器。
在一种示例性实例中,控制节点还用于:触发有向图处理。可选地,可以当预设周期到来时、或者手动设置触发时等等,触发子有向图的划分。
在一种示例性实例中,上述控制节点也可以设置在分布式系统中的任意服务器上。
图4中所示的有向图识别系统中的每个服务器,可以包括但不限于:预处理模块、处理模块、交互模块;
预处理模块,用于确定自身所属服务器对应的子有向图中各节点的入度数/出度数;
交互模块,用于在自身所属服务器所在分布式系统中,从根节点/最外层枝叶节点开始,广播自身所属服务器对应的子有向图中入度数/出度数为0的节点的节点信息;接收来自自身所属服务器所在分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;
处理模块,用于将自身所属服务器对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有向有环图。
在一种示例性实例中,交互模块用于服务器间的消息通信,消息通信协议可以包括但不限于如:超文本传输协议(HTTP,Hyper Text Transfer Protocol)、超文本传输安全协议(HTTPS,Hyper Text Transfer Protocol Secure)、远端程序呼叫(RPC,Remote Procedure Call)协议等。
在一种示例性实例中,处理模块中的确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节 点处理完成,可以包括:
根据接收到的所述来自自身所属服务器所在分布式系统中其它服务器返回的广播的节点信息对应的节点的入度数/出度数,统计自身广播的节点信息在所述有向图中的入度数/出度数;
如果统计得到的所述节点信息对应的节点的入度数/出度数为0,删除该节点,将该节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后为0,则将该直接下游节点/直接上游节点存储在所述待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后不为0,则保留该节点的直接下游节点/直接上游节点和对应的处理后的入度数/出度数;同时,向分布式系统中其它用于处理子有向图的处理设备发送通知消息,以通知其他服务器对所述节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理;
如果统计得到的所述节点信息对应的节点的入度数/出度数不为0,删除该节点。
在一种示例性实例中,处理模块中的确定自身所属服务器对应的子有向图中入度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,可以包括:
接收到广播消息,确定自身的子有向图中是否存在广播消息中的节点信息对应的节点,如果存在,获取该节点的入度数并返回给所述分布式系统中广播节点信息的处理设备,如果不存在,则向所述分布式系统中广播节点信息的处理设备返回该节点的入度数/出度数为0;
接收到通知消息,对通知消息中携带的节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该直接下游节点/直接上游节点的入度数/出度数等于0,则将该直接下游节点/直接上游节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果该直接下游节点/直接上游节点的入度数/出度数不等于0,不做处理。
在一种示例性实例中,所述处理模块可以按照预先设置的周期从待处理节点队列中读取节点信息。
在一种示例性实例中,所述处理模块采用多个线程对待处理节点队列中的节点并行进行处理。
在一种示例性实例中,所述处理模块中的直至所有节点处理完成,可以包括:多线程每隔预先设置的周期如1s,从待处理节点队列中读取节点信息,如果连续预先设置的次数如10次或者更多次未读取到节点信息,则表示整个图已经全部处理完毕。
在一种示例性实例中,所述处理模块还用于:
将环所包含的节点的节点信息存入数据库;
或者,发出告警信息,告警信息中可以包括环所包含的节点的节点信息;可选地,可以通过短信等方式发出告警信息。
图5为本申请另一种服务器的组成结构示意图,如图5所示,服务器包括:第一处理模块、第二处理模块;其中,
第一处理模块,用于确定有向图中每个节点的入度数/出度数;
第二处理模块,用于逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
在一种示例性实例中,第二处理模块中的所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
将所述有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数;
或者,
将所述有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始,删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
在一种示例性实例中,第二处理模块还用于:
将所述环所包含的节点的节点信息存入数据库;
或者,发出告警信息,告警信息中包括所述环所包含的节点的节点信息。
虽然本申请所揭露的实施方式如上,但所述的内容仅为便于理解本申请而采用的实施方式,并非用以限定本申请。任何本申请所属领域内的技术人员,在不脱离本申请所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本申请的专利保护范围,仍须以所附的权利要求书所界定的范围为准。

Claims (28)

  1. 一种有向图识别方法,包括:
    确定有向图中每个节点的入度数/出度数;
    逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;
    如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
  2. 根据权利要求1所述的有向图识别方法,其中,所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    将所述有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数;
    或者,
    将所述有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始,删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
  3. 根据权利要求1所述的有向图识别方法,所述方法之前还包括:根据有向图包括的边将有向图划分为多个子有向图;
    所述确定有向图中每个节点的入度数/出度数,包括:采用分布式方式确定每个子有向图中各节点的入度数/出度数;
    所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,包括:
    对于每个子有向图,确定在子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理。
  4. 根据权利要求3所述的有向图识别方法,其中,所述将有向图划分为多个子有向 图,包括:
    按照预先设置的子有向图包括的边的数量,随机将所述有向图中的节点划分到不同的所述子有向图中。
  5. 根据权利要求3所述的有向图识别方法,其中,所述将有向图划分为多个子有向图,包括:
    根据所述有向图中节点的节点标识,计算子有向图中包括的节点数量;
    根据计算出的节点数量和子有向图对应的处理设备的编号,获取子有向图包括的节点的节点标识范围。
  6. 根据权利要求5所述的有向图识别方法,其中,
    所述子有向图中包括的节点数量N=(MAX(ID)-MIN(ID))/C,其中,C表示所述处理设备的数量,MAX(ID)为所述有向图中节点标识的最大值,MIN(ID)为所述有向图中节点标识的最小值;
    所述子有向图包括的节点的节点标识范围IDS=I*N~I*(N+1),其中,I为所述处理设备在对应的编号。
  7. 根据权利要求3所述的有向图识别方法,其中,所述确定在子有向图中入度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    所述分布式系统中的用于处理子有向图的处理设备,将自身处理的子有向图中入度数为0的节点信息存储在待处理节点队列中,从根节点/最外层枝叶节点开始,在所述分布式系统中广播自身处理的子有向图中入度数/出度数为0的节点的节点信息;
    接收来自所述分布式系统中其它用于处理子有向图的处理设备返回的所述节点信息对应的节点的入度数/出度数,统计自身广播的节点信息在所述有向图中的入度数/出度数;
    如果统计得到的所述节点信息对应的节点的入度数/出度数为0,删除该节点,将该节点的直接下游节点/直接上游节点的入度数进行减1处理,如果该节点的直接下游节点/直接上游节点的入度数减1后为0,则将该直接下游节点/直接上游节点存储在所述待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后不为0,则保留该节点的直接下游节点/直接上游节点和对应的处理后的入度数/出度数;同时,向分布式系统中 其它用于处理子有向图的处理设备发送通知消息,以通知其他服务器对所述节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理;
    如果统计得到的所述节点信息对应的节点的入度数/出度数不为0,删除该节点。
  8. 根据权利要求3所述的有向图识别方法,其中,所述确定在子有向图中入度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    所述分布式系统中的用于处理子有向图的处理设备接收到广播消息,确定自身的子有向图中是否存在广播消息中的节点信息对应的节点,如果存在,获取该节点的入度数/出度数并返回给所述分布式系统中广播节点信息的处理设备,如果不存在,则向所述分布式系统中广播节点信息的处理设备返回该节点的入度数/出度数为0;
    接收到通知消息,对通知消息中携带的节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该直接下游节点/直接上游节点的入度数/出度数等于0,则将该直接下游节点/直接上游节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果该直接下游节点/直接上游节点的入度数/出度数不等于0,不做处理。
  9. 根据权利要求7或8所述的有向图识别方法,其中,所述直至所有节点处理完成,包括:
    每隔预先设置的周期从所述待处理节点队列中读取节点信息,如果连续预先设置的次数未读取到节点信息,则所有节点处理完成。
  10. 根据权利要求7或8所述的有向图识别方法,其中,采用多个线程从所述待处理节点队列中并行读取节点信息,采用多线程分别利用读取的节点信息进行所述处理。
  11. 根据权利要求1所述的有向图识别方法,所述方法还包括:
    将所述有向有环图所包含的节点的节点信息存入数据库;
    或者,发出告警信息,告警信息中包括所述有向有环图所包含的节点的节点信息。
  12. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1~权利要求11任一项所述的有向图识别方法。
  13. 一种用于实现有向图识别的装置,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:用于执行权利要求1~权利要求11任一项所述的有向图识别方法的步骤。
  14. 一种服务器,包括第一处理模块、第二处理模块;其中,
    第一处理模块,用于确定有向图中每个节点的入度数/出度数;
    第二处理模块,用于逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果有向图中存在入度数/出度数不为0的节点,表明所述有向图为有向有环图。
  15. 根据权利要求14所述的服务器,其中,所述第二处理模块中的所述逐个对有向图中入度数/出度数为0的节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    将所述有向图中入度数为0的节点存储在待处理队列中,从根节点开始,删除该节点,将该节点的直接下游节点的入度数进行减1处理,如果该节点的直接下游节点的入度数减1后为0,则将该下游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接下游节点的入度数减1后不为0,则保留该节点的直接下游节点和对应的处理后的入度数;
    或者,
    将所述有向图中出度数为0的节点存储在待处理队列中,从最外层枝叶节点开始,删除该节点,将该节点的直接上游节点的入度数进行减1处理,如果该节点的直接上游节点的入度数减1后为0,则将该上游节点存储在所述待处理节点队列中,以此循环,直至所有节点处理完成,如果该节点的直接上游节点的入度数减1后不为0,则保留该节点的直接上游节点和对应的处理后的入度数。
  16. 一种服务器,包括预处理模块、处理模块、交互模块;
    预处理模块,用于确定自身所属服务器对应的子有向图中各节点的入度数/出度数;
    交互模块,用于在自身所属服务器所在分布式系统中,从根节点/最外层枝叶节点开始,广播自身所属服务器对应的子有向图中入度数/出度数为0的节点的节点信息;接收来自自身所属服务器所在分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;
    处理模块,用于将自身所属服务器对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有 向有环图。
  17. 根据权利要求16所述的服务器,其中,所述交互模块通过以下任意消息通信协议:超文本传输协议HTTP、超文本传输安全协议HTTPS、远端程序呼叫RPC协议实现所述广播。
  18. 根据权利要求16所述的服务器,其中,所述处理模块中的确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    根据接收到的所述来自自身所属服务器所在分布式系统中其它服务器返回的广播的节点信息对应的节点的入度数/出度数,统计自身广播的节点信息在所述有向图中的入度数/出度数;
    如果统计得到的所述节点信息对应的节点的入度数/出度数为0,删除该节点,将该节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后为0,则将该直接下游节点/直接上游节点存储在所述待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成,如果该节点的直接下游节点/直接上游节点的入度数/出度数减1后不为0,则保留该节点的直接下游节点/直接上游节点和对应的处理后的入度数/出度数;同时,向分布式系统中其它用于处理子有向图的处理设备发送通知消息,以通知其他服务器对所述节点信息对应的节点的直接下游节点/直接上游节点的入度数/出度数进行减1处理;
    如果统计得到的所述节点信息对应的节点的入度数/出度数不为0,删除该节点。
  19. 根据权利要求16所述的服务器,其中,所述处理模块中的确定自身所属服务器对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成,包括:
    接收到广播消息,确定自身的子有向图中是否存在广播消息中的节点信息对应的节点,如果存在,获取该节点的入度数/出度数并返回给所述分布式系统中广播节点信息的处理设备,如果不存在,则向所述分布式系统中广播节点信息的处理设备返回该节点的入度数/出度数为0;
    接收到通知消息,对通知消息中携带的节点信息对应的节点的直接下游节点/直接上 游节点的入度数/出度数进行减1处理,如果该直接下游节点/直接上游节点的入度数/出度数等于0,则将该直接下游节点/直接上游节点存储到待处理节点队列中,继续执行所述广播的步骤,以此循环,直至所有节点处理完成;如果该直接下游节点/直接上游节点的入度数/出度数不等于0,不做处理。
  20. 根据权利要求16、18或19所述的服务器,其中,所述处理模块中的直至所有节点处理完成,包括:
    每隔预先设置的周期从所述待处理节点队列中读取节点信息,如果连续预先设置的次数未读取到节点信息,则表示所述所有节点处理完成。
  21. 根据权利要求16、18或19所述的服务器,其中,所述处理模块采用多个线程对所述待处理节点队列中的节点并行进行处理。
  22. 根据权利要求16所述的服务器,所述处理模块还用于:
    将所述有向有环图所包含的节点的节点信息存入数据库;
    或者,发出告警信息,告警信息中包括所述有向有环图所包含的节点的节点信息。
  23. 一种有向图识别系统,包括:两个以上服务器的分布式系统;其中,
    服务器,用于:
    确定自身对应的子有向图的中各节点的入度数/出度数;从根节点/最外层枝叶节点开始,广播自身对应的子有向图中入度数/出度数为0的节点的节点信息;接收来自分布式系统中其它服务器返回的所述广播的节点信息对应的节点的入度数/出度数;将自身对应的子有向图中入度数/出度数为0的节点信息存储在待处理节点队列中;确定自身对应的子有向图中入度数/出度数为0的节点在有向图中的入度数/出度数,对于在有向图中的入度数/出度数为0的节点,对该节点的直接下游节点/直接上游节点的入度数/出度数进行处理,直至所有节点处理完成;如果子有向图中存在入度数/出度数不为0的节点,表明该子有向图所属有向图为有向有环图。
  24. 根据权利要求23所述的有向图识别系统,其中,所述分布式系统中各服务器对应的子有向图是预先设置的。
  25. 根据权利要求23所述的有向图识别系统,所述有向图处理系统还包括:控制节点,用于根据有向图包括的边将有向图划分为多个所述子有向图。
  26. 根据权利要求25所述的有向图识别系统,其中,所述控制节点具体用于:
    按照预先设置的子有向图包括的边的数量,随机将所述有向图中的节点划分到不同的子有向图中;将子有向图的信息发送给所述分布式系统中各服务器;或者,
    根据所述有向图中节点的节点标识,计算子有向图中包括的节点数量;
    根据计算出的节点数量和子有向图对应的服务器的编号,获取子有向图包括的节点的节点标识范围;将子有向图的信息发送给所述分布式系统中各服务器。
  27. 根据权利要求25所述的有向图识别系统,其中,所述控制节点为独立实体设备,或者设置在所述分布式系统中的任意服务器上。
  28. 根据权利要求23~27任一项所述的有向图识别系统,其中,所述服务器包括权利要求16~权利要求22所述的服务器。
PCT/CN2020/084119 2019-04-12 2020-04-10 一种有向图识别方法及系统和服务器 WO2020207457A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910293860.1A CN111814002B (zh) 2019-04-12 2019-04-12 一种有向图识别方法及系统和服务器
CN201910293860.1 2019-04-12

Publications (1)

Publication Number Publication Date
WO2020207457A1 true WO2020207457A1 (zh) 2020-10-15

Family

ID=72750920

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/084119 WO2020207457A1 (zh) 2019-04-12 2020-04-10 一种有向图识别方法及系统和服务器

Country Status (2)

Country Link
CN (1) CN111814002B (zh)
WO (1) WO2020207457A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241474A (zh) * 2020-11-24 2021-01-19 深圳前海微众银行股份有限公司 信息处理方法、装置和存储介质
US20220101194A1 (en) * 2020-09-30 2022-03-31 EMC IP Holding Company LLC Method, electronic device, and computer program product for processing machine learning model

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416761B (zh) * 2020-11-11 2023-07-07 北京京航计算通讯研究所 一种基于广度优先搜索的测试用例生成方法及装置
CN113672780A (zh) * 2021-08-12 2021-11-19 北京金山云网络技术有限公司 检测有向图闭环的方法、装置、设备及存储介质
CN115994244B (zh) * 2021-10-18 2024-03-19 广州南天电脑系统有限公司 基于大数据的有向图数据处理方法、装置、计算机设备
CN114510338B (zh) * 2022-04-19 2022-09-06 浙江大华技术股份有限公司 一种任务调度方法、任务调度设备和计算机可读存储介质
CN118152206A (zh) * 2022-12-07 2024-06-07 奇安信网神信息技术(北京)股份有限公司 一种工作流检测方法和装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733832A (zh) * 2018-05-28 2018-11-02 北京阿可科技有限公司 有向无环图的分布式存储方法
CN109561148A (zh) * 2018-11-30 2019-04-02 湘潭大学 边缘计算网络中基于有向无环图的分布式任务调度方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9367809B2 (en) * 2013-10-11 2016-06-14 Accenture Global Services Limited Contextual graph matching based anomaly detection
CN108762201B (zh) * 2018-04-18 2021-02-09 南京工业大学 一种基于Pearson相关性的大系统图论分解方法
CN109241355A (zh) * 2018-06-20 2019-01-18 中南大学 有向无环图的可达性查询方法、系统及可读存储介质

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733832A (zh) * 2018-05-28 2018-11-02 北京阿可科技有限公司 有向无环图的分布式存储方法
CN109561148A (zh) * 2018-11-30 2019-04-02 湘潭大学 边缘计算网络中基于有向无环图的分布式任务调度方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220101194A1 (en) * 2020-09-30 2022-03-31 EMC IP Holding Company LLC Method, electronic device, and computer program product for processing machine learning model
CN112241474A (zh) * 2020-11-24 2021-01-19 深圳前海微众银行股份有限公司 信息处理方法、装置和存储介质
CN112241474B (zh) * 2020-11-24 2023-08-15 深圳前海微众银行股份有限公司 信息处理方法、装置和存储介质

Also Published As

Publication number Publication date
CN111814002A (zh) 2020-10-23
CN111814002B (zh) 2024-06-04

Similar Documents

Publication Publication Date Title
WO2020207457A1 (zh) 一种有向图识别方法及系统和服务器
EP3565219B1 (en) Service execution method and device
US9191463B2 (en) Stream processing using a client-server architecture
CN107688500B (zh) 一种分布式任务处理方法、装置、系统及设备
CN106802826B (zh) 一种基于线程池的业务处理方法及装置
CN108319495B (zh) 任务处理方法及装置
Yang et al. Big data real-time processing based on storm
WO2016206600A1 (zh) 一种信息流数据的处理方法和装置
WO2017016421A1 (zh) 一种集群中的任务执行方法及装置
US20130013534A1 (en) Hardware-assisted approach for local triangle counting in graphs
US20210211428A1 (en) Addressing transaction conflict in blockchain systems
US11270004B2 (en) Blockchain-based transaction privacy
US11675622B2 (en) Leader election with lifetime term
US20170085512A1 (en) Generating message envelopes for heterogeneous events
WO2019219005A1 (zh) 一种数据处理系统及方法
CN113722055A (zh) 数据处理方法、装置、电子设备和计算机可读介质
US9652310B1 (en) Method and apparatus for using consistent-hashing to ensure proper sequencing of message processing in a scale-out environment
US10250515B2 (en) Method and device for forwarding data messages
CN110297706A (zh) 一种基于Eureka-Server项目的下线方法
US20210073024A1 (en) Data processing method, apparatus, and server
US20220038423A1 (en) System and method for application traffic and runtime behavior learning and enforcement
CN103678521A (zh) 一种基于Hadoop框架的分布式文件监控系统
CN113127187B (zh) 用于集群扩缩容的方法和装置
CN111163117B (zh) 一种基于Zookeeper的对等式调度方法和装置
CN109933459A (zh) 一种多任务的执行方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20788320

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20788320

Country of ref document: EP

Kind code of ref document: A1