WO2022262007A1 - Graph algorithm autoincrement method and apparatus, device, and storage medium - Google Patents

Graph algorithm autoincrement method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2022262007A1
WO2022262007A1 PCT/CN2021/102218 CN2021102218W WO2022262007A1 WO 2022262007 A1 WO2022262007 A1 WO 2022262007A1 CN 2021102218 W CN2021102218 W CN 2021102218W WO 2022262007 A1 WO2022262007 A1 WO 2022262007A1
Authority
WO
WIPO (PCT)
Prior art keywords
algorithm
graph
batch
incremental
processing
Prior art date
Application number
PCT/CN2021/102218
Other languages
French (fr)
Chinese (zh)
Inventor
樊文飞
田超
许瑞琦
尹强
于文渊
周靖人
Original Assignee
深圳计算科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳计算科学研究院 filed Critical 深圳计算科学研究院
Publication of WO2022262007A1 publication Critical patent/WO2022262007A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms

Definitions

  • the present application relates to the technical field of data processing, in particular to a graph algorithm automatic increment method, device, equipment and storage medium.
  • the existing incremental graph query algorithms are specially designed for specific problems. This method lacks versatility and requires a lot of professional knowledge, which greatly increases the threshold for using incremental algorithms.
  • the main advancement of the existing incremental algorithms lies in the fact that for unit update (unitupdate), it has lower complexity compared with batch processing, but for batch update (batchupdate), the efficiency of these incremental algorithms is often Directly computing the updated graph without re-using the batch algorithm results in higher query results, and how to use incremental computation in a more efficient way than batch re-computation in this case is still an open problem.
  • this application is proposed to provide an automatic incremental method, device, device and storage medium for a graph algorithm that overcomes the above problems or at least partially solves the above problems, including:
  • An automatic incrementalization method for a graph algorithm comprising:
  • the batch processing graph algorithm is a fixed point calculation
  • the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm as a step function of the fixed point calculation; according to the batch processing
  • the data structure and logic of the graph algorithm generate an initial range function based on the graph update area; combine the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm;
  • the batch processing graph algorithm When the batch processing graph algorithm is a fixed point calculation, it is judged whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm has a bounded incremental property, then the The batch processing algorithm is transformed into an incremental processing algorithm, which has a bounded incremental property.
  • it also includes:
  • the batch processing graph algorithm Judging whether the batch processing graph algorithm is an iteration of a step function related to the graph, state, query result and range; wherein, the graph is an undirected graph; the state is the data structure of the batch processing graph algorithm The state at the beginning of the preset round; the query result is the query result obtained after using the batch graph algorithm to query the graph, the query result is composed of the state variables, and each of the state variables is Associated with a certain point or edge in the graph, each of the state variables has a corresponding update function for updating the value of the state variable, and the input of the update function is a set of the state variables , each of the state variables corresponds to a logic judgment statement, satisfying that after each update function is used to update the state variable, the logic judgment statement is true, and for the final fixed point state, all the logic The judgment statement is all true; the range includes the state variables that all correspond to the logic judgment statement being negated when the preset round starts;
  • the batch graph algorithm is an iteration of a step function related to the state, the query result, the graph, and the range, then the batch graph algorithm is determined to be a fixed point computation.
  • the step of generating the initial range function based on the graph update region according to the data structure and logic of the batch graph algorithm includes:
  • the directed acyclic graph includes the topological partial order relationship of each point in the graph in the fixed point state, and the topological partial order relationship is used to guide the batch handles change propagation for graph algorithms;
  • the initial range function based on the graph update region is generated from the directed acyclic graph.
  • the step of combining the initial range function and the step function to obtain the incremental processing algorithm corresponding to the batch processing algorithm it further includes:
  • the step function is used to iterate the initial state and the initial range to a new fixed point to obtain a query result of the updated graph.
  • it also includes:
  • the step function and the state of the batch processing graph algorithm are modified so that the batch processing graph algorithm has bounded incrementalization property; converting the batch processing algorithm into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
  • the step of converting the batch processing algorithm into an incremental processing algorithm includes:
  • the batch graph algorithm includes any one of weakly connected component algorithm, single-source shortest path algorithm, depth-first search algorithm, local clustering coefficient algorithm and graph simulation algorithm.
  • An automatic incrementalization device for a graph algorithm comprising:
  • the first processing module is used to obtain the step of the batch graph algorithm as a fixed point calculation from the data structure and logic of the batch graph algorithm when the batch graph algorithm is a fixed point calculation function; generate an initial range function based on the graph update region according to the data structure and logic of the batch processing graph algorithm; combine the initial range function and the step function to obtain an increment corresponding to the batch processing algorithm processing algorithm;
  • the second processing module is used to determine whether the batch graph algorithm has a bounded incremental property when the batch graph algorithm is a fixed point calculation; when the batch graph algorithm has a bounded incremental property When the property is , the batch processing algorithm is transformed into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
  • a device comprising a processor, a memory, and a computer program stored on the memory and capable of running on the processor, the computer program implementing a graph algorithm as described above when executed by the processor The steps of the auto-increment method.
  • a computer-readable storage medium where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned automatic increment method for a graph algorithm are realized.
  • the batch processing graph algorithm by obtaining the batch processing graph algorithm; when the batch processing graph algorithm is fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm As a step function for fixed point calculation; according to the data structure and logic of the batch graph algorithm, an initial range function based on the graph update area is generated; the initial range function and the step function are combined to obtain the same value as the original range function
  • the incremental processing algorithm corresponding to the batch processing algorithm or; when the batch processing graph algorithm is a fixed point calculation, judging whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm When it has a bounded incremental property, the batch processing algorithm is converted into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property, which can obtain a more efficient way than batch recalculation for incremental processing. volume calculation.
  • Fig. 1 is a flow chart of the steps of an automatic increment method of a graph algorithm provided by an embodiment of the present application
  • Fig. 2 is a schematic structural diagram of a graph growth model that generates two-way edge segmentation in an automatic incremental method of a graph algorithm provided by an embodiment of the present application;
  • Fig. 3 is a structural block diagram of an automatic incrementing device for a graph algorithm provided by an embodiment of the present application
  • Fig. 4 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Computer equipment 14. Peripheral equipment; 16. Processing unit; 18. Bus; 20. Network adapter; 22. I/O interface; 24. Display; 28. Memory; 30. Random access memory; 32. Cache memory; 34, storage system; 40, program/utility tool; 42, program module.
  • FIG. 1 shows an automatic increment method of a graph algorithm provided by an embodiment of the present application, including:
  • the batch processing graph algorithm by obtaining the batch processing graph algorithm; when the batch processing graph algorithm is fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm As a step function for fixed point calculation; according to the data structure and logic of the batch graph algorithm, an initial range function based on the graph update area is generated; the initial range function and the step function are combined to obtain the same value as the original range function the incremental processing algorithm corresponding to the batch processing algorithm; or; when the batch processing graph algorithm is a fixed point calculation, judging whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm When the graph algorithm has a bounded incremental property, the batch processing algorithm is converted into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property, which can obtain a more efficient method than batch processing recalculation Do incremental calculations.
  • the batch graph algorithm A (batch algorithms) is defined as follows:
  • the batch processing graph algorithm A answers the query Q on the graph data G, and outputs A(Q, G) as the query result Q(G);
  • the incremental processing algorithm A ⁇ (incremental algorithm) corresponding to the batch graph algorithm A is defined as follows:
  • step S120 when the batch graph algorithm A is fixed point calculation, the batch graph algorithm A is obtained from the data structure and logic of the batch graph algorithm A as a fixed point The calculated step function f A ; according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ⁇ G is generated; the initial range function h and the step function f A are combined , to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm.
  • a batch graph algorithm A when run on a graph G, builds an auxiliary structure D A that expands the graph by the state variables xi associated with the nodes and edges in the graph G and the part size of the split result Q(G) G, the auxiliary structure D A records the calculation process and evolves continuously throughout the calculation process. Therefore, the segmentation process of the batch graph algorithm A can be described as a continuous update of the auxiliary structure D A.
  • the continuous update of the auxiliary structure D A is often performed in time order, with Indicates the state of the auxiliary structure D A after t iterations; by inputting and the change value calculated by the update function f on a specific update region Ht to obtain details as follows:
  • the update function f is obtained from the original logic of the batch graph algorithm A; the update area H t of the t+1 iteration is represented by the state as input, and generated by the initial range function h of the batch graph algorithm A; the computation continues until stop, at this time, the As the segmentation result of Batch Graph Algorithm A.
  • the initial range function h determines the update area by obtaining the information of the previous iteration, that is, the state variables of the auxiliary structure D A to be updated; the update function f deduces the actual changes of these state variables, for example, decides not Partial assignment of assigned nodes and edges, i.e., the computation process of Batch Graph Algorithm A is guided by changing values at runtime (i.e., change propagation).
  • Example 1 for a graph growth model that generates two-way edge segmentation, a starting node is selected from the graph, and the graph is grown around an edge-segmented segmented area by breadth-first search (BFS) until Half of the nodes are included, and the rest of the nodes are placed into another edge-segmented split region.
  • BFS breadth-first search
  • w' 0 can be selected as the starting node, and then v 1 ,...,v 19 , u' 1 ,...,u' 19 ,w ' 0 and w 0 are put into the V1 part for graph growth, while the rest of the nodes in G are put into the V2 part.
  • the above graph growth model is an iterative batch graph algorithm. Its state variables include the part ID and part size of each node. In each iteration, its update function f takes a set of unallocated vertices, a node to be updated As input, the part ID and corresponding part size of , it assigns part IDs to these vertices and updates their sizes accordingly, and obtains the unassigned neighbor nodes of newly assigned nodes through the initial range function h.
  • the application of the update function f and the initial range function h can be done in parallel, e.g. to assign different vertices or edges simultaneously in one parallel iteration.
  • the iterative segmentation algorithms that meet the fixed point calculation include: (1) graph growing algorithm (graph growing), greedy growing algorithm (greedy growing), k-way graph greedy growing algorithm (K-Way Graph Greedy Growing Partitioning, KGGGP) and bubble methods; (2) Neighborhood embedding (NE) and discriminant neighborhood embedding (DNE) applied to vertex segmentation; and (3) Applied to streaming Set Fennel algorithm, High-Degree (are) Replicated First (HDRF), Ginger algorithm, and Greedy algorithm (Greedy); although streaming segmentation algorithms are only developed for insertion, they can still be processed incrementally General update.
  • the METIS algorithm applied to edge segmentation and the Sheep algorithm applied to vertex segmentation are iterative segmentation algorithms that are not suitable for fixed point calculation, because they will change the topology of the graph during the segmentation process, which exceeds the expressive ability of the iterative calculation model.
  • the batch processing algorithm A is fixed point calculation, the batch processing algorithm A is converted into an incremental processing algorithm A ⁇ , so that the incremental processing algorithm A ⁇ has two characteristics, (1) relative incremental Bounded, (2) preserves the segmentation quality of Batch Graph Algorithm A, as follows:
  • the initial range function h of the batch graph algorithm A determines another update region H T+1 for the next new iteration t+2, and adopts the above method to determine the basic change related to H T+1 ; then, the update function f is applied to H T+1 .
  • These two steps are iterated in the incremental processing algorithm A ⁇ until the update function f and the initial range function h can no longer be used to operate according to the same logic as the batch graph algorithm A to obtain a new final state of this incremental operation
  • the input graph update region ⁇ G may overweight some parts of the old edge-segmented (or vertex-segmented) query result Q(G).
  • the incremental processing algorithm A ⁇ needs to remove a set of nodes (or edges) from each overweight V i (or E i ) such that it satisfies the equilibrium constraint, and then uses the same as above (a ) to reassign U i with the same strategy.
  • the incremental processing algorithm A ⁇ redistributes some vertices or edges that are either overwritten by the incoming update or picked out for rebalancing. This way helps to refine the old segmentation, because the incremental processing algorithm A ⁇ can utilize all the information of the old segmentation. In some cases, the segmentation quality produced this way is even better than re-segmentation with batch graph algorithm A.
  • Example 2 Continuing from Example 1, for a graph update region ⁇ G, delete four edges (w 0 , v 1 ), (w 0 , v 2 ), (w 1 , v' 18 ) and (w 1, v'19 ).
  • the incremental processing algorithm A ⁇ directly uses the modified range function h' to obtain an initial new update region, which contains the node set covered by the graph update region ⁇ G, namely ⁇ w' 0 , v 1 , v 2 , w 1 , v' 18 , v' 19 ⁇ ; deallocate the above set of nodes, update the size of the corresponding part, and deduce the ID of the candidate part according to the allocation of its neighbors.
  • the incremental processing algorithm A ⁇ iteratively reallocates these nodes using the original update of graph growth and the initial range function h (Example 1), placing w 0 (or w 1 ) into part V 2 (or V 1 ) , and keep the assignments of all other nodes unchanged. That is, the incremental processing algorithm A ⁇ only swaps the original assignments of the two nodes.
  • the incremental processing algorithm A ⁇ can ensure incremental boundedness when (1) the update function f and the (modified) range functions h and h' can be computed incrementally in polynomial time on the update scale, i.e., (or can be computed from f(d) (or h(d), h'(d)) and ⁇ d without visiting the whole of d; and (2) the overhead of identifying each set Ui from the overweight part is
  • some aggregate functions such as sum and avg, can be computed incrementally, but the functions that compute eigenvectors in the lineage segmentation method are not.
  • the incremental processing algorithm A ⁇ only re-evaluates the update function f for regions involving changes, when these functions can be computed incrementally, the magnitude of the change is given by the polynomial of
  • constraint If
  • the rebalancing is performed when the incremental processing algorithm A ⁇ starts. Therefore, when the above two conditions hold, the overall overhead and size
  • step S130 when the batch processing graph algorithm A is a fixed point calculation, it is judged whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a When the property of bounded increment is not obtained, the batch processing algorithm A is transformed into an incremental processing algorithm A ⁇ , and the incremental processing algorithm A ⁇ has the property of bounded increment.
  • the method for judging that the batch graph algorithm A has a bounded incremental property is: the cost of incremental computation can be calculated by a polynomial function of graph update, segmentation size, and the size of the area affected by the update.
  • this embodiment provides a bounded incremental condition, that is, when the batch graph algorithm A has this property, it can ensure that the incremental processing algorithm obtained by the above automatic incremental method A ⁇ is incrementally bounded, i.e. its computational cost is only affected by the extent of graph updates affected by it, and not related to the full graph size. This ensures that the incremental processing algorithm A ⁇ does not perform redundant calculations.
  • the graph G is an undirected graph; the state It is the state of the data structure of the batch processing graph algorithm A at the beginning of the preset round (the tth round); the query result Q(G) is obtained after querying the graph G using the batch processing graph algorithm A , the query result Q(G) is composed of the state variables xi , each of the state variables xi is associated with a certain point or edge in the graph G, and each of the state Each variable x i has a corresponding update function for updating the value of the state variable x i
  • the update function input of is the set of the state variables x i , each of the state variables x i corresponds to a logic judgment statement Satisfied every time using the update function After updating the state variable x i , the logic judgment statement is true, and for all the final converged fixed points, all the logical judgment statements all true; the range In order to all correspond to the logic judgment statement when the preset round (the tth round) starts The state variable x i that is
  • the step function f A is used in each round of iterations in the range
  • the input corresponding to the state variable x i is executed on all the state variables x i contained in Iterate to all the logic judgment statements is true, the range for stop.
  • the step of generating the initial range function h based on the graph update area ⁇ G according to the data structure and logic of the batch graph algorithm A includes:
  • the directed acyclic graph contains the topological partial order relationship ⁇ C of each point in the graph G in the fixed point state, and the topological partial order relationship ⁇ C is used to guide the batch processing Change propagation of graph algorithm A;
  • the initial range function h based on the graph update region ⁇ G is generated according to the directed acyclic graph.
  • the initial range function h can be generated in the following manner:
  • the directed acyclic graph is used to represent the topological partial order relationship ⁇ C of each point in the graph G in the fixed point state; change propagation is along the direction of the directed acyclic graph Propagation of dependencies;
  • step of combining the initial range function h and the step function f A to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm A further includes :
  • step function f A to convert the initial state and the initial scope Iterate to new fixed points to obtain the updated graph query results for
  • the step function f A and the state of the batch graph algorithm A modifying, so that the batch processing graph algorithm A has a bounded incremental property; transforming the batch processing algorithm A into an incremental processing algorithm A ⁇ , and the incremental processing algorithm A ⁇ has a bounded incremental property .
  • this embodiment provides a state for the batch processing algorithm A and the modification method of the step function f A , so that after the modification method is applied to a class of weakly incremental algorithms that do not have the bounded incremental property, they can all have the bounded incremental property, thereby being able to into the corresponding incremental processing algorithm A ⁇ .
  • the step of converting the batch processing algorithm A into an incremental processing algorithm A ⁇ includes:
  • the initial range function h and the step function f A are combined to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm.
  • the batch graph algorithm A includes a single-source shortest path algorithm (Single Source Shortest Path, SSSP), a weakly connected component algorithm (Weakly Connected Component, WCC), a depth-first search algorithm (DepthFirst Search, DFS), Any one of Local Clustering Coefficient (LCC) and Graph Simulation.
  • SSSP Single Source Shortest Path
  • WCC weakly connected component algorithm
  • DFS Depth First Search
  • LCC Local Clustering Coefficient
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • FIG. 3 shows an automatic incrementing device for a graph algorithm provided by an embodiment of the present application, including:
  • An acquisition module 210 configured to acquire a batch graph algorithm A
  • the first processing module 220 is configured to obtain the batch graph algorithm A as a fixed point from the data structure and logic of the batch graph algorithm A when the batch graph algorithm A is a fixed point calculation
  • the calculated step function f A according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ⁇ G is generated; the initial range function h and the step function f A are combined , to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm;
  • the second processing module 230 is used to judge whether the batch processing graph algorithm A has a bounded incremental property when the batch processing graph algorithm A is a fixed point calculation; when the batch processing graph algorithm A has a bounded incremental property When the property of bounded increment is not obtained, the batch processing algorithm A is transformed into an incremental processing algorithm A ⁇ , and the incremental processing algorithm A ⁇ has the property of bounded increment.
  • the first processing module 220 includes:
  • An anchor set acquisition submodule used to acquire the input that determines the state variable x i in the fixed point state A subset of , as the anchor set of the state variable xi
  • Directed acyclic graph generation sub-module for according to the anchor set Generate a directed acyclic graph, the directed acyclic graph contains the topological partial order relationship ⁇ C of each point in the graph G in the fixed point state, and the topological partial order relationship ⁇ C is used to guide the batch processing Change propagation of graph algorithm A;
  • the initial range function generating submodule is configured to generate the initial range function h based on the graph update area ⁇ G according to the directed acyclic graph.
  • the second processing module 230 includes:
  • the second processing submodule is used to determine whether the batch graph algorithm A has a weakly incremental property when the batch graph algorithm A does not have a bounded incremental property;
  • the step function f A and the state of the batch graph algorithm A modifying, so that the batch processing graph algorithm A has a bounded incremental property; transforming the batch processing algorithm A into an incremental processing algorithm A ⁇ , and the incremental processing algorithm A ⁇ has a bounded incremental property .
  • the third processing module is used to obtain the query result Q(G) and the update area ⁇ G of the graph G; using the initial range function h, the fixed point state of the query result Q(G) Modified to for the updated figure Feasible initial state and the initial state initial range on Use the step function f A to convert the initial state and the initial scope Iterate to new fixed points to obtain the updated graph query results for
  • FIG. 4 shows a computer device for automatic incrementation of a graph algorithm of the present application, which may specifically include the following:
  • the above-mentioned computer device 12 is expressed in the form of a general-purpose computing device.
  • the components of the computer device 12 may include, but are not limited to: one or more processors or processing units 16, memory 28, and different system components (including memory 28 and processing unit 16) connected to each other. bus 18.
  • the bus 18 represents one or more of several types of bus 18 structures, including a memory bus 18 or memory controller, a peripheral bus 18, an accelerated graphics port, a processor, or a bureau using any of a variety of bus 18 structures.
  • domain bus 18 includes, but are not limited to, the Industry Standard Architecture (ISA) bus 18, the Micro Channel Architecture (MAC) bus 18, the Enhanced ISA bus 18, the Audio Video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18 .
  • ISA Industry Standard Architecture
  • MAC Micro Channel Architecture
  • VESA Audio Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • Computer device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
  • Memory 28 may include computer system readable media in the form of volatile memory, such as random access memory 30 and/or cache memory 32 .
  • the computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • storage system 34 may be used to read from and write to non-removable, non-volatile magnetic media (commonly referred to as a "hard drive").
  • a disk drive for reading and writing to removable non-volatile disks such as "floppy disks”
  • removable non-volatile optical disks such as CD-ROM, DVD-ROM or other optical media
  • each drive may be connected to bus 18 via one or more data media interfaces.
  • the memory may include at least one program product having a set (eg, at least one) of program modules 42 configured to perform the functions of various embodiments of the present application.
  • program/utility 40 having a set (at least one) of program modules 42, such as may be stored in memory, such program modules 42 including - but not limited to - an operating system, one or more application programs, other program modules 42 and program data, each or some combination of these examples may include the implementation of the network environment.
  • the program modules 42 generally perform the functions and/or methods of the embodiments described herein.
  • the computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, camera, etc.), and with one or more devices that enable an operator to interact with the computer device 12, and and/or communicate with any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices. Such communication may occur through I/O interface 22 .
  • computer device 12 may communicate with one or more networks (eg, local area network (LAN)), wide area network (WAN) and/or public networks (eg, the Internet) via network adapter 20 . As shown in FIG. 4 , network adapter 20 communicates with other modules of computer device 12 via bus 18 .
  • LAN local area network
  • WAN wide area network
  • Internet public networks
  • network adapter 20 communicates with other modules of computer device 12 via bus 18 .
  • other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units 16, external disk drive arrays
  • the processing unit 16 executes various functional applications and data processing by running the programs stored in the memory 28 , for example, implementing an automatic increment method of a graph algorithm provided by the embodiment of the present application.
  • the above-mentioned processing unit 16 executes the above-mentioned program, it realizes: obtaining a batch processing graph algorithm A; when the batch processing graph algorithm A is fixed point calculation, then from the data structure and logic of the batch processing graph algorithm A Obtain the step function f A calculated by the batch graph algorithm A as a fixed point; generate an initial range function h based on the graph update area ⁇ G according to the data structure and logic of the batch graph algorithm A; set the initial range function h and the step function f A are combined to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm; or; when the batch processing graph algorithm A is a fixed point calculation, determine the Whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a bounded incremental property, then the batch processing algorithm A is converted into an incremental processing algorithm A ⁇ , so The incremental processing algorithm A ⁇ has the property of bounded incrementalization.
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, an automatic increment of a graph algorithm as provided in all embodiments of the present application is realized method.
  • the batch processing graph algorithm A is obtained; when the batch processing graph algorithm A is fixed point calculation, it is obtained from the data structure and logic of the batch processing graph algorithm A
  • the batch graph algorithm A is used as the step function f A calculated by the fixed point; according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ⁇ G is generated; the initial range function h and the step function f A are combined to obtain the incremental processing algorithm A ⁇ corresponding to the batch processing algorithm; or; when the batch processing graph algorithm A is fixed point calculation, determine the Whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a bounded incremental property, then the batch processing algorithm A is converted into an incremental processing algorithm A ⁇ , the Incremental processing algorithm A ⁇ has bounded incremental properties.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
  • Computer program codes for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language - such as "C" or a similar programming language.
  • the program code may execute entirely on the operator computer, partly on the operator computer, as a stand-alone software package, partly on the operator computer and partly on a remote computer or entirely on the remote computer or server .
  • the remote computer can be connected to the operator computer via any kind of network, including a local area network (LAN) or wide area network (WAN), or it can be connected to an external computer (e.g. using an Internet service provider to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • an Internet service provider to connect via the Internet

Abstract

A graph algorithm autoincrement method and apparatus, a device, and a storage medium, the method comprising: acquiring a batch processing graph algorithm (S110); when the batch processing graph algorithm is a fixed point calculation, acquiring a step function of the batch processing graph algorithm from the data structure and logic of the batch processing graph algorithm for the fixed point calculation; according to the data structure and logic of the batch processing graph algorithm, generating an initial range function based on a graph update region; and combining the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm (S120); or, when the batch processing graph algorithm is a fixed point calculation, determining whether the batch processing graph algorithm has bounded incremental properties; and, when the batch processing graph algorithm has bounded incremental properties, converting the batch processing algorithm into an incremental processing algorithm, the incremental processing algorithm having bounded incremental properties (S130). In the present method, incremental calculation can be performed in a manner more efficient than batch recomputing.

Description

一种图算法的自动增量化方法、装置、设备及存储介质Automatic increment method, device, equipment and storage medium of graph algorithm 技术领域technical field
本申请涉及数据处理技术领域,特别是一种图算法的自动增量化方法、装置、设备及存储介质。The present application relates to the technical field of data processing, in particular to a graph algorithm automatic increment method, device, equipment and storage medium.
背景技术Background technique
大图上查询需要消耗大量资源和时间。更复杂的情况是,图数据是经常变化的,例如顶点上的数值或边链接的插入和删除,这使得有些过程需要自动和不断重复查询变化的图数据以获得最新的查询结果。这种过程会消耗大量计算资源,特别实在大图上。Queries on large graphs consume a lot of resources and time. To further complicate the situation, graph data is constantly changing, such as the values on vertices or the insertion and deletion of edge links, which makes some processes need to automatically and repeatedly query the changing graph data to obtain the latest query results. This process consumes a lot of computing resources, especially on large graphs.
上述情形中,在图数据更新后,批处理查询的复杂度仍然是和图数据本身的大小有关,而与图更新的大小无关。这种方法在图数据较大,更新频率较频繁的时候会变得十分低效。In the above case, after the graph data is updated, the complexity of the batch query is still related to the size of the graph data itself, and has nothing to do with the size of the graph update. This method will become very inefficient when the graph data is large and the update frequency is frequent.
已有的增量图查询算法都是针对专门问题专门设计的,这种方法缺乏通用性,且需要大量的专业知识,大大提高了增量算法使用的门槛。另一方面,已有的增量算法的主要先进性在于对于单位更新(unitupdate),相比较与批处理,具有更低的复杂度,但对于批量更新(batchupdate),这些增量算法的效率往往没有重新使用批处理算法直接计算更新后的图的查询结果更高,而在这种情况下如何能够使用比批处理重新计算更高效的方式进行增量计算仍是一个未解决的问题。The existing incremental graph query algorithms are specially designed for specific problems. This method lacks versatility and requires a lot of professional knowledge, which greatly increases the threshold for using incremental algorithms. On the other hand, the main advancement of the existing incremental algorithms lies in the fact that for unit update (unitupdate), it has lower complexity compared with batch processing, but for batch update (batchupdate), the efficiency of these incremental algorithms is often Directly computing the updated graph without re-using the batch algorithm results in higher query results, and how to use incremental computation in a more efficient way than batch re-computation in this case is still an open problem.
发明内容Contents of the invention
鉴于所述问题,提出了本申请以便提供克服所述问题或者至少部分地解决所述问题的一种图算法的自动增量化方法、装置、设备及存储介质,包括:In view of the above problems, this application is proposed to provide an automatic incremental method, device, device and storage medium for a graph algorithm that overcomes the above problems or at least partially solves the above problems, including:
一种图算法的自动增量化方法,包括:An automatic incrementalization method for a graph algorithm, comprising:
获取批处理图算法;Get the batch graph algorithm;
当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;依据所述批 处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法;When the batch processing graph algorithm is a fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm as a step function of the fixed point calculation; according to the batch processing The data structure and logic of the graph algorithm generate an initial range function based on the graph update area; combine the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm;
或;or;
当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。When the batch processing graph algorithm is a fixed point calculation, it is judged whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm has a bounded incremental property, then the The batch processing algorithm is transformed into an incremental processing algorithm, which has a bounded incremental property.
优选地,还包括:Preferably, it also includes:
判断所述批处理图算法是否为与图、状态、查询结果和范围有关的一个步进函数的迭代;其中,所述图为无向图;所述状态为所述批处理图算法的数据结构在预设轮开始时的状态;所述查询结果为使用所述批处理图算法对所述图查询后获得的查询结果,所述查询结果由所述状态变构成,每个所述状态变量都与所述图中的某个点或者边相关联,每个所述状态变量都有对应的用于更新所述状态变量的值的更新函数,所述更新函数的输入为所述状态变量的集合,每个所述状态变量对应一个逻辑判断语句,满足每次使用所述更新函数更新所述状态变量之后,所述逻辑判断语句为真,且对于最终的不动点状态,所有的所述逻辑判断语句全部为真;所述范围包括在所述预设轮开始时全部对应于所述逻辑判断语句为非的所述状态变量;Judging whether the batch processing graph algorithm is an iteration of a step function related to the graph, state, query result and range; wherein, the graph is an undirected graph; the state is the data structure of the batch processing graph algorithm The state at the beginning of the preset round; the query result is the query result obtained after using the batch graph algorithm to query the graph, the query result is composed of the state variables, and each of the state variables is Associated with a certain point or edge in the graph, each of the state variables has a corresponding update function for updating the value of the state variable, and the input of the update function is a set of the state variables , each of the state variables corresponds to a logic judgment statement, satisfying that after each update function is used to update the state variable, the logic judgment statement is true, and for the final fixed point state, all the logic The judgment statement is all true; the range includes the state variables that all correspond to the logic judgment statement being negated when the preset round starts;
若所述批处理图算法为与所述状态、所述查询结果、所述图和所述范围有关的一个步进函数的迭代,则判定所述批处理图算法为不动点计算。If the batch graph algorithm is an iteration of a step function related to the state, the query result, the graph, and the range, then the batch graph algorithm is determined to be a fixed point computation.
优选地,所述依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数的步骤包括:Preferably, the step of generating the initial range function based on the graph update region according to the data structure and logic of the batch graph algorithm includes:
获取所述更新函数的输入在不动点状态下的一个子集,作为所述状态变量的锚定集;Obtaining a subset of the input of the update function in a fixed point state as an anchor set of the state variable;
依据所述锚定集生成有向无环图,所述有向无环图包含不动点状态下所述图中各个点的拓扑偏序关系,所述拓扑偏序关系用于指导所述批处理图算法的变化传播;Generate a directed acyclic graph according to the anchor set, the directed acyclic graph includes the topological partial order relationship of each point in the graph in the fixed point state, and the topological partial order relationship is used to guide the batch handles change propagation for graph algorithms;
依据所述有向无环图生成基于所述图更新区域的所述初始范围函数。The initial range function based on the graph update region is generated from the directed acyclic graph.
优选地,在所述将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的所述增量处理算法的步骤之后,还包括:Preferably, after the step of combining the initial range function and the step function to obtain the incremental processing algorithm corresponding to the batch processing algorithm, it further includes:
获取所述查询结果和所述图的更新区域;Obtaining the query result and the update area of the graph;
使用所述初始范围函数,将所述查询结果的不动点状态修改为对于更新后的图可行的初始状态和所述初始状态上的初始范围;modifying the fixed point state of the query result to an initial state feasible for the updated graph and an initial range on the initial state using the initial range function;
使用所述步进函数将所述初始状态和所述初始范围迭代至新的不动点,得到所述更新后的图的查询结果。The step function is used to iterate the initial state and the initial range to a new fixed point to obtain a query result of the updated graph.
优选地,还包括:Preferably, it also includes:
当所述批处理图算法不具备有界增量化性质时,判断所述批处理图算法是否具备弱可增量化性质;When the batch graph algorithm does not have a bounded incremental property, determine whether the batch graph algorithm has a weakly incremental property;
当所述批处理图算法具备弱可增量化性质时,则对所述批处理图算法的所述步进函数和所述状态进行修改,使所述批处理图算法具备有界增量化性质;将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。When the batch processing graph algorithm has a weakly incremental property, the step function and the state of the batch processing graph algorithm are modified so that the batch processing graph algorithm has bounded incrementalization property; converting the batch processing algorithm into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
优选地,所述将所述批处理算法转化为增量处理算法的步骤包括:Preferably, the step of converting the batch processing algorithm into an incremental processing algorithm includes:
从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;deriving the batch graph algorithm as a step function for fixed point computation from the batch graph algorithm's data structure and logic;
依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;Generate an initial range function based on the graph update region according to the data structure and logic of the batch graph algorithm;
将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法。Combining the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm.
优选地,所述批处理图算法包括弱联通分量算法、单源最短路径算法、深度优先搜索算法、局部聚类系数算法和图仿真算法中的任意一种。Preferably, the batch graph algorithm includes any one of weakly connected component algorithm, single-source shortest path algorithm, depth-first search algorithm, local clustering coefficient algorithm and graph simulation algorithm.
一种图算法的自动增量化装置,包括:An automatic incrementalization device for a graph algorithm, comprising:
获取模块,用于获取批处理图算法;Obtaining a module for obtaining a batch graph algorithm;
第一处理模块,用于当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的 步进函数;依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法;The first processing module is used to obtain the step of the batch graph algorithm as a fixed point calculation from the data structure and logic of the batch graph algorithm when the batch graph algorithm is a fixed point calculation function; generate an initial range function based on the graph update region according to the data structure and logic of the batch processing graph algorithm; combine the initial range function and the step function to obtain an increment corresponding to the batch processing algorithm processing algorithm;
或;or;
第二处理模块,用于当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。The second processing module is used to determine whether the batch graph algorithm has a bounded incremental property when the batch graph algorithm is a fixed point calculation; when the batch graph algorithm has a bounded incremental property When the property is , the batch processing algorithm is transformed into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
一种设备,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如上所述的一种图算法的自动增量化方法的步骤。A device comprising a processor, a memory, and a computer program stored on the memory and capable of running on the processor, the computer program implementing a graph algorithm as described above when executed by the processor The steps of the auto-increment method.
一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如上所述的一种图算法的自动增量化方法的步骤。A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned automatic increment method for a graph algorithm are realized.
本申请具有以下优点:This application has the following advantages:
在本申请的实施例中,通过获取批处理图算法;当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法;或;当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质,能够获得比批处理重新计算更高效的方式进行增量计算。In the embodiment of the present application, by obtaining the batch processing graph algorithm; when the batch processing graph algorithm is fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm As a step function for fixed point calculation; according to the data structure and logic of the batch graph algorithm, an initial range function based on the graph update area is generated; the initial range function and the step function are combined to obtain the same value as the original range function The incremental processing algorithm corresponding to the batch processing algorithm; or; when the batch processing graph algorithm is a fixed point calculation, judging whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm When it has a bounded incremental property, the batch processing algorithm is converted into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property, which can obtain a more efficient way than batch recalculation for incremental processing. volume calculation.
附图说明Description of drawings
为了更清楚地说明本申请的技术方案,下面将对本申请的描述中所需要 使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solution of the present application more clearly, the accompanying drawings that need to be used in the description of the present application will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. Ordinary technicians can also obtain other drawings based on these drawings on the premise of not paying creative work.
图1是本申请一实施例提供的一种图算法的自动增量化方法的步骤流程图;Fig. 1 is a flow chart of the steps of an automatic increment method of a graph algorithm provided by an embodiment of the present application;
图2是本申请一实施例提供的一种图算法的自动增量化方法中一个产生两路边缘分割的图增长模型的结构示意图;Fig. 2 is a schematic structural diagram of a graph growth model that generates two-way edge segmentation in an automatic incremental method of a graph algorithm provided by an embodiment of the present application;
图3是本申请一实施例提供的一种图算法的自动增量化装置的结构框图;Fig. 3 is a structural block diagram of an automatic incrementing device for a graph algorithm provided by an embodiment of the present application;
图4是本申请一实施例提供的一种计算机设备的结构示意图。Fig. 4 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
12、计算机设备;14、外部设备;16、处理单元;18、总线;20、网络适配器;22、I/O接口;24、显示器;28、内存;30、随机存取存储器;32、高速缓存存储器;34、存储系统;40、程序/实用工具;42、程序模块。12. Computer equipment; 14. Peripheral equipment; 16. Processing unit; 18. Bus; 20. Network adapter; 22. I/O interface; 24. Display; 28. Memory; 30. Random access memory; 32. Cache memory; 34, storage system; 40, program/utility tool; 42, program module.
具体实施方式detailed description
为使本申请的所述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, features and advantages of the present application more obvious and understandable, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods. Apparently, the described embodiments are some of the embodiments of the present application, but not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.
参照图1,示出了本申请一实施例提供的一种图算法的自动增量化方法,包括:Referring to FIG. 1 , it shows an automatic increment method of a graph algorithm provided by an embodiment of the present application, including:
S110、获取批处理图算法A;S110. Acquire the batch graph algorithm A;
S120、当所述批处理图算法A为不动点计算时,则从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f A;依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h;将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A ΔS120. When the batch graph algorithm A is fixed point calculation, obtain the batch graph algorithm A as the step function f of the fixed point calculation from the data structure and logic of the batch graph algorithm A A ; generate an initial range function h based on the graph update area ΔG according to the data structure and logic of the batch processing graph algorithm A; combine the initial range function h and the step function f A to obtain the batch function h The incremental processing algorithm A Δ corresponding to the processing algorithm;
或;or;
S130、当所述批处理图算法A为不动点计算时,判断所述批处理图算法A是否具备有界增量化性质;当所述批处理图算法A具备有界增量化性质时,则将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。 S130. When the batch graph algorithm A is fixed point calculation, judge whether the batch graph algorithm A has a bounded incremental property; when the batch graph algorithm A has a bounded incremental property , then the batch processing algorithm A is transformed into an incremental processing algorithm A Δ , and the incremental processing algorithm A Δ has a bounded incremental property.
在本申请的实施例中,通过获取批处理图算法;当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的所述增量处理算法;或;当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质,能够获得比批处理重新计算更高效的方式进行增量计算。In the embodiment of the present application, by obtaining the batch processing graph algorithm; when the batch processing graph algorithm is fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm As a step function for fixed point calculation; according to the data structure and logic of the batch graph algorithm, an initial range function based on the graph update area is generated; the initial range function and the step function are combined to obtain the same value as the original range function the incremental processing algorithm corresponding to the batch processing algorithm; or; when the batch processing graph algorithm is a fixed point calculation, judging whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm When the graph algorithm has a bounded incremental property, the batch processing algorithm is converted into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property, which can obtain a more efficient method than batch processing recalculation Do incremental calculations.
下面,将对本示例性实施例中一种图算法的自动增量化方法作进一步地说明。Next, an automatic increment method of a graph algorithm in this exemplary embodiment will be further described.
如所述步骤S110所述,获取批处理图算法A;As described in the step S110, obtain the batch graph algorithm A;
所述批处理图算法A(batch algorithms)定义如下:The batch graph algorithm A (batch algorithms) is defined as follows:
输入图数据G和查询Q,所述批处理图算法A回答图数据G上的查询Q,并输出A(Q,G)作为查询结果Q(G);Input graph data G and query Q, the batch processing graph algorithm A answers the query Q on the graph data G, and outputs A(Q, G) as the query result Q(G);
与所述批处理图算法A对应的增量处理算法A Δ(incremental algorithm)定义如下: The incremental processing algorithm A Δ (incremental algorithm) corresponding to the batch graph algorithm A is defined as follows:
输入图数据G、查询Q、查询结果Q(G)和图数据更新ΔG,所述增量处理算法A Δ处理图数据G上的数据更新ΔG,并输出A Δ(Q,G,Q(G),ΔG)作为
Figure PCTCN2021102218-appb-000001
相对于Q(G)的改变量,也即:
Figure PCTCN2021102218-appb-000002
Input graph data G, query Q, query result Q(G) and graph data update ΔG, the incremental processing algorithm A Δ processes the data update ΔG on the graph data G, and outputs A Δ (Q,G,Q(G ),ΔG) as
Figure PCTCN2021102218-appb-000001
The amount of change relative to Q(G), that is:
Figure PCTCN2021102218-appb-000002
如所述步骤S120所述,当所述批处理图算法A为不动点计算时,则从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f A;依据所述批处理图算法A的数据结构和逻辑生成基于 图更新区域ΔG的初始范围函数h;将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A ΔAs described in step S120, when the batch graph algorithm A is fixed point calculation, the batch graph algorithm A is obtained from the data structure and logic of the batch graph algorithm A as a fixed point The calculated step function f A ; according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ΔG is generated; the initial range function h and the step function f A are combined , to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm.
批处理图算法A在图G上运行时会建立辅助结构D A,辅助结构D A通过与图G中的节点和边相关的状态变量x i以及分割结果Q(G)的部分大小来扩展图G,辅助结构D A记录计算过程,并且在整个计算过程中不断演变,因此,可以将批处理图算法A的分割过程描述为对辅助结构D A的连续更新。 A batch graph algorithm A, when run on a graph G, builds an auxiliary structure D A that expands the graph by the state variables xi associated with the nodes and edges in the graph G and the part size of the split result Q(G) G, the auxiliary structure D A records the calculation process and evolves continuously throughout the calculation process. Therefore, the segmentation process of the batch graph algorithm A can be described as a continuous update of the auxiliary structure D A.
辅助结构D A的连续更新往往是按时间顺序进行的,用
Figure PCTCN2021102218-appb-000003
表示经过t轮迭代后辅助结构D A的状态;通过输入
Figure PCTCN2021102218-appb-000004
和更新函数f在特定更新区域H t上计算出的变化值获得
Figure PCTCN2021102218-appb-000005
具体如下:
The continuous update of the auxiliary structure D A is often performed in time order, with
Figure PCTCN2021102218-appb-000003
Indicates the state of the auxiliary structure D A after t iterations; by inputting
Figure PCTCN2021102218-appb-000004
and the change value calculated by the update function f on a specific update region Ht to obtain
Figure PCTCN2021102218-appb-000005
details as follows:
Figure PCTCN2021102218-appb-000006
Figure PCTCN2021102218-appb-000006
Figure PCTCN2021102218-appb-000007
Figure PCTCN2021102218-appb-000007
更新函数f由批处理图算法A的原有逻辑中获取得到;第t+1轮迭代的更新区域H t以状态
Figure PCTCN2021102218-appb-000008
作为输入,并且由批处理图算法A的初始范围函数h生成;计算一直进行到当
Figure PCTCN2021102218-appb-000009
时停止,此时,将
Figure PCTCN2021102218-appb-000010
作为批处理图算法A的分割结果。
The update function f is obtained from the original logic of the batch graph algorithm A; the update area H t of the t+1 iteration is represented by the state
Figure PCTCN2021102218-appb-000008
as input, and generated by the initial range function h of the batch graph algorithm A; the computation continues until
Figure PCTCN2021102218-appb-000009
stop, at this time, the
Figure PCTCN2021102218-appb-000010
As the segmentation result of Batch Graph Algorithm A.
如果一个批处理图算法A符合上述迭代计算模型,则判断它为不动点计算。If a batch graph algorithm A conforms to the above iterative calculation model, it is judged to be a fixed point calculation.
需要说明的是,初始范围函数h通过获取上一次迭代的信息,即要更新的辅助结构D A的状态变量来确定更新区域;更新函数则f推导出这些状态变量的实际变化,例如,决定未分配节点和边的部分分配,也即,批处理图算法A的计算过程是由运行时的变化值指导的(即变化传播)。 It should be noted that the initial range function h determines the update area by obtaining the information of the previous iteration, that is, the state variables of the auxiliary structure D A to be updated; the update function f deduces the actual changes of these state variables, for example, decides not Partial assignment of assigned nodes and edges, i.e., the computation process of Batch Graph Algorithm A is guided by changing values at runtime (i.e., change propagation).
例1,对于一个产生两路边缘分割的图增长模型,从图中选择一个起始节点,通过宽度优先搜索(breadth-first search,BFS)在其周围一个边缘分割的分割区域进行图增长,直到一半的节点被包含在内,将其余节点放入另一个边缘分割的分割区域。如图2中的图G所示(只考虑实心的边),可以选择w' 0作为起始节点,然后把v 1,……,v 19,u' 1,……,u' 19,w' 0和w 0放入V 1部分进行图增长,而将G中的其余节点放入V 2部分。 Example 1, for a graph growth model that generates two-way edge segmentation, a starting node is selected from the graph, and the graph is grown around an edge-segmented segmented area by breadth-first search (BFS) until Half of the nodes are included, and the rest of the nodes are placed into another edge-segmented split region. As shown in graph G in Figure 2 (only solid edges are considered), w' 0 can be selected as the starting node, and then v 1 ,...,v 19 , u' 1 ,...,u' 19 ,w ' 0 and w 0 are put into the V1 part for graph growth, while the rest of the nodes in G are put into the V2 part.
上述图增长模型是一个迭代的批处理图算法,它的状态变量包括每个节 点的部分ID和部分大小,在每次迭代中,它的更新函数f将一组未分配的顶点、一个待更新的部分ID和对应的部分大小作为输入,它将部分ID分配给这些顶点并相应地更新它们的大小,并通过初始范围函数h获得新分配的节点未分配的邻居节点。The above graph growth model is an iterative batch graph algorithm. Its state variables include the part ID and part size of each node. In each iteration, its update function f takes a set of unallocated vertices, a node to be updated As input, the part ID and corresponding part size of , it assigns part IDs to these vertices and updates their sizes accordingly, and obtains the unassigned neighbor nodes of newly assigned nodes through the initial range function h.
更新函数f和初始范围函数h的应用可以并列进行,例如,在一次并行迭代中同时分配不同的顶点或边。The application of the update function f and the initial range function h can be done in parallel, e.g. to assign different vertices or edges simultaneously in one parallel iteration.
符合不动点计算的迭代分割算法包括:(1)应用于边缘分割的图增长算法(graph growing)、贪婪增长算法(greedy growing)、k路图贪婪增长算法(K-Way Graph Greedy Growing Partitioning,KGGGP)和气泡法(bubble methods);(2)应用于顶点分割的邻域嵌入算法(neighborhood embedding,NE)和判别邻域嵌入算法(discriminant neighborhood embedding,DNE);以及(3)应用于流式设置的Fennel算法、高度顶点复制优先算法(High-Degree(are)Replicated First,HDRF)、Ginger算法和贪婪算法(Greedy);虽然流式分割算法仅针对插入开发,它们仍然可以被增量以处理通用更新。The iterative segmentation algorithms that meet the fixed point calculation include: (1) graph growing algorithm (graph growing), greedy growing algorithm (greedy growing), k-way graph greedy growing algorithm (K-Way Graph Greedy Growing Partitioning, KGGGP) and bubble methods; (2) Neighborhood embedding (NE) and discriminant neighborhood embedding (DNE) applied to vertex segmentation; and (3) Applied to streaming Set Fennel algorithm, High-Degree (are) Replicated First (HDRF), Ginger algorithm, and Greedy algorithm (Greedy); although streaming segmentation algorithms are only developed for insertion, they can still be processed incrementally General update.
应用于边缘分割的METIS算法和应用于顶点分割的Sheep算法为不符合不动点计算的迭代分割算法,因为它们在分割过程中会改变图的拓扑结构,超出了迭代计算模型的表达能力。The METIS algorithm applied to edge segmentation and the Sheep algorithm applied to vertex segmentation are iterative segmentation algorithms that are not suitable for fixed point calculation, because they will change the topology of the graph during the segmentation process, which exceeds the expressive ability of the iterative calculation model.
若所述批处理图算法A为不动点计算,将所述批处理算法A转化为增量处理算法A Δ,使得所述增量处理算法A Δ具备两个特性,(1)相对增量有界,(2)保留了批处理图算法A的分割质量,具体如下: If the batch processing algorithm A is fixed point calculation, the batch processing algorithm A is converted into an incremental processing algorithm A Δ , so that the incremental processing algorithm A Δ has two characteristics, (1) relative incremental Bounded, (2) preserves the segmentation quality of Batch Graph Algorithm A, as follows:
(1)相对增量有界。给定图G,图更新区域ΔG,平衡因子ψ,由批处理图算法A在图G上运行产生的查询结果Q(G),以及批处理图算法A的可能的辅助结构D A;采用增量处理算法A Δ计算图更新区域ΔG得到的ΔO,使得
Figure PCTCN2021102218-appb-000011
的一个查询结果。其开销可以表示为|CHANGED|=|ΔG|+|ΔO|,此外,|ΔO|的大小可以表示为|ΔG|的多项式。
(1) The relative increment is bounded. Given graph G, graph update area ΔG, balance factor ψ, query result Q(G) generated by batch graph algorithm A running on graph G, and possible auxiliary structure D A of batch graph algorithm A ; Quantity processing algorithm A Δ calculates the ΔO obtained by updating the region ΔG of the graph, so that
Figure PCTCN2021102218-appb-000011
A query result of . Its overhead can be expressed as |CHANGED|=|ΔG|+|ΔO|, in addition, the size of |ΔO| can be expressed as a polynomial of |ΔG|.
(2)保留原有的分割质量。要求新的查询结果
Figure PCTCN2021102218-appb-000012
保持与Q(G)相同的平衡系数,此外,要求增量处理算法A Δ保持与批处理图算法A相同的对分割尺寸的限制。因此,如果批处理图算法A是广泛使用的,那么增量处理 算法A Δ的质量也可以被批处理图算法A的用户所接受。
(2) Preserve the original segmentation quality. request new query results
Figure PCTCN2021102218-appb-000012
Keeping the same balance factor as Q(G), in addition, the incremental processing algorithm A Δ is required to keep the same constraints on the split size as the batch graph algorithm A. Therefore, if batch graph algorithm A is widely used, the quality of incremental processing algorithm A Δ is also acceptable to users of batch graph algorithm A.
为了从批处理图算法A中推导出一个有效的增量处理算法A Δ,首先确定与图更新区域ΔG有关的基本变化,然后通过使用批处理图算法A的更新函数f和初始范围函数h来处理这些变化,具体如下: To derive an efficient incremental processing algorithm A Δ from a batch graph algorithm A, first determine the fundamental changes associated with the graph update region ΔG, and then by using the update function f and the initial range function h of the batch graph algorithm A to Handle these changes as follows:
(a)恢复迭代。给出图更新区域ΔG和批处理图算法A批量运行后的最终状态
Figure PCTCN2021102218-appb-000013
增量处理算法A Δ首先找到基于图更新区域ΔG的(小)区域变化,然后在状态
Figure PCTCN2021102218-appb-000014
上执行这些变化以得到新的状态
Figure PCTCN2021102218-appb-000015
也即,在新的迭代中恢复分割过程。这些被称为与更新有关的基本变化。更具体地,在初始的新的迭代t+1中的基本变化f(H T)是由批处理图算法A的更新函数f计算出来的,其中
Figure PCTCN2021102218-appb-000016
是一个由批处理图算法A的初始范围函数h修订的范围函数h'决定的更新区域,就像在批处理图算法A中一样,将基本变化f(H T)应用于H T中的状态变量。
(a) Resume iterations. Given the graph update area ΔG and the final state of the batch graph algorithm A after running in batches
Figure PCTCN2021102218-appb-000013
The incremental processing algorithm A Δ first finds (small) regional changes based on the graph update region ΔG, and then in the state
Figure PCTCN2021102218-appb-000014
Perform these changes on to get the new state
Figure PCTCN2021102218-appb-000015
That is, the segmentation process is resumed in a new iteration. These are known as base changes related to updates. More specifically, the basic change f(H T ) in the initial new iteration t+1 is computed by the update function f of the batch graph algorithm A, where
Figure PCTCN2021102218-appb-000016
is an update region determined by the range function h' revised from the initial range function h of the batch graph algorithm A, as in the batch graph algorithm A, applying the basic change f(H T ) to the state in HT variable.
基于新的状态
Figure PCTCN2021102218-appb-000017
批处理图算法A的初始范围函数h再为下一个新的迭代t+2确定另一个更新区域H T+1,并采用上述方法确定与H T+1相关的基本变化;然后,将更新函数f应用于H T+1。将这两个步骤在增量处理算法A Δ中迭代,直到不能再用更新函数f和初始范围函数h按照与批处理图算法A相同的逻辑进行运算,获得这个增量运行的新的最终状态
Figure PCTCN2021102218-appb-000018
based on the new state
Figure PCTCN2021102218-appb-000017
The initial range function h of the batch graph algorithm A determines another update region H T+1 for the next new iteration t+2, and adopts the above method to determine the basic change related to H T+1 ; then, the update function f is applied to H T+1 . These two steps are iterated in the incremental processing algorithm A Δ until the update function f and the initial range function h can no longer be used to operate according to the same logic as the batch graph algorithm A to obtain a new final state of this incremental operation
Figure PCTCN2021102218-appb-000018
(b)重新平衡。输入的图更新区域ΔG可能使旧的边缘分割(或顶点分割)的查询结果Q(G)的某些部分变得过重。当这种情况发生时,增量处理算法A Δ需要从每个超重的V i(或E i)中移除一组节点(或边),以使其满足平衡约束,然后使用与上述(a)相同的策略重新分配U i(b) Rebalance. The input graph update region ΔG may overweight some parts of the old edge-segmented (or vertex-segmented) query result Q(G). When this happens, the incremental processing algorithm A Δ needs to remove a set of nodes (or edges) from each overweight V i (or E i ) such that it satisfies the equilibrium constraint, and then uses the same as above (a ) to reassign U i with the same strategy.
事实上,增量处理算法A Δ会重新分配一些顶点或边,这些顶点或边要么被输入的更新所覆盖,要么被挑选出来进行重新平衡。这种方式有助于完善旧的分割,因为增量处理算法A Δ可以利用旧的分割的所有信息。在某些情况下,这种方式产生的分割质量甚至比用批处理图算法A重新分割更好。 In fact, the incremental processing algorithm redistributes some vertices or edges that are either overwritten by the incoming update or picked out for rebalancing. This way helps to refine the old segmentation, because the incremental processing algorithm can utilize all the information of the old segmentation. In some cases, the segmentation quality produced this way is even better than re-segmentation with batch graph algorithm A.
例2:继续例1,对于一个图更新区域ΔG,从图G中删除四条边(w 0,v 1),(w 0,v 2),(w 1,v' 18)和(w 1,v' 19)。 Example 2: Continuing from Example 1, for a graph update region ΔG, delete four edges (w 0 , v 1 ), (w 0 , v 2 ), (w 1 , v' 18 ) and (w 1, v'19 ).
这里,增量处理算法A Δ直接使用修改后的范围函数h'获得初始的新的更 新区域,其包含图更新区域ΔG所覆盖的节点集,即{w' 0,v 1,v 2,w 1,v' 18,v' 19};将上述节点集取消分配,更新相应部分的大小,并根据其邻居的分配情况推断出候选部分的ID。接下来,增量处理算法A Δ使用图增长的原始更新和初始范围函数h(例1)以迭代方式重新分配这些节点,将w 0(或w 1)放入部分V 2(或V 1),并保持所有其他节点的分配不变。也即,增量处理算法A Δ只交换了两个节点的原始分配。 Here, the incremental processing algorithm A Δ directly uses the modified range function h' to obtain an initial new update region, which contains the node set covered by the graph update region ΔG, namely {w' 0 , v 1 , v 2 , w 1 , v' 18 , v' 19 }; deallocate the above set of nodes, update the size of the corresponding part, and deduce the ID of the candidate part according to the allocation of its neighbors. Next, the incremental processing algorithm A Δ iteratively reallocates these nodes using the original update of graph growth and the initial range function h (Example 1), placing w 0 (or w 1 ) into part V 2 (or V 1 ) , and keep the assignments of all other nodes unchanged. That is, the incremental processing algorithm A Δ only swaps the original assignments of the two nodes.
这是因为,增量处理算法A Δ是采用批处理图算法A的原有逻辑来对D A持续更新,并且只限制与图更新区域ΔG和负载平衡有关的状态变量的变化。 This is because the incremental processing algorithm A Δ uses the original logic of the batch graph algorithm A to continuously update DA, and only limits the changes of the state variables related to the graph update area ΔG and load balancing.
增量处理算法A Δ可以确保增量有界性,当(1)更新函数f和(修改后的)范围函数h和h'可以在更新规模的多项式时间内递增计算,也即,
Figure PCTCN2021102218-appb-000019
(或
Figure PCTCN2021102218-appb-000020
可以从f(d)(或h(d),h'(d))和Δd计算出来,而不需要访问整个d;以及(2)从超重部分识别每个集合U i的开销是|U i|的多项式。
The incremental processing algorithm A Δ can ensure incremental boundedness when (1) the update function f and the (modified) range functions h and h' can be computed incrementally in polynomial time on the update scale, i.e.,
Figure PCTCN2021102218-appb-000019
(or
Figure PCTCN2021102218-appb-000020
can be computed from f(d) (or h(d), h'(d)) and Δd without visiting the whole of d; and (2) the overhead of identifying each set Ui from the overweight part is | Ui | polynomial.
具体地,一些集合函数,例如sum和avg,是可以递增计算的,但是谱系分割方法中计算特征向量的函数不是。Specifically, some aggregate functions, such as sum and avg, can be computed incrementally, but the functions that compute eigenvectors in the lineage segmentation method are not.
事实上,增量处理算法A Δ只对涉及变化的区域重新评估更新函数f,当这些函数可以递增计算时,变化大小由|ΔG|和∑ i∈[1,k]|U i|的多项式约束。如果|U i|≥|ΔG|,i∈[1,k],进行重新平衡后可以获得新的平衡约束;并且∑ i∈[1,k]|U i|可以由O(|ΔG|)约束。需要注意的是,当增量处理算法A Δ开始后即进行重新平衡。因此,当上述两个条件成立时,增量处理算法A Δ的总体开销和大小|ΔO|都只由|ΔG|决定,即,增量处理算法A Δ是增量有界的。 In fact, the incremental processing algorithm A Δ only re-evaluates the update function f for regions involving changes, when these functions can be computed incrementally, the magnitude of the change is given by the polynomial of |ΔG| and ∑ i∈[1,k] |U i | constraint. If |U i |≥|ΔG|, i∈[1,k], a new balance constraint can be obtained after rebalancing; and ∑ i∈[1,k] |U i | can be obtained by O(|ΔG|) constraint. It should be noted that the rebalancing is performed when the incremental processing algorithm A Δ starts. Therefore, when the above two conditions hold, the overall overhead and size |ΔO| of the incremental processing algorithm A Δ is only determined by |ΔG|, that is, the incremental processing algorithm A Δ is incrementally bounded.
如所述步骤S130所述,当所述批处理图算法A为不动点计算时,判断所述批处理图算法A是否具备有界增量化性质;当所述批处理图算法A具备有界增量化性质时,则将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。 As described in the step S130, when the batch processing graph algorithm A is a fixed point calculation, it is judged whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a When the property of bounded increment is not obtained, the batch processing algorithm A is transformed into an incremental processing algorithm A Δ , and the incremental processing algorithm A Δ has the property of bounded increment.
所述批处理图算法A具备有界增量化性质的判断方法为:增量计算的开销可以用一个有关图更新、分割大小、分割受到更新影响的区域的大小这三者的一个多项式函数来表达。需要说明的是,本实施例提供了一种有界增量化条件,即在所述批处理图算法A具有该性质时,可以确保通过上述自动增 量化方法得到的所述增量处理算法A Δ是增量有界的,即其计算开销只受图更新所影响到的范围代销影响,而与全图大小无关.这确保所述增量处理算法A Δ不会进行冗余计算。 The method for judging that the batch graph algorithm A has a bounded incremental property is: the cost of incremental computation can be calculated by a polynomial function of graph update, segmentation size, and the size of the area affected by the update. Express. It should be noted that this embodiment provides a bounded incremental condition, that is, when the batch graph algorithm A has this property, it can ensure that the incremental processing algorithm obtained by the above automatic incremental method A Δ is incrementally bounded, i.e. its computational cost is only affected by the extent of graph updates affected by it, and not related to the full graph size. This ensures that the incremental processing algorithm A Δ does not perform redundant calculations.
本实施例中,还包括:In this embodiment, also include:
判断所述批处理图算法A是否为与状态
Figure PCTCN2021102218-appb-000021
查询结果Q(G)、图G和范围
Figure PCTCN2021102218-appb-000022
有关的一个步进函数
Figure PCTCN2021102218-appb-000023
的迭代;
Determine whether the batch graph algorithm A is AND state
Figure PCTCN2021102218-appb-000021
Query results Q(G), graph G and range
Figure PCTCN2021102218-appb-000022
A step function related to
Figure PCTCN2021102218-appb-000023
iteration of
其中,所述图G为无向图;所述状态
Figure PCTCN2021102218-appb-000024
为所述批处理图算法A的数据结构在预设轮(第t轮)开始时的状态;所述查询结果Q(G)为使用所述批处理图算法A对所述图G查询后获得的查询结果,所述查询结果Q(G)由所述状态变量x i构成,每个所述状态变量x i都与所述图G中的某个点或者边相关联,每个所述状态变量x i都有对应的用于更新所述状态变量x i的值的更新函数
Figure PCTCN2021102218-appb-000025
所述更新函数
Figure PCTCN2021102218-appb-000026
的输入
Figure PCTCN2021102218-appb-000027
为所述状态变量x i的集合,每个所述状态变量x i对应一个逻辑判断语句
Figure PCTCN2021102218-appb-000028
满足每次使用所述更新函数
Figure PCTCN2021102218-appb-000029
更新所述状态变量x i之后,所述逻辑判断语句
Figure PCTCN2021102218-appb-000030
为真,且对于最终所有的收敛到的不动点,所有的所述逻辑判断语句
Figure PCTCN2021102218-appb-000031
全部为真;所述范围
Figure PCTCN2021102218-appb-000032
为在所述预设轮(第t轮)开始时全部对应于所述逻辑判断语句
Figure PCTCN2021102218-appb-000033
为非的所述状态变量x i
Wherein, the graph G is an undirected graph; the state
Figure PCTCN2021102218-appb-000024
It is the state of the data structure of the batch processing graph algorithm A at the beginning of the preset round (the tth round); the query result Q(G) is obtained after querying the graph G using the batch processing graph algorithm A , the query result Q(G) is composed of the state variables xi , each of the state variables xi is associated with a certain point or edge in the graph G, and each of the state Each variable x i has a corresponding update function for updating the value of the state variable x i
Figure PCTCN2021102218-appb-000025
The update function
Figure PCTCN2021102218-appb-000026
input of
Figure PCTCN2021102218-appb-000027
is the set of the state variables x i , each of the state variables x i corresponds to a logic judgment statement
Figure PCTCN2021102218-appb-000028
Satisfied every time using the update function
Figure PCTCN2021102218-appb-000029
After updating the state variable x i , the logic judgment statement
Figure PCTCN2021102218-appb-000030
is true, and for all the final converged fixed points, all the logical judgment statements
Figure PCTCN2021102218-appb-000031
all true; the range
Figure PCTCN2021102218-appb-000032
In order to all correspond to the logic judgment statement when the preset round (the tth round) starts
Figure PCTCN2021102218-appb-000033
The state variable x i that is not;
所述步进函数f A用于在每一轮迭代中在所述范围
Figure PCTCN2021102218-appb-000034
中包含的所有所述状态变量x i上执行与所述状态变量x i对应的所述输入
Figure PCTCN2021102218-appb-000035
迭代进行至所有所述逻辑判断语句
Figure PCTCN2021102218-appb-000036
为真,即所述范围
Figure PCTCN2021102218-appb-000037
Figure PCTCN2021102218-appb-000038
时停止。
The step function f A is used in each round of iterations in the range
Figure PCTCN2021102218-appb-000034
The input corresponding to the state variable x i is executed on all the state variables x i contained in
Figure PCTCN2021102218-appb-000035
Iterate to all the logic judgment statements
Figure PCTCN2021102218-appb-000036
is true, the range
Figure PCTCN2021102218-appb-000037
for
Figure PCTCN2021102218-appb-000038
stop.
若所述批处理图算法A为与所述状态
Figure PCTCN2021102218-appb-000039
所述查询结果Q(G)、所述图G和所述范围
Figure PCTCN2021102218-appb-000040
有关的一个步进函数
Figure PCTCN2021102218-appb-000041
的迭代,则判定所述批处理图算法A为不动点计算。
If the batch graph algorithm A is the same as the state
Figure PCTCN2021102218-appb-000039
The query result Q(G), the graph G and the range
Figure PCTCN2021102218-appb-000040
A step function related to
Figure PCTCN2021102218-appb-000041
iteration, it is determined that the batch graph algorithm A is fixed point calculation.
本实施例中,所述依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h的步骤包括:In this embodiment, the step of generating the initial range function h based on the graph update area ΔG according to the data structure and logic of the batch graph algorithm A includes:
获取在不动点状态下决定所述状态变量x i的所述输入
Figure PCTCN2021102218-appb-000042
的一个子集,作为所述状态变量x i的锚定集
Figure PCTCN2021102218-appb-000043
obtain the input that determines the state variable x i in the fixed point state
Figure PCTCN2021102218-appb-000042
A subset of , as the anchor set of the state variable xi
Figure PCTCN2021102218-appb-000043
依据所述锚定集
Figure PCTCN2021102218-appb-000044
生成有向无环图,所述有向无环图包含不动点状态下所述图G中各个点的拓扑偏序关系< C,所述拓扑偏序关系< C用于指导所 述批处理图算法A的变化传播;
According to the anchor set
Figure PCTCN2021102218-appb-000044
Generate a directed acyclic graph, the directed acyclic graph contains the topological partial order relationship < C of each point in the graph G in the fixed point state, and the topological partial order relationship < C is used to guide the batch processing Change propagation of graph algorithm A;
依据所述有向无环图生成基于所述图更新区域ΔG的所述初始范围函数h。The initial range function h based on the graph update region ΔG is generated according to the directed acyclic graph.
具体地,所述初始范围函数h可以按照下述方式生成:Specifically, the initial range function h can be generated in the following manner:
(1)进行变化传播。由于每个所述状态变量x i都是用其对应的所述更新函数
Figure PCTCN2021102218-appb-000045
进行更新,因此所述状态变量x i的值取决于所述更新函数
Figure PCTCN2021102218-appb-000046
的所述输入
Figure PCTCN2021102218-appb-000047
而且通常最终所述状态变量x i只取决于所述输入
Figure PCTCN2021102218-appb-000048
的一个子集,而非全部,因此,记录在不动点状态下决定所述状态变量x i的所述输入
Figure PCTCN2021102218-appb-000049
的一个子集,作为所述状态变量x i的锚定集
Figure PCTCN2021102218-appb-000050
(1) Perform change propagation. Since each of the state variables x i uses its corresponding update function
Figure PCTCN2021102218-appb-000045
is updated, so the value of the state variable xi depends on the update function
Figure PCTCN2021102218-appb-000046
The input of
Figure PCTCN2021102218-appb-000047
And usually ultimately the state variable xi only depends on the input
Figure PCTCN2021102218-appb-000048
A subset of , but not all of , therefore, record the input that determines the state variable xi in a fixed-point state
Figure PCTCN2021102218-appb-000049
A subset of , as the anchor set of the state variable xi
Figure PCTCN2021102218-appb-000050
依据所述锚定集
Figure PCTCN2021102218-appb-000051
构建有向无环图,所述有向无环图用于表示不动点状态下所述图G中各个点的拓扑偏序关系< C;变化传播即沿着所述有向无环图的依赖关系进行传播;
According to the anchor set
Figure PCTCN2021102218-appb-000051
Build a directed acyclic graph, the directed acyclic graph is used to represent the topological partial order relationship < C of each point in the graph G in the fixed point state; change propagation is along the direction of the directed acyclic graph Propagation of dependencies;
(2)不动点调整。下面给出所述初始范围函数h的一个实现:(2) Fixed point adjustment. An implementation of the initial range function h is given below:
输入:所述不动点状态
Figure PCTCN2021102218-appb-000052
和所述图更新区域ΔG;
Input: The fixed point state
Figure PCTCN2021102218-appb-000052
and the graph update area ΔG;
输出:所述初始范围
Figure PCTCN2021102218-appb-000053
和一个对于所述更新后的图
Figure PCTCN2021102218-appb-000054
可行的所述初始状态
Figure PCTCN2021102218-appb-000055
output: the initial range
Figure PCTCN2021102218-appb-000053
and one for the updated graph
Figure PCTCN2021102218-appb-000054
Feasible initial state
Figure PCTCN2021102218-appb-000055
将所有被所述图更新区域ΔG覆盖的所述状态变量x i收集到所述初始范围
Figure PCTCN2021102218-appb-000056
中,将所述初始状态
Figure PCTCN2021102218-appb-000057
初始化为
Figure PCTCN2021102218-appb-000058
Collect all the state variables xi covered by the graph update region ΔG into the initial range
Figure PCTCN2021102218-appb-000056
, the initial state
Figure PCTCN2021102218-appb-000057
initialized to
Figure PCTCN2021102218-appb-000058
初始化优先队列que,所述优先队列que中包含所有所述初始范围
Figure PCTCN2021102218-appb-000059
中的所述状态变量x i,其优先级按照所述拓扑偏序关系< C的偏序关系;
Initialize the priority queue que, which contains all the initial ranges in the priority queue que
Figure PCTCN2021102218-appb-000059
The priority of the state variable x i in is according to the partial order relationship of the topological partial order relationship <C;
Figure PCTCN2021102218-appb-000060
Figure PCTCN2021102218-appb-000060
本实施例中,在所述将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法A对应的所述增量处理算法A Δ的步骤之后,还包括: In this embodiment, after the step of combining the initial range function h and the step function f A to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm A, further includes :
获取所述查询结果Q(G)和所述图G的更新区域ΔG;Obtaining the query result Q(G) and the update area ΔG of the graph G;
使用所述初始范围函数h,将所述查询结果Q(G)的不动点状态
Figure PCTCN2021102218-appb-000061
修改为对于更新后的图
Figure PCTCN2021102218-appb-000062
可行的初始状态
Figure PCTCN2021102218-appb-000063
和所述初始状态
Figure PCTCN2021102218-appb-000064
上的初始范围
Figure PCTCN2021102218-appb-000065
Using the initial range function h, the fixed point state of the query result Q(G)
Figure PCTCN2021102218-appb-000061
Modified to for the updated figure
Figure PCTCN2021102218-appb-000062
Feasible initial state
Figure PCTCN2021102218-appb-000063
and the initial state
Figure PCTCN2021102218-appb-000064
initial range on
Figure PCTCN2021102218-appb-000065
使用所述步进函数f A将所述初始状态
Figure PCTCN2021102218-appb-000066
和所述初始范围
Figure PCTCN2021102218-appb-000067
迭代至新的不动点,得到所述更新后的图
Figure PCTCN2021102218-appb-000068
的查询结果
Figure PCTCN2021102218-appb-000069
Use the step function f A to convert the initial state
Figure PCTCN2021102218-appb-000066
and the initial scope
Figure PCTCN2021102218-appb-000067
Iterate to new fixed points to obtain the updated graph
Figure PCTCN2021102218-appb-000068
query results for
Figure PCTCN2021102218-appb-000069
本实施例中,还包括:In this embodiment, also include:
当所述批处理图算法A不具备有界增量化性质时,判断所述批处理图算法A是否具备弱可增量化性质;When the batch graph algorithm A does not have a bounded incremental property, determine whether the batch graph algorithm A has a weakly incremental property;
当所述批处理图算法A具备弱可增量化性质时,则对所述批处理图算法A的所述步进函数f A和所述状态
Figure PCTCN2021102218-appb-000070
进行修改,使所述批处理图算法A具备有界增量化性质;将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。
When the batch graph algorithm A has a weak incremental property, then the step function f A and the state of the batch graph algorithm A
Figure PCTCN2021102218-appb-000070
modifying, so that the batch processing graph algorithm A has a bounded incremental property; transforming the batch processing algorithm A into an incremental processing algorithm A Δ , and the incremental processing algorithm A Δ has a bounded incremental property .
需要说明的是,本实施例提供了一种对于所述批处理算法A的所述状态
Figure PCTCN2021102218-appb-000071
和所述步进函数f A的修改方法,使得将该修改方法应用于一类本来不具备界增量化性质的弱可增量化算法之后,都可以具备有界增量化性质,从而能够转化成为对应的所述增量处理算法A Δ
It should be noted that this embodiment provides a state for the batch processing algorithm A
Figure PCTCN2021102218-appb-000071
and the modification method of the step function f A , so that after the modification method is applied to a class of weakly incremental algorithms that do not have the bounded incremental property, they can all have the bounded incremental property, thereby being able to into the corresponding incremental processing algorithm A Δ .
本实施例中,所述将所述批处理算法A转化为增量处理算法A Δ的步骤包括: In this embodiment, the step of converting the batch processing algorithm A into an incremental processing algorithm A Δ includes:
从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f AObtaining the step function f A of the batch graph algorithm A as a fixed point calculation from the data structure and logic of the batch graph algorithm A ;
依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h;Generate an initial range function h based on the graph update area ΔG according to the data structure and logic of the batch graph algorithm A;
将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A ΔThe initial range function h and the step function f A are combined to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm.
本实施例中,所述批处理图算法A包括单源最短路径算法(Single Source  Shortest Path,SSSP)、弱联通分量算法(Weakly Connected Component,WCC)、深度优先搜索算法(DepthFirst Search,DFS)、局部聚类系数算法(Local Clustering Coefficient,LCC)和图仿真算法(Graph Simulation)中的任意一种。In this embodiment, the batch graph algorithm A includes a single-source shortest path algorithm (Single Source Shortest Path, SSSP), a weakly connected component algorithm (Weakly Connected Component, WCC), a depth-first search algorithm (DepthFirst Search, DFS), Any one of Local Clustering Coefficient (LCC) and Graph Simulation.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
参照图3,示出了本申请一实施例提供的一种图算法的自动增量化装置,包括:Referring to FIG. 3 , it shows an automatic incrementing device for a graph algorithm provided by an embodiment of the present application, including:
获取模块210,用于获取批处理图算法A;An acquisition module 210, configured to acquire a batch graph algorithm A;
第一处理模块220,用于当所述批处理图算法A为不动点计算时,则从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f A;依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h;将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A ΔThe first processing module 220 is configured to obtain the batch graph algorithm A as a fixed point from the data structure and logic of the batch graph algorithm A when the batch graph algorithm A is a fixed point calculation The calculated step function f A ; according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ΔG is generated; the initial range function h and the step function f A are combined , to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm;
或;or;
第二处理模块230,用于当所述批处理图算法A为不动点计算时,判断所述批处理图算法A是否具备有界增量化性质;当所述批处理图算法A具备有界增量化性质时,则将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。 The second processing module 230 is used to judge whether the batch processing graph algorithm A has a bounded incremental property when the batch processing graph algorithm A is a fixed point calculation; when the batch processing graph algorithm A has a bounded incremental property When the property of bounded increment is not obtained, the batch processing algorithm A is transformed into an incremental processing algorithm A Δ , and the incremental processing algorithm A Δ has the property of bounded increment.
在本申请一实施例中,所述第一处理模块220,包括:In an embodiment of the present application, the first processing module 220 includes:
锚定集获取子模块,用于获取在不动点状态下决定所述状态变量x i的所述输入
Figure PCTCN2021102218-appb-000072
的一个子集,作为所述状态变量x i的锚定集
Figure PCTCN2021102218-appb-000073
An anchor set acquisition submodule, used to acquire the input that determines the state variable x i in the fixed point state
Figure PCTCN2021102218-appb-000072
A subset of , as the anchor set of the state variable xi
Figure PCTCN2021102218-appb-000073
有向无环图生成子模块,用于依据所述锚定集
Figure PCTCN2021102218-appb-000074
生成有向无环图,所述有向无环图包含不动点状态下所述图G中各个点的拓扑偏序关系< C,所述拓扑偏序关系< C用于指导所述批处理图算法A的变化传播;
Directed acyclic graph generation sub-module for according to the anchor set
Figure PCTCN2021102218-appb-000074
Generate a directed acyclic graph, the directed acyclic graph contains the topological partial order relationship < C of each point in the graph G in the fixed point state, and the topological partial order relationship < C is used to guide the batch processing Change propagation of graph algorithm A;
初始范围函数生成子模块,用于依据所述有向无环图生成基于所述图更新区域ΔG的所述初始范围函数h。The initial range function generating submodule is configured to generate the initial range function h based on the graph update area ΔG according to the directed acyclic graph.
本实施例中,所述第二处理模块230,包括:In this embodiment, the second processing module 230 includes:
第二处理子模块,用于当所述批处理图算法A不具备有界增量化性质时,判断所述批处理图算法A是否具备弱可增量化性质;The second processing submodule is used to determine whether the batch graph algorithm A has a weakly incremental property when the batch graph algorithm A does not have a bounded incremental property;
当所述批处理图算法A具备弱可增量化性质时,则对所述批处理图算法A的所述步进函数f A和所述状态
Figure PCTCN2021102218-appb-000075
进行修改,使所述批处理图算法A具备有界增量化性质;将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。
When the batch graph algorithm A has a weak incremental property, then the step function f A and the state of the batch graph algorithm A
Figure PCTCN2021102218-appb-000075
modifying, so that the batch processing graph algorithm A has a bounded incremental property; transforming the batch processing algorithm A into an incremental processing algorithm A Δ , and the incremental processing algorithm A Δ has a bounded incremental property .
本实施例中,还包括:In this embodiment, also include:
第三处理模块,用于获取所述查询结果Q(G)和所述图G的更新区域ΔG;使用所述初始范围函数h,将所述查询结果Q(G)的不动点状态
Figure PCTCN2021102218-appb-000076
修改为对于更新后的图
Figure PCTCN2021102218-appb-000077
可行的初始状态
Figure PCTCN2021102218-appb-000078
和所述初始状态
Figure PCTCN2021102218-appb-000079
上的初始范围
Figure PCTCN2021102218-appb-000080
使用所述步进函数f A将所述初始状态
Figure PCTCN2021102218-appb-000081
和所述初始范围
Figure PCTCN2021102218-appb-000082
迭代至新的不动点,得到所述更新后的图
Figure PCTCN2021102218-appb-000083
的查询结果
Figure PCTCN2021102218-appb-000084
The third processing module is used to obtain the query result Q(G) and the update area ΔG of the graph G; using the initial range function h, the fixed point state of the query result Q(G)
Figure PCTCN2021102218-appb-000076
Modified to for the updated figure
Figure PCTCN2021102218-appb-000077
Feasible initial state
Figure PCTCN2021102218-appb-000078
and the initial state
Figure PCTCN2021102218-appb-000079
initial range on
Figure PCTCN2021102218-appb-000080
Use the step function f A to convert the initial state
Figure PCTCN2021102218-appb-000081
and the initial scope
Figure PCTCN2021102218-appb-000082
Iterate to new fixed points to obtain the updated graph
Figure PCTCN2021102218-appb-000083
query results for
Figure PCTCN2021102218-appb-000084
参照图4,示出了本申请的一种图算法的自动增量化的计算机设备,具体可以包括如下:Referring to FIG. 4 , it shows a computer device for automatic incrementation of a graph algorithm of the present application, which may specifically include the following:
上述计算机设备12以通用计算设备的形式表现,计算机设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,内存28,连接不同系统组件(包括内存28和处理单元16)的总线18。The above-mentioned computer device 12 is expressed in the form of a general-purpose computing device. The components of the computer device 12 may include, but are not limited to: one or more processors or processing units 16, memory 28, and different system components (including memory 28 and processing unit 16) connected to each other. bus 18.
总线18表示几类总线18结构中的一种或多种,包括存储器总线18或者存储器控制器,外围总线18,图形加速端口,处理器或者使用多种总线18结构中的任意总线18结构的局域总线18。举例来说,这些体系结构包括但不限于工业标准体系结构(ISA)总线18,微通道体系结构(MAC)总线18,增强型ISA总线18、音视频电子标准协会(VESA)局域总线18以及外围组件互连(PCI)总线18。The bus 18 represents one or more of several types of bus 18 structures, including a memory bus 18 or memory controller, a peripheral bus 18, an accelerated graphics port, a processor, or a bureau using any of a variety of bus 18 structures. domain bus 18. By way of example, these architectures include, but are not limited to, the Industry Standard Architecture (ISA) bus 18, the Micro Channel Architecture (MAC) bus 18, the Enhanced ISA bus 18, the Audio Video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18 .
计算机设备12典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。 Computer device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
内存28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器30和/或高速缓存存储器32。计算机设备12可以进一步包括其他移动/不可移动的、易失性/非易失性计算机体统存储介质。仅作为举例,存储系统34可以用于读写不可移动的、非易失性磁介质(通常称为“硬盘驱动器”)。尽管图4中未示出,可以提供用于对可移动非易失性磁盘(如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如CD-ROM,DVD-ROM或者其他光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质界面与总线18相连。存储器可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块42,这些程序模块42被配置以执行本申请各实施例的功能。 Memory 28 may include computer system readable media in the form of volatile memory, such as random access memory 30 and/or cache memory 32 . The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, non-volatile magnetic media (commonly referred to as a "hard drive"). Although not shown in FIG. 4, a disk drive for reading and writing to removable non-volatile disks (such as "floppy disks") may be provided, as well as for removable non-volatile optical disks (such as CD-ROM, DVD-ROM or other optical media) CD-ROM drive. In these cases, each drive may be connected to bus 18 via one or more data media interfaces. The memory may include at least one program product having a set (eg, at least one) of program modules 42 configured to perform the functions of various embodiments of the present application.
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如存储器中,这样的程序模块42包括——但不限于——操作系统、一个或者多个应用程序、其他程序模块42以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块42通常执行本申请所描述的实施例中的功能和/或方法。program/utility 40 having a set (at least one) of program modules 42, such as may be stored in memory, such program modules 42 including - but not limited to - an operating system, one or more application programs, other program modules 42 and program data, each or some combination of these examples may include the implementation of the network environment. The program modules 42 generally perform the functions and/or methods of the embodiments described herein.
计算机设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24、摄像头等)通信,还可与一个或者多个使得操作人员能与该计算机设备12交互的设备通信,和/或与使得该计算机设备12能与一个或多个其他计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过I/O接口22进行。并且,计算机设备12还可以通过网络适配器20与一个或者多个网络(例如局域网(LAN)),广域网(WAN)和/或公共网络(例如因特网)通信。如图4所示,网络适配器20通过总线18与计算机设备12的其他模块通信。应当明白,尽管图4中未示出,可以结合计算机设备12使用其他硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元16、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统34等。The computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, camera, etc.), and with one or more devices that enable an operator to interact with the computer device 12, and and/or communicate with any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices. Such communication may occur through I/O interface 22 . Also, computer device 12 may communicate with one or more networks (eg, local area network (LAN)), wide area network (WAN) and/or public networks (eg, the Internet) via network adapter 20 . As shown in FIG. 4 , network adapter 20 communicates with other modules of computer device 12 via bus 18 . It should be appreciated that although not shown in FIG. 4 , other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units 16, external disk drive arrays, RAID systems, Tape drives and data backup storage systems 34 and the like.
处理单元16通过运行存储在内存28中的程序,从而执行各种功能应用以及数据处理,例如实现本申请实施例所提供的一种图算法的自动增量化方 法。The processing unit 16 executes various functional applications and data processing by running the programs stored in the memory 28 , for example, implementing an automatic increment method of a graph algorithm provided by the embodiment of the present application.
也即,上述处理单元16执行上述程序时实现:获取批处理图算法A;当所述批处理图算法A为不动点计算时,则从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f A;依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h;将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A Δ;或;当所述批处理图算法A为不动点计算时,判断所述批处理图算法A是否具备有界增量化性质;当所述批处理图算法A具备有界增量化性质时,则将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。 That is, when the above-mentioned processing unit 16 executes the above-mentioned program, it realizes: obtaining a batch processing graph algorithm A; when the batch processing graph algorithm A is fixed point calculation, then from the data structure and logic of the batch processing graph algorithm A Obtain the step function f A calculated by the batch graph algorithm A as a fixed point; generate an initial range function h based on the graph update area ΔG according to the data structure and logic of the batch graph algorithm A; set the initial range function h and the step function f A are combined to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm; or; when the batch processing graph algorithm A is a fixed point calculation, determine the Whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a bounded incremental property, then the batch processing algorithm A is converted into an incremental processing algorithm A Δ , so The incremental processing algorithm A Δ has the property of bounded incrementalization.
在本申请一实施例中,还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请所有实施例提供的一种图算法的自动增量化方法。In one embodiment of the present application, there is also provided a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, an automatic increment of a graph algorithm as provided in all embodiments of the present application is realized method.
也即,给程序被处理器执行时实现:获取批处理图算法A;当所述批处理图算法A为不动点计算时,则从所述批处理图算法A的数据结构和逻辑中获取所述批处理图算法A作为不动点计算的步进函数f A;依据所述批处理图算法A的数据结构和逻辑生成基于图更新区域ΔG的初始范围函数h;将所述初始范围函数h和所述步进函数f A进行组合,得到与所述批处理算法对应的所述增量处理算法A Δ;或;当所述批处理图算法A为不动点计算时,判断所述批处理图算法A是否具备有界增量化性质;当所述批处理图算法A具备有界增量化性质时,则将所述批处理算法A转化为增量处理算法A Δ,所述增量处理算法A Δ具备有界增量化性质。 That is to say, when the program is executed by the processor, it is realized that the batch processing graph algorithm A is obtained; when the batch processing graph algorithm A is fixed point calculation, it is obtained from the data structure and logic of the batch processing graph algorithm A The batch graph algorithm A is used as the step function f A calculated by the fixed point; according to the data structure and logic of the batch graph algorithm A, an initial range function h based on the graph update area ΔG is generated; the initial range function h and the step function f A are combined to obtain the incremental processing algorithm A Δ corresponding to the batch processing algorithm; or; when the batch processing graph algorithm A is fixed point calculation, determine the Whether the batch processing graph algorithm A has a bounded incremental property; when the batch processing graph algorithm A has a bounded incremental property, then the batch processing algorithm A is converted into an incremental processing algorithm A Δ , the Incremental processing algorithm A Δ has bounded incremental properties.
可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦可编程只读存 储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言——诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言——诸如“C”语言或类似的程序设计语言。程序代码可以完全地在操作人员计算机上执行、部分地在操作人员计算机上执行、作为一个独立的软件包执行、部分在操作人员计算机上部分在远程计算机上执行或者完全在远程计算机或者服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到操作人员计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。Computer program codes for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language - such as "C" or a similar programming language. The program code may execute entirely on the operator computer, partly on the operator computer, as a stand-alone software package, partly on the operator computer and partly on a remote computer or entirely on the remote computer or server . In cases involving a remote computer, the remote computer can be connected to the operator computer via any kind of network, including a local area network (LAN) or wide area network (WAN), or it can be connected to an external computer (e.g. using an Internet service provider to connect via the Internet). Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。While the preferred embodiments of the embodiments of the present application have been described, additional changes and modifications can be made to these embodiments by those skilled in the art once the basic inventive concept is understood. Therefore, the appended claims are intended to be interpreted to cover the preferred embodiment and all changes and modifications that fall within the scope of the embodiments of the application.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求 或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this text, relational terms such as first and second etc. are only used to distinguish one entity or operation from another, and do not necessarily require or imply that these entities or operations, any such actual relationship or order exists. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or terminal equipment comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements identified, or also include elements inherent in such a process, method, article, or end-equipment. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or terminal device comprising said element.
以上对本申请所提供的一种图算法的自动增量化方法、装置、设备及存储介质,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The above is a detailed introduction to the automatic increment method, device, equipment and storage medium of a graph algorithm provided by this application. In this paper, specific examples are used to illustrate the principle and implementation of this application. The above examples The description is only used to help understand the method of the present application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation and application scope. In summary, As stated above, the contents of this specification should not be construed as limiting the application.

Claims (10)

  1. 一种图算法的自动增量化方法,其特征在于,包括:An automatic incremental method for a graph algorithm, characterized in that it includes:
    获取批处理图算法;Get the batch graph algorithm;
    当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法;When the batch processing graph algorithm is a fixed point calculation, the batch processing graph algorithm is obtained from the data structure and logic of the batch processing graph algorithm as a step function of the fixed point calculation; according to the batch processing The data structure and logic of the graph algorithm generate an initial range function based on the graph update area; combine the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm;
    或;or;
    当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。When the batch processing graph algorithm is a fixed point calculation, it is judged whether the batch processing graph algorithm has a bounded incremental property; when the batch processing graph algorithm has a bounded incremental property, then the The batch processing algorithm is transformed into an incremental processing algorithm, which has a bounded incremental property.
  2. 根据权利要求1所述的自动增量化方法,其特征在于,还包括:The automatic increment method according to claim 1, further comprising:
    判断所述批处理图算法是否为与图、状态、查询结果和范围有关的一个步进函数的迭代;其中,所述图为无向图;所述状态为所述批处理图算法的数据结构在预设轮开始时的状态;所述查询结果为使用所述批处理图算法对所述图查询后获得的查询结果,所述查询结果由所述状态变构成,每个所述状态变量都与所述图中的某个点或者边相关联,每个所述状态变量都有对应的用于更新所述状态变量的值的更新函数,所述更新函数的输入为所述状态变量的集合,每个所述状态变量对应一个逻辑判断语句,满足每次使用所述更新函数更新所述状态变量之后,所述逻辑判断语句为真,且对于最终的不动点状态,所有的所述逻辑判断语句全部为真;所述范围包括在所述预设轮开始时全部对应于所述逻辑判断语句为非的所述状态变量;Judging whether the batch processing graph algorithm is an iteration of a step function related to the graph, state, query result and range; wherein, the graph is an undirected graph; the state is the data structure of the batch processing graph algorithm The state at the beginning of the preset round; the query result is the query result obtained after using the batch graph algorithm to query the graph, the query result is composed of the state variables, and each of the state variables is Associated with a certain point or edge in the graph, each of the state variables has a corresponding update function for updating the value of the state variable, and the input of the update function is a set of the state variables , each of the state variables corresponds to a logic judgment statement, satisfying that after each update function is used to update the state variable, the logic judgment statement is true, and for the final fixed point state, all the logic The judgment statement is all true; the range includes the state variables that all correspond to the logic judgment statement being negated when the preset round starts;
    若所述批处理图算法为与所述状态、所述查询结果、所述图和所述范围有关的一个步进函数的迭代,则判定所述批处理图算法为不动点计算。If the batch graph algorithm is an iteration of a step function related to the state, the query result, the graph, and the range, then the batch graph algorithm is determined to be a fixed point computation.
  3. 根据权利要求2所述的自动增量化方法,其特征在于,所述依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数的 步骤包括:The automatic increment method according to claim 2, wherein the step of generating an initial range function based on a graph update region according to the data structure and logic of the batch graph algorithm comprises:
    获取所述更新函数的输入在不动点状态下的一个子集,作为所述状态变量的锚定集;Obtaining a subset of the input of the update function in a fixed point state as an anchor set of the state variable;
    依据所述锚定集生成有向无环图,所述有向无环图包含不动点状态下所述图中各个点的拓扑偏序关系,所述拓扑偏序关系用于指导所述批处理图算法的变化传播;Generate a directed acyclic graph according to the anchor set, the directed acyclic graph includes the topological partial order relationship of each point in the graph in the fixed point state, and the topological partial order relationship is used to guide the batch handles change propagation for graph algorithms;
    依据所述有向无环图生成基于所述图更新区域的所述初始范围函数。The initial range function based on the graph update region is generated from the directed acyclic graph.
  4. 根据权利要求3所述的自动增量化方法,其特征在于,在所述将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的所述增量处理算法的步骤之后,还包括:The automatic incrementalization method according to claim 3, wherein the initial range function and the step function are combined to obtain the incremental processing algorithm corresponding to the batch processing algorithm After the steps, also include:
    获取所述查询结果和所述图的更新区域;Obtaining the query result and the update area of the graph;
    使用所述初始范围函数,将所述查询结果的不动点状态修改为对于更新后的图可行的初始状态和所述初始状态上的初始范围;modifying the fixed point state of the query result to an initial state feasible for the updated graph and an initial range on the initial state using the initial range function;
    使用所述步进函数将所述初始状态和所述初始范围迭代至新的不动点,得到所述更新后的图的查询结果。The step function is used to iterate the initial state and the initial range to a new fixed point to obtain a query result of the updated graph.
  5. 根据权利要求2所述的自动增量化方法,其特征在于,还包括:The automatic increment method according to claim 2, further comprising:
    当所述批处理图算法不具备有界增量化性质时,判断所述批处理图算法是否具备弱可增量化性质;When the batch graph algorithm does not have a bounded incremental property, determine whether the batch graph algorithm has a weakly incremental property;
    当所述批处理图算法具备弱可增量化性质时,则对所述批处理图算法的所述步进函数和所述状态进行修改,使所述批处理图算法具备有界增量化性质;将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。When the batch processing graph algorithm has a weakly incremental property, the step function and the state of the batch processing graph algorithm are modified so that the batch processing graph algorithm has bounded incrementalization property; converting the batch processing algorithm into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
  6. 根据权利要求1所述的自动增量化方法,其特征在于,所述将所述批处理算法转化为增量处理算法的步骤包括:The automatic incremental method according to claim 1, wherein the step of converting the batch processing algorithm into an incremental processing algorithm comprises:
    从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;deriving the batch graph algorithm as a step function for fixed point computation from the batch graph algorithm's data structure and logic;
    依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;Generate an initial range function based on the graph update region according to the data structure and logic of the batch graph algorithm;
    将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法。Combining the initial range function and the step function to obtain an incremental processing algorithm corresponding to the batch processing algorithm.
  7. 根据权利要求1所述的自动增量化方法,其特征在于,所述批处理图算法包括弱联通分量算法、单源最短路径算法、深度优先搜索算法、局部聚类系数算法和图仿真算法中的任意一种。The automatic incrementalization method according to claim 1, wherein the batch graph algorithm includes a weakly connected component algorithm, a single-source shortest path algorithm, a depth-first search algorithm, a local clustering coefficient algorithm, and a graph simulation algorithm. any of the
  8. 一种图算法的自动增量化装置,其特征在于,包括:An automatic incrementalization device for a graph algorithm, characterized in that it includes:
    获取模块,用于获取批处理图算法;Obtaining a module for obtaining a batch graph algorithm;
    第一处理模块,用于当所述批处理图算法为不动点计算时,则从所述批处理图算法的数据结构和逻辑中获取所述批处理图算法作为不动点计算的步进函数;依据所述批处理图算法的数据结构和逻辑生成基于图更新区域的初始范围函数;将所述初始范围函数和所述步进函数进行组合,得到与所述批处理算法对应的增量处理算法;The first processing module is used to obtain the step of the batch graph algorithm as a fixed point calculation from the data structure and logic of the batch graph algorithm when the batch graph algorithm is a fixed point calculation function; generate an initial range function based on the graph update region according to the data structure and logic of the batch processing graph algorithm; combine the initial range function and the step function to obtain an increment corresponding to the batch processing algorithm processing algorithm;
    或;or;
    第二处理模块,用于当所述批处理图算法为不动点计算时,判断所述批处理图算法是否具备有界增量化性质;当所述批处理图算法具备有界增量化性质时,则将所述批处理算法转化为增量处理算法,所述增量处理算法具备有界增量化性质。The second processing module is used to determine whether the batch graph algorithm has a bounded incremental property when the batch graph algorithm is a fixed point calculation; when the batch graph algorithm has a bounded incremental property When the property is , the batch processing algorithm is transformed into an incremental processing algorithm, and the incremental processing algorithm has a bounded incremental property.
  9. 一种设备,其特征在于,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至7中任一项所述的方法。A device, characterized in that it comprises a processor, a memory, and a computer program stored on the memory and capable of running on the processor, when the computer program is executed by the processor, it implements claims 1 to 1. The method described in any one of 7.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的方法。A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method according to any one of claims 1 to 7 is implemented.
PCT/CN2021/102218 2021-06-18 2021-06-24 Graph algorithm autoincrement method and apparatus, device, and storage medium WO2022262007A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110681585.8 2021-06-18
CN202110681585.8A CN113343040A (en) 2021-06-18 2021-06-18 Automatic incremental method, device, equipment and storage medium for graph algorithm

Publications (1)

Publication Number Publication Date
WO2022262007A1 true WO2022262007A1 (en) 2022-12-22

Family

ID=77477433

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102218 WO2022262007A1 (en) 2021-06-18 2021-06-24 Graph algorithm autoincrement method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN113343040A (en)
WO (1) WO2022262007A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934702A (en) * 2023-03-14 2023-04-07 青岛安工数联信息科技有限公司 Data processing method and device in process industry, storage medium and processor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140320497A1 (en) * 2013-04-29 2014-10-30 Microsoft Corporation Graph partitioning for massive scale graphs
CN107193896A (en) * 2017-05-09 2017-09-22 华中科技大学 A kind of diagram data division methods based on cluster
CN110622156A (en) * 2017-05-12 2019-12-27 华为技术有限公司 Incremental graph computation for querying large graphs
CN111538867A (en) * 2020-04-15 2020-08-14 深圳计算科学研究院 Method and system for dividing bounded incremental graph

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140320497A1 (en) * 2013-04-29 2014-10-30 Microsoft Corporation Graph partitioning for massive scale graphs
CN107193896A (en) * 2017-05-09 2017-09-22 华中科技大学 A kind of diagram data division methods based on cluster
CN110622156A (en) * 2017-05-12 2019-12-27 华为技术有限公司 Incremental graph computation for querying large graphs
CN111538867A (en) * 2020-04-15 2020-08-14 深圳计算科学研究院 Method and system for dividing bounded incremental graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SONG JIE, GUO CHAO-PENG;ZHANG YI-CHUAN;ZHANG YAN-FENG;YU GE: "Research and Implementation Incremental Iterative Model", CHINESE JOURNAL OF COMPUTERS, vol. 39, no. 1, 31 January 2016 (2016-01-31), pages 109 - 125, XP093014712, DOI: 10.11897/SP.J.1016.2016.00109 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934702A (en) * 2023-03-14 2023-04-07 青岛安工数联信息科技有限公司 Data processing method and device in process industry, storage medium and processor
CN115934702B (en) * 2023-03-14 2023-05-23 青岛安工数联信息科技有限公司 Data processing method, device, storage medium and processor in process industry

Also Published As

Publication number Publication date
CN113343040A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
WO2020114022A1 (en) Knowledge base alignment method and apparatus, computer device and storage medium
CN104008064B (en) The method and system compressed for multi-level store
US20160140152A1 (en) Intelligent compiler for parallel graph processing
CN108763376B (en) Knowledge representation learning method for integrating relationship path, type and entity description information
WO2017107414A1 (en) File operation method and device
CN104462668B (en) Computer-implemented method for designing an industrial product modeled with a binary tree
JP6070936B2 (en) Information processing apparatus, information processing method, and program
US9276821B2 (en) Graphical representation of classification of workloads
JP6933736B2 (en) Methods, devices, equipment and media for acquiring data models in the knowledge graph
JP2005292832A (en) Language model adaptation using semantic supervision
US20150120346A1 (en) Clustering-Based Learning Asset Categorization and Consolidation
CN106970958B (en) A kind of inquiry of stream file and storage method and device
CN109886311B (en) Incremental clustering method and device, electronic equipment and computer readable medium
WO2022262007A1 (en) Graph algorithm autoincrement method and apparatus, device, and storage medium
WO2024036662A1 (en) Parallel graph rule mining method and apparatus based on data sampling
CN111008213B (en) Method and apparatus for generating language conversion model
US20060206524A1 (en) Intelligent collection management
EP4226612A1 (en) Quantization of tree-based machine learning models
US20220121665A1 (en) Computerized Methods and Systems for Selecting a View of Query Results
CN113761017A (en) Similarity searching method and device
CN115774854B (en) Text classification method and device, electronic equipment and storage medium
WO2023236239A1 (en) Multi-round sampling based data screening rule validation method, and apparatus thereof
CN111414422A (en) Data distribution method, device, equipment and storage medium
WO2019184577A1 (en) Transaction processing method and system, and server
WO2022257301A1 (en) Method, system and apparatus for configuring computing resources of service

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21945590

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE