WO2022057303A1 - 一种图处理的方法,系统以及装置 - Google Patents

一种图处理的方法,系统以及装置 Download PDF

Info

Publication number
WO2022057303A1
WO2022057303A1 PCT/CN2021/096023 CN2021096023W WO2022057303A1 WO 2022057303 A1 WO2022057303 A1 WO 2022057303A1 CN 2021096023 W CN2021096023 W CN 2021096023W WO 2022057303 A1 WO2022057303 A1 WO 2022057303A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
edges
graph
subgraphs
ports
Prior art date
Application number
PCT/CN2021/096023
Other languages
English (en)
French (fr)
Inventor
王智勇
潘如晟
魏雅婷
陈为
高寒
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022057303A1 publication Critical patent/WO2022057303A1/zh
Priority to US18/186,267 priority Critical patent/US20230229704A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the embodiments of the present application relate to the field of data visualization, and in particular, to a method, apparatus, and device for graph processing.
  • Deep learning technology is currently widely used in feature extraction, reasoning and prediction of complex data, including data types such as graph, text, and speech.
  • users generally use the deep learning framework to write code for data preprocessing, model building, model training, model evaluation, and deployment.
  • FIG. 1 is a schematic diagram of an embodiment of a computation graph according to an embodiment of the present application.
  • the deep learning framework may instruct a machine to perform model training and inference in the form of a computation graph.
  • the complexity of the deep learning framework and the model itself brings a high threshold for users to use, especially it is difficult to locate and debug problems conveniently and quickly.
  • graphs usually include hierarchical aggregations and nodes and edges with a large amount of data, so the layout of nodes and edges in the graph is cluttered, which reduces the efficiency of graph processing and reduces the clarity and completeness of the data flow in the graph displayed by the terminal device. .
  • Embodiments of the present application provide a graph processing method, system, and apparatus, which are used to improve the clarity and completeness of data flow in the graph without affecting the structure of the graph and the calculation logic of the expression.
  • a first aspect of the embodiments of the present application provides a method for graph processing, including: because the first graph generated by a complex deep learning model usually contains frequent subgraphs, the frequent subgraphs are those that appear repeatedly in the same graph structure.
  • the graph processing apparatus determines from the first graph a node whose degree is greater than the second threshold as the starting node, and then determines the corresponding ending node through the path traversed by the data flow of the starting node, and passes the starting node and the ending node.
  • each subgraph includes edges and nodes, and edges are used to represent the flow of data between different nodes.
  • the second threshold is used to indicate the out-degree of the data flow direction of the node, that is, greater than the second threshold value indicates that the out-degree of the node has exceeded the preset out-degree, resulting in more data flow of the node, so the edge shown in the figure There are also many.
  • the specific value of the second threshold may be 2, 5, or 8, etc.
  • the specific data of the second threshold needs to be predetermined according to the actual situation of the first figure, which is not limited here.
  • the graph processing apparatus calculates the respective identifiers of the at least two subgraphs based on the nodes and edges included in the at least two subgraphs, where the identifiers are used to indicate the features of the subgraphs, or the identifiers are hash values of the subgraphs , which is not specifically limited here. Further, since the sub-graphs have the same identification, that is, the sub-graphs have similar features, the graph processing apparatus combines at least two sub-graphs with the same sub-graphs to generate a second graph and output the second graph.
  • the terminal device when the method in this embodiment of the present application is applied to a terminal device, the terminal device can read the first image saved by the deep learning framework from the memory of the terminal device, or receive the first image sent from the server, and then directly display the image.
  • the server when the method of the embodiment of the present application is applied to the server, the server can read the stored first image from the server memory, and after generating the second image in the aforementioned manner, send the second image to the terminal device, and the terminal device can display the second image. Received the second image.
  • a subgraph with a corresponding identifier is determined according to the acquired first graph, and at least two subgraphs with the same identifier are merged to generate a second graph, thus reducing the number of nodes and edges included in the second graph , thereby improving the efficiency of graph processing. Since the subgraphs with the same identification are similar, the merging does not affect the structure of the graph and the calculation logic of the expression, so it can also improve the clarity and completeness of the data flow in the graph.
  • the identifier is a hash value.
  • the calculation graph corresponding to the deep learning framework is used as an example.
  • the hash value corresponding to the node can indicate The characteristics of nodes
  • the hash value corresponding to the edge can indicate the characteristics of the edge.
  • the hash value can distinguish nodes and edges with different characteristics. Therefore, for each subgraph in at least two subgraphs, the corresponding nodes in each subgraph can be based on Therefore, the hash value corresponding to each subgraph is calculated, and the obtained hash value corresponding to the subgraph can also distinguish subgraphs with different characteristics.
  • the calculated hash value corresponding to the subgraph can accurately reflect the characteristics of the subgraph and ensure subsequent merging
  • the accuracy of subgraphs with the same hash value further improves the accuracy of graph processing.
  • the identifier is a hash value, so in each subgraph, the hash value corresponding to a node is related to the attribute of the node, that is, the hash value corresponding to the node is the corresponding hash value for each subgraph.
  • the attributes of the nodes are calculated. Specifically, the nodes in the subgraph correspond to a variety of node attributes.
  • the node attributes corresponding to the nodes include but are not limited to variable types, parameter types, aggregations, etc.
  • the node attributes can reflect the characteristics of the node. Therefore, the hash value corresponding to the node obtained by calculating the attributes of the node can indicate the characteristics of the node.
  • the hash value corresponding to an edge is related to the connection relationship indicated by the edge in each subgraph, that is, the hash value corresponding to the edge in each subgraph is the relationship between the node indicated by the edge in each subgraph and the edge in each subgraph.
  • the connection relationship between nodes is calculated, and the connection relationship between nodes is directional.
  • an edge represents the data flow from one node to another node. For example, for an edge i, the data output node is node A, and the data input node is node B.
  • node A and node B can be encoded as String "[source type]->[target type]", where [source type] is used to indicate the type of node A of the edge, and [target type] is used to indicate the type of node B of the edge, this string is used for Indicates that the data flow of edge i is from node A to node B to ensure the orderliness of the data. Therefore, the hash value of the string is calculated to indicate the characteristics of the edge.
  • the hash value corresponding to the node is obtained by calculating the attribute of the node, which can improve the accuracy of the obtained hash value corresponding to the node.
  • the connection relationship indicated by the edge is directional, which can make the calculated hash value more accurate, so it can improve the accuracy of the hash value corresponding to the subgraph, thereby further improving the accuracy of graph processing.
  • the identifier is a hash value.
  • the calculation graph corresponding to the deep learning framework is used as an example. Based on the characteristics of the deep learning calculation graph, at least two subgraphs can be merged. The subgraphs with the same value can reduce the number of nodes and edges included in the calculation graph, and determine the node whose out-degree or in-degree is greater than the first threshold in the first graph as the first node, and then assign a first node to the first node.
  • a port and a second port wherein the first port is the data input edge of each first node passing through each first node port, and the second port is the data output edge of each first node passing through each first node , according to all the edges included in the generated graph after merging, multiple edges passing through the same first port and multiple edges passing through the same second port are merged to generate a second graph.
  • the first threshold may indicate the out-degree of the data flow direction of the node, and may also indicate the in-degree of the data inflow of the node, that is, greater than the first threshold value indicates that the out-degree of the node has exceeded the preset out-degree, or the in-degree of the node has exceeded The preset in-degree will lead to more data flows of the node, so there are more edges displayed in the figure.
  • the specific value of the first threshold can be 3, 5 or 4, etc.
  • the specific value of the first threshold is The data needs to be predetermined according to the actual situation of the calculation graph, which is not limited here.
  • the first node is a node whose out-degree or in-degree is greater than the first threshold, there are many output or input data flows from the first node, and the first node is allocated a first port and a second port, The data input passes through the edge of the first port, and the data output passes through the edge of the second port to combine to generate a second graph, which further improves the clarity of the data flow in the graph displayed by the terminal device.
  • the identifier is a hash value
  • the second graph is laid out in a manner of orthogonal edge routing.
  • a port constraint (port constraint) optimization layout algorithm is first used to optimize the position and ordering of the first port and the second port based on the first port and the second port, aiming at the overall orthogonal edge routing layout, The position coordinates of the first port and the second port are obtained by calculation, thereby completing the layout of the orthogonal edge routing to generate the second graph.
  • the orthogonal layout specifically means that the included angle of the connection lines around the node is 90 degrees, and the edge routing is the arrangement and direction of the specific connection lines in the figure.
  • the layout using the orthogonal edge routing can constrain and optimize the positions and ordering of the first port and the second port, so that the overall data flow from left to right is complete and clear, which further improves the display of terminal equipment.
  • the identifier is a hash value.
  • the calculation graph corresponding to the deep learning framework is used as an example. Based on the characteristics of the deep learning calculation graph, at least two subgraphs can be merged. Subgraphs with the same value can reduce the number of nodes and edges included in the computation graph.
  • an aggregate indicates a computing function through a combination of nodes and edges, it can indicate a
  • the input edge of the aggregation is determined as the first edge
  • the edge indicating the output of an aggregation is determined as the second edge
  • ports are added to the first edges and the second edges, and the first edges , corresponding to the same aggregation, and the port on the edge indicating that the input comes from the same node is determined as the third port, and a plurality of third ports are combined.
  • the port on the edge corresponding to the same aggregation and indicating the output to the same node is determined as the fourth port, and the multiple fourth ports are merged. Based on the merged ports, the orthogonal edge is used.
  • the route layout generates the second graph.
  • the specific orthogonal edge routing manner is similar to that described in the foregoing application embodiment, and details are not described herein again.
  • At least two subgraphs with the same hash value are firstly merged to reduce the number of nodes and edges included in the second graph, thereby improving the efficiency of graph processing. Since subgraphs with the same hash value are similar , so the merging does not affect the data flow in the graph, so it can also improve the clarity and completeness of the data flow in the graph displayed by the terminal device.
  • the identifier is a hash value.
  • the calculation graph corresponding to the deep learning framework is used as an example. Based on the characteristics of the deep learning calculation graph, at least two subgraphs can be merged. The subgraphs with the same value can reduce the number of nodes and edges included in the calculation graph, and determine the node whose out-degree or in-degree is greater than the first threshold in the first graph as the first node, and then assign a first node to the first node.
  • a port and a second port wherein the first port is the data input edge of each first node passing through each first node port, and the second port is the data output edge of each first node passing through each first node , according to all the edges included in the generated graph after merging, multiple edges passing through the same first port and multiple edges passing through the same second port are merged to generate a second graph.
  • the first threshold may indicate the out-degree of the data flow direction of the node, and may also indicate the in-degree of the node data, that is, greater than the first threshold value indicates that the out-degree of the node has exceeded the preset out-degree, or the in-degree of the node has exceeded The preset in-degree will lead to more data flows of the node, so there are more edges displayed in the figure.
  • the specific value of the first threshold can be 3, 5 or 4, etc.
  • the specific value of the first threshold is The data needs to be predetermined according to the actual situation of the calculation graph, which is not limited here.
  • the edge indicating the input of an aggregation can be determined as the first edge, and the edge indicating an aggregation can be determined as the first edge.
  • the output edge is determined as the second edge, and then ports are added to multiple first edges and multiple second edges, which will correspond to the same aggregation on multiple first edges, and indicate that the input comes from the same node.
  • the port on the edge of the node It is determined as the third port, and multiple third ports will be merged.
  • the port on the edge corresponding to the same aggregation and indicating the output to the same node is determined as the fourth port, and multiple second edges are determined as the fourth port.
  • a fourth port, based on the merged port, is laid out in a way of orthogonal edge routing to generate a second graph. The specific orthogonal edge routing manner is similar to that described in the foregoing application embodiment, and details are not described herein again.
  • At least two subgraphs with the same hash value are firstly merged to reduce the number of nodes and edges included in the second graph, thereby improving the efficiency of graph processing. Since subgraphs with the same hash value are similar Therefore, the merging does not affect the structure of the graph and the computational logic of the expression, the clarity and completeness of the data flow in the graph.
  • the first node is a node whose out-degree or in-degree is greater than the first threshold, there are many data flows from or to the first node, so the first node is allocated a first port and a second port, and the data is input through the The edge of the first port, and the data output is combined through the edge of the second port, which further improves the clarity of the data flow in the display diagram of the terminal device.
  • the merged ports are able to split the aggregate exterior from the aggregate interior, Since there will be intertwined and complex edges between each aggregate, the layout of the outside of the aggregate and the inside of the aggregate through the merged ports can further improve the clarity of the data flow in the display graph.
  • a second aspect of the embodiments of the present application provides a graph processing apparatus, the graph processing apparatus includes: an acquisition module configured to acquire at least two subgraphs of a first graph, wherein each subgraph includes a plurality of subgraphs in the first graph Nodes and edges between nodes; a calculation module for calculating the respective identifiers of at least two subgraphs based on at least two subgraphs, the nodes and edges included in each subgraph; a merging module for merging at least two subgraphs with the same identifier The subgraph of ; the output module is used to output the second graph generated after merging.
  • the identifier is a hash value
  • the data of each subgraph indicates nodes and edges in each subgraph
  • the computing module is specifically configured to, for each subgraph in the at least two subgraphs, based on The hash values corresponding to multiple nodes in each subgraph and the hash values corresponding to multiple edges in each subgraph are calculated, and the hash value corresponding to each subgraph is calculated.
  • a hash value corresponding to a node is related to an attribute of the node; in each subgraph, a hash value corresponding to an edge is connected to the connection indicated by the edge in each subgraph relationship related.
  • the identifier is a hash value
  • a merging module is specifically configured to merge subgraphs with the same hash value in at least two subgraphs; and add each first node in the plurality of first nodes The first port and the second port of , where each first node is a node whose out-degree or in-degree is greater than the first threshold in the first graph, indicating that the edge of data input to each first node passes through the edge of each first node.
  • a first port instructing the data output edge of each first node to pass through a second port of each first node; performing the following operations on multiple first nodes to generate a second graph: merging multiple edges passing through the same first port , merging multiple edges passing through the same second port.
  • the second graph is laid out in a manner of orthogonal edge routing.
  • the identifier is a hash value; a merging module is specifically configured to merge subgraphs with the same hash value in at least two subgraphs; and a plurality of first edges and a plurality of second edges Add ports on the top, where each first edge indicates the input of an aggregate, each second edge indicates the output of an aggregate, and an aggregate indicates a computing function through a combination of a set of nodes and edges; the multiple ports added are combined Multiple third ports in , the third ports are multiple first edges, corresponding to the same aggregation, and indicating that the input comes from the same node. , the fourth port is the port on the edge of the multiple second edges, which corresponds to the same aggregation and indicates the output to the same node; based on the merged ports, the second graph is laid out in an orthogonal edge routing manner.
  • the identifier is a hash value
  • the merging module is specifically configured to merge subgraphs with the same hash value in at least two subgraphs
  • the first port and the second port wherein each first node is a node whose out-degree or in-degree is greater than the first threshold in the first graph, indicating that the data input edge of each first node passes through the first node of each first node.
  • a port instructing the data output edge of each first node to pass through the second port of each first node; merging multiple edges passing through the same first port, merging multiple edges passing through the same second port; Add ports on the first edge and multiple second edges, where each first edge indicates an aggregate input, each second edge indicates an aggregate output, and an aggregate indicates a combination of a set of nodes and edges. Calculation function; merge multiple third ports among the multiple ports added, the third ports are among multiple first edges, correspond to the same aggregation, and indicate that the input comes from the port on the edge of the same node, and merge the added multiple ports.
  • the second graph is laid out in a way of orthogonal edge routing.
  • a terminal device is provided, and the terminal device may be a graph processing apparatus in the above method design, or a chip provided in the graph processing apparatus.
  • the terminal device includes: a processor, which is coupled to the memory, and can be used to execute instructions in the memory, so as to implement the method executed by the graph processing apparatus in the first aspect and any possible implementation manner thereof.
  • the terminal device further includes a memory.
  • the terminal device further includes a communication interface, and the processor is coupled to the communication interface.
  • the communication interface may be a transceiver, or an input/output interface.
  • the communication interface may be an input/output interface.
  • the transceiver may be a transceiver circuit.
  • the input/output interface may be an input/output circuit.
  • a server is provided, and the server may be a graph processing apparatus in the above method design, or a chip provided in the graph processing apparatus.
  • the server includes: a processor, which is coupled to the memory and can be configured to execute instructions in the memory, so as to implement the method executed by the graph processing apparatus in the first aspect and any possible implementation manner thereof.
  • the server further includes memory.
  • the server further includes a communication interface to which the processor is coupled.
  • the communication interface may be a transceiver, or an input/output interface.
  • the communication interface may be an input/output interface.
  • the transceiver may be a transceiver circuit.
  • the input/output interface may be an input/output circuit.
  • a fifth aspect of the embodiments of the present application provides a program, which, when executed by a processor, is used to execute any method in the first aspect and its possible implementation manners.
  • a sixth aspect of the embodiments of the present application provides a computer program product (or computer program) that stores one or more computers.
  • the processor executes the first aspect or The method in any possible implementation manner of the first aspect.
  • a seventh aspect of the embodiments of the present application provides a chip, where the chip includes at least one processor, and is configured to support a terminal device to implement the functions involved in the first aspect or any possible implementation manner of the first aspect.
  • the chip system may further include a memory, at least one processor is connected in communication with at least one memory, and at least one memory stores instructions for storing necessary program instructions and data of the terminal device and the server.
  • the chip system further includes an interface circuit, and the interface circuit provides program instructions and/or data for the at least one processor.
  • a computer-readable storage medium stores a program, and the program enables a terminal device to execute any of the foregoing first aspect and its possible implementation manners. a method.
  • At least two subgraphs of the first graph can be obtained, based on the nodes and edges included in the at least two subgraphs, the respective hash values of the at least two subgraphs are calculated, and the at least two subgraphs are merged.
  • the subgraphs with the same hash value in the second graph are generated, and the second graph is output, thereby reducing the number of nodes and edges included in the second graph, thereby improving the efficiency of graph processing.
  • the calculation logic of structure and expression can also improve the clarity and completeness of the data flow in the graph.
  • FIG. 1 is a schematic diagram of an embodiment of a calculation diagram of an embodiment of the application
  • FIG. 2 is a schematic diagram of a system architecture of an embodiment of the present application.
  • FIG. 3 is a schematic diagram of the architecture of product implementation in an embodiment of the application.
  • FIG. 4 is a schematic diagram of an embodiment of a node in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of a side in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an embodiment of a frequent subgraph structure in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an embodiment of a simple graph and a composite graph in an embodiment of the present application
  • FIG. 8 is a schematic diagram of an embodiment of a cross-aggregation edge in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an embodiment of a method for processing a graph in an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an embodiment of a calculation diagram in an embodiment of the present application.
  • FIG. 11 is a schematic diagram of another embodiment of a method for processing a graph in an embodiment of the present application.
  • FIG. 12 is a schematic diagram of another embodiment of a calculation diagram in an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an embodiment of a BERT network calculation diagram in an embodiment of the application.
  • FIG. 14 is a schematic diagram of another embodiment of a method for processing a graph in an embodiment of the present application.
  • FIG. 15 is a schematic diagram of an embodiment of a port in an embodiment of the present application.
  • FIG. 16 is a schematic diagram of an embodiment of a merge port in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of another embodiment of a method for processing a graph in an embodiment of the present application.
  • FIG. 18 is a schematic diagram of an embodiment of a graph processing apparatus in an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of an embodiment of a graph processing apparatus in an embodiment of the present application.
  • Embodiments of the present application provide a graph processing method, system, and device, which can acquire at least two subgraphs of a first graph, each subgraph including multiple nodes in the first graph and edges between nodes, and then calculate at least two subgraphs in the first graph.
  • the nodes and edges included in each subgraph in the two subgraphs calculate the respective identifiers of the at least two subgraphs, merge the subgraphs with the same identifier in the at least two subgraphs to generate a second graph, and then output the second graph generated after the combination , thereby reducing the number of nodes and edges included in the second graph, thereby improving the efficiency of graph processing.
  • the merging does not affect the structure of the graph and the calculation logic of the expression, it can also improve the clarity and completeness of the data flow in the graph. Spend.
  • FIG. 2 is a schematic diagram of the system architecture of the embodiment of the present application.
  • the user in the process of deep learning practice, the user, for example, writes code to construct a model
  • the existing problems can be found through the visualization of the model calculation graph. Therefore, the model structure can be displayed accurately and clearly through the graph processing method provided in the embodiment of the present application, thereby making the user more convenient. to debug and tune the training.
  • FIG. 3 is a schematic diagram of the architecture of the product implementation in the embodiment of the present application.
  • the graph processing module In order to instruct the graph processing module, the graph processing module first sends a request, reads the initial calculation graph data in a specific format from the server or host directory based on the request, and processes the graph provided by the embodiment of the present application in the web service of the browser. The method of calculating, rendering and displaying the initial calculation graph data, based on the displayed graph data, the user can continue to interact and adjust the display form. It should be understood that, in practical applications, a unified computational graph data storage and parsing format needs to be configured in advance, and then the computational graph processing is performed using the method provided by the embodiments of the present application.
  • the calculation graph By visualizing the calculation graph generated by the deep learning framework, it is helpful for users to check whether the code written by themselves conforms to the model structure in mind, and to locate problems during the model training process.
  • the calculation graph usually includes hierarchical aggregation and huge data volume. Therefore, the layout of nodes and edges in the computation graph is cluttered, which reduces the efficiency of computational graph processing and reduces the clarity and completeness of the data flow in the computation graph displayed by the terminal device.
  • a computational graph is a directed graph used to represent data flow and computational operations.
  • the computational graph includes nodes and edges.
  • the nodes in the computation graph all correspond to operations (Operation) or variables (Variable), variables can deliver their own values to operations, and operations usually indicate a computational logic, such as assignment, addition, rounding, AND, or, etc. , therefore, some nodes in the calculation graph define a function of the variables in the calculation graph.
  • the values input to the nodes and the values output from the nodes have various data forms, such as tensors (tensors), tensors indicate multi-dimensional arrays, Tensors thus include, but are not limited to, scalars, vectors, matrices, and higher-order tensors.
  • FIG. 4 is a schematic diagram of an embodiment of a node in an embodiment of the present application. As shown in the figure, B1 to B11 are used to indicate nodes.
  • Edges are used to indicate the flow of data between nodes in a computational graph. Each end of an edge is connected to a node, and data flows from the node connected at one end of the edge to the node at the other end of the edge. Therefore, in a directed graph, the edge is directional. For example, if the two ends of an edge are connected to node A and node B respectively, and the direction of the edge is from node A to node B, the data flows from node A to node B. For a node, the direction of an edge points to this node, which means data input (or "inflow") to this node, and the direction of the edge points from this node to other nodes, which means data output (or "outflow”) this node.
  • a port can be understood as a concrete representation of the input or output of data, so a port can be marked anywhere on an edge, such as an edge.
  • One end of the edge that is, the intersection of the edge and the node, for example, the position on the edge close to the end of the edge.
  • FIG. 5 is a schematic diagram of an embodiment of an edge in an embodiment of the present application.
  • C1 to C5 are used to indicate nodes
  • C6 to C9 are used to indicate edges, wherein edge C6 It is the data flow between node C1 and node C2.
  • Edge C7 is the data flow between node C2 and node C3.
  • Edge C8 is the data flow between node C2 and node C4.
  • Edge C9 is the data flow between node C2 and node C5. It can be seen that the edge has a direction.
  • Frequent subgraphs are subgraph structures that appear repeatedly in the computation graph, and frequent subgraphs are multiple parallel data flow paths between the start node (startHubNude) and the end node (endHubNude).
  • startHubNude startHubNude
  • endHubNude endHubNude
  • FIG. 6 is a schematic diagram of an embodiment of a frequent subgraph structure in an embodiment of the present application.
  • D1 is used to indicate the start node
  • D2 is used to indicate the end node
  • at the start There are multiple parallel data flow paths between a node and a terminating node.
  • the nodes and edges passing through these data flow paths form frequent subgraphs, and the subgraphs described in this embodiment are the frequent subgraphs introduced.
  • FIG. 7 is a schematic diagram of an embodiment of a simple diagram and a composite diagram in an embodiment of the present application.
  • diagram (A) in FIG. 7 is used to indicate a simple diagram
  • E1 to E4 are used to indicate aggregations
  • E5 and E6 are used to indicate sub-aggregations.
  • each node in (A) in Figure 7 does not form an aggregation, that is, it does not constitute an aggregation relationship that constitutes a hierarchical node
  • an aggregation may contain multiple sub-aggregations or sub-nodes, such as Aggregate E1 includes 2 child nodes, Aggregate E2 includes 3 child nodes, Aggregate E3 includes 2 child nodes, and Aggregate E4 includes one child node, Sub-Aggregate E5 and Sub-Aggregate E6, where Sub-Aggregate E5 includes 3 Child Nodes, and Sub-Aggregate E6 It includes 4 child nodes, and the aggregation relationship of hierarchical nodes can be formed through the data flow between multiple aggregations.
  • a cross-aggregate edge is the flow of data between nodes within the aggregate and nodes outside the aggregate in a composite graph.
  • FIG. 8 is a schematic diagram of an embodiment of cross-aggregation edges in this embodiment of the present application.
  • F1 to F3 are used to indicate nodes
  • F4 and F5 are used to indicate edges.
  • the figure includes aggregation A and aggregation 2, and aggregation 1 includes node F1 and node F2, and aggregation 2 includes node F3.
  • the data flow between node F1 and node F3 is The corresponding edge F4 is a cross-aggregation edge, and node F1 and node F2 belong to the same aggregation. Therefore, for the data flow between node F1 and node F2 to the corresponding edge F2, it is not a cross-aggregation edge.
  • an embodiment of the present application provides a graph processing method, which is used to improve the clarity and completeness of the data flow in the graph displayed by the terminal device.
  • the embodiments of the present application are described by taking a computation graph corresponding to a deep learning framework as an example. It should be understood that, in practical applications, the graph processing method provided by the embodiments of the present application can be applied to graphs including nodes and edges. Various drawings are not specifically limited here. The following describes the graph processing method used in this embodiment of the present application in detail. Please refer to FIG. 9 , which is a schematic diagram of an embodiment of the graph processing method in this embodiment of the present application. As shown in the figure, the graph processing method includes the following steps.
  • the image processing apparatus acquires the first image.
  • the server can read the first graph (computation graph data) saved by the deep learning framework from the server memory.
  • the terminal device can read the first image saved by the deep learning framework from the memory of the terminal device, or receive the first image sent from the server.
  • the graph processing apparatus may be a server or a terminal device, and the specific graph processing apparatus and the specific manner of acquiring the first graph are not limited in this embodiment.
  • the first graph includes edges and nodes, and in this embodiment, the edges are used to represent the flow of data between different nodes.
  • the processing efficiency of the terminal device for the first graph will be reduced. Secondly, it is also inconvenient for users to quickly find key sub-regions.
  • the graph processing apparatus can determine the node whose degree is greater than the second threshold from the first graph, and this node is defined as the starting node in this embodiment.
  • the second threshold is used to indicate the out-degree of the data flow direction of the node, that is, greater than the second threshold value indicates that the out-degree of the node has exceeded the preset out-degree, resulting in more data flows of the node. Therefore, as shown in the figure There are also many sides.
  • the specific value of the second threshold can be 2, 5, or 8, etc.
  • the specific data of the second threshold needs to be predetermined according to the actual situation of the first graph, which is not limited here.
  • the corresponding termination node is determined through the path that the data flow of the starting node passes through, and all subgraphs between the two nodes are determined through the starting node and the ending node, because the out-degree of the initial node is greater than the second threshold. , so the number of all subgraphs between the start node and the end node is at least two, and each subgraph includes edges and nodes.
  • the identifier is used to indicate the feature of the subgraph, or the identifier is the hash value of the subgraph, which is not specifically limited here. Since this embodiment takes the calculation graph applied to the deep learning framework as an example, and uses the hash value identified as a subgraph as an example, based on the characteristics of the deep learning calculation graph, the graph processing apparatus can calculate the subgraph of the calculation graph.
  • the hash value corresponding to the node and the edge because the hash value corresponding to the node can indicate the characteristics of the node, and the hash value corresponding to the edge can indicate the characteristics of the edge, so the nodes and edges with different characteristics can be distinguished by the hash value.
  • the hash value is calculated for all nodes in the subgraph.
  • the node corresponds to a variety of node attributes, and the node attributes corresponding to the node include: But it is not limited to variable type, parameter type, belonging aggregation, etc. Therefore, it is first necessary to obtain the node attributes of node n.
  • the node attributes include the node type corresponding to node n, the number of hidden input nodes in node n, the type of hidden input nodes in node n, and the number of hidden input nodes in node n.
  • the Time33 hash algorithm is used for each node attribute of node n to obtain the hash value corresponding to each node attribute, and then the hash value corresponding to each node attribute is added to obtain the added node attribute hash value.
  • the added edge hash value is modulo BIG_PRIMITIVE to obtain the hash value corresponding to node n. node_hash[n].
  • the large prime number in this embodiment is 10000019. It should be understood that, in practical applications, the corresponding data of the specific large prime number should be flexibly determined according to the actual situation, which is not specifically limited here.
  • the hash value corresponding to each node is determined in a manner similar to the foregoing embodiment, and the added node is obtained by adding the hash values corresponding to each node. Hash value. If the number of nodes in the subgraph is large, the problem of hash value overflow may occur. In order to prevent overflow, the added node hash value is modulated by a large prime number to obtain the corresponding node in the subgraph. hash value.
  • the hash value corresponding to an edge is related to the connection relationship indicated by the edge in each subgraph, that is, the hash value corresponding to the edge in each subgraph is the value of each subgraph.
  • the connection relationship between the node and the node indicated by the edge is calculated, and the connection relationship between the node and the node is directional.
  • edge i represents the data flow direction from node A to node B, that is, for edge i, the data output node is node A, and the data input node is node B
  • Node A and node B can be encoded into a string "[source type]->[target type]", where [source type] is used to indicate the type of node A of edge i, and [target type] is used to indicate edge i
  • the type of node B, and [source type]->[target type]” is used to indicate that the data flow of edge i is from node A to node B to ensure data order.
  • ]->[target type]” uses the Time33 hash algorithm to get the hash value edge_hash[i] corresponding to edge i, where the Time33 hash algorithm is specifically used to map strings to numbers.
  • the hash value corresponding to each edge is determined in a manner similar to the foregoing embodiment, and the added edge is obtained by adding the hash values corresponding to each edge Hash value. If the number of edges in the subgraph is large, the problem of hash value overflow may occur. In order to prevent overflow, the added edge hash value is modulo the large prime number to obtain the corresponding edge in the subgraph. hash value.
  • the identifier is used as an example for introduction, but in practical applications, the identifier can also compare the nodes and edges of the subgraph with the preset sub-image gallery, so as to compare the sub-graph.
  • a logo is matched, and the structures of similar sub-graphs correspond to a logo, so the logo can indicate the characteristics of the sub-graph, so the sub-graphs with different characteristics can also be distinguished by the logo.
  • a value is obtained, and the value is used as an identifier.
  • the identifier can also indicate the characteristics of the subgraph.
  • the identification is not limited here.
  • the identifier is a hash value
  • this application does not specifically limit the hash algorithm used to calculate the hash value, as long as it can indicate subgraphs of different structures, and the subgraphs of similar (including the same situation)
  • the hash value is the same.
  • S104 Calculate the identifier corresponding to each subgraph based on the identifier corresponding to the node in each subgraph and the identifier corresponding to the edge in each subgraph.
  • the graph processing apparatus can obtain the hash value corresponding to the node in each subgraph and the hash value corresponding to the edge in each subgraph through step S103, and then calculate the obtained hash value.
  • the hash value corresponding to the node and the hash value corresponding to the edge are added, and the large prime number is modulo, so as to obtain the corresponding hash value of each subgraph.
  • subgraph A, subgraph B, and subgraph C can obtain the hash corresponding to the node in each subgraph through step S103.
  • Hash value and the hash value corresponding to the edge then add the obtained hash value corresponding to the node and the hash value corresponding to the edge, and take the modulus of the large prime number to obtain the hash value H (A ), the hash value corresponding to the subgraph B is the hash value H(B), and the hash value corresponding to the subgraph C is the hash value H(C).
  • the description is given by taking the identification as a hash value as an example. Since the subgraphs with the same hash value are similar, the graph processing apparatus may combine at least two subgraphs with the same hash value to generate the second graph, And the second graph is used for displaying on the terminal device, so that the number of nodes and edges in the second graph displayed on the terminal device can be reduced.
  • the merging described in this embodiment does not completely merge the subgraphs, but stacks the subgraphs, so that the number of subgraphs in the generated second graph is reduced, but the data included in the subgraphs will not be merge and reduce.
  • the hash value H(A) corresponding to the sub-graph A and the hash value corresponding to the sub-graph B are obtained through step S104.
  • the hash value corresponding to the subgraph C is the hash value H(C)
  • the subgraph A and the subgraph B can be combined to generate a second graph, so that only the corresponding structures of the subgraph A and the subgraph C are shown in the second graph shown, which reduces the nodes and edges displayed in the calculation graph. quantity.
  • the server may generate the second image in the manner of the foregoing embodiment, and send the generated second image to the terminal device, so that the terminal device displays the second image, or directly uses the The server displays the second image.
  • the terminal device can directly generate the second image in the manner of the foregoing embodiment, and display the generated second image, or accept the second image sent by the server and display the received second image. picture.
  • the specific display manner of the second figure is again not limited.
  • the solution provided by the embodiment of the present application can accurately and completely find all frequent subgraphs of the same level within the same aggregation level for stacking, and can recursively stack layer by layer, accurately and completely identify the structure of frequent subgraphs, and ensure On the premise of the accuracy of the connection relationship, reduce the number of nodes and edges displayed in the calculation graph.
  • the BERT pre-training (pretrain) network calculation graph and the mobilenetV2 network calculation graph generated based on the open source computing framework MindSpore are used as an example for description. Please refer to Table 1.
  • Table 1 is the detailed information corresponding to the BERT_pretrain network calculation graph, and Detailed information corresponding to the mobilenetV2 network calculation graph.
  • the internet node The number of original graph nodes The number of nodes after stacking Effect mobilenetV2 Optimizer_Momentum 650 33 Display image in 2 seconds BERT_pretrain Optimizer_Lamb 16723 (crash) 99 Show pictures in 5 seconds
  • the number of nodes is reduced from 650 to 33, and the terminal device can display the graph within 2 seconds. Therefore, it can be seen that the graph processing method provided by the embodiment of the present application can reduce the number of nodes in the computation graph, and can also improve the efficiency of displaying the computation graph by the terminal device. It should be understood that the examples in Table 1 are only used to understand this solution, and the specific needs to be flexibly determined according to the actual situation.
  • the second graph is obtained by reducing the number of nodes and edges displayed in the foregoing embodiment, and an orthogonal layout is used as the basic layout style, wherein the orthogonal layout is specifically that the edges do not cross each other, and all nodes of the same depth are in the same On the horizontal line, there should be a certain gap between nodes at the same level.
  • the scale calculation graph usually includes a large number of nodes and edges.
  • the orthogonal layout is used as the basic layout style. , there is still the problem that the displayed calculation graph is not clear enough, which makes it impossible for users to perform training analysis and debugging based on the graph.
  • FIG. 10 is a schematic diagram of an embodiment of the calculation diagram in the embodiment of the application.
  • G1 is used to indicate the multiplication instruction (mul) operator, and the edge connected to the mul operator G1 has 10. If all 10 edges are displayed in the calculation graph, because there are too many edges connected to the mul operator G1, the calculation graph cannot be clearly displayed in the relationship between the connected edges of the mul operator G1 and the data flow of the mul operator G1. It is not conducive to the analysis of computational graphs.
  • the embodiment of the present application provides another method for processing a graph.
  • the identifier is a hash value as an example for introduction.
  • FIG. 11 is a schematic diagram of another embodiment of the graph processing method in the embodiment of the present application. As shown in the figure, the graph processing method includes the following steps.
  • step S101 the manner in which the image processing apparatus acquires the first image is similar to step S101, and details are not described herein again.
  • step S102 the manner in which the image processing apparatus acquires at least two sub-images of the first image is similar to step S102, and details are not described herein again.
  • the graph processing apparatus calculates a hash value corresponding to a node in each subgraph and a hash value corresponding to an edge in each subgraph, which is similar to step S103. This will not be repeated here.
  • the graph processing apparatus calculates the hash value corresponding to each subgraph based on the hash value corresponding to the node in each subgraph and the hash value corresponding to the edge in each subgraph, which is similar to step S104, and here No longer.
  • the manner in which the graph processing apparatus merges subgraphs with the same hash value in at least two subgraphs is similar to the manner introduced in step S105, and details are not described herein again.
  • the graph processing apparatus can traverse all the nodes in the graph generated after the merging, and determine the nodes whose out-degree or in-degree is greater than the first threshold.
  • the node is determined as the first node, and then a first port and a second port are allocated to the first node, wherein the first port is the data input edge of each first node passing through the port of each first node, and the second port is Ports for data output each first node's edge passes through each first node's port.
  • the first threshold may indicate the out-degree of the data flow direction of the node, and may also indicate the in-degree of the node's data inflow, that is, greater than the first threshold value indicates that the out-degree of the node has exceeded the preset out-degree, or the in-degree of the node has been Exceeding the preset in-degree will lead to more data flows of the node, so there are more edges displayed in the figure.
  • the specific value of the first threshold can be 3, 5 or 4, etc. The specific data needs to be predetermined according to the actual situation of the calculation graph, which is not limited here.
  • the graph processing apparatus merges multiple edges passing through the same first port and multiple edges passing through the same second port according to all the edges included in the generated graph after merging to generate the second graph.
  • the second graph is laid out in an orthogonal edge routing manner.
  • the port constraint optimization layout algorithm is first used to optimize the position and ordering of the first port and the second port with the overall orthogonal edge routing layout as the goal.
  • the orthogonal layout specifically means that the included angle of the connection lines around the node is 90 degrees, and the edge routing is the arrangement and direction of the specific connection lines in the figure.
  • FIG. 12 is a schematic diagram of another embodiment of the calculation diagram in the embodiment of the present application.
  • H1 is used to indicate the mul operator.
  • the input passes through the edge of the first port and the data output passes through the edge of the second port, merged, and is laid out in the way of orthogonal edge routing to generate Figure 12, which reduces the number of connected edges in the graph displayed by the terminal device, so that there are more Large-scale computing graphs with multiple nodes and edges can be clearly displayed, which further improves the clarity of the computing graph and facilitates users to perform training analysis and debugging based on the graph.
  • the calculation graph of the deep bidirectional interactive coding (Bidirectional Encoder Representation from Transformers, BERT) network adopts the method corresponding to the foregoing embodiment, and obtains the visualized graph through the orthogonal edge routing layout, please refer to Fig. 13, Fig. 13
  • the overall data flow of the obtained graph is complete from left to right. and clear, so after the terminal device displays the figure, the user can interactively click on the operators in it to view more and more detailed substructures.
  • FIG. 12 and FIG. 13 are only used for understanding this solution, and the specifics need to be flexibly determined according to the actual situation.
  • step S106 the manner in which the image processing apparatus outputs the second image generated after the combination is similar to step S106, and details are not described herein again.
  • the aggregation is a set containing a part of child nodes, that is, the computation graph may include
  • For aggregation information please refer to Figure 7 again. It can be seen from Figure 7 that when there is an aggregation in the calculation graph, since the aggregation includes at least one node, there will be interlaced complex edges between each aggregation, resulting in the calculation of a composite graph. The graph still has the problem that the displayed calculation graph is not clear enough. Therefore, in order to solve the problem corresponding to the display of the composite graph shown in FIG.
  • FIG. 7 another method for graph processing is provided in the embodiment of the present application.
  • 14 is a schematic diagram of another embodiment of the method for graph processing in this embodiment of the present application.
  • the fourth graph includes aggregation, graph processing
  • the method includes the following steps.
  • step S201 the manner in which the image processing apparatus acquires the first image is similar to step S201, and details are not described herein again.
  • step S202 the manner in which the image processing apparatus acquires at least two sub-images of the first image is similar to step S202, and details are not described herein again.
  • the graph processing apparatus calculates the hash value corresponding to the node in each subgraph and the hash value corresponding to the edge in each subgraph, which is similar to step S203. This will not be repeated here.
  • the method of calculating the hash value corresponding to each subgraph based on the hash value corresponding to the node in each subgraph and the hash value corresponding to the edge in each subgraph is similar to step S204 and will not be repeated here. .
  • step S205 the manner in which the graph processing apparatus merges subgraphs with the same hash value in at least two subgraphs is similar to step S205, and details are not described herein again.
  • step S206 the manner in which the graph processing apparatus adds the first port and the second port of each of the multiple first nodes is similar to step S206, and details are not described herein again.
  • step S207 the manner in which the graph processing apparatus merges multiple edges passing through the same first port and multiple edges passing through the same second port is similar to step S207, and is not repeated here.
  • the graph processing apparatus since there are aggregations in the computation graph, the graph processing apparatus merges multiple edges passing through the same first port, and after merging multiple edges passing through the same second port, the obtained graph includes aggregations. Therefore, the graph processing apparatus can first determine the first edge and the second edge according to the nodes included in the aggregation and the data flow between nodes other than the aggregation.
  • the first edge and the second edge are both the cross-aggregation introduced in FIG. 8 . edge. Specifically, the first side indicates an aggregated data input, while the second side indicates an aggregated data output, and an aggregate can indicate a computing function through a combination of a set of nodes and edges.
  • a port may be added to the first edge and the second edge.
  • the number of ports is the same as the number of first edges and second edges corresponding to the aggregation. For example, if an aggregation has 3 data inputs, then 3 first edges can be determined, and An aggregate has 2 data outputs, then 2 second edges can be determined, then the number of ports added by the graph processing device is 5.
  • FIG. 15 is a schematic diagram of an embodiment of a port in an embodiment of the present application.
  • I1 and I2 are used to indicate the first side
  • I3 and I4 are used to indicate the second side
  • I5 to I8 is used to indicate the port.
  • the figure includes an aggregation and 6 nodes.
  • node 1, node 2, and node 6 do not belong to the aggregation
  • the aggregation includes node 3, node 4, and node 5.
  • the node The data flow between 1 and node 4 corresponds to the corresponding edge I1, and the edge I1 indicates the aggregated data input, so the edge I1 is the first edge.
  • the corresponding edge I2 of the data flow between node 2 and node 3 can indicate the aggregated data.
  • edge I2 is the first edge.
  • edge I3 indicates the aggregated data output, so edge I3 is the second edge.
  • edge I4 of the data flow between node 5 and node 6 can indicate The aggregated data is output, so edge I4 is the first edge, and 2 first edges and 2 second edges can be determined. Based on each first edge and second edge, add a port at the intersection with the aggregation, that is, the intersection I5 of the first edge I1 and the aggregation is a port. Similarly, it can be seen that the intersection I6 of the first edge I2 and the aggregation is a port.
  • intersection point I7 of the second edge I3 and the aggregation is a port
  • the intersection point I8 of the second edge I4 and the aggregation is a port, that is, four ports corresponding to two first edges and two second edges can be added. It should be understood that the example in FIG. 15 is only used to understand this solution.
  • the specific first side and the second side need to be flexibly determined according to the actual situation of the node data flow relationship in the figure, and the added specific port needs the first side and the second side. Determined flexibly according to the actual situation.
  • the graph processing apparatus traverses all the nodes in the aggregation, determines the port on the edge of the multiple first edges that corresponds to the same aggregation and indicates that the input comes from the same node as the third port, and merges the multiple first edges. Three ports. Next, among the multiple second edges, the port on the edge corresponding to the same aggregation and indicating the output to the same node is determined as the fourth port, and the multiple fourth ports are combined.
  • FIG. 16 is a schematic diagram of an embodiment of a merged port in an embodiment of the present application. As shown in the figure, J1 to J4 are used to indicate ports, and J5 is used to indicate ports. to indicate the fourth port.
  • Figure 16 (A) shows an example diagram including ports J1 to J4, wherein both ports J3 and J4 are connected to node 6, that is, the data flow directions of node 4 and node 5 in the aggregation all point to node 6, That is, port J3 and port J4 are two second edges, corresponding to the same aggregation, and indicate the port on the edge of the same node, so port J3 and port J4 can be combined to obtain the fourth port, thus obtaining the graph 16(B) shows an example diagram including a port J1, a port J2, and a fourth port J5. It should be understood that the example in FIG. 16 is only used for understanding this solution, and the specific third port and the fourth port need to be flexibly determined according to the actual situation of the data flow between the nodes within the aggregation and the nodes outside the aggregation.
  • the second graph is laid out in a manner of orthogonal edge routing.
  • the graph processing apparatus lays out the second graph in the manner of orthogonal edge routing based on the merged ports. Specifically, firstly, the first side is disassembled into two sections with the third port as the boundary, and the second side is disassembled into two sections with the fourth port as the boundary. If there is no third port or fourth port, the first side and the second side may also be disassembled into two sections with the port as a boundary, which is not specifically limited here. For ease of understanding, further examples are given based on the fourth port in FIG. 16 , please refer to FIG. 17 .
  • FIG. 17 is a schematic diagram of another embodiment of the image processing method in this embodiment of the present application.
  • K1 and K2 are used for The port is indicated, K3 is used to indicate the fourth port, K4, K5 and K6 are used to indicate the edges outside the aggregation, and K7 to K9 are used to indicate the edges within the aggregation.
  • the port K1 the aggregated data from nodes 1 to 4 is input to the corresponding first edge and disassembled, thereby obtaining the edge K4 and the edge K7.
  • the edge K5 and the edge K8 can be obtained according to the port K2.
  • the second edge corresponding to the aggregated data output of node 4 and node 5 to node 6 can be disassembled, so as to obtain edge K6 and edge K9, and then use the port constraint optimization layout algorithm to aggregate the ports on the And the fourth port is the boundary, the layout calculation of nodes and edges is performed inside the aggregation and inside and outside the aggregation respectively, and the number, position and order of nodes and ports are constrained to adjust, thereby generating the second graph.
  • the specific layout of the orthogonal edge routing has been introduced in the foregoing embodiments, and will not be repeated here.
  • step S208 the manner in which the image processing apparatus outputs the second image generated after the combination is similar to step S208, and details are not described herein again.
  • the embodiments of the present application adopt port design and rule-based edge binding in data edges aggregated across nodes, which can not only reduce the number of edges, but also preserve the complete data flow direction of a local focus area.
  • the port constraint optimization layout algorithm is used to adaptively adjust and limit the position and number of nodes and ports on the aggregation boundary, so that it can more generally adapt to different computing graph structures, so that it can be used in While simplifying the graph layout, keep the original local data connection relationship as much as possible.
  • steps S306 and S307 may be implemented first, and then step S305 may be implemented.
  • steps S306 and S307, and between steps S308 and S310 steps S308 to S310 can be implemented first, and then steps S306 and S307 can be implemented. Therefore, the examples in this embodiment can be performed according to the actual situation.
  • the timing adjustment between time intervals is not limited here.
  • the image processing apparatus includes corresponding hardware structures and/or software modules for executing each function.
  • the present application can be implemented in hardware or in the form of a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • This embodiment of the present application may divide the graph processing apparatus into functional modules based on the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • FIG. 18 is a schematic diagram of an embodiment of the image processing apparatus in the embodiment of the application. As shown in the figure, the image processing apparatus 1800 includes:
  • an obtaining module 1801 configured to obtain at least two subgraphs of the first graph, wherein each subgraph includes a plurality of nodes in the first graph and edges between nodes;
  • a calculation module 1802 configured to calculate the respective identities of the at least two subgraphs based on the at least two subgraphs, the nodes and edges included in each subgraph;
  • a merging module 1803 configured to merge at least two subgraphs with the same subgraph
  • the output module 1804 is configured to output the second map generated after the combination.
  • the identifier is a hash value
  • the data of each subgraph indicates nodes and edges in each subgraph
  • the calculation module 1802 is specifically configured to, for each subgraph in the at least two subgraphs, calculate each subgraph based on the respective hash values corresponding to the multiple nodes in each subgraph and the hash values corresponding to the multiple edges in each subgraph. The hash value corresponding to each subgraph.
  • a hash value corresponding to a node is related to an attribute of the node
  • the hash value corresponding to an edge is related to the connection relationship indicated by the edge in each subgraph.
  • the identification is a hash value
  • each first node is a node whose out-degree or in-degree is greater than the first threshold in the first graph, indicating that the data is input to each node.
  • the edge of each first node passes through the first port of each first node, and the edge of each first node instructs the data output to pass through the second port of each first node;
  • the second graph is laid out in a manner of orthogonal edge routing.
  • the identification is a hash value
  • each first edge indicates an aggregate input
  • each second edge indicates an aggregate output
  • an aggregate passes through a set of nodes and edges.
  • a combination indicates a computational function
  • the third ports are among multiple first edges, correspond to the same aggregation, and indicate that the input comes from the port on the edge of the same node, and among the multiple ports added by combining Multiple fourth ports of , where the fourth port is a port on the edge of the multiple second edges, corresponding to the same aggregation, and indicating the output to the same node;
  • the second graph is laid out in an orthogonal edge routing manner.
  • the identification is a hash value
  • a first port and a second port of each first node in the plurality of first nodes are added, wherein each first node is a node whose out-degree or in-degree is greater than the first threshold in the first graph, indicating that data is input to each The edge of the first node passes through the first port of each first node, and the edge of each first node instructs the data output to pass through the second port of each first node;
  • each first edge indicates an aggregate input
  • each second edge indicates an aggregate output
  • an aggregate passes through a set of nodes and edges.
  • a combination indicates a computational function
  • the third ports are among multiple first edges, correspond to the same aggregation, and indicate that the input comes from the port on the edge of the same node, and among the multiple ports added by combining Multiple fourth ports of , where the fourth port is a port on the edge of the multiple second edges, corresponding to the same aggregation, and indicating the output to the same node;
  • the second graph is laid out in an orthogonal edge routing manner.
  • the graph processing apparatus in the embodiments of the present application may be deployed in a terminal device or a server, and may also be a chip applied to the terminal device or the server, or other combined devices and components that can implement the functions of the above-mentioned terminal device.
  • the computing module and the merging module may be implemented by a processor executing codes, for example, the processor may be an application chip of a certain type.
  • the calculation module and the merging module can be implemented by the processor executing codes.
  • the computing module and the merging module may be processors of the system-on-chip.
  • FIG. 19 is a schematic structural diagram of an embodiment of a graph processing apparatus in an embodiment of the present application.
  • the graph processing apparatus 1900 includes a processor 1910 , which is coupled to the processor 1910 Memory 1920, I/O ports 1930. In some implementations, they can be coupled together via a bus.
  • the graph processing apparatus 1900 may be a server or a terminal device.
  • the processor 1910 may be a central processing unit (CPU), a network processor (NP), or a combination of CPU and NP.
  • the processor may also be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above-mentioned PLD can be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general-purpose array logic (generic array logic, GAL) or any combination thereof.
  • the processor 1910 may refer to one processor, or may include multiple processors.
  • the memory 1920 may include volatile memory (volatile memory), such as random access memory (random access memory, RAM), and the processor 1910 may execute codes to implement the functions of the computing module 1802 and the merging module 1803.
  • the memory 1920 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory (flash memory), hard disk drive (HDD) or solid state drive ( solid-state drive, SSD); memory 1920 may also include a combination of the types of memory described above.
  • non-volatile memory such as read-only memory (ROM), flash memory (flash memory), hard disk drive (HDD) or solid state drive ( solid-state drive, SSD
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid state drive
  • the memory 1920 has computer readable instructions stored in the memory 1920 that perform any of the methods in the possible implementations described above. After the processor 1910 executes the computer-readable instructions, it can perform corresponding operations according to the instructions of the computer-readable instructions. In addition, after the processor 1910 executes the computer-readable instructions in the memory 1920, it can perform all the operations that the server or the terminal device can perform according to the instructions of the computer-readable instructions. The operations performed in the corresponding embodiments.
  • Input/output ports 1930 include ports for outputting data, and in some cases, ports for inputting data.
  • the processor 1910 can call the input/output port 1930 by executing the code to output the second image. In some cases, the processor 1910 can also call the input/output port 1930 by executing the code to obtain the two sub-images of the first image from other devices.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种图处理的方法,系统以及装置,用于数据可视化领域。上述方法中,首先获取第一图的至少两个子图(S102),每个子图包括第一图中的多个节点和节点之间的边,然后计算至少两个子图中每个子图包括的节点和边计算至少两个子图各自的标识,将至少两个子图中标识相同的子图进行合并,以生成第二图,然后输出合并后生成的第二图(S106)。上述图处理的方法在不影响图的结构和表达的计算逻辑的前提下,提升图中数据流向的清晰度以及完整度。

Description

一种图处理的方法,系统以及装置
本申请要求于2020年09月21日提交中国国家知识产权局、申请号为202010998184.0、发明名称为“一种图处理的方法,系统以及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及数据可视化领域,尤其涉及一种图处理的方法,装置以及设备。
背景技术
深度学习技术目前广泛应用于复杂数据的特征提取,推理及预测,包括图、文本、语音等数据类型。具体到实现层面,用户一般会使用深度学习框架编写代码,来进行数据预处理、模型构建、模型训练、模型评估和部署等环节。进一步地,请参阅图1,图1为本申请实施例的计算图的实施例示意图,如图所示,深度学习框架可以采用计算图的形式指示机器进行模型的训练和推理。然而深度学习框架和模型本身的复杂性为用户使用带来了较高的门槛,特别是难以方便快速的定位问题和调试。
目前,通过将深度学习框架生成的计算图可视化,有助于用户检查自己编写的代码是否符合心目中的模型结构、在模型训练过程中定位问题的出现。
然而,图通常包括层级聚合以及数据量庞大的节点以及边,因此图中节点以及边的布局杂乱,由此会降低图处理的效率,从而降低终端设备展示图中数据流向的清晰度以及完整度。
发明内容
本申请实施例提供了一种图处理的方法,系统以及装置,用于不影响图的结构和表达的计算逻辑的前提下,提升图中数据流向的清晰度以及完整度。
本申请实施例的第一方面提供了一种图处理的方法,包括:由于复杂深度学习模型所生成的第一图通常包含有频繁子图,频繁子图即在同一个图结构中重复出现的子图结构,若将包括频繁子图的第一图展示于终端设备,会降低终端设备对第一图的处理效率,其次,也不利于用户快速地找到关键子区域。因此,图处理装置从第一图中确定出度大于第二阈值的节点为起始节点,然后通过起始节点的数据流向所经过的路径,确定对应的终止节点,通过起始节点以及终止节点确定在两个节点之间的所有子图,每个子图包括边以及节点,且边用于表示不同节点之间数据的流向。其中,第二阈值用于指示节点数据流向的出度,即大于第二阈值说明该节点的出度已经超出预设出度,从而导致该节点的数据流向较多,因此图中所展示的边也较多,其次,第二阈值的具体取值可以为2、5或者8等,第二阈值的具体数据需要根据第一图的实际情况预先确定,此处不做限定。然后图处理装置基于至少两个子图中,每个子图包括的节点和边计算至少两个子图各自的标识,所述标识用于指示子图的特征,或者所述标识为子图的哈希值,具体此处不做限定。进一步地,由于子图的标识相同,即说明子图之间有着相似的特征,因此图处理装置合并至少两个子图中标识 相同的子图,以生成第二图,并且输出第二图。示例性地,当本申请实施例的方法应用于终端设备时,终端设备可以从终端设备内存中读取深度学习框架保存的第一图,或者接收从服务器所发送的第一图,然后直接展示通过前述方式生成的第二图。其次,当本申请实施例的方法应用于服务器时,服务器可以从服务器内存中读取存储的第一图,通过前述方式生成第二图后,向终端设备发送第二图,终端设备即可展示接收到第二图。
在该实施方式中,根据所获取的第一图确定具有对应标识的子图,将标识相同的至少两个子图进行合并,以生成第二图,因此降低第二图包括的节点数量以及边数量,由此提升图处理的效率,由于标识相同的子图是相似的,因此合并不影响图的结构和表达的计算逻辑,因此还可以提升图中数据流向的清晰度以及完整度。
本申请实施例的一种实现方式中,标识为哈希值,本实施例以应用于深度学习框架对应的计算图作为示例,基于深度学习计算图的特点,由于节点对应的哈希值可以指示节点的特征,而边对应的哈希值可以指示边的特征,通过哈希值可以区别特征不同的节点以及边,因此可以对至少两个子图中的每个子图,基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,因此计算每个子图对应的哈希值,所得到的子图对应的哈希值也可以区别特征不同的子图。
在该实施方式中,基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,计算得到的子图对应的哈希值能够准确反映子图的特征,保证后续合并哈希值相同子图的准确性,由此进一步地提升了图处理的准确度。
本申请实施例的一种实现方式中,标识为哈希值,因此在每个子图中,一个节点对应的哈希值与节点的属性有关,即节点对应的哈希值是对每个子图中节点的属性进行计算得到的,具体地,子图中的节点对应有多种节点属性,节点对应的节点属性包括但不限于变量类型、参数类型、所属聚合等,节点属性可以反映节点的特征,因此对节点的属性进行计算得到节点对应的哈希值可以指示节点的特征。其次,在每个子图中,一条边对应的哈希值与每个子图中边指示的连接关系有关,即每个子图中边对应的哈希值是对每个子图中边所指示的节点与节点之间的连接关系进行计算得到的,且节点与节点之间的连接关系是有方向的。具体地,边表示一个节点向另一个节点的数据流向,例如,对于一条边i而言,数据输出的节点为节点A,数据输入的节点为节点B,因此可以将节点A以及节点B编码成字符串“[source type]->[target type]”,其中[source type]用于指示边的节点A的类型,而[target type]用于指示边的节点B的类型,该字符串用于指示边i的数据流向为从节点A至节点B,以确保数据有序性,因此对该字符串计算哈希值,即可以指示边的特征。
在该实施方式中,由于节点属性可以反映节点的特征,因此对节点的属性进行计算得到节点对应的哈希值,能够提升所得到的节点对应的哈希值的准确性。其次,通过边指示的连接关系是有方向的,能够使得计算得到的哈希值更准确,因此能够提升子图对应的哈希值的准确性,由此进一步地提升了图处理的准确度。
本申请实施例的一种实现方式中,标识为哈希值,本实施例以应用于深度学习框架对应的计算图作为示例,基于深度学习计算图的特点,因此可以合并至少两个子图中哈希值相同的子图,以降低计算图包括的节点数量以及边数量,以及确定第一图中出度或入度大 于第一阈值的节点为第一节点,然后为第一节点分配一个第一端口以及第二端口,其中,第一端口为数据输入每个第一节点的边经过每个第一节点的端口,而第二端口为数据输出每个第一节点的边经过每个第一节点的端口,根据合并后所生成图中包括的所有边,合并经过同一第一端口的多条边,合并经过同一第二端口的多条边,以生成第二图。其中,第一阈值可以指示节点数据流向的出度,还可以指示节点数据流入的入度,即大于第一阈值说明该节点的出度已经超出预设出度,或者该节点的入度已经超出预设入度,均会导致该节点的数据流向较多,因此图中所展示的边也较多,其次,第一阈值的具体取值可以为3、5或者4等,第一阈值的具体数据需要根据计算图的实际情况预先确定,此处不做限定。
在该实施方式中,由于第一节点为出度或入度大于第一阈值的节点,因此从第一节点输出或输入的数据流向较多,为第一节点分配第一端口以及第二端口,将数据输入经过第一端口的边,且数据输出经过第二端口的边进行合并,生成第二图,进一步地提升终端设备展示图中数据流向的清晰度。
本申请实施例的一种实现方式中,标识为哈希值,第二图是以正交边路由的方式布局的。具体地,先采用端口约束(port constraint)优化布局算法基于第一端口以及第二端口,以整体的正交边路由布局为目标,对第一端口以及第二端口的位置以及排序进行约束优化,计算得出第一端口以及第二端口的位置坐标,由此完成正交边路由的布局,以生成第二图。其中,正交布局具体为节点周边的连线夹角为90度,而边路由为图中具体连线的排布以及走向。
在该实施方式中,采用正交边路由的方式布局能够对第一端口以及第二端口的位置以及排序进行约束优化,使得从左到右整体的数据流向完整且清晰,进一步地提升终端设备展示图中数据流向的清晰度。
本申请实施例的一种实现方式中,标识为哈希值,本实施例以应用于深度学习框架对应的计算图作为示例,基于深度学习计算图的特点,因此可以合并至少两个子图中哈希值相同的子图,以降低计算图包括的节点数量以及边数量,当所得到的图中包括聚合时,由于一个聚合通过一组节点和边的组合指示一种计算功能,因此可以将指示一个聚合的输入的边确定为第一边,而指示一个聚合的输出的边确定为第二边,然后再在多条第一边和多条第二边上添加端口,将多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口确定为第三端口,并且合并多个第三端口。其次,将多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口确定为第四端口,并且合并多个第四端口,基于合并后的端口,以正交边路由的方式布局生成第二图。具体正交边路由的方式与前述申请实施例中描述的类似,在此不再赘述。
在该实施方式中,首先将哈希值相同的至少两个子图进行合并,降低第二图包括的节点数量以及边数量,由此提升图处理的效率,由于哈希值相同的子图是相似的,因此合并不影响图中的数据流向,因此还可以提升终端设备展示图中数据流向的清晰度以及完整度,其次,通过合并同一聚合,且指示输入来自相同的节点的边上的端口,以及合并同一聚合且指示输出向同一个节点的边上的端口,合并后的端口能够将聚合外部与聚合内部分割,由于每个聚合之间会有交错复杂的边,因此通过合并后的端口分别对聚合外部与聚合内部进行布局,能够进一步地提升展示图中数据流向的清晰度。
本申请实施例的一种实现方式中,标识为哈希值,本实施例以应用于深度学习框架对应的计算图作为示例,基于深度学习计算图的特点,因此可以合并至少两个子图中哈希值相同的子图,以降低计算图包括的节点数量以及边数量,以及确定第一图中出度或入度大于第一阈值的节点为第一节点,然后为第一节点分配一个第一端口以及第二端口,其中,第一端口为数据输入每个第一节点的边经过每个第一节点的端口,而第二端口为数据输出每个第一节点的边经过每个第一节点的端口,根据合并后所生成图中包括的所有边,合并经过同一第一端口的多条边,合并经过同一第二端口的多条边,以生成第二图。其中,第一阈值可以指示节点数据流向的出度,还可以指示节点数据流入的如度,即大于第一阈值说明该节点的出度已经超出预设出度,或者该节点的入度已经超出预设入度,均会导致该节点的数据流向较多,因此图中所展示的边也较多,其次,第一阈值的具体取值可以为3、5或者4等,第一阈值的具体数据需要根据计算图的实际情况预先确定,此处不做限定。进一步地,当所得到的图中包括聚合时,由于一个聚合通过一组节点和边的组合指示一种计算功能,因此可以将指示一个聚合的输入的边确定为第一边,而指示一个聚合的输出的边确定为第二边,然后再在多条第一边和多条第二边上添加端口,将在多条第一边对应同一聚合,且指示输入来自相同的节点的边上的端口确定为第三端口,并且将合并多个第三端口,其次,将多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口确定为第四端口,并且合并多个第四端口,基于合并后的端口,以正交边路由的方式布局生成第二图。具体正交边路由的方式与前述申请实施例中描述的类似,在此不再赘述。
在该实施方式中,首先将哈希值相同的至少两个子图进行合并,降低第二图包括的节点数量以及边数量,由此提升图处理的效率,由于哈希值相同的子图是相似的,因此合并不影响图的结构和表达的计算逻辑,图中数据流向的清晰度以及完整度。其次,由于第一节点为出度或入度大于第一阈值的节点,因此从第一节点输出或输入的数据流向较多,为第一节点分配第一端口以及第二端口,将数据输入经过第一端口的边,且数据输出经过第二端口的边进行合并,进一步地提升终端设备展示图中数据流向的清晰度。再次,通过合并同一聚合,且指示输入来自相同的节点的边上的端口,以及合并同一聚合且指示输出向同一个节点的边上的端口,合并后的端口能够将聚合外部与聚合内部分割,由于每个聚合之间会有交错复杂的边,因此通过合并后的端口分别对聚合外部与聚合内部进行布局,能够进一步地提升展示图中数据流向的清晰度。
本申请实施例的第二方面提供了一种图处理装置,该图处理装置包括:获取模块,用于获取第一图的至少两个子图,其中,每个子图包括第一图中的多个节点和节点之间的边;计算模块,用于基于至少两个子图中,每个子图包括的节点和边计算至少两个子图各自的标识;合并模块,用于合并至少两个子图中标识相同的子图;输出模块,用于输出合并后生成的第二图。
本申请实施例的一种实现方式中,标识为哈希值,每个子图的数据指示每个子图中的节点和边;计算模块,具体用于对于至少两个子图中的每个子图,基于每个子图中的多个节点各自对应的哈希值和每个子图中的多条边对应的哈希值,计算每个子图对应的哈希值。
本申请实施例的一种实现方式中,每个子图中,一个节点对应的哈希值与节点的属性有关;每个子图中,一条边对应的哈希值与每个子图中边指示的连接关系有关。
本申请实施例的一种实现方式中,标识为哈希值;合并模块,具体用于合并至少两个子图中哈希值相同的子图;以及添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,每个第一节点为第一图中出度或入度大于第一阈值的节点,指示数据输入每个第一节点的边经过每个第一节点的第一端口,指示数据输出每个第一节点的边经过每个第一节点的第二端口;对多个第一节点执行以下操作以生成第二图:合并经过同一第一端口的多条边,合并经过同一第二端口的多条边。
本申请实施例的一种实现方式中,第二图是以正交边路由的方式布局的。
本申请实施例的一种实现方式中,标识为哈希值;合并模块,具体用于合并至少两个子图中哈希值相同的子图;以及在多条第一边和多条第二边上添加端口,其中,每条第一边指示一个聚合的输入,每条第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;合并添加的多个端口中的多个第三端口,第三端口为多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,第四端口为多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;基于合并后的端口,以正交边路由的方式布局所述第二图。
本申请实施例的一种实现方式中,标识为哈希值;合并模块,具体用于合并至少两个子图中哈希值相同的子图;添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,每个第一节点为第一图中出度或入度大于第一阈值的节点,指示数据输入每个第一节点的边经过每个第一节点的第一端口,指示数据输出每个第一节点的边经过每个第一节点的第二端口;合并经过同一第一端口的多条边,合并经过同一第二端口的多条边;以及在多条第一边和多条第二边上添加端口,其中,每条第一边指示一个聚合的输入,每条第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;合并添加的多个端口中的多个第三端口,第三端口为多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,第四端口为多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;基于上述一个或多个合并步骤的结果,以正交边路由的方式布局第二图。
本申请实施例的第三方面,提供一种终端设备,该终端设备可以为上述方法设计中的图处理装置,或者,为设置在图处理装置中的芯片。该终端设备包括:处理器,与存储器耦合,可用于执行存储器中的指令,以实现上述第一方面及其任意一种可能的实施方式中图处理装置所执行的方法。可选地,该终端设备还包括存储器。可选地,该终端设备还包括通信接口,处理器与通信接口耦合。
当终端设备为图处理装置时,该通信接口可以是收发器,或,输入/输出接口。
当终端设备为设置于图处理装置中的芯片时,该通信接口可以是输入/输出接口。
可选地,该收发器可以为收发电路。可选地,该输入/输出接口可以为输入/输出电路。
本申请实施例的第四方面,提供一种服务器,该服务器可以为上述方法设计中的图处理装置,或者,为设置在图处理装置中的芯片。该服务器包括:处理器,与存储器耦合,可用于执行存储器中的指令,以实现上述第一方面及其任意一种可能的实施方式中图处理装置所执行的方法。可选地,该服务器还包括存储器。可选地,该服务器还包括通信接口,处理器与通信接口耦合。
当该服务器置为图处理装置时,该通信接口可以是收发器,或,输入/输出接口。
当该服务器为设置于图处理装置中的芯片时,该通信接口可以是输入/输出接口。
可选地,该收发器可以为收发电路。可选地,该输入/输出接口可以为输入/输出电路。
本申请实施例的第五方面,提供了一种程序,该程序在被处理器执行时,用于执行第一方面及其可能的实施方式中的任一方法。
本申请实施例的第六方面,提供了一种存储一个或多个计算机的计算机程序产品(或称计算机程序),当计算机程序产品被该处理器执行时,该处理器执行上述第一方面或第一方面任意一种可能实施方式中的方法。
本申请实施例的第七方面,提供了一种芯片,该芯片包括至少一个处理器,用于支持终端设备实现上述第一方面或第一方面任意一种可能的实施方式中所涉及的功能。在一种可能的设计中,该芯片系统还可以包括存储器,至少一个处理器与至少一个存储器通信连接,至少一个存储器中存储有指令,用于保存该终端设备以及服务器必要的程序指令和数据。可选的,所述芯片系统还包括接口电路,所述接口电路为所述至少一个处理器提供程序指令和/或数据。
本申请实施例的第八方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序,所述程序使得终端设备执行上述第一方面及其可能的实施方式中的任一方法。
需要说明的是,本申请第二方面至第八方面的实施方式所带来的有益效果,以及对各方面的实施方式的说明可以参照第一方面的实施方式进行理解,因此没有重复赘述。
通过本申请提供的技术方案,可以获取第一图的至少两个子图,基于至少两个子图中,每个子图包括的节点和边计算至少两个子图各自的哈希值,合并至少两个子图中哈希值相同的子图,以生成第二图,并输出第二图,由此降低第二图包括的节点数量以及边数量,从而提升图处理的效率,其次,由于合并不影响图的结构和表达的计算逻辑,因此还可以提升图中数据流向的清晰度以及完整度。
附图说明
图1为本申请实施例的计算图的实施例示意图;
图2为本申请实施例的系统架构示意图;
图3为本申请实施例中产品实现的架构示意图;
图4为本申请实施例中节点的一个实施例示意图;
图5为本申请实施例中边的一个实施例示意图;
图6为本申请实施例中频繁子图结构的一个实施例示意图;
图7为本申请实施例中简单图与复合图一个实施例示意图
图8为本申请实施例中跨聚合的边一个实施例示意图;
图9为本申请实施例中图处理的方法一个实施例示意图;
图10为本申请实施例中计算图一个实施例示意图;
图11为本申请实施例中图处理的方法另一实施例示意图;
图12为本申请实施例中计算图另一实施例示意图;
图13为本申请实施例中BERT网络计算图一个实施例示意图;
图14为本申请实施例中图处理的方法另一实施例示意图;
图15为本申请实施例中端口一个实施例示意图;
图16为本申请实施例中合并端口一个实施例示意图;
图17为本申请实施例中图处理的方法另一个实施例示意图;
图18为本申请实施例中图处理装置一个实施例示意图;
图19为本申请实施例中图处理装置一个实施例的结构示意图。
具体实施方式
本申请实施例提供了一种图处理的方法,系统以及装置,可以获取第一图的至少两个子图,每个子图包括第一图中的多个节点和节点之间的边,然后计算至少两个子图中每个子图包括的节点和边计算至少两个子图各自的标识,将至少两个子图中标识相同的子图进行合并,以生成第二图,然后输出合并后生成的第二图,由此降低第二图包括的节点数量以及边数量,从而提升图处理的效率,其次,由于合并不影响图的结构和表达的计算逻辑,因此还可以提升图中数据流向的清晰度以及完整度。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应理解,本申请中所有节点、图以及边的名称仅仅是本申请为描述方便而设定的名称,在实际应用中的名称可能不同,不应理解本申请限定各种节点、图以及边的名称,相反,任何具有和本申请中用到的节点、图以及边具有相同或类似功能的名称都视作本申请的方法或等效替换,都在本申请的保护范围之内,以下不再赘述。
为了更好地理解本申请实施例公开的一种图处理的方法及相关装置,下面先对本发明实施例所使用的系统架构进行描述。本申请实施例可以应用于模型结构可视化对应的模块,请参阅图2,图2为本申请实施例的系统架构示意图,如图所示,用户在深度学习实践过程中,例如,编写代码构建模型以及训练模型的过程中,可以通过模型计算图的可视化来发现存在的问题,因此通过本申请实施例所提供的图处理的方法,能够对模型结构进行准确、清晰的展示,从而使用户更便捷的进行训练的调试和调优。
进一步地,本申请实施例还可以作为开源软件中,网络(web)服务中的一个图处理模块的实现代码,请参阅图3,图3为本申请实施例中产品实现的架构示意图,A1用于指示图处理模块,图处理模块首先发出请求,基于该请求从服务器或主机目录中读取特定格式的初始计算图数据,并在浏览器的web服务中通过本申请实施例所提供的图处理的方法,对初始计算图数据进行计算、渲染以及展示,基于所展示的图数据,用户可持续进行交互, 调整展示形态。应理解,在实际应用中,需要提前配置统一的计算图数据存储和解析格式,然后采用本申请实施例所提供的方法进行计算图处理。
通过将深度学习框架生成的计算图可视化,有助于用户检查自己编写的代码是否符合心目中的模型结构、在模型训练过程中定位问题的出现,然而计算图通常包括层级聚合以及数据量庞大的节点以及边,因此计算图中节点以及边的布局杂乱,由此会降低计算图处理的效率,从而降低终端设备展示计算图中数据流向的清晰度以及完整度。
下面对本申请实施例涉及到的一些术语或概念进行解释,以便于本领域技术人员理解。
一、计算图(computational graph)。
计算图是用来表示数据流向和计算操作的有向图,计算图中包括节点(node)以及边(edge)。
二、节点。
计算图中的节点都对应着操作(Operation)或者变量(Variable),变量可以把自己的值递送给操作,而操作通常指示一种计算逻辑,例如赋值,加,取整,与,或等等,因此,计算图中的一些节点都定义了计算图中变量的一个函数,输入至节点的值以及从节点输出的值有多种数据形式,例如张量(tensor),张量指示多维数组,因此张量包括但不限于标量、矢量、矩阵以及高阶的张量。为了便于理解,请参阅图4,图4为本申请实施例中节点的一个实施例示意图,如图所示,B1至B11均用于指示节点。
三、边。
边用于指示计算图中节点之间的数据流向。一条边的两端各自连接一个节点,数据就从边一端连接的节点流向边另一端的节点。因此,在有向图中,边是有方向的,例如,一条边两端分别连接A节点和B节点,边的方向从A节点指向B节点,则数据从A节点流向B节点。而对于一个节点,一条边的方向指向这个节点,就表示数据输入(或称为“流入”)这个节点,边的方向从这个节点指向其他节点,则表示数据输出(或称为“流出”)这个节点。
本申请的方案中,在生成第二图的过程中涉及到在边上添加端口,端口可以理解为数据的输入或者输出的具象化表示,所以一个端口可以标在边上的任意位置,例如边的一端(也就是边与节点的交点),再例如边上靠近边的一端的位置。
为了便于理解,请参阅图5,图5为本申请实施例中边的一个实施例示意图,如图所示,C1至C5均用于指示节点,C6至C9均用于指示边,其中边C6为节点C1至节点C2之间的数据流向。边C7为节点C2至节点C3之间的数据流向。边C8为节点C2至节点C4之间的数据流向。边C9为节点C2至节点C5之间的数据流向。由此可知,边是具有方向的。
四、频繁子图。
频繁子图为计算图中重复出现的子图结构,频繁子图即在起始节点(startHubNude)与终止节点(endHubNude)之间的多条并行数据流向路径。为了便于理解,请参阅图6,图6为本申请实施例中频繁子图结构的一个实施例示意图,如图所示,D1用于指示起始节点,D2用于指示终止节点,在起始节点于终止节点之间有多条并行的数据流向路径,这些数据流向路径中所经过的节点以及边即组成频繁子图,本实施例中所描述的子图即为所介绍的频繁子图。
五、简单图(simple graph)与复合图(compound graph)。
在计算图的可视化界面中,若具有连接关系的节点分属于不同层级的聚合(scope),此时该计算图被称为复合图,聚合即为包含子节点的集合,其次,若无前述按层级划分的聚合,则该计算图被称为简单图。为了便于理解,请参阅图7,图7为本申请实施例中简单图与复合图一个实施例示意图,如图所示,图7中(A)图用于指示简单图,而图7中(B)图用于指示复合图,E1至E4用于指示聚合,E5以及E6用于指示子聚合。具体地,图7中(A)图中每个节点没有形成聚合,即不会构成构成层级节点的聚合关系,而图7中(B)图中一个聚合可能包含多个子聚合或子节点,例如聚合E1包括2个子节点,聚合E2包括3个子节点,聚合E3包括2个子节点,而聚合E4包括一个子节点,子聚合E5以及子聚合E6,其中子聚合E5包括3个子节点,而子聚合E6包括4个子节点,通过多个聚合之间的数据流向可以构成层级节点的聚合关系。
六、跨聚合的边。
跨聚合的边为在复合图中,聚合内的节点与聚合外的节点之间的数据流向。为了便于理解,请参阅图8,图8为本申请实施例中跨聚合的边一个实施例示意图,如图所示,F1至F3用于指示节点,F4以及F5用于指示边。图中包括聚合A以及聚合2,且聚合1中包括节点F1以及节点F2,聚合2中包括节点F3,由于聚合A以及聚合2为不同的集合,因此对于节点F1和节点F3之间的数据流向对应的边F4,即为跨聚合的边,而节点F1以及节点F2属于相同聚合,因此对于节点F1和节点F2之间的数据流向对应的边F2,不为跨聚合的边。
基于此,为了解决上述问题,本申请实施例提供了一种图处理的方法,用于提升终端设备展示图中数据流向的清晰度以及完整度。为了便于理解,本申请实施例以应用于深度学习框架对应的计算图作为示例进行说明,应理解,在实际应用中,本申请实施理所提供的图处理的方法可以应用于包括节点以及边的各种图中,具体此处不做限定。下面对本申请实施例使用的图处理的方法进行详细描述,请参阅图9,图9为本申请实施例中图处理的方法一个实施例示意图,如图所示,图处理的方法包括如下步骤。
S101、获取第一图。
本实施例中,图处理装置获取第一图。示例性地,若应用于服务器,即图处理装置为服务器,那么服务器可以从服务器内存中读取深度学习框架保存的第一图(计算图数据)。其次,若应用于终端设备,即图处理装置为终端设备,那么终端设备可以从终端设备内存中读取深度学习框架保存的第一图,或者接收从服务器所发送的第一图。应理解,在实际应用中,图处理装置可以为服务器也可以为终端设备,本实施例中不对具体图处理装置以及具体获取第一图方式进行限定。具体地,第一图包括边以及节点,并且在本实施例中,边用于表示不同节点之间数据的流向。
S102、获取第一图的至少两个子图。
本实施例中,由于复杂深度学习模型所生成的第一图通常包含有频繁子图,若将包括频繁子图的第一图展示于终端设备,会降低终端设备对第一图的处理效率,其次,也不利于用户快速地找到关键子区域。
因此,基于深度学习计算图的特点,图处理装置可以从第一图中确定出度大于第二阈 值的节点,本实施例中定义该节点为起始节点。具体地,第二阈值用于指示节点数据流向的出度,即大于第二阈值说明该节点的出度已经超出预设出度,从而导致该节点的数据流向较多,因此图中所展示的边也较多,其次,第二阈值的具体取值可以为2、5或者8等,第二阈值的具体数据需要根据第一图的实际情况预先确定,此处不做限定。进一步地,通过起始节点的数据流向所经过的路径,确定对应的终止节点,通过起始节点以及终止节点确定在两个节点之间的所有子图,由于初始节点的出度大于第二阈值,因此起始节点以及终止节点之间所有子图数量至少为两个,且每个子图包括边以及节点。
S103、对于至少两个子图中的每个子图,计算每个子图中节点对应的标识和每个子图中边对应的标识。
本实施例中,标识用于指示子图的特征,或者标识为子图的哈希(hash)值,具体此处不做限定。由于本实施例以应用于深度学习框架对应的计算图作为示例,以标识为子图的哈希值作为一个示例进行介绍,基于深度学习计算图的特点,图处理装置可以计算计算图的子图中节点以及边对应的哈希值,由于节点对应的哈希值可以指示节点的特征,而边对应的哈希值可以指示边的特征,因此通过哈希值可以区别特征不同的节点以及边。
示例性地,以对一个子图计算节点对应的哈希值以及边对应的哈希值作为示例进行说明,下面分别介绍节点对应的哈希值以及边对应的哈希值的计算。
1、节点对应的哈希值。
首先,对子图中所有的节点计算哈希值,具体地,以子图中一个节点n为例,基于深度学习计算图的特点,节点对应有多种节点属性,且节点对应的节点属性包括但不限于变量类型、参数类型、所属聚合等。因此首先需要获取到节点n的节点属性,本实施例中节点属性包括节点n对应的节点类型,节点n中被隐藏的输入节点的数量,节点n中被隐藏的输入节点的类型,节点n中被隐藏的输出节点的数量,节点n中被隐藏的输出节点的类型以及节点n的附属节点的数量和类型,其中附属节点包括常量(const)节点以及变量(parameter)节点。
然后,对节点n的每个节点属性通过Time33哈希算法,得到每个节点属性对应的哈希值,然后将每个节点属性对应的哈希值相加得到相加后的节点属性哈希值,由于节点对应多个节点属性,可能会出现哈希值溢出的问题,为了防止溢出,再对相加后的边哈希值进行大质数(BIG_PRIMITIVE)取模从而得到节点n对应的哈希值node_hash[n]。具体地,本实施例中大质数为10000019,应理解,在实际应用中,具体大质数的对应数据应根据实际情况灵活确定,具体此处不做限定。
进一步地,由于一个子图包括至少两个节点,因此通过与前述实施例类似的方式确定每个节点对应的哈希值,并且将每个节点对应的哈希值相加得到相加后的节点哈希值,若子图中节点的数量较多,可能会出现哈希值溢出的问题,为了防止溢出,再对相加后的节点哈希值进行大质数取模从而得到该子图中节点对应的哈希值。
2、边对应的哈希值。
首先,对子图中所有的边计算哈希值,一条边对应的哈希值与每个子图中边指示的连接关系有关,即每个子图中边对应的哈希值是对每个子图中边所指示的节点与节点之间的连接关系进行计算得到的,且节点与节点之间的连接关系是有方向的。具体地,以子图中 一条边i为例,且边i表示节点A向节点B的数据流向,即对于边i而言,数据输出的节点为节点A,数据输入的节点为节点B,那么可以将节点A以及节点B编码成字符串“[source type]->[target type]”,其中[source type]用于指示边i的节点A的类型,而[target type]用于指示边i的节点B的类型,且[source type]->[target type]”用于指示边i的数据流向为从节点A至节点B,以确保数据有序性。然后,对字符串“[source type]->[target type]”使用Time33哈希算法得到边i对应的哈希值edge_hash[i],其中Time33哈希算法具体用于将字符串映射到数字。
进一步地,由于一个子图包括至少两条边,因此通过与前述实施例类似的方式确定每条边对应的哈希值,并且将每条边对应的哈希值相加得到相加后的边哈希值,若子图中边的数量较多,可能会出现哈希值溢出的问题,为了防止溢出,再对相加后的边哈希值进行大质数取模从而得到该子图中边对应的哈希值。
可以理解的是,本实施例中以标识为哈希值作为示例进行介绍,但在实际应用中,标识还可以将子图的节点和边与预设的子图库做比对,从而将子图匹配出一个标识,类似的子图的结构对应一个标识,因此该标识可以指示子图的特征,因此通过该标识也可以区别特征不同的子图。或者,基于节点的属性的统计值,和边的数量以及连接关系,通过取整,整除等其他计算逻辑,获取一个数值,将该数值作为标识,该标识也可以指示子图的特征,通过该标识可以区别特征不同的子图。具体此处不对标识进行限定。另外,标识是哈希值的情况下,本申请对计算哈希值使用的哈希算法也不做特别限制,只要可以指示不同结构的子图,且相似(包括相同的情况)的子图的哈希值相同即可。
S104、基于每个子图中节点对应的标识和每个子图中边对应的标识,计算每个子图对应的标识。
本实施例中,以标识为哈希值作为示例进行介绍,图处理装置通过步骤S103可以得到每个子图中节点对应的哈希值和每个子图中边对应的哈希值,然后将所得到的节点对应的哈希值以及边对应的哈希值相加,并进行大质数取模,从而得到每个子图对应的哈希值。
示例性地,以第一图中包括子图A,子图B以及子图C作为一个示例,子图A,子图B以及子图C通过步骤S103均可以得到每个子图中节点对应的哈希值以及边对应的哈希值,然后将所得到的节点对应的哈希值以及边对应的哈希值相加,并进行大质数取模,得到子图A对应的哈希值H(A),子图B对应的哈希值为哈希值H(B),以及子图C对应的哈希值为哈希值H(C)。
S105、合并至少两个子图中标识相同的子图,以生成第二图。
本实施例中,以标识为哈希值作为示例进行介绍,由于哈希值相同的子图是相似的,因此图处理装置可以将哈希值相同的至少两个子图进行合并生成第二图,并且第二图用于展示于终端设备,由此可以减少在终端设备展示第二图中的节点以及边数量。其中,本实施例中所描述的合并并非将子图完全合并,而是将子图进行堆叠,使得所生成的第二图中子图数量减少,但子图中所包括的数据并不会因为合并而减少。示例性地,以第一图中包括子图A,子图B以及子图C作为一个示例,通过步骤S104得到子图A对应的哈希值H(A),子图B对应的哈希值为哈希值H(B),以及子图C对应的哈希值为哈希值H(C),若其中哈希值H(A)和哈希值H(B)为相同哈希值,那么可以将子图A以及子图B进行合并生成 第二图,由此所展示的第二图中仅将子图A以及子图C对应结构示出,减少了计算图中展示的节点以及边的数量。
S106、输出合并后生成的第二图。
本实施例中,若本实施例应用于服务器,则服务器可以通过前述实施例的方式生成第二图,并且向终端设备发送所生成的第二图,使得终端设备展示第二图,或者直接由服务器展示第二图。其次,若应用于终端设备,则终端设备可以直接通过前述实施例的方式生成第二图,并且展示所生成的第二图,或者接受服务器所发送的第二图,并展示所接收的第二图。第二图具体地展示方式再次不做限定。
本申请实施例所提供的方案能够准确、可以在同一聚合层面内部寻找同层级的所有频繁子图进行堆叠,并可以递归地进行层层堆叠,准确、完整的识别频繁子图结构,并且在保证连接关系的准确性的前提下,减少计算图中的节点以及边的展示数目。示例性地,以基于开源计算框架MindSpore生成的BERT预先训练(pretrain)网络计算图以及mobilenetV2网络计算图作为一个示例进行说明书,请参阅表1,表1为BERT_pretrain网络计算图对应的详细信息,以及mobilenetV2网络计算图对应的详细信息。
表1
网络 节点 原图节点数 堆叠后节点数 效果
mobilenetV2 Optimizer_Momentum 650 33 2秒内展示图
BERT_pretrain Optimizer_Lamb 16723(崩溃) 99 5秒内展示图
其中,通过表1可以看出前述两个计算图为通过图处理之前,原计算图节点数量较多,对于BERT_pretrain网络计算图中Optimizer_Lamb节点而言,由于原图节点数为16723,节点数过多直接导致终端设备展示该节点时崩溃。而通过本申请实施例进行频繁子图堆叠处理后,对于BERT_pretrain网络计算图中Optimizer_Lamb节点而言,节点数量从导致崩溃的16723降低至99,且终端设备能够在5秒内展示该图。其次,对于mobilenetV2网络计算图中Optimizer_Momentum节点而言,节点数量从650降低至33,且终端设备能够在2秒内展示该图。因此可知,通过本申请实施例所提供的图处理的方法,可以使得计算图中节点数量减少,还可以提升终端设备展示计算图的效率。应理解,表1示例仅用于理解本方案,具体需要根据实际情况灵活确定。
进一步地,通过前述实施例减少节点以及边的展示数目得到第二图,并且采用正交式布局作为基本布局样式,其中,正交式布局具体为边不相互交叉,相同深度的所有节点在同一水平线上,且同一层级的节点应该有一定的空隙。然而,规模计算图通常包括数量较多的节点和边,在通过前述实施例减少节点以及边的展示数目的基础上,由于边以及节点的数量仍然较多,通过正交式布局作为基本布局样式,仍存在所展示计算图不够清晰的问题,导致用户无法基于该图的进行训练分析以及调试。为了便于理解,请参阅图10,图10为本申请实施例中计算图一个实施例示意图,如图所示,G1用于指示乘法指令(mul)算子,和mul算子G1相连的边有10条,若将10条边均在计算图中展示,由于与mul算子G1相连的边过多,计算图无法清晰的展示于mul算子G1相连边的关系以及mul算子G1数据流向,不利于计算图的分析。
因此,为解决前述图10中所带来的问题,本申请实施例提供了另一种图处理的方法,本申请实施例中以标识为哈希值作为示例进行介绍,请参阅图11,图11为本申请实施例中图处理的方法另一实施例示意图,如图所示,图处理的方法包括如下步骤。
S201、获取第一图。
本实施例中,图处理装置获取第一图的方式与步骤S101类似,在此不再赘述。
S202、获取第一图的至少两个子图。
本实施例中,图处理装置获取第一图的至少两个子图的方式与步骤S102类似,在此不再赘述。
S203、对于至少两个子图中的每个子图,计算每个子图中节点对应的哈希值和每个子图中边对应的哈希值。
本实施例中,图处理装置对于至少两个子图中的每个子图,计算每个子图中节点对应的哈希值和每个子图中边对应的哈希值的方式,与步骤S103类似,在此不再赘述。
S204、基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,计算每个子图对应的哈希值。
本实施例中,图处理装置基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,计算每个子图对应的哈希值的方式,与步骤S104类似,在此不再赘述。
S205、合并至少两个子图中哈希值相同的子图。
本实施例中,图处理装置合并至少两个子图中哈希值相同的子图的方式与步骤S105中介绍的方式类似,在此不再赘述。
S206、添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,每个第一节点为第一图中出度或入度大于第一阈值的节点,指示数据输入每个第一节点的边经过每个第一节点的第一端口,指示数据输出每个第一节点的边经过每个第一节点的第二端口。
本实施例中,图处理装置合并至少两个子图中哈希值相同的子图后,可以遍历合并后所生成图中的所有节点,确定出度或入度大于第一阈值的节点,这样的节点被确定为第一节点,然后为第一节点分配一个第一端口以及第二端口,其中,第一端口为数据输入每个第一节点的边经过每个第一节点的端口,而第二端口为数据输出每个第一节点的边经过每个第一节点的端口。具体地,第一阈值可以指示节点数据流向的出度,还可以指示节点数据流入的入度,即大于第一阈值说明该节点的出度已经超出预设出度,或者该节点的入度已经超出预设入度,均会导致该节点的数据流向较多,因此图中所展示的边也较多,其次,第一阈值的具体取值可以为3、5或者4等,第一阈值的具体数据需要根据计算图的实际情况预先确定,此处不做限定。
S207、合并经过同一第一端口的多条边,合并经过同一第二端口的多条边,以生成第二图。
本实施例中,图处理装置根据合并后所生成图中包括的所有边,合并经过同一第一端口的多条边,合并经过同一第二端口的多条边,以生成第二图。
可选地,第二图是以正交边路由的方式布局的。具体地,先采用端口约束优化布局算法基于第一端口以及第二端口,以整体的正交边路由布局为目标,对第一端口以及第二端口的位置以及排序进行约束优化,计算得出第一端口以及第二端口的位置坐标,由此完成 正交边路由的布局,以生成第二图。其中,正交布局具体为节点周边的连线夹角为90度,而边路由为图中具体连线的排布以及走向。
为了便于理解,请参阅图12,图12为本申请实施例中计算图另一实施例示意图,如图所示,H1用于指示mul算子,通过将图10所示出的图中,数据输入经过第一端口的边且数据输出经过第二端口的边进行合并,并通过正交边路由的方式布局生成图12,减少了终端设备展示的图中连接边的数量,从而能够让有着较多节点和边的大规模计算图可以清晰展示,进一步提升计算图的清晰度,便于用户可以基于该图的进行训练分析以及调试。
示例性地,对深度双向交互编码(Bidirectional Encoder Representation from Transformers,BERT)网络的计算图采用前述实施例对应的方法,通过正交式边路由布局进行得到可视化的图,请参阅图13,图13为本申请实施例中BERT网络计算图一个实施例示意图,如图所示,针对BERT这种大规模计算图,通过前述方法进行图处理后,所得到的图从左到右整体的数据流向完整且清晰,因此终端设备展示该图后,用户可交互点击其中的算子,查看更多更细小的子结构。应理解,图12以及图13对应示例仅用于理解本方案,具体需要根据实际情况灵活确定。
S208、输出合并后生成的第二图。
本实施例中,图处理装置输出合并后生成的第二图的方式与步骤S106类似,在此不再赘述。
再进一步地,通过前述实施例提升计算图的清晰度得到第二图之后,由于具有连接关系的节点可能分属于不同层级的聚合,聚合为包含一部分子节点的集合,即在计算图中可以包括聚合信息,请再次参阅图7,通过图7可知,当计算图中存在聚合时,由于聚合中包括至少一个节点,因此每个聚合之间会有交错复杂的边,从而导致为复合图的计算图仍旧存在所展示计算图不够清晰的问题,因此,为解决前述图7中所示出的复合图展示时对应的问题,本申请实施例提供了另一种图处理的方法,本申请实施例中以标识为哈希值作为示例进行介绍,请参阅图14,图14为本申请实施例中图处理的方法另一实施例示意图,如图所示,其中,第四图包括聚合,图处理的方法包括如下步骤。
S301、获取第一图。
本实施例中,图处理装置获取第一图的方式与步骤S201类似,在此不再赘述。
S302、获取第一图的至少两个子图。
本实施例中,图处理装置获取第一图的至少两个子图的方式与步骤S202类似,在此不再赘述。
S303、对于至少两个子图中的每个子图,计算每个子图中节点对应的哈希值和每个子图中边对应的哈希值。
本实施例中,图处理装置对于至少两个子图中的每个子图,计算每个子图中节点对应的哈希值和每个子图中边对应的哈希值的方式,与步骤S203类似,在此不再赘述。
S304、基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,计算每个子图对应的哈希值。
本实施例中,基于每个子图中节点对应的哈希值和每个子图中边对应的哈希值,计算每个子图对应的哈希值的方式,与步骤S204类似,在此不再赘述。
S305、合并至少两个子图中哈希值相同的子图。
本实施例中,图处理装置合并至少两个子图中哈希值相同的子图的方式与步骤S205类似,在此不再赘述。
S306、添加多个第一节点中每个第一节点的第一端口以及第二端口。
本实施例中,图处理装置添加多个第一节点中每个第一节点的第一端口以及第二端口的方式与步骤S206类似,在此不再赘述。
S307、合并经过同一第一端口的多条边,合并经过同一第二端口的多条边。
本实施例中,图处理装置合并经过同一第一端口的多条边,合并经过同一第二端口的多条边的方式,与步骤S207类似,在此不再赘述。
S308、在多条第一边和多条第二边上添加端口,其中,每条第一边指示一个聚合的输入,每条第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能。
本实施例中,由于计算图中存在聚合,图处理装置合并经过同一第一端口的多条边,合并经过同一第二端口的多条边之后,所得到的图中包括聚合。因此图处理装置可以先根据聚合所包括的节点,以及聚合之外的节点之间数据流向确定为第一边以及第二边,第一边以及第二边均为图8中所介绍的跨聚合的边。具体地,第一边指示一个聚合的数据输入,而第二边指示一个聚合的数据输出,一个聚合通过一组节点和边的组合可以指示一种计算功能。
进一步地,图处理装置确定第一边以及第二边之后,还可以在第一边以及第二边上添加端口。应理解,对于一个聚合而言,端口的数量与该聚合对应的第一边以及第二边的数量相同,例如,一个聚合的有3个数据输入,那么即可确定3条第一边,而一个聚合的有2个数据输出,那么即可确定2条第二边,那么图处理装置所添加的端口为5个。为了便于理解,请参阅图15,图15为本申请实施例中端口一个实施例示意图,如图所示,I1以及I2用于指示第一边,I3以及I4用于指示第二边,I5至I8用于指示端口。图中包括聚合以及6个节点,对于该聚合而言,节点1、节点2以及节点6均不属于该聚合,而该聚合中包括节点3、节点4以及节点5,通过前述实施例可知,节点1与节点4之间数据流向对应边I1,边I1指示聚合的数据输入,因此边I1为第一边,同理可知,节点2与节点3之间数据流向对应的边I2可以指示聚合的数据输入,因此边I2为第一边。其次,节点4与节点6之间数据流向对应边I3,边I3指示聚合的数据输出,因此边I3为第二边,同理可知,节点5与节点6之间数据流向对应的边I4可以指示聚合的数据输出,因此边I4为第一边,即可以确定2条第一边以及2条第二边。基于每条第一边以及第二边,在与聚合的交点处添加端口,即第一边I1与聚合的交点I5为端口,同理可知,第一边I2与聚合的交点I6为端口,第二边I3与聚合的交点I7为端口,第二边I4与聚合的交点I8为端口,即可以添加2条第一边以及2条第二边对应的4个端口。应理解,图15的示例仅用于理解本方案,具体第一边以及第二边需要根据图中节点数据流向关系的实际情况灵活确定,而添加的具体端口需要第一边以及第二边的根据实际情况灵活确定。
S309、合并添加的多个端口中的多个第三端口,第三端口为多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四 端口,第四端口为多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口。
本实施例中,图处理装置遍历聚合中的所有节点,将多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口确定为第三端口,并且合并多个第三端口。其次,将多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口确定为第四端口,并且合并多个第四端口。
为了便于理解,基于图15中的端口进行进一步地示例,请参阅图16,图16为本申请实施例中合并端口一个实施例示意图,如图所示,J1至J4用于指示端口,J5用于指示第四端口。图16中(A)图示出的为包括端口J1至端口J4的示例图,其中端口J3以及端口J4均与节点6连接,即聚合中的节点4与节点5的数据流向均指向节点6,即端口J3以及端口J4为2条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口,因此可以将端口J3以及端口J4进行合并,得到第四端口,从而得到图16中(B)图示出的为包括端口J1、端口J2以及第四端口J5的示例图。应理解,图16的示例仅用于理解本方案,具体第三端口以及第四端口需要根据聚合内的节点与聚合外的节点数据流向的实际情况灵活确定。
S310、基于合并后的端口,以正交边路由的方式布局第二图。
本实施例中,图处理装置基于合并后的端口,以正交边路由的方式布局第二图。具体地,首先将第一边以第三端口为界拆解为两段,并且将第二边以第四端口为界拆解为两段。若无第三端口或者第四端口,那么也可以以端口为界将第一边以及第二边拆解为两段,具体此处不做限定。为了便于理解,基于图16中的第四端口进行进一步地示例,请参阅图17,图17为本申请实施例中图处理的方法另一个实施例示意图,如图所示,K1以及K2用于指示端口,K3用于指示第四端口,K4,K5以及K6用于指示聚合外的边,K7至K9用于指示聚合内的边。根据端口K1将节点1至节点4的聚合数据输入对应的第一边拆解,从而得到边K4以及边K7,同理可知,根据端口K2可以得到边K5以及边K8。其次,根据第四端口K3可以将节点4以及节点5至节点6的聚合数据输出对应的第二边拆解,从而得到边K6以及边K9,然后采用端口约束优化布局算法,以聚合上的端口以及第四端口为界,在聚合内部与聚合内部外部分别进行节点和边的布局计算,同时约束调整节点以及端口的数量、位置和排序,从而生成第二图。具体正交边路由的布局方式在前述实施例中已进行介绍,在此不再赘述。
S311、输出合并后生成的第二图。
本实施例中,图处理装置输出合并后生成的第二图的方式与步骤S208类似,在此不再赘述。
通过前述实施例可知,本申请实施例在跨节点聚合的数据边中采用端口设计以及基于规则的边绑定,不但能够减少边的数量,还能够保留局部聚焦区域的完整数据流向。此外,对于聚合内外的连线进出处理,采用端口约束优化布局算法对聚合边界的节点以及端口位置和数量进行自适应的调整和限制,从而能够更通用地适应不同的计算图结构,从而能够在简化图布局的同时,尽可能的保留原有的局部数据连接关系。
应理解,前述实施例中步骤之间的步骤无时序限定,例如步骤S305与步骤S306以及步骤S307之间,可以先实施步骤S306以及步骤S307,再实施步骤S305。其次,例如步骤 S306以及步骤S307,与步骤S308至步骤S310之间,可以先实施步骤S308至步骤S310,再实施步骤S306以及步骤S307,因此本实施例中的示例均可以根据实际情况进行步骤之间的时序调整,具体此处不做限定。
上述主要以方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,图处理装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以基于上述方法示例对图处理装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
由此,下面对本申请中的图处理装置进行详细描述,请参阅图18,图18为本申请实施例中图处理装置一个实施例示意图,如图所示,图处理装置1800包括:
获取模块1801,用于获取第一图的至少两个子图,其中,每个子图包括第一图中的多个节点和节点之间的边;
计算模块1802,用于基于至少两个子图中,每个子图包括的节点和边计算至少两个子图各自的标识;
合并模块1803,用于合并至少两个子图中标识相同的子图;
输出模块1804,用于输出合并后生成的第二图。
在本申请的一些可选实施例中,标识为哈希值,每个子图的数据指示每个子图中的节点和边;
计算模块1802,具体用于对于至少两个子图中的每个子图,基于每个子图中的多个节点各自对应的哈希值和每个子图中的多条边对应的哈希值,计算每个子图对应的哈希值。
在本申请的一些可选实施例中,每个子图中,一个节点对应的哈希值与节点的属性有关;
每个子图中,一条边对应的哈希值与每个子图中边指示的连接关系有关。
在本申请的一些可选实施例中,标识为哈希值;
合并模块1803,具体用于合并至少两个子图中哈希值相同的子图;
以及添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,每个第一节点为第一图中出度或入度大于第一阈值的节点,指示数据输入每个第一节点的边经过每个第一节点的第一端口,指示数据输出每个第一节点的边经过每个第一节点的第二端口;
对多个第一节点执行以下操作以生成第二图:
合并经过同一第一端口的多条边,合并经过同一第二端口的多条边。
在本申请的一些可选实施例中,第二图是以正交边路由的方式布局的。
在本申请的一些可选实施例中,标识为哈希值;
合并模块1803,具体用于合并至少两个子图中哈希值相同的子图;
以及在多条第一边和多条第二边上添加端口,其中,每条第一边指示一个聚合的输入,每条第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
合并添加的多个端口中的多个第三端口,第三端口为多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,第四端口为多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
基于合并后的端口,以正交边路由的方式布局第二图。
在本申请的一些可选实施例中,标识为哈希值;
合并模块1803,具体用于合并至少两个子图中哈希值相同的子图;
添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,每个第一节点为第一图中出度或入度大于第一阈值的节点,指示数据输入每个第一节点的边经过每个第一节点的第一端口,指示数据输出每个第一节点的边经过每个第一节点的第二端口;
合并经过同一第一端口的多条边,合并经过同一第二端口的多条边;
以及在多条第一边和多条第二边上添加端口,其中,每条第一边指示一个聚合的输入,每条第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
合并添加的多个端口中的多个第三端口,第三端口为多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,第四端口为多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
基于上述一个或多个合并步骤的结果,以正交边路由的方式布局第二图。
本申请实施例中的图处理装置,可以部署于终端设备,也可以部署于服务器,还可以是应用于终端设备或服务器中的芯片或者其他可实现上述终端设备功能的组合器件、部件等。当图处理装置是终端设备时,计算模块以及合并模块可以由处理器执行代码来实现,例如处理器可以是某种型号的应用芯片等。当图处理装置是具有上述终端设备功能的部件时,计算模块以及合并模块可以由处理器执行代码来实现。当图处理装置是芯片系统时,计算模块以及合并模块可以是芯片系统的处理器。
具体地,请参阅图19,图19为本申请实施例中图处理装置一个实施例的结构示意图,如图19所示,图处理装置1900包括处理器1910,与所述处理器1910耦接的存储器1920,输入/输出端口1930。一些实现方式下,它们可以通过总线耦合在一起。图处理装置1900可以是服务器或者终端设备。其中,处理器1910可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP)或者CPU和NP的组合。处理器还可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。处理器1910可以是指一个处理器,也可以包括多个处理器。存储器1920可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM),处理器1910可以执行代码实现计算模块1802以及合并模块1803的功能。存储器 1920也可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器1920还可以包括上述种类的存储器的组合。
存储器1920中存储有计算机可读指令,所述计算机可读指令执行前文描述的可能实施方式中的任一种方法。处理器1910执行计算机可读指令后可以按照计算机可读指令的指示进行相应的操作。此外,处理器1910执行存储器1920中的计算机可读指令后,可以按照所述计算机可读指令的指示,执行服务器或者终端设备可以执行的全部操作,例如服务器在与图9和图11和图14对应的实施例中执行的操作。
输入/输出端口1930包括用于输出数据的端口,一些情况下还包括用于输入数据的端口。处理器1910可以通过执行代码调用该输入/输出端口1930输出第二图,一些情况下,处理器1910还可以通过执行代码调用该输入/输出端口1930从其他设备获取第一图的两个子图。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,工作过程的说明以及技术效果可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-only memory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (17)

  1. 一种图处理的方法,其特征在于,包括:
    获取第一图的至少两个子图,其中,每个所述子图包括所述第一图中的多个节点和节点之间的边;
    基于所述至少两个子图中,每个所述子图包括的节点和边计算所述至少两个子图各自的标识;
    合并所述至少两个子图中标识相同的子图;
    输出合并后生成的第二图。
  2. 根据权利要求1所述的方法,其特征在于,所述标识为哈希值,每个所述子图的数据指示每个所述子图中的节点和边;
    所述基于所述至少两个子图中,每个所述子图包括的节点和边计算所述至少两个子图各自的标识,包括:
    对于所述至少两个子图中的每个子图,基于所述每个子图中的多个节点各自对应的哈希值和所述每个子图中的多条边对应的哈希值,计算所述每个子图对应的哈希值。
  3. 根据权利要求2所述的方法,其特征在于,所述每个子图中,一个节点对应的哈希值与所述节点的属性有关;
    所述每个子图中,一条边对应的哈希值与所述每个子图中边指示的连接关系有关。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述标识为哈希值;
    所述合并所述至少两个子图中标识相同的子图,包括:
    合并所述至少两个子图中哈希值相同的子图;
    以及添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,所述每个第一节点为所述第一图中出度或入度大于第一阈值的节点,指示数据输入所述每个第一节点的边经过所述每个第一节点的第一端口,指示数据输出所述每个第一节点的边经过所述每个第一节点的第二端口;
    对所述多个第一节点执行以下操作以生成所述第二图:
    合并经过同一第一端口的多条边,合并经过同一第二端口的多条边。
  5. 根据权利要求4中所述的方法,其特征在于,所述第二图是以正交边路由的方式布局的。
  6. 根据权利要求1至3中任一项所述的方法,其特征在于,所述标识为哈希值;
    所述合并所述至少两个子图中标识相同的子图,包括:
    合并所述至少两个子图中哈希值相同的子图;
    以及在多条第一边和多条第二边上添加端口,其中,每条所述第一边指示一个聚合的输入,每条所述第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
    合并添加的多个端口中的多个第三端口,所述第三端口为所述多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,所述第四端口为所述多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
    基于合并后的端口,以正交边路由的方式布局所述第二图。
  7. 根据权利要求1至3中任一项所述的方法,其特征在于,所述标识为哈希值;
    所述合并所述至少两个子图中标识相同的子图,包括:
    合并所述至少两个子图中哈希值相同的子图;
    添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,所述每个第一节点为所述第一图中出度或入度大于第一阈值的节点,指示数据输入所述每个第一节点的边经过所述每个第一节点的第一端口,指示数据输出所述每个第一节点的边经过所述每个第一节点的第二端口;
    合并经过同一第一端口的多条边,合并经过同一第二端口的多条边;
    以及在多条第一边和多条第二边上添加端口,其中,每条所述第一边指示一个聚合的输入,每条所述第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
    合并添加的多个端口中的多个第三端口,所述第三端口为所述多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,所述第四端口为所述多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
    基于上述一个或多个合并步骤的结果,以正交边路由的方式布局所述第二图。
  8. 一种图处理装置,其特征在于,包括:
    获取模块,用于获取第一图的至少两个子图,其中,每个所述子图包括所述第一图中的多个节点和节点之间的边;
    计算模块,用于基于所述至少两个子图中,每个所述子图包括的节点和边计算所述至少两个子图各自的标识;
    合并模块,用于合并所述至少两个子图中标识相同的子图;
    输出模块,用于输出合并后生成的第二图。
  9. 根据权利要求8所述的图处理装置,其特征在于,所述标识为哈希值,每个所述子图的数据指示每个所述子图中的节点和边;
    所述计算模块,具体用于对于所述至少两个子图中的每个子图,基于所述每个子图中的多个节点各自对应的哈希值和所述每个子图中的多条边对应的哈希值,计算所述每个子图对应的哈希值。
  10. 根据权利要求9所述的图处理装置,其特征在于,所述每个子图中,一个节点对应的哈希值与所述节点的属性有关;
    所述每个子图中,一条边对应的哈希值与所述每个子图中边指示的连接关系有关。
  11. 根据权利要求8至10中任一项所述的图处理装置,其特征在于,所述标识为哈希值;
    所述合并模块,具体用于合并所述至少两个子图中哈希值相同的子图;
    以及添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,所述每个第一节点为所述第一图中出度或入度大于第一阈值的节点,指示数据输入所述每个第一节点的边经过所述每个第一节点的第一端口,指示数据输出所述每个第一节点的边经过所述每 个第一节点的第二端口;
    对所述多个第一节点执行以下操作以生成所述第二图:
    合并经过同一第一端口的多条边,合并经过同一第二端口的多条边。
  12. 根据权利要求11所述的图处理装置,其特征在于,所述第二图是以正交边路由的方式布局的。
  13. 根据权利要求8至10中任一项所述的图处理装置,其特征在于,所述标识为哈希值;
    所述合并模块,具体用于合并所述至少两个子图中哈希值相同的子图;
    以及在多条第一边和多条第二边上添加端口,其中,每条所述第一边指示一个聚合的输入,每条所述第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
    合并添加的多个端口中的多个第三端口,所述第三端口为所述多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,所述第四端口为所述多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
    基于合并后的端口,以正交边路由的方式布局所述第二图。
  14. 根据权利要求8至10中任一项所述的图处理装置,其特征在于,所述标识为哈希值;
    所述合并模块,具体用于合并所述至少两个子图中哈希值相同的子图;
    添加多个第一节点中每个第一节点的第一端口以及第二端口,其中,所述每个第一节点为所述第一图中出度或入度大于第一阈值的节点,指示数据输入所述每个第一节点的边经过所述每个第一节点的第一端口,指示数据输出所述每个第一节点的边经过所述每个第一节点的第二端口;
    合并经过同一第一端口的多条边,合并经过同一第二端口的多条边;
    以及在多条第一边和多条第二边上添加端口,其中,每条所述第一边指示一个聚合的输入,每条所述第二边指示一个聚合的输出,一个聚合通过一组节点和边的组合指示一种计算功能;
    合并添加的多个端口中的多个第三端口,所述第三端口为所述多条第一边中,对应同一聚合,且指示输入来自相同的节点的边上的端口,以及合并添加的多个端口中的多个第四端口,所述第四端口为所述多条第二边中,对应同一聚合,且指示输出向同一个节点的边上的端口;
    基于上述一个或多个合并步骤的结果,以正交边路由的方式布局所述第二图。
  15. 一种服务器,其特征在于,包括:
    处理器、存储器、输入输出(I/O)接口;
    所述处理器与所述存储器、所述输入输出接口耦合;
    所述处理器通过运行所述存储器中的代码执行如权利要求1至7中任一项所述的方法。
  16. 一种终端设备,其特征在于,包括:
    处理器、存储器、和输入输出(I/O)接口;
    所述处理器与所述存储器、输入输出接口耦合;
    所述处理器通过运行所述存储器中的代码执行如权利要求1至7中任一项所述的方法。
  17. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至7中任一项所述的方法。
PCT/CN2021/096023 2020-09-21 2021-05-26 一种图处理的方法,系统以及装置 WO2022057303A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/186,267 US20230229704A1 (en) 2020-09-21 2023-03-20 Graph processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010998184.0A CN114283099A (zh) 2020-09-21 2020-09-21 一种图处理的方法,系统以及装置
CN202010998184.0 2020-09-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/186,267 Continuation US20230229704A1 (en) 2020-09-21 2023-03-20 Graph processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2022057303A1 true WO2022057303A1 (zh) 2022-03-24

Family

ID=80777610

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096023 WO2022057303A1 (zh) 2020-09-21 2021-05-26 一种图处理的方法,系统以及装置

Country Status (3)

Country Link
US (1) US20230229704A1 (zh)
CN (1) CN114283099A (zh)
WO (1) WO2022057303A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210232969A1 (en) * 2018-12-24 2021-07-29 Intel Corporation Methods and apparatus to process a machine learning model in a multi-process web browser environment
US20230115149A1 (en) * 2021-09-24 2023-04-13 Insitro, Inc. System, devices and/or processes for updating call graphs
US11914578B2 (en) * 2021-12-21 2024-02-27 Michael Roberts Xbundle: a hierarchical hypergraph database designed for distributed processing
CN116959731A (zh) * 2022-11-15 2023-10-27 中移(成都)信息通信科技有限公司 医疗信息处理方法以及装置、设备及存储介质
CN115793914A (zh) * 2023-02-08 2023-03-14 广州市玄武无线科技股份有限公司 多轮场景交互流程图生成方法、电子设备及其存储介质
CN117576125B (zh) * 2024-01-16 2024-04-16 芯瞳半导体技术(山东)有限公司 一种神经网络计算图的分割方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130246480A1 (en) * 2012-03-19 2013-09-19 Sap Ag Computing Canonical Hierarchical Schemas
CN105468371A (zh) * 2015-11-23 2016-04-06 赣南师范学院 一种基于主题聚类的业务流程图合并方法
CN107038215A (zh) * 2017-03-07 2017-08-11 东方网力科技股份有限公司 一种从m部图中得到极大完全子图的数据库搜索方法
CN109359172A (zh) * 2018-08-02 2019-02-19 浙江大学 一种基于图划分的实体对齐优化方法
CN111338635A (zh) * 2020-02-20 2020-06-26 腾讯科技(深圳)有限公司 计算图的图编译方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130246480A1 (en) * 2012-03-19 2013-09-19 Sap Ag Computing Canonical Hierarchical Schemas
CN105468371A (zh) * 2015-11-23 2016-04-06 赣南师范学院 一种基于主题聚类的业务流程图合并方法
CN107038215A (zh) * 2017-03-07 2017-08-11 东方网力科技股份有限公司 一种从m部图中得到极大完全子图的数据库搜索方法
CN109359172A (zh) * 2018-08-02 2019-02-19 浙江大学 一种基于图划分的实体对齐优化方法
CN111338635A (zh) * 2020-02-20 2020-06-26 腾讯科技(深圳)有限公司 计算图的图编译方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114283099A (zh) 2022-04-05
US20230229704A1 (en) 2023-07-20

Similar Documents

Publication Publication Date Title
WO2022057303A1 (zh) 一种图处理的方法,系统以及装置
Felsner et al. Straight-line drawings on restricted integer grids in two and three dimensions
US10380186B2 (en) Virtual topological queries
US8065658B1 (en) Tool for visualizing software architecture
Bannister et al. Track layouts, layered path decompositions, and leveled planarity
US10650559B2 (en) Methods and systems for simplified graphical depictions of bipartite graphs
WO2016078592A1 (zh) 批量数据查询方法和装置
WO2002046980A2 (en) A method of configuring a product using a directed acyclic graph
JP2010176347A (ja) 最短経路探索方法及び装置
US9824494B2 (en) Hybrid surfaces for mesh repair
Bertolazzi et al. Quasi-upward planarity
US20200320367A1 (en) Graph Conversion Method
CN110853120B (zh) 基于分割绘图法的网络布局方法、系统及介质
WO2024060999A1 (zh) 多边形处理方法、装置、设备、计算机可读存储介质及计算机程序产品
KR20190021397A (ko) 분산형 컴퓨팅 태스크를 실행하기 위한 방법 및 장치
CN112015405B (zh) 界面布局文件的生成方法、界面生成方法、装置及设备
Demaine et al. Fine-grained I/O complexity via reductions: New lower bounds, faster algorithms, and a time hierarchy
JP6781819B2 (ja) タスク処理方法及び分散コンピューティングフレームワークシステム
Cortese et al. On embedding a cycle in a plane graph
Laface et al. Picard-graded Betti numbers and the defining ideals of Cox rings
KR102428849B1 (ko) 이미지를 처리하는 방법, 장치, 전자 기기, 저장 매체 및 프로그램
Sharp Intrinsic Triangulations in Geometry Processing.
Woodbury An introduction to shape schema grammars
JP6185732B2 (ja) 画像処理装置
Kaufmann et al. On upward point set embeddability

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21868141

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21868141

Country of ref document: EP

Kind code of ref document: A1