CN113626652B - Data processing network system, data processing network deployment system and method thereof - Google Patents

Data processing network system, data processing network deployment system and method thereof Download PDF

Info

Publication number
CN113626652B
CN113626652B CN202111183990.3A CN202111183990A CN113626652B CN 113626652 B CN113626652 B CN 113626652B CN 202111183990 A CN202111183990 A CN 202111183990A CN 113626652 B CN113626652 B CN 113626652B
Authority
CN
China
Prior art keywords
data processing
network
backward
computation
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111183990.3A
Other languages
Chinese (zh)
Other versions
CN113626652A (en
Inventor
成诚
张建浩
李新奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oneflow Technology Co Ltd
Original Assignee
Beijing Oneflow Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oneflow Technology Co Ltd filed Critical Beijing Oneflow Technology Co Ltd
Priority to CN202111183990.3A priority Critical patent/CN113626652B/en
Publication of CN113626652A publication Critical patent/CN113626652A/en
Application granted granted Critical
Publication of CN113626652B publication Critical patent/CN113626652B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Abstract

The invention discloses a data processing network system, a data processing network deployment system and a method thereof. The data processing network system includes: a forward data processing network comprising a plurality of forward data processing sub-networks; a backward data processing network comprising a plurality of backward data processing sub-networks; and one or more forward data processing transition networks, wherein an output of a data processing source node of a forward data processing sub-network and a data processing source node input of a backward data processing sub-network of the pair of forward and backward data processing sub-networks respectively serve as one of inputs of data processing source nodes of the forward data processing transition network, and an output of a data processing sink node of the forward data processing transition network serves as one of inputs of data processing source nodes of a backward data processing sub-network of the pair of forward and backward data processing sub-networks.

Description

Data processing network system, data processing network deployment system and method thereof
Technical Field
The present disclosure relates to a data processing technology. More particularly, the present disclosure relates to a data processing network system, a data processing network deployment system, and methods thereof.
Background
In the present day when deep learning is popularized, more and more models and data with larger and larger scale make training for deep learning impossible on a single computing device. Distributed computing has been proposed for this purpose. With the popularization of distributed computing, a large job or a large tensor can be processed by deploying different parts of data to each computing device of different distributed data processing systems through division, and interaction of intermediate parameters is required in each part of computing process. Thus, the entire data processing process typically includes both forward and backward data processing during the processing of a particular job. The backward data processing usually needs to use the intermediate parameters generated in the forward data processing, and the backward data processing is after the forward data processing, so the intermediate parameters generated in the forward data processing often need to be stored in a designated storage space before the backward data processing using the intermediate parameters correspondingly. Under the condition that deep learning has a large requirement on the storage space, the storage space is greatly wasted due to long-time occupation of the storage space. This will greatly reduce the data processing capacity of the data processing system for a data processing system with a fixed memory space.
Therefore, there is a need for a method and system for improving the processing capacity of a data processing network system in situations where the storage space of the data processing network system is limited.
Disclosure of Invention
An object of the present invention is to solve at least the above problems, and in particular, the present disclosure provides a data processing network system including: a forward data processing network for performing forward data processing, including a plurality of forward data processing sub-networks, each sub-network being formed of a plurality of data processing nodes, each data processing node performing data processing using data generated by an upstream data processing node thereof and outputting data to a downstream data processing node thereof; a backward data processing network for performing backward data processing, comprising a plurality of backward data processing sub-networks; and one or more forward data processing transition networks, each of which is located between and separates the connection between a pair of forward data processing sub-networks and a pair of backward data processing sub-networks, and has the same network structure with the forward data processing sub-network of the pair of forward data processing sub-network and the backward data processing sub-network in which the forward data processing transition network is located, wherein the output terminal of the data processing source node of the forward data processing sub-network of the pair of forward data processing sub-network and the data processing source node input terminal of the backward data processing sub-network of the pair of forward data processing sub-network and the backward data processing sub-network are respectively one of the input terminals of the data processing source node of the forward data processing transition network, and the output terminal of the data processing sink node of the forward data processing transition network is used as the backward data processing sub-network of the pair of forward data processing sub-network and the backward data processing sub-network One of the inputs of a data processing source node of the network, such that, immediately prior to a backward data processing sub-network of the pair of forward and backward data processing sub-networks performing backward data processing, the forward data processing transition network performs transition forward data processing based on a message that the backward data processing sub-network is about to perform backward data processing to prepare input data for performing backward data processing for the backward data processing sub-network of the pair of forward and backward data processing sub-networks.
The data processing network system according to the present disclosure, wherein the data processing source node of the forward data processing transition network and the data processing source node of the forward data processing sub-network share the same memory unit.
The data processing network system according to the present disclosure, wherein the message that the backward data processing sub-network is about to perform backward data processing is a message that backward data processing is finished, which is issued from an output terminal of one or more upstream backward data processing nodes connected to the data processing source node of the backward data processing sub-network.
According to another aspect of the present disclosure, there is provided a data processing method including: performing forward data processing through a forward data processing network comprising a plurality of forward data processing sub-networks, each sub-network being formed by a plurality of data processing nodes, each data processing node performing data processing using data generated by an upstream data processing node thereof and outputting data to a downstream data processing node thereof; performing backward data processing through a backward data processing network comprising a plurality of backward data processing sub-networks; and performing transitional forward data processing through one or more forward data processing transition networks, wherein each forward data processing transition network is positioned between and separates a pair of a forward data processing sub-network and a backward data processing sub-network, and the forward data processing transition network and the forward data processing sub-network in the pair of the forward data processing sub-network and the backward data processing sub-network have the same network structure, the output end of the data processing source node of the forward data processing sub-network and the data processing source node input end of the backward data processing sub-network in the pair of the forward data processing sub-network and the backward data processing sub-network are respectively used as one of the input ends of the data processing source node of the forward data processing transition network, and the output end of the data processing sink node of the forward data processing transition network is used as the pair of the forward data processing sub-network and the backward data processing sub-network One of the inputs of the data processing source node of a backward data processing sub-network of the sub-networks, such that before a backward data processing sub-network of the pair of forward and backward data processing sub-networks is about to perform backward data processing, the forward data processing transition network performs transition forward data processing based on a message that the backward data processing sub-network is about to perform backward data processing to prepare input data for performing backward data processing for the backward data processing sub-network of the pair of forward and backward data processing sub-networks.
According to the data processing method of the present disclosure, the data processing source node of the forward data processing transition network and the data processing source node of the forward data processing sub-network share the same memory unit.
The data processing method of the present disclosure, wherein the message that the backward data processing sub-network is about to perform backward data processing is a message that backward data processing is finished, which is sent from an output terminal of one or more upstream backward data processing nodes connected to a data processing source node of the backward data processing sub-network.
According to another aspect of the present disclosure, there is provided a deployment system of a data processing network, comprising: the initial data processing calculation graph generation component receives task configuration data input by a user and generates an initial data processing calculation graph for the data processing system, wherein the initial data processing calculation graph comprises a forward data processing calculation graph and a backward data processing calculation graph, the forward data processing calculation graph is used for executing forward data processing, and the backward data processing calculation graph is used for executing backward data processing; the check point acquisition component inquires all the data calculation logic nodes in the forward data processing calculation graph and collects the data calculation logic nodes with check point marks and the action range thereof so as to determine a forward data processing calculation subgraph of the data calculation logic nodes with the check point marks; the forward data processing transition computation subgraph generation component is used for generating a corresponding forward data processing transition computation subgraph based on forward data processing computation subgraph replication and generating a corresponding forward data processing transition computation subgraph based on forward data processing computation subgraph replication; a final data processing computation graph generation component which disconnects the data computation logic nodes themselves in the data processing computation subgraph from the connection edges of the backward data processing computation graph, and takes the output end of the data computation logic source node of the corresponding forward data processing computation subgraph and the input end of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph as one of the input ends of the data computation logic source nodes of the forward data processing transition computation subgraph respectively, taking the output end of the data computation logic sink node of the forward data processing transition computation subgraph as one of the input ends of the data computation logic source nodes of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, thereby obtaining a final data processing computation graph; and a data processing network deployment component that deploys each data computation logic node to a data processing node based on the final data processing computational graph, thereby forming a data processing network corresponding to the final data processing computational graph, the data processing network including a forward data processing network, a backward data processing network, and a forward data processing transition network corresponding to the forward data processing transition computational subgraph.
According to yet another aspect of the present disclosure, there is provided a method of deploying a data processing network, comprising: generating, by an initial data processing computation graph generation component, an initial data processing computation graph for the data processing system based on the received user-input task configuration data, the initial data processing computation graph comprising a forward data processing computation graph and a backward data processing computation graph, the forward data processing computation graph for performing forward data processing, the backward data processing computation graph for performing backward data processing; querying all data calculation logical nodes in the forward data processing calculation graph through a check point acquisition component, and collecting the data calculation logical nodes with check point marks and action ranges thereof, thereby determining a forward data processing calculation subgraph of the data calculation logical nodes with the check point marks; generating a corresponding forward data processing transition computation subgraph by a forward data processing transition computation subgraph generation component based on forward data processing computation subgraph replication; disconnecting the connection edge between the data computation logic node and a backward data processing computation graph in the data processing computation subgraph, taking the output end of the data computation logic source node of the corresponding forward data processing computation subgraph and the input end of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph as one of the input ends of the data computation logic source nodes of the forward data processing transition computation subgraph respectively, and taking the output end of the data computation logic sink node of the forward data processing transition computation subgraph as one of the input ends of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, thereby obtaining a final data processing computation graph; and deploying, by the data processing network deployment component, each data computation logic node to a data processing node based on the final data processing computational graph, thereby forming a data processing network corresponding to the final data processing computational graph, the data processing network including a forward data processing network, a backward data processing network, and a forward data processing transition network corresponding to the forward data processing transition computational subgraph.
By a data processing network system, a data processing network deployment system and a method thereof according to the present disclosure, a forward data processing transition network as a backup of the forward data processing network enables the forward data processing network to release a required memory space in a time period before a corresponding backward data processing is performed after performing its forward data processing, and repeatedly performs the forward data processing performed by the corresponding forward data processing network through the forward data processing transition network before the corresponding backward data processing is performed, thereby repeatedly generating result data required for the corresponding backward data processing, thereby shortening an occupied time of the memory space required for the result data from a length of the time period before the corresponding backward data processing is performed after performing its forward data processing to a length of the time period before the forward data processing transition network generates the result data to a length of the corresponding backward data processing is performed The length of the time period between bundles, thereby providing a huge time space for time-sharing of these memories. The mode of saving memory space by exchanging time provides an effective alternative scheme for users with limited funds, thereby improving the computing capacity of fixed computing resources facing the same computing task as much as possible.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is a schematic diagram illustrating one example of a system for a data processing network according to the present disclosure.
FIG. 2 is a schematic diagram illustrating a deployment system for a data processing network according to the present disclosure.
FIG. 3 is a schematic diagram illustrating a method of deployment for a data processing network according to the present disclosure.
Detailed Description
The present invention will be described in further detail with reference to the following examples and the accompanying drawings so that those skilled in the art can practice the invention with reference to the description.
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, one of the two possible objects may be referred to hereinafter as a first data processing node and may also be referred to as a second data processing node without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
For a better understanding of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
FIG. 1 is a schematic diagram illustrating one example of a system for a data processing network according to the present disclosure. As shown in fig. 1, the data processing network system includes a forward data processing network and a backward data processing network. Fig. 1 shows only a part of a forward data processing network for performing forward data processing and a backward data processing network for performing backward data processing. The backward data processing network performs backward data processing using the parameters generated by the forward data processing network. The forward data processing network as described in fig. 1 comprises a plurality of forward data processing subnetworks FCP1, FCP2, FCP3 … …, each subnetwork comprising a plurality of data processing nodes. As shown in fig. 1, exemplary forward data processing subnetwork FCP1 includes forward data processing nodes Fn1, Fn2, Fn3, Fn4, Fn5, Fn 6. The number of data processing nodes of the forward data processing sub-network FCP1 and the connection relationship between them are varied according to the actual situation of data processing and are not the same sub-network structure as shown in fig. 1. As shown in fig. 1, each data processing node performs data processing using data generated by its upstream data processing node and outputs the data to its downstream data processing node. For example, the data processing node Fn3 performs data processing using data generated by its upstream data processing node Fn1 and outputs the resultant data to its downstream data processing node Fn 5.
The backward data processing network as described in fig. 1 comprises a plurality of backward data processing sub-networks BCP1, BCP2, BCP3 … …, each sub-network being formed by a plurality of data processing nodes. As shown in fig. 1, the exemplary backward data processing sub-network BCP1 includes backward data processing nodes Bn1, Bn2, Bn3, Bn4, Bn5, Bn 6. The number of data processing nodes of backward data processing sub-network BCP1 and the connection relationship between them are various according to the actual situation of data processing, and are not the same sub-network configuration as that shown in fig. 1. As shown in fig. 1, in general, each backward data processing node performs backward data processing using data generated by its upstream backward data processing node and parameters generated and stored by the corresponding forward data processing node, and outputs the data to its downstream backward data processing node. For example, conventionally, the backward data processing node Bn2 performs backward data processing using data generated by its upstream backward data processing node Bn4 and parameters generated by the corresponding forward data processing node Fn3 and stored in a buffer, and outputs the resultant data to its downstream data processing node Bn 1. However, in the present disclosure, the backward data processing node Bn2 performs backward data processing using data generated by its upstream backward data processing node Bn4 and parameters generated by and stored in a buffer of the corresponding forward data processing transition node TFn3 in the forward data processing transition network TFCP1, and outputs the resultant data to its downstream data processing node Bn 1. As shown in fig. 1, forward data processing transition network TFCP1 is one of a plurality of forward data processing transition networks. The forward data processing transition networks TFCPs are not associated with each other. As shown in fig. 1, one or more forward data processing transition networks TFCP1, TFCP2 … …, each of which is between and separates a pair of a forward data processing sub-network FCP and a backward data processing sub-network BCP. As shown in fig. 1, the forward data processing transition network TFCP1 is located between the forward data processing sub-network FCP1 and the backward data processing sub-network BCP 1. Conventionally, as shown in fig. 1, parameters generated by some of the forward data processing nodes Fn1, Fn3, Fn5, Fn6 of the forward data processing sub-network FCP1 are used by the backward data processing nodes Bn1, Bn2, Bn4, Bn6 of the backward data processing sub-network BCP1, respectively. However, in the present disclosure, since the forward data processing transition network TFCP1 separates the forward data processing sub-network FCP1 from the backward data processing sub-network BCP1, and the forward data processing transition network TFCP1 will replace the role of the forward data processing sub-network FCP1 when the backward data processing sub-network BCP1 performs backward data processing, the parameters generated by the forward data processing transition nodes TFn1, TFn3, TFn5, TFn6 of the forward data processing transition network TFCP1 will be used by the backward data processing nodes Bn1, Bn2, Bn4, Bn6 of the backward data processing sub-network BCP1, respectively, while the parameters generated by the forward data processing nodes Fn1, Fn3, Fn5, Fn6 will not be used by the backward data processing nodes Bn1, Bn2, Bn4, Bn6, as necessary. The parameters generated by the forward data processing nodes Fn1, Fn3, Fn5, Fn6 will not necessarily all be saved for the backward data processing nodes Bn1, Bn2, Bn4, Bn6 to perform backward data processing. Therefore, the parameters generated by the forward data processing sub-network FCP1 except the forward data processing source node Fn1 of the sub-network are stored in the forward data processing transition source node TFn1, and the parameters generated by the other forward data processing nodes Fn2, Fn3, Fn4, Fn5, Fn6 are not stored in the buffer, i.e. the buffers used by the parameters are released and can be used as the buffers required by other data processing nodes.
As shown in fig. 1, for example, a forward data processing transition network, such as the TFCP1, includes a number of forward data processing nodes TFn1, TFn2, TFn3, TFn4, TFn5, TFn6 having the same network structure as the forward data processing sub-network FCP1 in the pair of forward data processing sub-network FCP1 and backward data processing sub-network BCP1 in which they are located. The input of the forward data processing transition source node TFn1 of the forward data processing transition network TFCP1 needs to acquire the parameters generated by the corresponding forward data processing source node Fn1, so that the forward data processing transition source node TFn1 can share a buffer, i.e., memory, with the forward data processing source node Fn 1. In order to implement the forward data processing transition network TFCP1 to perform the forward data processing process performed by the forward data processing sub-network FCP1 again one pass in place of the role of the forward data processing sub-network FCP1 to again generate the corresponding parameters required to perform the backward data processing before the backward data processing sub-network BCP1 performs the backward data processing, the data processing source node Bn6 input of the backward data processing sub-network BCP1 of the backward data processing sub-network BCP1 is connected to one of the inputs of the forward data processing transition source node TFn1, thereby forming a control edge between the outputs of the backward data processing node Bn7 upstream of the data processing source node Bn6 of the backward data processing sub-network BCP 1. Thus, only after one of the inputs of the forward data processing transition source node TFn1 obtains a message from the backward data processing node Bn7 that its backward data processing is complete, the forward data processing transition source node TFn1 may initiate forward data processing procedures of the alternate forward data processing source node Fn1 and generate corresponding parameters, and thereby initiate alternate forward data processing procedures of the data processing transition nodes TFn2, TFn3, TFn4, TFn5, TFn 6. Although only one backward data processing node Bn7 located upstream of the backward data processing source node Bn6, or an upstream backward data processing node Bn7 connected to the backward data processing source node Bn6 is shown here, there may be a plurality of backward data processing nodes upstream thereof. Thus, although only one message connection control edge is shown in FIG. 1 for the input of the forward data processing transition source node TFn1 and the output of the backward data processing node Bn7, there may be a corresponding number of such message connection control edges in the case of multiple upstream backward data processing nodes connected to the backward data processing source node Bn 6. That is, the output terminal of the data processing source node of the forward data processing sub-network and the input terminal of the data processing source node of the backward data processing sub-network in the pair of forward data processing sub-network and backward data processing sub-network are respectively one of the input terminals of the data processing source node of the forward data processing transition network. Further, as shown in fig. 1, an output of the forward data processing transition sink node TFn6 of the forward data processing transition network TFCP1 is connected to one of inputs of the backward data processing source node Bn6 of the backward data processing sub-network BCP1 of the pair of forward data processing sub-network FCP1 and backward data processing sub-network BCP1 in which the forward data processing transition network TFCP1 is located, thereby forming a control connection edge between the forward data processing transition sink node TFn6 and the backward data processing source node Bn 6. In this way, the backward data processing source node Bn6 can start backward data processing only after the forward data processing transition sink node TFn6 obtains a message that it completes the completion of the alternative forward data processing procedure, so as to avoid that the backward data processing sub-network BCP1 starts backward data processing to cause a discontinuous wait before the forward data processing transition network TFCP1 has not prepared parameters yet. In other words, the output of the data processing sink node of the forward data processing transition network will be one of the inputs of the data processing source nodes of the backward data processing sub-networks of the pair of forward and backward data processing sub-networks, such that the forward data processing transition network performs forward data processing to prepare input data or control messages for performing backward data processing for the backward data processing sub-networks of the pair of forward and backward data processing sub-networks immediately before performing backward data processing. Therefore, the control connection edge controls the time sequence of the forward data processing transition network for repeatedly executing the forward data processing, ensures that the data processing of the forward data processing transition network occurs as late as possible, shortens the life cycle of the memory occupied by the forward data processing transition network, and increases the memory multiplexing efficiency.
Similarly, the connection relationship edges and control relationship edges between the forward data processing transition network TFCP2 and the pair of forward data processing sub-network FCP2 and backward data processing sub-network BCP2 in which the forward data processing transition network TFCP2 is located as shown in fig. 1 are the same as the connection relationship edges and control relationship edges between the forward data processing transition network TFCP1 and the pair of forward data processing sub-network FCP1 and backward data processing sub-network BCP1 in which the forward data processing transition network TFCP1 is located. Although the two groups are shown as having the same network structure, the network structure between the two groups may be different depending on the task required to process data. And thus will not be described in detail. Furthermore, although one forward data processing node Fn7 is shown between the forward data processing sub-networks FCP1 and FCP2 in fig. 1, this is merely exemplary, and in practice, there may be many more forward data processing nodes Fn connected between them, which form a complex network, depending on the application scenario. But the intermediate parameters need not be used by the backward data processing network. It should be noted that although fig. 1 shows the forward data processing node Fn7 having the corresponding backward data processing node Bn7, in practical applications, not any forward data processing node has a corresponding backward data processing node, which is only exemplary in fig. 1.
It should be noted that, in a data processing system, the scope of the forward data processing sub-network FCP may be set by a person skilled in the construction process of the data processing system, or may be formed according to rules in the deployment process of the system, and therefore, the forward data processing transition network TFCP is correspondingly deployed.
FIG. 2 is a schematic diagram illustrating a deployment system for a data processing network according to the present disclosure. As shown in fig. 2, a deployment system 200 of a data processing network performs network deployment based on job tasks, forming the above-described data processing network. The deployment system 200 of a data processing network comprises: an initial data processing computation graph generation component 210, a checkpoint acquisition component 220, a forward data processing transition computation graph generation component 230, a final data processing computation graph generation component 240, and a data processing network deployment component 250. As shown in fig. 2, the initial data processing calculation graph generation component 210 receives task configuration data included in a job task input by a user, and generates an initial data processing calculation graph for a data processing system, where the initial data processing calculation graph includes a forward data processing calculation graph and a backward data processing calculation graph, the forward data processing calculation graph is used for executing forward data processing, and the backward data processing calculation graph is used for executing backward data processing. The generation process of the calculation graph is processed in a conventional processing manner, and therefore, the detailed description is omitted in this disclosure.
Checkpoint acquisition component 220 then queries all of the data computation logic nodes in the forward data processing computation graph and collects the data computation logic nodes with checkpoint markers and their scope (e.g., by determining whether the data computation logic nodes belong to nodes that directly participate in forward computation, i.e., belong to a scope that includes markers and whether the scope has opened checkpoint markers) to determine a forward data processing computation sub-graph, such as a hash table, of the data computation logic nodes with checkpoint markers. Specifically, all the forward data computation logic nodes under the checkpoint scope package are collected, and therefore each computation subgraph formed by the forward data computation logic nodes is obtained. Specifically, the forward data processing computation graph is composed of a plurality of forward data computation subgraphs. In the query process, the checkpoint acquisition component 220 traverses all the forward data computation logic nodes in each forward data computation subgraph, and determines whether all the output edges of the forward data computation logic nodes are connected to the backward data processing network, i.e., whether the data generated by the forward data computation logic nodes are consumed by the backward data computation logic nodes. BFS searching is adopted, the next data computation sub-graph node is found according to the current node of the forward data computation sub-graph, and the searching logic is as follows: and traversing the next data computation sub-graph node (possibly a forward data computation sub-graph node or a backward data computation sub-graph node) with consumption relation on the input/input edge of the current node as a starting point, marking the traversed node as accessed, and placing the traversed node into the accessed node set. If all the forward data computation logic nodes in the forward data computation subgraph are not connected to the backward data computation logic nodes, the forward data computation subgraph is indicated to be only involved in forward data processing and is not related to backward data processing, and therefore the query skips the forward data computation subgraph, namely, the query is not the target of the checkpoint. Conversely, if so, it is marked as a checkpoint. Data computation logical nodes with checkpoint markers are typically marked by means of code wrapping, and data computation logical nodes within the scope of the checkpoint markers constitute a computation subgraph containing checkpoints. These computational subgraphs will be the smallest unit of data processing transition computational subgraphs that implement the present disclosure. Checkpoint acquisition component 220 thus acquires check points in the forward data processing computation graph for forward data processing transition computation subgraph generation component 230 to generate forward data processing transition computation subgraphs based on these check points.
And after the data calculation logic node with the check point mark indicates that the life cycle of the forward tensor is finished in a static memory multiplexing mode, the rest tensors can multiplex the memory, so that the effects of memory multiplexing and memory saving are achieved. After the check points are marked in a code wrapping mode, only one memory of the input tensor is saved in the whole forward data processing process of the computation subgraph network in the area covered by the check points, the generated parameters are stored, and the memories of all generated intermediate feature tensors of other data computation logic nodes of the computation subgraph are not locked but released.
Subsequently, the forward data processing transition computational subgraph generation component 230 generates forward data processing transition computational subgraphs based on the forward computational subgraphs encompassed by these checkpoints. The checkpoints contain forward computational subgraphs that are computational subgraphs associated with backward data processing where the result data of at least one forward data computing node is to be consumed by at least one backward data computing node. And copying the network structure of the forward computational subgraph to generate the forward data processing transition computational subgraph, and copying the connecting edge of the forward computational subgraph and the backward computational subgraph.
Next, the final data processing computation graph generation component 240 generates a final data processing computation graph based on a combination of the initial data processing computation graph generated by the initial, initial data processing computation graph generation component 210 and the forward data processing transition computation subgraph generated by the forward data processing transition computation subgraph generation component 230. Specifically, the connection edge between the forward data computation logic node and the backward data processing computation graph in the initial data processing computation subgraph is disconnected, the output end of the data computation logic source node of the corresponding forward data processing computation subgraph and the input end of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph are respectively used as one of the input ends of the data computation logic source node of the forward data processing transition computation subgraph, and the output end of the data computation logic sink node of the forward data processing transition computation subgraph is used as one of the input ends of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, so that the final data processing computation graph is obtained. Since the forward data computation logic nodes themselves in the initial data processing computation subgraph are disconnected from the connecting edges of the backward data processing computation graph, all data computation logic nodes in each forward data computation subgraph except the source node of the forward data computation subgraph do not participate in backward data processing of the data computation logic nodes in the backward data processing computation graph. In other words, the initial data processing computation graph is modified by inserting a forward data processing transition computation sub-graph, which is identical to the forward data processing computation sub-graph, between pairs of the forward data processing computation sub-graph and the backward data processing computation sub-graph, thereby breaking the connecting edge between the initial forward data processing computation sub-graph and the backward data processing computation sub-graph, and the connecting edge between the initial forward data processing computation sub-graph and the backward data processing computation sub-graph is replaced by the connecting edge between the inserted forward data processing transition computation sub-graph and the backward data processing computation sub-graph.
Finally, data processing network deployment component 250 deploys each data computation logic node to a data processing node based on the final data processing computational graph, thereby forming a data processing network corresponding to the final data processing computational graph, the data processing network including a forward data processing network, a backward data processing network, and a forward data processing transition network corresponding to the forward data processing transition computational subgraph.
FIG. 3 is a schematic diagram illustrating a method of deployment for a data processing network according to the present disclosure. As shown in FIG. 3, first at step 300, task configuration data input by a user is received and computing resources are obtained. An initial data processing computation graph for the data processing system is then generated by the initial data processing computation graph generation component 210 based on the received user-input task configuration data at step 310, the initial data processing computation graph including a forward data processing computation graph for performing forward data processing and a backward data processing computation graph for performing backward data processing. Next, at step 320, the checkpoint acquisition component 220 queries all of the data computation logic nodes in the forward data processing computation graph, collects the data computation logic nodes with checkpoint markers and their scopes of action, and thereby determines a forward data processing computation subgraph of the data computation logic nodes with checkpoint markers. By determining whether a data computation logic node belongs to a node directly participating in forward computation, i.e., belongs to a range that includes a flag, and whether the range has a checkpoint flag enabled), a forward data processing computation sub-graph, e.g., a hash table, of the data computation logic node having the checkpoint flag is determined. Specifically, all the forward data computation logic nodes under the checkpoint scope package are collected, and therefore each computation subgraph formed by the forward data computation logic nodes is obtained. Specifically, the forward data processing computation graph is composed of a plurality of forward data computation subgraphs. In the query process, the checkpoint acquisition component 220 traverses all the forward data computation logic nodes in each forward data computation subgraph, and determines whether all the output edges of the forward data computation logic nodes are connected to the backward data processing network, i.e., whether the data generated by the forward data computation logic nodes are consumed by the backward data computation logic nodes. BFS searching is adopted, the next data computation sub-graph node is found according to the current node of the forward data computation sub-graph, and the searching logic is as follows: and traversing the next data computation sub-graph node (possibly a forward data computation sub-graph node or a backward data computation sub-graph node) with consumption relation on the input/input edge of the current node as a starting point, marking the traversed node as accessed, and placing the traversed node into the accessed node set. If all the forward data computation logic nodes in the forward data computation subgraph are not connected to the backward data computation logic nodes, the forward data computation subgraph is indicated to be only involved in forward data processing and is not related to backward data processing, and therefore the query skips the forward data computation subgraph, namely, the query is not the target of the checkpoint. Conversely, if so, it is marked as a checkpoint. Data computation logical nodes with checkpoint markers are typically marked by means of code wrapping, and data computation logical nodes within the scope of the checkpoint markers constitute a computation subgraph containing checkpoints. These computational subgraphs will be the smallest unit of data processing transition computational subgraphs that implement the present disclosure.
Next, at step 330, a corresponding forward data processing transition computational subgraph is generated based on the forward data processing computational subgraph replication by the forward data processing transition computational subgraph generation component 230. Subsequently, at step 340, the final data processing computation graph generation component 240 disconnects the data computation logic nodes themselves in the data processing computation subgraph from the connecting edges of the backward data processing computation graph, takes the output of the data computation logic source node of the corresponding forward data processing computation subgraph and the input of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph as one of the inputs of the data computation logic source nodes of the forward data processing transition computation subgraph respectively, and taking the output end of the data computation logic sink node of the forward data processing transition computation subgraph as one of the input ends of the data computation logic source nodes of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, thereby obtaining the final data processing computation graph. Finally, at step 350, deploying, by the data processing network deployment component, each data computation logic node to a data processing node based on the final data processing computational graph, thereby forming a data processing network corresponding to the final data processing computational graph, the data processing network including a forward data processing network, a backward data processing network, and a forward data processing transition network corresponding to the forward data processing transition computational subgraph.
The basic principles of the present disclosure have been described in connection with specific embodiments, but it should be noted that it will be understood by those skilled in the art that all or any of the steps or components of the method and apparatus of the present disclosure may be implemented in any computing device (including processors, storage media, etc.) or network of computing devices, in hardware, firmware, software, or a combination thereof, which can be implemented by those skilled in the art using their basic programming skills after reading the description of the present disclosure.
Thus, the objects of the present disclosure may also be achieved by running a program or a set of programs on any computing device. The computing device may be a general purpose device as is well known. Thus, the object of the present disclosure can also be achieved merely by providing a program product containing program code for implementing the method or apparatus. That is, such a program product also constitutes the present disclosure, and a storage medium storing such a program product also constitutes the present disclosure. It is to be understood that the storage medium may be any known storage medium or any storage medium developed in the future.
It is also noted that in the apparatus and methods of the present disclosure, it is apparent that individual components or steps may be disassembled and/or re-assembled. These decompositions and/or recombinations are to be considered equivalents of the present disclosure. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
The above detailed description should not be construed as limiting the scope of the disclosure. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (8)

1. A data processing network system, comprising:
a forward data processing network for performing forward data processing, including a plurality of forward data processing sub-networks, each sub-network being formed of a plurality of data processing nodes, each data processing node performing data processing using data generated by an upstream data processing node thereof and outputting data to a downstream data processing node thereof;
a backward data processing network for performing backward data processing, comprising a plurality of backward data processing sub-networks; and
one or more forward data processing transition networks, each of which is located between and separates a pair of a forward data processing sub-network and a backward data processing sub-network, and has the same network structure as a forward data processing sub-network of a pair of the forward data processing sub-network and the backward data processing sub-network on which it is located,
wherein the output of the data processing source node of the forward data processing sub-network and the input of the data processing source node of the backward data processing sub-network in the pair of forward data processing sub-networks and backward data processing sub-networks are respectively one of the inputs of the data processing source node of the forward data processing transition network, and the output of the data processing sink node of the forward data processing transition network is one of the inputs of the data processing source node of the backward data processing sub-network in the pair of forward data processing sub-networks and backward data processing sub-networks, so that the buffers used by the parameters generated by other forward data processing nodes of the forward data processing sub-networks except the parameters generated by the forward data processing source node of the forward data processing sub-network are released, and before the backward data processing sub-networks of the pair of forward and backward data processing sub-networks are about to perform backward data processing, the forward data processing transition network performs, based on a message that the backward data processing sub-networks are about to perform backward data processing, transitional forward data processing in place of the forward data processing sub-networks performing the forward data processing procedure performed by the forward data processing sub-networks once again, so as to prepare input data for performing backward data processing for the backward data processing sub-networks of the pair of forward and backward data processing sub-networks.
2. The data processing network system of claim 1, wherein the data processing source nodes of the forward data processing transition network share a same memory location as the data processing source nodes of the forward data processing sub-network.
3. The data processing network system of claim 1, wherein the message that the backward data processing sub-network is about to perform backward data processing is a backward data processing end message issued from an output of one or more upstream backward data processing nodes connected to the data processing source node of the backward data processing sub-network.
4. A method of data processing, comprising:
performing forward data processing through a forward data processing network comprising a plurality of forward data processing sub-networks, each sub-network being formed by a plurality of data processing nodes, each data processing node performing data processing using data generated by an upstream data processing node thereof and outputting data to a downstream data processing node thereof;
performing backward data processing through a backward data processing network comprising a plurality of backward data processing sub-networks; and
performing transitional forward data processing through one or more forward data processing transition networks, wherein each forward data processing transition network is positioned between and separates a pair of a forward data processing sub-network and a backward data processing sub-network, and the forward data processing transition network and the forward data processing sub-network in the pair of the forward data processing sub-network and the backward data processing sub-network in which the forward data processing transition network is positioned have the same network structure, the output end of the data processing source node of the forward data processing sub-network and the data processing source node input end of the backward data processing sub-network in the pair of the forward data processing sub-network and the backward data processing sub-network are respectively one of the input ends of the data processing source node of the forward data processing transition network, and the output end of the data processing sink node of the forward data processing transition network is used as the input end of the pair of the forward data processing sub-network and the backward data processing sub-network One of the inputs of the data processing source nodes of the backward data processing sub-networks in the network, whereby the buffers used by the parameters generated by the forward data processing nodes of the forward data processing sub-network, except for the forward data processing transition source node for which the parameters generated by the forward data processing source node are to be saved, are released, and in that a backward data processing sub-network of the pair of forward and backward data processing sub-networks is immediately before performing backward data processing, the forward data processing transition network performs, based on a message that the backward data processing sub-network is about to perform backward data processing, transition forward data processing in place of the forward data processing sub-network performing again the forward data processing procedure performed by the forward data processing sub-network in one pass for the backward data processing sub-network of the pair of forward and backward data processing sub-networks Backward data processing is performed to prepare input data.
5. The data processing method of claim 4, wherein the data processing source nodes of the forward data processing transition network share the same memory location as the data processing source nodes of the forward data processing sub-network.
6. A data processing method according to claim 4, wherein the message that the backward data processing sub-network is about to perform backward data processing is a message that backward data processing is finished issued at the output of one or more upstream backward data processing nodes connected to the data processing source node of the backward data processing sub-network.
7. A deployment system for a data processing network, comprising:
the initial data processing calculation graph generation component receives task configuration data input by a user and generates an initial data processing calculation graph for the data processing system, wherein the initial data processing calculation graph comprises a forward data processing calculation graph and a backward data processing calculation graph, the forward data processing calculation graph is used for executing forward data processing, and the backward data processing calculation graph is used for executing backward data processing;
the check point acquisition component inquires all the data calculation logic nodes in the forward data processing calculation graph and collects the data calculation logic nodes with check point marks and the action range thereof so as to determine a forward data processing calculation subgraph of the data calculation logic nodes with the check point marks;
the forward data processing transition computation subgraph generation component is used for generating a corresponding forward data processing transition computation subgraph based on the forward data processing computation subgraph copying;
a final data processing computation graph generation component which disconnects the data computation logic nodes themselves in the data processing computation subgraph from the connection edges of the backward data processing computation graph, and takes the output end of the data computation logic source node of the corresponding forward data processing computation subgraph and the input end of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph as one of the input ends of the data computation logic source nodes of the forward data processing transition computation subgraph respectively, taking the output end of the data computation logic sink node of the forward data processing transition computation subgraph as one of the input ends of the data computation logic source nodes of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, thereby obtaining a final data processing computation graph; and
and the data processing network deployment component is used for deploying each data computing logic node to the data processing nodes based on the final data processing computational graph so as to form a data processing network corresponding to the final data processing computational graph, wherein the data processing network comprises a forward data processing network, a backward data processing network and a forward data processing transition network corresponding to the forward data processing transition computational subgraph.
8. A method of deploying a data processing network, comprising:
generating, by an initial data processing computation graph generation component, an initial data processing computation graph for the data processing system based on the received user-input task configuration data, the initial data processing computation graph comprising a forward data processing computation graph and a backward data processing computation graph, the forward data processing computation graph for performing forward data processing, the backward data processing computation graph for performing backward data processing;
querying all data calculation logical nodes in the forward data processing calculation graph through a check point acquisition component, and collecting the data calculation logical nodes with check point marks and action ranges thereof, thereby determining a forward data processing calculation subgraph of the data calculation logical nodes with the check point marks;
generating a corresponding forward data processing transition computation subgraph by a forward data processing transition computation subgraph generation component based on forward data processing computation subgraph replication;
disconnecting the connection edge between the data computation logic node and a backward data processing computation graph in the data processing computation subgraph, taking the output end of the data computation logic source node of the corresponding forward data processing computation subgraph and the input end of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph as one of the input ends of the data computation logic source nodes of the forward data processing transition computation subgraph respectively, and taking the output end of the data computation logic sink node of the forward data processing transition computation subgraph as one of the input ends of the data computation logic source node of the backward data processing sub-network computation subgraph corresponding to the corresponding forward data processing computation subgraph, thereby obtaining a final data processing computation graph; and
deploying, by a data processing network deployment component, each data computation logic node to a data processing node based on the final data processing computation graph, thereby forming a data processing network corresponding to the final data processing computation graph, the data processing network including a forward data processing network, a backward data processing network, and a forward data processing transition network corresponding to the forward data processing transition computation subgraph.
CN202111183990.3A 2021-10-11 2021-10-11 Data processing network system, data processing network deployment system and method thereof Active CN113626652B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111183990.3A CN113626652B (en) 2021-10-11 2021-10-11 Data processing network system, data processing network deployment system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111183990.3A CN113626652B (en) 2021-10-11 2021-10-11 Data processing network system, data processing network deployment system and method thereof

Publications (2)

Publication Number Publication Date
CN113626652A CN113626652A (en) 2021-11-09
CN113626652B true CN113626652B (en) 2021-12-17

Family

ID=78390913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111183990.3A Active CN113626652B (en) 2021-10-11 2021-10-11 Data processing network system, data processing network deployment system and method thereof

Country Status (1)

Country Link
CN (1) CN113626652B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112764940A (en) * 2021-04-12 2021-05-07 北京一流科技有限公司 Multi-stage distributed data processing and deploying system and method thereof
CN113342525A (en) * 2020-07-24 2021-09-03 北京一流科技有限公司 Distributed data processing system and method thereof

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550749A (en) * 2015-12-09 2016-05-04 四川长虹电器股份有限公司 Method for constructing convolution neural network in novel network topological structure
US10432652B1 (en) * 2016-09-20 2019-10-01 F5 Networks, Inc. Methods for detecting and mitigating malicious network behavior and devices thereof
US11630994B2 (en) * 2018-02-17 2023-04-18 Advanced Micro Devices, Inc. Optimized asynchronous training of neural networks using a distributed parameter server with eager updates
CN110602096B (en) * 2019-09-12 2021-07-13 腾讯科技(深圳)有限公司 Data processing method, device, storage medium and equipment in block chain network
CN111105017B (en) * 2019-12-24 2023-05-16 北京旷视科技有限公司 Neural network quantization method and device and electronic equipment
CN111338635B (en) * 2020-02-20 2023-09-12 腾讯科技(深圳)有限公司 Graph compiling method, device, equipment and storage medium for calculation graph
CN111723933B (en) * 2020-06-03 2024-04-16 上海商汤智能科技有限公司 Training method of neural network model and related products

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342525A (en) * 2020-07-24 2021-09-03 北京一流科技有限公司 Distributed data processing system and method thereof
CN112764940A (en) * 2021-04-12 2021-05-07 北京一流科技有限公司 Multi-stage distributed data processing and deploying system and method thereof

Also Published As

Publication number Publication date
CN113626652A (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN110727731B (en) Method for adding node in block chain network and block chain system
CN109242500B (en) Block chain transaction validity verification method and device and storage medium
CN111131399B (en) Method and device for dynamically increasing consensus nodes in block chain
CN102932415A (en) Method and device for storing mirror image document
WO2009033248A1 (en) A method for efficient thread usage for hierarchically structured tasks
CN110704438B (en) Method and device for generating bloom filter in blockchain
CN112667860A (en) Sub-graph matching method, device, equipment and storage medium
CN113626652B (en) Data processing network system, data processing network deployment system and method thereof
CN108897858A (en) The appraisal procedure and device, electronic equipment of distributed type assemblies index fragment
CN107977310B (en) Traversal test command generation method and device
CN112069259B (en) Multi-cloud environment data storage system and method based on blockchain
CN112463340A (en) Tensorflow-based multi-task flexible scheduling method and system
CN115665174B (en) Gradient data synchronization method, system, equipment and storage medium
CN114510338B (en) Task scheduling method, task scheduling device and computer readable storage medium
CN108494589B (en) Management method and system of distributed Nginx server
CN114756385B (en) Elastic distributed training method under deep learning scene
CN109389271B (en) Application performance management method and system
CN105447141A (en) Data processing method and node
CN116980281A (en) Node selection method, node selection device, first node, storage medium and program product
CN114710350A (en) Allocation method and device for callable resources
EP3793171B1 (en) Message processing method, apparatus, and system
CN116012485A (en) Time sequence path processing method and device and storage medium
CN113297164A (en) Database system, data query method and device
CN111724260B (en) Multi-scene configuration data storage method and system based on configuration block
CN116701410B (en) Method and system for storing memory state data for data language of digital networking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant