WO2023202005A1 - Methods and systems for performing data processing tasks - Google Patents
Methods and systems for performing data processing tasks Download PDFInfo
- Publication number
- WO2023202005A1 WO2023202005A1 PCT/CN2022/125155 CN2022125155W WO2023202005A1 WO 2023202005 A1 WO2023202005 A1 WO 2023202005A1 CN 2022125155 W CN2022125155 W CN 2022125155W WO 2023202005 A1 WO2023202005 A1 WO 2023202005A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- processing
- processing node
- data
- node
- current
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 928
- 238000000034 method Methods 0.000 title claims abstract description 75
- 230000004044 response Effects 0.000 claims abstract description 61
- 238000012546 transfer Methods 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims description 47
- 238000004422 calculation algorithm Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 125000004122 cyclic group Chemical group 0.000 description 9
- 238000011143 downstream manufacturing Methods 0.000 description 9
- 238000011144 upstream manufacturing Methods 0.000 description 9
- 238000012986 modification Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 238000005457 optimization Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000004075 alteration Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 230000000712 assembly Effects 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000007620 mathematical function Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
Definitions
- FIG. 1 is a flowchart illustrating an exemplary process for performing a data processing task according to some embodiments of the present disclosure
- the data processing task may be presented by a computational graph.
- the computational graph may be a directed graph configured to describe the data processing task.
- the computational graph may include one or more nodes and one or more directed edges each of which is configured to connect nodes among the plurality of nodes.
- a node in the computational graph may refer to a processing node configured to represent the processing flow.
- a processing node may also be referred to as a processing flow node.
- a processing flow may also be referred to as a node processing flow. More descriptions of the computational graph may be found elsewhere in the present disclosure, e.g., FIG. 3 to FIG. 9 and the descriptions thereof.
- the current processing node may receive multiple types of input data, such as forward input data, loopback input data, etc.
- the current processing node may perform the processing flow corresponding to the current processing node according to the data processing strategy, so that the current processing node may correctly and flexibly process the different types of input data.
- a determination result may be obtained by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node.
- the first data processing strategy may also be referred to as an asynchronous mode of data processing.
- a current scheduling strategy may be obtained by selecting a scheduling strategy (i.e., a data processing strategy) from preset scheduling strategies based on the hierarchy value of the current processing node and the hierarchy values of remaining processing nodes.
- a scheduling strategy i.e., a data processing strategy
- the processor may preset at least two scheduling strategies to obtain a preset scheduling strategy set, and the scheduling strategy may be a manner for scheduling the input data of the current processing by using a task scheduling device; by using a relationship between the hierarchy value of the current processing node and the hierarchy values of the processing nodes (i.e., the remaining processing nodes) other than the current processing node among all processing nodes, transmitting the input data to the current processing node by using which scheduling strategy may be determined.
- S71-S73 may be the same as S41-S43, which may not be repeated here.
- the loopback input data may also be referred to as loopback data.
- the operations of identifying the loopback data may be performed; specifically, identification information may be obtained by identifying the output data of the first input node, the first input node may be an input node that the hierarchy value which greater than the hierarchy value of the current processing node; for example, as shown in FIG, 6, the processing node 2 may include two input data; one of the two input data may from the processing node 1, and another one of the two input data may from the processing node 5, it needs to identify which input data is the loopback data; the input data may include a hierarchy value of a starting point processing node (i.e., the input data) , in response to the hierarchy value of the input data is greater than the hierarchy value of the current processing node, the data output by the input data may be identified as the loopback data, i.e., the data output by the processing node 5 may be identified as the loopback data.
- a starting point processing node i.e., the input data
- the input data of the current processing node may be performed synchronously, as shown in FIG. 6, the processing node 2 may include two input data; one of the two input data may be from the processing node 1, and another one of the two input data may from the processing node 5, therefore, a synchronization mechanism needs to be added to ensure that the data is correctly sent to the processing node 2, the embodiments provide two synchronization mechanisms, specifically as follows:
- the current processing node may process first output data by inputting the first output data into the current processing node, the first output data may be output data of the first input node and the first mode selection instruction may be instruction generated by mode selection of the user; and/or, the current processing may process second output data by inputting the second output data into the current processing node, the second output data may be output data of the second input node.
- the processing module 1220 may determine that the data processing strategy corresponding to the current processing node is related to a type of the processing flow corresponding to the current processing node.
- the processing module 1220 may identify loopback input data corresponding to the loopback input path in response to the determination result indicating that the current processing path includes the loopback input path.
- the processing module 1220 may determine that the data processing strategy corresponding to the current processing node includes a second data processing strategy.
- the second data processing strategy may include: in response to the input data received from the current processing node does not include loopback input data, determining whether the input data is first input data of the current processing node; and performing the process flow corresponding to the current processing node based on a second determination result including: in response to determining that the input data is the first input data of the current processing node, performing the process flow corresponding to the current processing node by processing the input data; or in response to determining that the input data is not the first input data of the current processing node, caching the input data and receiving next input data, performing the process flow corresponding to the current processing node based on the next input data.
- the following 2.1) -2.6) take the image processing task as an example to illustrate an exemplary description of the processing of determining the optimization direction of data processing algorithms, other technical fields may refer to the following examples to determine the optimization direction of data processing algorithms in the processing task of the data processing task.
- the data processing task may be: the initial input data may be the image to be processed, and a target task may be realized by processing the image in the plurality of data processing flow using various image processing algorithms, for example, detecting and identifying the target in the image.
- the plurality of different images may be obtained, and the corresponding sequence of data processing information may be obtained by performing the data processing task for each image.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Logic Circuits (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
A method and system for performing a data processing task. The method may include obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; and performing the plurality of processing flows based on the computational graph by: determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; and in response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority of Chinese Patent Application No. 202210412395.0 filed on April 19, 2022, the contents of each of which are hereby incorporated by reference.
The present disclosure relates to a technical field of intelligent processing, in particular, relates to methods and systems for performing a data processing task.
A computational graph is a way to represent mathematical functions in a graph theoretical language, which is used to describe a computational structure. The introduction of the computational graph may facilitate a visual representation of networks. Elements of the computational graph may include nodes and edges, the nodes may be connected among the plurality of edges.
The computational graph is defined as a directed graph, and a connection direction of the edges in the computational graph may represent a direction of data transfer. Generally, each processing flow is performed in sequence, and the direction of data transfer is unidirectional, which can not support the loopback data stream and cause the restriction of the application scenarios. For example, results between non-adjacent processing flows are difficult to transfer to each other and the loopback data may not be processed correctly, which may cause problems such as confusion in the data processing and errors in the final processing result.
Based on the above problems, the present disclosure provides methods and systems for improving the directed acyclic graph.
SUMMARY
One aspect of the present disclosure may provide a system for performing a data processing task. The system may include: at least one storage device including a set of instructions; and at least one processor configured to communicate with the at least one storage device, wherein when executing the set of instructions, the at least one processor is configured to direct the system to perform operations including: obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; and performing the plurality of processing flows based on the computational graph by: determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; and in response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
Another aspect of the present disclosure may provide a method for performing a data processing task. The method may include: obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; and performing the plurality of processing flows based on the computational graph by: determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; and in response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
Another aspect of the present disclosure may provide a system for performing a data processing task. The system may include: an obtaining module configured to obtain a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; and a processing module configured to performing the plurality of processing flows based on the computational graph by: determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; and in response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
The present disclosure is further illustrated in terms of exemplary embodiments, and these exemplary embodiments are described in detail with reference to the drawings. These embodiments are not restrictive. In these embodiments, the same number indicates the same structure, wherein:
FIG. 1 is a flowchart illustrating an exemplary process for performing a data processing task according to some embodiments of the present disclosure;
FIG. 2 is a flowchart illustrating an exemplary process for determining a data processing strategy corresponding to a current processing node according to some embodiments of the present disclosure;
FIG. 3 is a schematic diagram illustrating an exemplary processing of a directed acyclic graph according to some embodiments of the present disclosure;
FIG. 4 is a flowchart illustrating a process fora task scheduling according to some embodiments of the present disclosure;
FIG. 5 is a schematic diagram illustrating a processing of a directed cyclic graph according to some embodiments of the present disclosure;
FIG. 6 is a schematic diagram illustrating a hierarchy value of each of a plurality of processing nodes in a directed cyclic graph according to some embodiments of the present disclosure;
FIG. 7 is a flowchart illustrating another exemplary process for a task scheduling according to some embodiments of the present disclosure;
FIG. 8 is a schematic diagram illustrating a corresponding interaction of an asynchronous mode according to some embodiments of the present disclosure;
FIG. 9 is a schematic diagram illustrating a corresponding interaction of a synchronous mode according to some embodiments of the present disclosure;
FIG. 10 is a schematic diagram illustrating a task scheduling device according to some embodiments of the present disclosure;
FIG. 11 is a schematic diagram illustrating a computer readable storage medium according to some embodiments of the present disclosure;
FIG. 12 is a block diagram illustrating an exemplary system for performing a data processing task according to some embodiments of the present disclosure.
In order to illustrate the technical solutions related to the embodiments of the present disclosure, brief introduction of the drawings referred to in the description of the embodiments is provided below. Obviously, drawings described below are only some examples or embodiments of the present disclosure. Those having ordinary skills in the art, without further creative efforts, may apply the present disclosure to other similar scenarios according to these drawings. Unless stated otherwise or obvious from the context, the same reference numeral in the drawings refers to the same structure and operation.
It will be understood that the terms “system, ” “device, ” “unit, ” and/or “module” used herein are one method to distinguish different components, elements, parts, sections, or assemblies of different levels in ascending order. However, the terms may be displaced by other expressions if they may achieve the same purpose.
As shown in the present disclosure and claims, unless the context clearly indicates exceptions, the words “a, ” “an, ” “one, ” and/or “the” do not specifically refer to the singular, but may also include the plural. The terms "including" and "comprising" only suggest that the steps and elements that have been clearly identified are included, and these steps and elements do not constitute an exclusive list, and the method or device may also include other steps or elements.
A flowchart is used in the present disclosure to illustrate the operation performed by the system according to the embodiment of the present disclosure. It should be understood that the preceding or subsequent operations are not necessarily performed accurately in sequence. Instead, the steps may be processed in reverse order or simultaneously. At the same time, other operations may add to these procedures, or remove one or more operations from these procedures.
FIG. 1 is a flow diagram illustrating an exemplary process for performing a data processing task according to some embodiments of the present disclosure. As shown in FIG. 1, a process 100 may include the following operations. In some embodiments, the process 100 may be performed by a processor (e.g., a processor 1020) .
In 110, a computational graph corresponding to a data processing task may be obtained.
The data processing task may refer to a task for processing data. For example, the data processing task may include storing the data, obtaining an image detection result by retrieving input data, encoding collected data, processing image data, etc.
In some embodiments, the data processing task may include a plurality of processing flows. For example, a data processing task may include processing flows such as collecting data, storing data, retrieving data, etc. A processing flow may include one or more processing operations.
In some embodiments, a processor (e.g., the processor 1020 in FIG. 10) may obtain a data processing result by performing the plurality of processing flows in sequence.
In some embodiments, the data processing task may be presented by a computational graph. The computational graph may be a directed graph configured to describe the data processing task. The computational graph may include one or more nodes and one or more directed edges each of which is configured to connect nodes among the plurality of nodes. In some embodiments, a node in the computational graph may refer to a processing node configured to represent the processing flow. As used herein, a processing node may also be referred to as a processing flow node. A processing flow may also be referred to as a node processing flow. More descriptions of the computational graph may be found elsewhere in the present disclosure, e.g., FIG. 3 to FIG. 9 and the descriptions thereof.
In some embodiments, the computational graph may be a directed acyclic graph (i.e., a directed graph without loops) . More descriptions of the directed acyclic graph may be found in FIG. 3 and the related descriptions. In some embodiments, the computational graph may be a directed cyclic graph. More descriptions of the directed cyclic graph may be found in FIG. 5 and the related descriptions.
In some embodiments, the computational graph may include a plurality of processing nodes and directed edges) . The plurality of processing nodes may represent the plurality of processing flows. Each of the directed edges may be configured to connect processing nodes among the plurality of processing nodes. The directed edges may represent a relationship between the processing node (e.g., a data transmission relationship) . For example, a starting point of a directed edge may be or connect a processing node A, an ending point of the directed edge may be or connect a processing node B, and the directed edge may represent data transmission between the processing node A and the processing node B.
In some embodiments, a processing node corresponding to or connected with the starting point of the directed edge may refer to an input processing node, and a processing node corresponding to or connected with the ending point of the directed edge may refer to a receiving processing node.
In some embodiments, the processing node may be or include a program or a module that implements at least one algorithm. More descriptions of the processing node may be found in FIG. 4 and the related descriptions.
In some embodiments, the processor (e.g., a processor 1020) may obtain the computational graph in various possible manners, for example, receiving the computational graph sent by other devices or building the computational graph according to application needs or specific application scenarios. More descriptions of obtaining the computational graph may be found in FIG. 4 and the related descriptions.
In 120, the plurality of processing flows may be performed based on the computational graph.
The processor may perform the plurality of processing nodes according to the data transmission relationship represented by the computational graph. In some embodiments, the computational graph may include processing nodes 1, 2, and 3, representing processing flows A, B, and C, respectively. For example, the computational graph may include processing nodes 1, 2, and 3, representing processing flows A, B, and C, respectively, and the processing flow A may be an initial processing flow. The processing node 1 may be directed to the processing node 2, and the processing node 2 may be directed to the processing node 3. The directional herein may represent a direction of data flow. The processer may perform the above processing flows according to the sequence of the processing flow A, the processing flow B, and the processing flow C. During the execution, input data of the processing node A may be preset data, a calculation result of the processing node A may be input data of the processing flow B, and a calculation result of the processing flow B may be input data of the processing flow C.
The plurality of processing flows may be performed according to following operations 121 and 122.
In 121, a data processing strategy corresponding to a current processing node may be determined based on a hierarchy value of the current processing node and a hierarchy value of at least one processing node connected with the current processing node. A hierarchy value of a processing node may be related to a count of edges between the processing node and an initial processing node.
The hierarchy value of a processing node may reflect an execution sequence of the processing node. The hierarchy value may be represented by any feasible data such as numeric values, characters, etc. For example, the hierarchy value of a processing node may be represented as a certain value between 0-100, and the larger the value is, the later the execution sequence of the processing node may be. As another example, the hierarchy value of a processing node may be represented as a certain letter between A-Z, and the later the letter is in the alphabet, the later the execution sequence of the processing node may be.
In some embodiments, the processor may assign a hierarchy value to each processing node in the computational graph. More descriptions of assigning the hierarchy value may be found in FIG. 4 and the related descriptions.
The current processing node may correspond to a current processing flow to be processed. For example, when the processing flow A needs to be processed currently, a node corresponding to the processing flow A may be a node 1, and the node 1 may be the current processing node.
In some embodiments, the processor may determine the current processing node in the plurality of processing nodes in sequence by performing each of the plurality of processing flows of the data processing task. More descriptions of the current processing node may be found in FIG. 4 and the related descriptions.
In some embodiments, the hierarchy value of the processing node may be related to a count of edges between the processing node and an initial processing node. A count of edges between the processing nodes may be equal to a difference between the hierarchy values of the processing nodes. When the hierarchy value of the initial processing node is known, a hierarchy value of the current processing node may be determined according to the count of edges between the current processing node and the initial processing node. Since the processing node may be used to represent the processing flow, the hierarchy value of the current processing node corresponding to the current processing flow may also be determined according to the count of edges between the current processing node corresponding to the current processing flow and the initial processing node corresponding to the initial processing flow. For example, FIG. 5 shows two data flow paths. One of the two data flow paths may be in the following sequence: a processing flow 1, a processing flow 2, a processing flow 3-1, a processing flow 4-1, and a processing flow 5. A processing node of the starting point of the data flow path may be the processing flow 1, and the hierarchy value of the processing flow 1 may be determined as 0; the count of edges from the processing flow 3-1 to the processing flow 1 may be 2, and the hierarchy value of the processing flow 3-1 may be determined as 2; the count of edges from the processing flow 4-1 to the processing flow 1 may be 3, and the hierarchy value of the processing flow 4-1 may be determined as 3; the count of edges from the processing flow 5 to the processing flow 1 may be 4, and the hierarchy value of the processing flow 5 may be determined as 4.
In some embodiments, a processing node with a lower hierarchy value than the current processing node may be referred to as an upstream processing node, and a processing node with a higher hierarchy value than the current processing node may be referred to as a downstream processing node.
Input data from the upstream processing node received by the current processing node may be referred to as forward input data. For example, as shown in FIG. 5, output data of the processing node 2 whose hierarchy value may be 2 may be used as input data of the processing node 3 whose hierarchy value may be 3, the output data of the processing node 2 may be referred to as the forward input data.
Input data from the downstream processing node received by the current processing node may be referred to as loopback input data. As shown in FIG. 5, output data of the processing node 5 whose the hierarchy value may be 5 may be used as input data of the processing node 2 whose the hierarchy value may be 2, and the output data of the processing node 5 may be referred to as the loopback input data.
The data processing strategy may refer to a processing manner of input data of the processing node. For example, the data processing strategy may be performing the process flow corresponding to the current processing node by processing the input data. As another example, the data processing strategy may be caching the input data and receiving the next input data, and performing the process flow corresponding to the current processing node based on the next input data.
In some embodiments, the data processing strategy may be a scheduling strategy. More descriptions of the scheduling strategy and the process of determining the scheduling strategy may be found in FIG. 4, FIG, 7, and the related descriptions.
In some embodiments, the processor may obtain a determination result by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node; and determine the data processing strategy corresponding to the current processing node based on the determination result. In some embodiments, the at least one associative processing node connected with the current processing node may include an input processing node connected with the current processing node. More descriptions of the embodiments above may be found in FIG. 2, FIG. 5, and the related descriptions.
In 122, in response to determining that the current processing node receives input data, a processing flow corresponding to the current processing node may be performed according to the data processing strategy based on the input data.
According to the descriptions of the computational graph, each of the plurality of processing nodes may receive output data of the starting point of the edge connected with the processing node as own input data, and the processor may further perform the processing flow corresponding to the processing node based on the input data.
In response to determining that the current processing node receives input data, the processor may perform the processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
In some embodiments, the current processing node may receive multiple types of input data, such as forward input data, loopback input data, etc. The current processing node may perform the processing flow corresponding to the current processing node according to the data processing strategy, so that the current processing node may correctly and flexibly process the different types of input data.
More descriptions of the embodiments above may be found in FIG. 4 and the related descriptions.
In some embodiments, the hierarchy value for each processing node may be configured in the computational graph, the data processing strategy corresponding to the current processing node may be determined according to a relative relationship between the nodes, and the processing flow corresponding to the current processing node may be performed according to the data processing strategy, the current processing node may correctly process input data from upstream and downstream processing nodes. The manner may be applied to a data processing task scenario with the loopback input data, so as to ensure the correct processing of the data and increase the flexibility of the data processing.
FIG. 2 is a flowchart illustrating an exemplary process for determining a data processing strategy corresponding to a current processing node according to some embodiments of the present disclosure. As shown in FIG. 2, process 200 may include the following operation. In some embodiments, the process 200 may be performed by the processor (e.g., the processor 1020) .
In 210, a determination result may be obtained by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node.
The associative processing node connected with the current processing node may refer to a processing node that is directly associated with the current processing node (i.e., a previous processing node of the current processing node and a next processing node of the current processing node) . For example, the associative processing node connected with the current processing node may include an upstream processing node and/or a downstream processing node connected with the current processing node by an edge. The associative processing node may be one or more.
In some embodiments, the at least one associative processing node connected with the current processing node may include an input processing node connected with the current processing node. In some embodiments, the input processing node connected with the current processing node may include an upstream processing node connected with the current processing node and a downstream processing node connected with the current processing node.
In some embodiments, the processor may obtain the determination result by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node.
The relative relationship may be a hierarchical relationship, a sequence relationship, or other relationships determined by a preset hierarchy rule between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node. For example, if the hierarchy value of the current processing node is less than or equal to the hierarchy values of one or more input processing nodes or a sequence of the current processing node (i.e., a sequence corresponding to the processing flow) is located before sequences of one or more input processing node connected with the current processing node, it may be determined that the current processing node includes the loopback input path.
In some embodiments, the relative relationship may be a hierarchical relationship between the processing nodes. In response to determining that the hierarchy value of the current processing node is less than or equal to the hierarchy values of the one or more input processing node, the determination result may include that the current processing node includes the loopback input path. For example, as shown in FIG. 6, if a hierarchy value of the current processing node 2 is 2 and the input processing node connected with the current processing node 2 includes a processing node 5 with a hierarchy value 5, the current processing node 2 may include the loopback input path.
The input data that transmits from a node with a relatively high hierarchy value to a node with a relative low hierarchy value through the loopback input path may be referred to as input data corresponding to the loopback input path. The input data corresponding to the loopback input path may also be referred to as loopback input data. In some embodiments, loopback input data corresponding to the loopback input path may be identified, and the identification of the loopback input data may be referred to as loopback data identification. The processor may identify the loopback input data received by the processing node using the loopback data identification.
In 220, the data processing strategy corresponding to the current processing node may be determined based on the determination result.
In some embodiments, in response to determining that the current processing node includes the loopback input path, it may be determined that the data processing strategy corresponding to the current processing node is related to a type of the processing flow corresponding to the current processing node. In some embodiments, different types of the processing flow may correspond to different data processing strategies.
The type of the processing flow may be determined based on a preset type rule. For example, the type of the processing flow may include a type of business scenarios for various processing processes, a type of processing target data, etc.
It should be noted that in response to that the current processing node includes the loopback input path, the input data of the current processing node may include at least two input data, such as including at least one forward input data and at least one loopback input data. In some types of processing flows, the processing flow of the current processing node may process each input data independently, i.e., do not affect each other. In some types of processing flows, there may be a certain relationship between the processing of each input data in the processing flow corresponding to the previous processing node, for example, the plurality of input data needs to be combined for processing, and there is a sequential relationship between the processing of the plurality of input data, etc.
In some embodiments, the data processing strategy may include a first data processing strategy.
In some embodiments, the first data processing strategy may also be referred to as an asynchronous mode of data processing.
The first data processing strategy may include: in response to receiving the input data from the current processing node, performing the processing flow corresponding to the current processing node by processing the input data.
The type of processing flow (e.g., a type of business scenario) corresponding to the first data processing strategy may include: a data release scenario. In the application of data release scenario, for example, when processing image information, image data in the processing may be stored in the storage device, if the processing flow node 2 is a target processing flow node, the processing flow 2 may receive the image data A transmitted by the processing 1, the image data A transmitting to the next processing flow node may be triggered, and the storage device may store the image data A. The node processing flow 5 may generate image data B by processing the image data 5. After using up the image data A, the image data A may be still stored in the storage device, not cleaning up in time may take up storage space. At this time, the node processing flow 5 may transmit the loopback input data to the processing flow 2, the loopback input data may trigger the processing flow 2 and release the image data A stored in the storage device. In the above example, the processing way included in the processing flow 2 may include: transmitting data output by the previous processing flow node to the next processing flow node. In the above example, the processing way included in the processing flow 2 may further include: releasing the data that uses up. The above processing ways do not interfere with each other, for example, when releasing the image data A, the processing flow 2 may transmit image data C to the next processing flow node.
More descriptions of the asynchronous mode may be found in FIG. 8.
In some embodiments, the data processing strategy may include a second data processing strategy.
The second data processing strategy may include: in response to determining that the input data does not include loopback input data, whether the input data is first input data of the current processing node may be determined, in response to determining that the input data is the first input data of the current processing node, performing the process flow corresponding to the current processing node by processing the input data; in response to determining that the input data is not the first input data of the current processing node, caching the input data and receiving next input data, and performing the process flow corresponding to the current processing node based on the next input data. The first input data may represent first data obtained by the current processing node (e.g., first frame input data under a data matching scenario) .
A type of processing flow (e.g., a type of business scenario) corresponding to the second data processing strategy may include: a switch scenario, a data match scenario, etc. In the application of the switch scenario, when the processing way of a current frame may be determined based on data information of the previous frame, a conditional synchronization mode may be used. A current frame represents a current process of the processing flow. For example, when which flow branch of the current frame may be performed next is determined according to the data processing status of the previous frame, the conditional synchronization mode may be used. A previous frame represents a previous process of the processing flow. In the application of the data match scenario, when the current frame performs the data match with the previous frame, it may be necessary to wait for the loopback input data obtained by the data processing of the previous frame and the current frame data of previous processing flow node reach the target processing flow node synchronously, the target processing flow node performing the matching processing on the two data may be triggered. Since each data has a frame number, the two data that perform the synchronous data processing may be determined, which can avoid the disorder of the sequence after caching the plurality of data.
The second data processing strategy may also be referred to as the conditional synchronization mode, and more descriptions of the conditional synchronization mode may be found in FIG. 7.
In some embodiments, the second data processing strategy may further include: the processor may determine whether the next input data includes the loopback input data, in response to determining that the next input data includes the loopback input data, the processing flow corresponding to the current processing node may be performed by processing the cached input data and the next input data.
More descriptions may be found in FIG. 7.
The embodiments in the present disclosure may process the input data of the current processing node may be processed distinguished by determining various data processing strategies, which can realize the flexible control of data processing flow under various scenarios.
In some embodiments, the processor may determine the data processing strategy corresponding to the current processing node according to process 400 illustrated in FIG. 4.
FIG. 4 is a flowchart illustrating a process for a task scheduling according to some embodiments of the present disclosure.
In S41, a directed cyclic graph may be obtained.
In a certain embodiment, the directed cyclic graph as shown in FIG 5 can be constructed, the directed cyclic graph may include a processing node 1 to a processing node 5, the processing node 3 may include a processing node 3-1, and a processing node 3-2. The processing node 2, the processing node 3, the processing node 4, and the processing node 5 may form a directed ring, the processing node 4 may include a processing node 4-1 and a processing node 4-2; data to be processed may be obtained firstly, the data to be processed may be image data, video data, text data, etc.; the data to be processed may input into the processing node 1, and a processing result 1 may be obtained by processing the data to be processed using the processing node 1; the processing result 1 may be transmitted to the processing node 2, and a processing result 2 may be obtained by processing the processing result 1 using the processing 2; the processing result 2 may be transmitted to the processing node 3-1 and the processing node 3-2 respectively, and a processing result 3 and a processing result 4 may be obtained; the processing result 3 and the processing result may be transmitted to the processing node 4-1 and the processing node 4-2 respectively, and a processing result 5 and a processing result 6 may be obtained; the processing result 5 and the processing result 6 may be transmitted to the processing node 2, and a corresponding result may be obtained by similar processing manner.
In S42, a corresponding hierarchy value may be configured for each processing node.
after obtaining the directed cyclic graph, a hierarchy value may be assigned to each processing node in the directed cyclic graph, and a processing node may correspond to a hierarchy value; specifically, the hierarchy value may be increased along a direction of data transfer of the processing node, for example, as shown in FIG. 6, a hierarchy value of the processing node 1 may be 0, a hierarchy value of the processing node 2 may be 1, the hierarchy values of the processing node 3-1 and the processing node 3-2 may be 2, the hierarchy values of the processing node 4-1 and the processing node 4-2 may be 3, and the hierarchy value of the processing node 5 may be 4.
In S43, the current processing node may be selected in sequence from the plurality of processing nodes.
After configuring the hierarchy values of all the processing nodes, the performing the data processing task may be started, and the current processing node may be selected in sequence from the plurality of processing nodes; specifically, the processing node with the lowest hierarchy value may be selected as the current processing node firstly; after performing the relevant operations of the first processing node, the current processing node may be updated to a second processing node, and so on, all processing nodes may be traversed.
In S44, a current scheduling strategy may be obtained by selecting a scheduling strategy (i.e., a data processing strategy) from preset scheduling strategies based on the hierarchy value of the current processing node and the hierarchy values of remaining processing nodes.
The processor may preset at least two scheduling strategies to obtain a preset scheduling strategy set, and the scheduling strategy may be a manner for scheduling the input data of the current processing by using a task scheduling device; by using a relationship between the hierarchy value of the current processing node and the hierarchy values of the processing nodes (i.e., the remaining processing nodes) other than the current processing node among all processing nodes, transmitting the input data to the current processing node by using which scheduling strategy may be determined.
In S45, the current processing node may process the input data by inputting the input data of the current processing node into the current processing node using the current scheduling strategy.
After determining the current scheduling strategy, the input data of the current processing node may be determined by using the current scheduling strategy, and the input data may be transmitted to the current processing node for processing the input data by the current processing node. Further, after processing the input data, the current processing result may return and select the operations of the current processing node in sequence from the plurality of processing nodes, i.e., return to perform the operation S43, until satisfying a preset ending condition, the preset ending condition may be that a count of performing times reach a preset time, an ending instruction is received, or other reasonable conditions. For example, as shown in FIG. 6, the processing node 1 may be regarded as the current processing node, after processing the processing data in the processing node 1, the current processing node may be updated to the processing node 2, and so on, and an output result may be obtained finally.
The embodiments provide a loopback task scheduling method based on the directed cycle graph, by configuring the hierarchy value for each processing node in the directed cycle graph, a relationship between the hierarchy values of the processing nodes in the direction of data transfer, a scheduling strategy suitable for the current processing node may be selected, the input data may be inputted into the current processing node by using the scheduling strategy, which not only can make the current processing node process the data output by the upstream processing node, but also process the data output by the downstream processing node, can be used into the scenarios that need to transmit the data from the downstream processing node to the upstream processing node, support the scheduling of loopback data task, increase the flexibility in task scheduling, and strengthen the adaptability of task scheduling.
Please refer to FIG. 7, FIG. 7 is a flowchart illustrating another exemplary process for a task scheduling according to some embodiments of the present disclosure, the method may include:
In S71, a directed cycle graph may be obtained.
In S72, a corresponding hierarchy value may be configured for each processing node.
In S73, the current processing node may be selected from the plurality of processing nodes.
S71-S73 may be the same as S41-S43, which may not be repeated here.
In S74, whether the current processing node satisfies a first preset condition may be determined.
Whether the current processing node is a first processing node in the directed cycle graph and the input data is a first data received by the current processing node may be determined; in response to determining that the current processing node is the first processing node and the input data is the first data received by the current processing node, it may be determined that the current processing node satisfy the first preset condition.
In S75, in response to that the current processing node satisfies the first preset condition, the current processing node may process the data to be processed by inputting the data to be processed into the current processing node.
In response to detecting that the current processing node satisfies the first preset condition, it may indicate that the current processing node is a first processing node in all the processing nodes, at this time, the current processing node may receive the data firstly, and the obtained data to be processed may be determined as input data, and the current processing node may process the data to be processed by transmitting the data to be processed to the current processing node.
In S76, in response to the current processing node does not satisfy the first preset condition, it may be determined whether the current processing node satisfies a second condition.
In response to the current processing node does not satisfy the first preset condition, it may indicate that the current processing node is not the first processing node or the current processing node is not the first time that receives the input data, at this time. It may be further determined whether the current processing node is the first processing node; in response to the current processing node is not the first processing node, it may be determined that the current processing node satisfies a second preset condition.
In S77, in response to that the current processing node satisfies the second preset condition, it may be determined whether the hierarchy value of the current processing node is less than the hierarchy value of the input node of the at least one of the current processing nodes.
In response to detecting that the current processing node satisfies the second preset condition, it may indicate that the current processing node is not the first processing node, i.e., the current processing node may exist the input node (i.e., the upstream processing node) , it may be further determined that whether the hierarchy value of the current processing node is less than the hierarchy value of the input node of the at least one of current processing node, which can determine whether the current processing node can receive the data output by the downstream processing node.
In S78, in response to that whether the hierarchy value of the current processing node is less than the input node of the at least one of the current processing nodes, the current processing node may process the input data by inputting the input data into the current processing node using the first scheduling strategy.
In different parts of the present disclosure, the loopback input data may also be referred to as loopback data. After determining the current processing node is not the first processing node, the operations of identifying the loopback data may be performed; specifically, identification information may be obtained by identifying the output data of the first input node, the first input node may be an input node that the hierarchy value which greater than the hierarchy value of the current processing node; for example, as shown in FIG, 6, the processing node 2 may include two input data; one of the two input data may from the processing node 1, and another one of the two input data may from the processing node 5, it needs to identify which input data is the loopback data; the input data may include a hierarchy value of a starting point processing node (i.e., the input data) , in response to the hierarchy value of the input data is greater than the hierarchy value of the current processing node, the data output by the input data may be identified as the loopback data, i.e., the data output by the processing node 5 may be identified as the loopback data.
Further, identifying the loopback data may be mainly to facilitate distinguishing processing, since the data uploaded by the loopback data is used to combine with the subsequent data rather than output a corresponding result, when performing the task scheduling, the loopback data needs to be distinguished.
In response to detecting that the hierarchy value of the current processing node is less than the hierarchy value of the input node of the at least one of the current processing node, it may indicate that the at least one input data of the current processing node is the loopback data, the first scheduling strategy may be determined as a current scheduling strategy, i.e., scheduling the input data of the current processing node using the first scheduling strategy; specifically, the first scheduling strategy may include: determining whether the output data of the input node includes the identification information; in response to the output data of the input node includes the identification information, it may be determined that the input node is the first input node; in response to the output data of the input node does not include the identification information, it may be determined that the input node is the second input node; the output data of the first input data and/or the output data of the second input data may be inputted into the current processing node.
Further, the input data of the current processing node may be performed synchronously, as shown in FIG. 6, the processing node 2 may include two input data; one of the two input data may be from the processing node 1, and another one of the two input data may from the processing node 5, therefore, a synchronization mechanism needs to be added to ensure that the data is correctly sent to the processing node 2, the embodiments provide two synchronization mechanisms, specifically as follows:
1) an asynchronous mode
After receiving a first mode selection instruction corresponding to the asynchronous mode, the current processing node may process first output data by inputting the first output data into the current processing node, the first output data may be output data of the first input node and the first mode selection instruction may be instruction generated by mode selection of the user; and/or, the current processing may process second output data by inputting the second output data into the current processing node, the second output data may be output data of the second input node.
For example, as shown in FIG. 8, when any one of the second output data output by the processing node 1 and the first output data of the processing node 5 reaches the processing node 2, the processing operations of the processing node 2 may be triggered.
2) a conditional synchronization mode
After receiving a second mode selection instruction corresponding to the conditional synchronization mode, a second output data may be obtained by obtaining the output data of the second input node, and the second mode selection instruction may generate the instruction by the mode selection of the user; whether the second output data satisfies a third preset condition may be determined; in response to the second output data satisfies the third preset condition, third output data may be obtained by inputting the second output data into the current processing node; in response to the second output data does not satisfy the third preset condition, the first output data may be obtained by obtaining the output data of the first input data, fourth output data may be generated based on the first output data. Specifically, the second output data may be cached; the current processing node may process the second output data and the first output data and further obtain the fourth output data by inputting the second output data and the first output data into the current processing node.
Further, whether the second output data is the first data received by the current processing node may be determined; in response to that the second output data is the first data received by the current processing node, it may be determined that the second output data satisfies the third preset condition.
In an embodiment, there exist usage scenarios that require data synchronization one by one, for example, as shown in FIG. 6, when the second data output by the processing node 1 reaches the processing node 2, the first output data output by the processing node 5 may be sent simultaneously; further, the first output data output by the processing node 1 need to be distinguished rather than synchronizing with data output from the processing node 5, at this time, the conditional synchronous mode (i.e. the conditional synchronous mode) may be selected, as shown in FIG. 9, whether the second output data output by the processing node 1 is the first data received by the processing node 2 may be determined, in response to the second output data output by the processing node 1 is the first data received by the processing node 2, the second output data may be transmitted to the processing node 2; in response to the second output data output by the processing node 1 is not the first data received by the processing node 2, the second output data may be cached, and the first output data may be transmitted to the processing node 2 simultaneously until generating the first output data.
In S79, in response to that the hierarchy value of the current processing node is greater than the hierarchy value of hierarchy values of all the input nodes, the current processing node may process the input data by inputting the input data into the current processing node using the second scheduling strategy.
In different parts of the present disclosure, the second scheduling strategy may also be referred to as a second data processing strategy. In response to that the hierarchy value of the current processing node is greater than the hierarchy values of all the input nodes, it may indicate that the current processing node is a downstream node of all the input nodes, the second scheduling strategy may be determined as the current scheduling strategy, the second scheduling may include: determining the output data of the input node as the input data; processing the output data of the input node using the current processing node by inputting the output data of the input node into the current processing node.
The embodiment increases the processing of the loopback data under the basis of the directed acyclic graph, the loopback data may be identified by comparing the hierarchy value of the processing node, and the embodiments further provide a plurality of synchronization manners of the loopback data and the ordinary data, so that the downstream processing node may transmit the data to the upstream processing node for using, and the application range is relatively wide.
Please refer to FIG. 10, FIG. 10 is a schematic diagram illustrating a task scheduling device according to some embodiments of the present disclosure, a task scheduling device 1000 may include a storage device 1010 and a processor 1020 connected with each other, the storage device 1010 may be used to store a computer program, the processor 1020 may implement the task scheduling manner in the above the embodiments by performing the computer program, and the task scheduling device 1000 may be devices such as a camera, a computer, or a server.
Please refer to FIG. 11, FIG. 11 is a schematic diagram illustrating a computer readable storage medium according to some embodiments of the present disclosure, a computer readable storage medium 1100 may be used to store a computer program 1110, and the processor may implement the task scheduling manner in the above embodiments by performing the computer program 1110.
The computer readable storage medium 1100 may be various mediums that can store the program code such as a server, a U disk, a mobile hard disk, a read only memory (ROM) , a random access memory (RAM) , a disk, or an optical disc, etc.
FIG. 12 is a block diagram illustrating an exemplary system for performing a data processing task according to some embodiments of the present disclosure.
As shown in FIG, 12, a system 1200 for performing the data processing task may include an obtaining module 1210 and a processing module 1220.
The obtaining module 1210 may be used to obtain a computational graph corresponding to a data processing task, the data processing task may include a plurality of processing flows, the computational graph may include a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge may denote a direction of data transfer between processing nodes connected by the directed edge.
The processing module 1220 may be used to perform the plurality of processing flows based on the computational graph by: determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; and in response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
In some embodiments, the processing module 1220 may obtain a determination result by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node; and the processing module 1220 may determine the data processing strategy corresponding to the current processing node based on the determination result.
In some embodiments, in response to the determination result indicates that the current processing node includes the loopback input path, the processing module 1220 may determine that the data processing strategy corresponding to the current processing node is related to a type of the processing flow corresponding to the current processing node.
In some embodiments, in response to the current processing node include the loopback input path, the processing module 1220 may determine that the data processing strategy corresponding to the current processing node includes a first data processing strategy. The first data processing strategy may include: in response to receiving the input data from the current processing node, performing the processing flow corresponding to the current processing node by processing the input data.
In some embodiments, the processing module 1220 may identify loopback input data corresponding to the loopback input path in response to the determination result indicating that the current processing path includes the loopback input path.
In some embodiments, the processing module 1220 may determine that the data processing strategy corresponding to the current processing node includes a second data processing strategy. The second data processing strategy may include: in response to the input data received from the current processing node does not include loopback input data, determining whether the input data is first input data of the current processing node; and performing the process flow corresponding to the current processing node based on a second determination result including: in response to determining that the input data is the first input data of the current processing node, performing the process flow corresponding to the current processing node by processing the input data; or in response to determining that the input data is not the first input data of the current processing node, caching the input data and receiving next input data, performing the process flow corresponding to the current processing node based on the next input data.
In some embodiments, the processing module 1220 may determine whether the next input data includes the loopback input data. In response to determining that the next input data includes the loopback input data, the processing flow corresponding to the current processing node may be performed by processing the cached input data and the next input data.
More description of the method for performing the data processing task and determining the data processing strategy of the current processing node may be found in FIG. 1, FIG. 2, and the related descriptions.
The current processing node may include a plurality of loopback data stream, since process manner corresponding to different loopback data streams are different, the different loopback data streams need to be distinguished, which can deal with the loopback data stream accordingly. The hierarchy values in the different loopback data streams may be the same, the hierarchy value of the starting point of the loopback data stream may identify the loopback data stream rather than distinguishing the different loopback data streams. The different loopback data streams of the nodes may be distinguished by the following manner:
1.1) for the nodes being related to a plurality of edges, hierarchy values of a plurality of nodes connected by the directed edge may be distinguished. For example, the node 2 (FIG. 3) may include three edges connected with two nodes, and the hierarchy values of the two nodes may be 3.1 and 3.2 (i.e., distinguishing the node of the same hierarchy by adding a decimal) .
1.2) further, the data stream may include the hierarchy value of each of the plurality of nodes passed through from the initial input data, i.e., the data stream may include a sequence of the hierarchy values corresponding to a sequence of the nodes passed through. For example, a data processing path may be: the node 1, the node 2, and the node 3.1, and the data stream of the processing node 3 may include a sequence of the hierarchy values (1, 2, 3.1) .
1.3) based on the sequence of the hierarchy values in the loopback data stream, the data processing path may be known by knowing the sequence of the nodes passed through, and different loopback data streams that include a certain node may be distinguished.
In some embodiments, under the method for performing the data processing task in the embodiments of the present disclosure, an optimization direction of data processing algorithms (e.g., image processing algorithms) in the processing flow of the data processing task (e.g., the image processing task) may be further determined.
For the convenience of description, the following 2.1) -2.6) take the image processing task as an example to illustrate an exemplary description of the processing of determining the optimization direction of data processing algorithms, other technical fields may refer to the following examples to determine the optimization direction of data processing algorithms in the processing task of the data processing task.
In the application scenario of image processing, the data processing task may be: the initial input data may be the image to be processed, and a target task may be realized by processing the image in the plurality of data processing flow using various image processing algorithms, for example, detecting and identifying the target in the image.
2.1) according to the above embodiments, the data stream may include data processing information such as the data processing time of each node passed through, data processing occupy memory, etc., and may be represented by an information sequence similar to the sequence of the hierarchy values. For example, the data processing path may be: the node 1, the node 2, and the node 3.1, the data stream of the input node 3 may include a sequence of data processing information (data processing information 1, data processing information 2, data processing information 3.1) , the node corresponding to each element in the information sequence and the sequence of the hierarchy values may be the same.
2.2) by 2.1, when a final result outputs, data of the result may include a sequence of data processing information of the whole data processing task. When the whole data processing task includes a plurality of data processing path branches, each data processing path branch may correspond to a sequence of data processing information. Based on the sequence of data processing information of the whole data processing information, the processing time, and occupied memory of each image processing algorithm may be obtained.
2.3) the plurality of different images (including multiple types of diverse images, e.g., various shapes, various clarity, and images of various scenes) may be obtained, and the corresponding sequence of data processing information may be obtained by performing the data processing task for each image.
2.4) for extracting target features (the target features may be determined by the need, for example, if the target features are determined by the effect of the image processing algorithm on shape detection processing, the relevant features related to the shape may be extracted; if the target features are determined by the effect of image processing algorithm on detection and processing of different scenes, the relevant features related to the scenes may be extracted) by obtaining the plurality of images, a plurality of classes may be obtained by clustering the plurality of the image according to the target features, and a corresponding type may be determined according to the class center of each class (e.g., clustering by the shape features, selecting a type of the image shape corresponding to a class center of a certain class as a corresponding type) .
2.5) for each class: calculating an average value of the data processing information corresponding to the type, that is to say, an average value sequence of the data processing information may be obtained by average after summing each information in the sequence of the data processing information corresponding to all the images included in this type.
2.6) the average value of the data processing information corresponding to each type may be determined as data processing information of the image of each category, and the processing time and the occupied memory for processing the image of each category using the image processing algorithms may be obtained. Further, the processing performance of the image processing algorithm is relatively poor (e.g., the relatively long processing time, a relatively large occupied memory) for the image of which category may be determined, and the image types that the processing performance is relatively poor may be used as the optimization direction of the image processing algorithm, and the image processing algorithm may be further optimized according to the optimization direction (e.g., training and improving the image processing algorithm such as the structure of the network, or the like, by obtaining more samples based on the optimization direction) .
The above sequences (e.g., 2.1) -2.6) , etc. ) are merely used for the convenience of descriptions, which does not limit the sequence of the operations and operation contents of the processing flow.
The basic concepts have been described. Obviously, for those skilled in the art, the detailed disclosure may be only an example and may not constitute a limitation to the present disclosure. Although not explicitly stated here, those skilled in the art may make various modifications, improvements, and amendments to the present disclosure. These alterations, improvements, and modifications are intended to be suggested by this disclosure and are within the spirit and scope of the exemplary embodiments of this disclosure.
It should be noted that the above description is provided for illustrative purposes only and is not intended to limit the scope of the present disclosure. For those skilled in the art, many changes and modifications can be made under the guidance of the content of the present disclosure. The features, structures, methods, and other features of the exemplary embodiments described herein may be combined in various ways to obtain additional and/or alternative exemplary embodiments. However, these changes and modifications may not deviate from the scope of the present disclosure.
Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure and are within the spirit and scope of the exemplary embodiments of this disclosure.
Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of the specification are not necessarily all referring to the same embodiment. In addition, some features, structures, or features in the present disclosure of one or more embodiments may be appropriately combined.
Moreover, unless otherwise specified in the claims, the sequence of the processing elements and sequences of the present application, the use of digital letters, or other names are not used to define the order of the application flow and methods. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various assemblies described above may be embodied in a hardware device, it may also be implemented as a software only solution, e.g., an installation on an existing server or mobile device.
Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various embodiments. However, this disclosure may not mean that the present disclosure object requires more features than the features mentioned in the claims. In fact, the features of the embodiments are less than all of the features of the individual embodiments disclosed above.
In some embodiments, the numbers expressing quantities, properties, and so forth, used to describe and claim certain embodiments of the application are to be understood as being modified in some instances by the term “about, ” “approximate, ” or “substantially. ” Unless otherwise stated, “about, ” “approximate, ” or “substantially” may indicate a ±20%variation of the value it describes. Accordingly, in some embodiments, the numerical parameters set forth in the description and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Although the numerical domains and parameters used in the present application are used to confirm the range of ranges, the settings of this type are as accurate in the feasible range in the feasible range in the specific embodiments.
Each patent, patent application, patent application publication, and other materials cited herein, such as articles, books, instructions, publications, documents, etc., are hereby incorporated by reference in the entirety. In addition to the application history documents that are inconsistent or conflicting with the contents of the present disclosure, the documents that may limit the widest range of the claim of the present disclosure (currently or later attached to this application) are excluded from the present disclosure. It should be noted that if the description, definition, and/or terms used in the appended application of the present disclosure is inconsistent or conflicting with the content described in the present disclosure, the use of the description, definition and/or terms of the present disclosure shall prevail.
At last, it should be understood that the embodiments described in the disclosure are used only to illustrate the principles of the embodiments of this application. Other modifications may be within the scope of the present disclosure. Thus, by way of example, but not of limitation, alternative configurations of the embodiments of the present disclosure may be utilized in accordance with the teachings herein. Accordingly, embodiments of the present disclosure are not limited to that precisely as shown and described.
Claims (16)
- A system for performing a data processing task, comprising:at least one storage device including a set of instructions; andat least one processor configured to communicate with the at least one storage device, wherein when executing the set of instructions, the at least one processor is configured to direct the system to perform operations including:obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; andperforming the plurality of processing flows based on the computational graph by:determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; andin response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
- The system of claim 1, wherein the determining a data processing strategy corresponding to a current processing node includes:obtaining a determination result by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node; anddetermining the data processing strategy corresponding to the current processing node based on the determination result.
- The system of claim 2, wherein the determination result indicates that the current processing node includes the loopback input path, andthe determining the data processing strategy corresponding to the current processing node based on the determination result includes:determining that the data processing strategy corresponding to the current processing node is related to a type of the processing flow corresponding to the current processing node.
- The system of claim 2, wherein in response to the current processing node includes the loopback input path, the data processing strategy corresponding to the current processing node includes a first data processing strategy;wherein the first data processing strategy includes: in response to receiving the input data from the current processing node, performing the processing flow corresponding to the current processing node by processing the input data.
- The system of claim 2, wherein the operations further include:identifying loopback input data corresponding to the loopback input path in response to the determination result indicating that the current processing path includes the loopback input path.
- The system of claim 2, whereinthe data processing strategy corresponding to the current processing node includes a second data processing strategy,wherein the second data processing strategy includes:in response to the input data received from the current processing node does not include loopback input data, determining whether the input data is first input data of the current processing node; andperforming the process flow corresponding to the current processing node based on a second determination result including:in response to determining that the input data is the first input data of the current processing node, performing the process flow corresponding to the current processing node by processing the input data; orin response to determining that the input data is not the first input data of the current processing node, caching the input data and receiving next input data, performing the process flow corresponding to the current processing node based on the next input data.
- The system of claim 6, wherein the performing the process flow corresponding to the current processing node based on the next input data includes:determining whether the next input data includes the loopback input data;in response to determining that the next input data includes the loopback input data, performing the processing flow corresponding to the current processing node by processing the cached input data and the next input data.
- A method for performing a data processing task, comprising:obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; andperforming the plurality of processing flows based on the computational graph by:determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; andin response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
- The method of claim 8, wherein the determining a data processing strategy corresponding to a current processing node includes:obtaining a determination result by determining whether the current processing node includes a loopback input path based on a relative relationship between the hierarchy value of the current processing node and the hierarchy value of the at least one associative processing node connected with the current processing node; anddetermining the data processing strategy corresponding to the current processing node based on the determination result.
- The method of claim 9, wherein the determination result indicates that the current processing node includes the loopback input path, andthe determining the data processing strategy corresponding to the current processing node based on the determination result includes:determining that the data processing strategy corresponding to the current processing node is related to a type of the processing flow corresponding to the current processing node.
- The method of claim 9, wherein in response to the current processing node includes the loopback input path, the data processing strategy corresponding to the current processing node includes a first data processing strategy;wherein the first data processing strategy includes: in response to receiving the input data from the current processing node, performing the processing flow corresponding to the current processing node by processing the input data.
- The method of claim 9, wherein the operations further include:identifying loopback input data corresponding to the loopback input path in response to the determination result indicating that the current processing path includes the loopback input path.
- The method of claim 12, whereinthe data processing strategy corresponding to the current processing node includes a second data processing strategy,wherein the second data processing strategy includes:in response to the input data received from the current processing node does not include loopback input data, determining whether the input data is first input data of the current processing node; andperforming the process flow corresponding to the current processing node based on a second determination result including:in response to determining that the input data is the first input data of the current processing node, performing the process flow corresponding to the current processing node by processing the input data; orin response to determining that the input data is not the first input data of the current processing node, caching the input data and receiving next input data, performing the process flow corresponding to the current processing node based on the next input data.
- The method of claim 13, wherein the performing the process flow corresponding to the current processing node based on the next input data includes:determining whether the next input data includes the loopback input data;in response to determining that the next input data includes the loopback input data, performing the processing flow corresponding to the current processing node by processing the cached input data and the next input data.
- A system for performing a data processing task, comprising:an obtaining module configured to obtain a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; anda processing module configured to performing the plurality of processing flows based on the computational graph by:determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; andin response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
- A non-transitory computer readable medium, comprising at least one set of instructions, wherein when executed by one or more processors of a computing device, the at least one set of instructions causes the computing device to perform a method, the method comprising:obtaining a computational graph corresponding to a data processing task, the data processing task including a plurality of processing flows, the computational graph including a plurality of processing nodes representing the plurality of processing flows and directed edges each of which is configured to connect processing nodes among the plurality of processing nodes, a direction of a directed edge denoting a direction of data transfer between processing nodes connected by the directed edge; andperforming the plurality of processing flows based on the computational graph by:determining a data processing strategy corresponding to a current processing node based on a hierarchy value of the current processing node and a hierarchy value of at least one associative processing node connected with the current processing node, a hierarchy value of a processing node being related to a count of edges between the processing node and an initial processing node; andin response to determining that the current processing node receives input data, performing a processing flow corresponding to the current processing node according to the data processing strategy based on the input data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210412395.0A CN114510338B (en) | 2022-04-19 | 2022-04-19 | Task scheduling method, task scheduling device and computer readable storage medium |
CN202210412395.0 | 2022-04-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023202005A1 true WO2023202005A1 (en) | 2023-10-26 |
Family
ID=81555399
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/125155 WO2023202005A1 (en) | 2022-04-19 | 2022-10-13 | Methods and systems for performing data processing tasks |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114510338B (en) |
WO (1) | WO2023202005A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114510338B (en) * | 2022-04-19 | 2022-09-06 | 浙江大华技术股份有限公司 | Task scheduling method, task scheduling device and computer readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107315834A (en) * | 2017-07-12 | 2017-11-03 | 广东奡风科技股份有限公司 | A kind of ETL work flow analysis methods based on breadth-first search |
CN111814002A (en) * | 2019-04-12 | 2020-10-23 | 阿里巴巴集团控股有限公司 | Directed graph identification method and system and server |
US20210019184A1 (en) * | 2019-07-17 | 2021-01-21 | Google Llc | Scheduling operations on a computation graph |
CN112766907A (en) * | 2021-01-20 | 2021-05-07 | 中国工商银行股份有限公司 | Service data processing method and device and server |
CN113176975A (en) * | 2021-03-30 | 2021-07-27 | 东软集团股份有限公司 | Monitoring data processing method and device, storage medium and electronic equipment |
CN113986503A (en) * | 2021-10-29 | 2022-01-28 | 中国平安人寿保险股份有限公司 | Task scheduling method, task scheduling device, task scheduling apparatus, and storage medium |
US20220043688A1 (en) * | 2018-09-11 | 2022-02-10 | Huawei Technologies Co., Ltd. | Heterogeneous Scheduling for Sequential Compute Dag |
CN114510338A (en) * | 2022-04-19 | 2022-05-17 | 浙江大华技术股份有限公司 | Task scheduling method, task scheduling device and computer readable storage medium |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9553796B2 (en) * | 2013-03-15 | 2017-01-24 | Cisco Technology, Inc. | Cycle-free multi-topology routing |
CN112214289A (en) * | 2019-07-11 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Task scheduling method and device, server and storage medium |
CN111314138B (en) * | 2020-02-19 | 2021-08-31 | 腾讯科技(深圳)有限公司 | Detection method of directed network, computer readable storage medium and related equipment |
CN113434265A (en) * | 2020-03-23 | 2021-09-24 | 阿里巴巴集团控股有限公司 | Workflow scheduling method, server and medium |
US11425000B2 (en) * | 2020-07-01 | 2022-08-23 | Paypal, Inc. | On-the-fly reorganization of directed acyclic graph nodes of a computing service for high integration flexibility |
CN112114960B (en) * | 2020-08-06 | 2022-11-01 | 河南大学 | Scheduling strategy for remote sensing image parallel cluster processing adapting to internet scene |
CN112506636A (en) * | 2020-12-16 | 2021-03-16 | 北京中天孔明科技股份有限公司 | Distributed task scheduling method and device based on directed acyclic graph and storage medium |
CN113282586A (en) * | 2021-06-01 | 2021-08-20 | 马上消费金融股份有限公司 | Information processing method, device, equipment and readable storage medium |
CN113672369A (en) * | 2021-08-20 | 2021-11-19 | 北京明略软件系统有限公司 | Method and device for verifying ring of directed acyclic graph, electronic equipment and storage medium |
CN113535367B (en) * | 2021-09-07 | 2022-01-25 | 北京达佳互联信息技术有限公司 | Task scheduling method and related device |
CN114168287A (en) * | 2021-12-07 | 2022-03-11 | 上海软素科技有限公司 | Task scheduling method and device, readable storage medium and electronic equipment |
CN114327692A (en) * | 2021-12-23 | 2022-04-12 | 杭州安恒信息技术股份有限公司 | Task flow direction identification method and system, electronic equipment and storage medium |
CN114301785A (en) * | 2021-12-30 | 2022-04-08 | 山石网科通信技术股份有限公司 | Method and device for determining service relationship of computer and storage medium |
-
2022
- 2022-04-19 CN CN202210412395.0A patent/CN114510338B/en active Active
- 2022-10-13 WO PCT/CN2022/125155 patent/WO2023202005A1/en unknown
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107315834A (en) * | 2017-07-12 | 2017-11-03 | 广东奡风科技股份有限公司 | A kind of ETL work flow analysis methods based on breadth-first search |
US20220043688A1 (en) * | 2018-09-11 | 2022-02-10 | Huawei Technologies Co., Ltd. | Heterogeneous Scheduling for Sequential Compute Dag |
CN111814002A (en) * | 2019-04-12 | 2020-10-23 | 阿里巴巴集团控股有限公司 | Directed graph identification method and system and server |
US20210019184A1 (en) * | 2019-07-17 | 2021-01-21 | Google Llc | Scheduling operations on a computation graph |
CN112766907A (en) * | 2021-01-20 | 2021-05-07 | 中国工商银行股份有限公司 | Service data processing method and device and server |
CN113176975A (en) * | 2021-03-30 | 2021-07-27 | 东软集团股份有限公司 | Monitoring data processing method and device, storage medium and electronic equipment |
CN113986503A (en) * | 2021-10-29 | 2022-01-28 | 中国平安人寿保险股份有限公司 | Task scheduling method, task scheduling device, task scheduling apparatus, and storage medium |
CN114510338A (en) * | 2022-04-19 | 2022-05-17 | 浙江大华技术股份有限公司 | Task scheduling method, task scheduling device and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114510338A (en) | 2022-05-17 |
CN114510338B (en) | 2022-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230024840A1 (en) | Calculation method and related product | |
CN111985229B (en) | Sequence labeling method and device and computer equipment | |
CN108664999B (en) | Training method and device of classification model and computer server | |
US9882844B2 (en) | Opaque message parsing | |
CN102197394B (en) | Digital image retrieval by aggregating search results based on visual annotations | |
CN111783767B (en) | Character recognition method, character recognition device, electronic equipment and storage medium | |
CN109684087B (en) | Operation method, device and related product | |
WO2018102062A1 (en) | Distributed assignment of video analytics tasks in cloud computing environments to reduce bandwidth utilization | |
CN108875955A (en) | Gradient based on parameter server promotes the implementation method and relevant device of decision tree | |
CN111742333A (en) | Method and apparatus for performing deep neural network learning | |
WO2023202005A1 (en) | Methods and systems for performing data processing tasks | |
KR20210102039A (en) | Electronic device and control method thereof | |
CN115244587A (en) | Efficient ground truth annotation | |
Heo et al. | Graph neural network based service function chaining for automatic network control | |
CN113989549A (en) | Semi-supervised learning image classification optimization method and system based on pseudo labels | |
Yao et al. | Network cooperation with progressive disambiguation for partial label learning | |
Yu et al. | Lifelong event detection with knowledge transfer | |
CN113961267B (en) | Service processing method, device and equipment | |
US20170091244A1 (en) | Searching a Data Structure | |
WO2023202006A1 (en) | Systems and methods for task execution | |
Trajdos et al. | An extension of multi-label binary relevance models based on randomized reference classifier and local fuzzy confusion matrix | |
CN115186738B (en) | Model training method, device and storage medium | |
US20190279080A1 (en) | Neural network systems and methods for application navigation | |
CN114299517A (en) | Image processing method, apparatus, device, storage medium, and computer program product | |
JP6993250B2 (en) | Content feature extractor, method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22938228 Country of ref document: EP Kind code of ref document: A1 |