CN112269648A - Parallel task allocation method and device for multi-stage program analysis - Google Patents

Parallel task allocation method and device for multi-stage program analysis Download PDF

Info

Publication number
CN112269648A
CN112269648A CN202011272405.2A CN202011272405A CN112269648A CN 112269648 A CN112269648 A CN 112269648A CN 202011272405 A CN202011272405 A CN 202011272405A CN 112269648 A CN112269648 A CN 112269648A
Authority
CN
China
Prior art keywords
task
tasks
analysis
stage
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011272405.2A
Other languages
Chinese (zh)
Other versions
CN112269648B (en
Inventor
陈睿
江云松
肖志恒
王峥
贾春鹏
高栋栋
于婷婷
丁戈
朱玉钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sunwise Information Technology Ltd
Original Assignee
Beijing Sunwise Information Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sunwise Information Technology Ltd filed Critical Beijing Sunwise Information Technology Ltd
Priority to CN202011272405.2A priority Critical patent/CN112269648B/en
Priority claimed from CN202011272405.2A external-priority patent/CN112269648B/en
Publication of CN112269648A publication Critical patent/CN112269648A/en
Application granted granted Critical
Publication of CN112269648B publication Critical patent/CN112269648B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a parallel task allocation method and device for multi-stage program analysis. The method comprises the following steps: according to the dependency relationship among all tasks in the code to be analyzed, a task relationship graph corresponding to the code to be analyzed is constructed; acquiring an analysis task needing to be operated in the code to be analyzed; according to the task relation graph and the analysis task, performing stage division on the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel; and according to the number of the concurrent running tasks, running the stage tasks in the stage task set, and acquiring a task running result. The invention can exert the hardware performance to a greater extent, shorten the overall analysis time, and effectively solve the problems that the results of all the inspectors are accumulated in the same result file, and the result file is too large and is inconvenient to read when the inspection results are more.

Description

Parallel task allocation method and device for multi-stage program analysis
Technical Field
The invention relates to the technical field of code analysis, in particular to a parallel task allocation method and a parallel task allocation device for multi-stage program analysis.
Background
Code analysis is an important method for ensuring the correctness of software code. The high-precision static analysis is carried out on modern large-scale complex software systems (such as a Linux operating system with ten million lines of code scale), and the improvement of the analysis precision usually means longer analysis time. The mutual restriction of precision, efficiency and expandability is a main obstacle for the application of the static analysis technology in the industry.
In the aspect of improving the analysis efficiency, people do a great deal of optimization research work, such as single machine
CPU parallel, distributed, GPU implementation, etc. In view of the fact that the system applied by the invention is a single-machine environment, the invention mainly aims at the parallel situation of single-machine CPUs. Some existing parallel analysis algorithms include a parallel orientation analysis algorithm based on constraint graph rewriting proposed by Mendez-Lojo et al, and a parallel data flow analysis algorithm based on a participant model proposed by Rodriguez et al, which are specific processing aiming at a certain type of specific static analysis problem and have no universality. Albarghouthi et al propose a general framework for parallelizing top-down analysis using a map-reduce strategy, which is more memory dependent and is computationally limited when memory is insufficient.
For the case of performing analysis simultaneously by using multiple types of inspectors, some inspectors may have pre-tasks, for example, before the analysis of an inspector for global variable analysis, program construction needs to be performed first, and then function call relation calculation needs to be performed, and the analysis of the inspector has pre-dependent conditions and needs to be performed in multiple stages. If each checker waits for its pre-task to complete and all checkers are to proceed sequentially, there is a problem that the project integrity analysis is inefficient.
Disclosure of Invention
The technical problem solved by the invention is as follows: the defects of the prior art are overcome, and a parallel task allocation method and a parallel task allocation device for multi-stage program analysis are provided.
In order to solve the above technical problem, an embodiment of the present invention provides a parallel task allocation method for multi-stage program analysis, including:
according to the dependency relationship among all tasks in the code to be analyzed, a task relationship graph corresponding to the code to be analyzed is constructed;
acquiring an analysis task needing to be operated in the code to be analyzed;
according to the task relation graph and the analysis task, performing stage division on the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel;
and according to the number of the concurrent running tasks, running the stage tasks in the stage task set, and acquiring a task running result.
Optionally, the constructing a task relationship graph corresponding to the code to be analyzed according to the dependency relationships among all tasks in the code to be analyzed includes:
taking all the tasks as graph nodes;
and connecting the graph nodes with the dependency relationship according to the dependency relationship to generate the task relationship graph.
Optionally, the performing stage division on the analysis task according to the task relationship graph and the analysis task to obtain a stage task set includes:
acquiring a target graph node of the analysis task in the task relation graph;
acquiring a first graph node which is a father node in the target graph nodes, and dividing the first graph node into a stage task set;
deleting the first graph node from the target graph nodes, circularly executing the steps of acquiring the first graph node which is a parent node from the target graph nodes and dividing the first graph node into a phase task set for the rest target graph nodes until all analysis tasks are divided into the phase task set.
Optionally, the running the phase tasks in the phase task set according to the number of concurrently-run tasks and obtaining a task running result includes:
determining the number of the concurrent running tasks according to the equipment performance of the running equipment and the average occupied memory corresponding to the analysis task;
and according to the phase running sequence of the phase task set and the number of the concurrent running tasks, running the analysis tasks in the phase task set in stages to obtain a task running result.
Optionally, the operating analysis tasks in the phase task sets in the phase task set in stages according to the phase operation order of the phase task set and the number of concurrently operating tasks to obtain a task operation result includes:
sending the task operated in the first stage to a task executor through a task scheduler for execution;
after all the tasks in the first stage are executed, calculating a first task which is not required to be executed in the second stage task according to a process exit code returned by the executor, removing the first task, and sending the rest tasks to the executor for execution;
and after the operation is finished, integrating the results of the analysis tasks selected by the user through a result data integration part to obtain the task operation results.
In order to solve the above technical problem, an embodiment of the present invention further provides a parallel task allocation apparatus for multi-stage program analysis, including:
the task relation graph building module is used for building a task relation graph corresponding to the code to be analyzed according to the dependency relationship among all tasks in the code to be analyzed;
the analysis task acquisition module is used for acquiring an analysis task to be operated in the code to be analyzed;
the task set acquisition module is used for carrying out stage division on the analysis task according to the task relation graph and the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel;
and the operation result acquisition module is used for operating the stage tasks in the stage task set according to the number of the concurrent operation tasks and acquiring the task operation result.
Optionally, the task relationship graph building module includes:
a graph node acquisition unit, configured to take all the tasks as graph nodes;
and the task relation graph generating unit is used for connecting graph nodes with the dependency relationship according to the dependency relationship to generate the task relation graph.
Optionally, the task set obtaining module includes:
the target graph node acquisition unit is used for acquiring target graph nodes of the analysis tasks in the task relation graph;
a first graph node obtaining unit, configured to obtain a first graph node that is a parent node in the target graph nodes, and divide the first graph node into a phase task set;
and the task set acquisition unit is used for deleting the first graph node from the target graph nodes and circularly executing the first graph node acquisition unit on the rest target graph nodes until all the analysis tasks are divided into the stage task sets.
Optionally, the operation result obtaining module includes:
the concurrent task number determining unit is used for determining the concurrent operation task number according to the equipment performance of the operation equipment and the average occupied memory corresponding to the analysis task;
and the task running result acquisition unit is used for running the analysis tasks in the stage task set in stages according to the stage running sequence of the stage task set and the number of the concurrent running tasks to obtain a task running result.
Optionally, the task operation result obtaining unit includes:
the first-stage execution subunit is used for sending the task operated in the first stage to the task executor to be executed through the task scheduler;
the residual task execution subunit is used for calculating a first task which is not required to be executed in the second-stage task according to a process exit code returned by the executor after all the tasks in the first stage are executed, removing the first task, and sending the residual tasks to the executor for execution;
and the operation result acquisition subunit is used for integrating the result of the analysis task selected by the user through the result data integration part after the operation is finished so as to obtain the task operation result.
Compared with the prior art, the invention has the advantages that:
according to the method, the dependency among tasks is calculated, so that the overall planning is effectively carried out on a plurality of analysis processes, and the analysis tasks are executed in parallel according to the priority stages in sequence, so that the problem that the speed of analyzing the tasks by linearly operating the multi-stage multi-checker is too low is solved; according to the method, the number of processes which can run simultaneously is calculated according to the number of tasks, the size of the memory of the PC and the number of the cores of the PC, and the analysis tasks which run in a single stage are executed concurrently according to the calculated number of the parallel tasks, so that the performance of hardware can be exerted to a greater extent, and the overall analysis time is shortened; the invention adopts a redirection output mode to independently form the analysis result of each checker of a project into a file, thereby effectively solving the problems that the results of all the checkers are accumulated in the same result file, and the result file is too large and inconvenient to read when the checking results are more.
Drawings
FIG. 1 is a flowchart illustrating steps of a method for parallel task allocation for multi-stage program analysis according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a parallel task allocation apparatus for multi-stage program analysis according to an embodiment of the present invention.
Detailed Description
Aiming at the problem that a user uses a plurality of multi-stage inspectors to analyze large-scale projects, the running time is long, and the running efficiency is low, the embodiment of the invention aims to reasonably schedule analysis tasks, give full play to hardware performance as far as possible, and reduce the total analysis time. The system comprises four parts of analysis task setting, analysis task calculation, analysis task execution and result data integration. The analysis task setting comprises two parts of checker setting and parameter setting; the analysis task calculation part comprises three parts of analysis task generation, analysis task dependency calculation and concurrency number calculation; the analysis task execution is divided into an analysis task scheduler and an analysis task executor. The invention constructs the dependency relationship of tasks required to be executed in each stage of program analysis into a directional graph data structure, and each node of the graph structure represents the task to be analyzed; the directed edges are the dependency relationship among the tasks, namely the father node of each task is the task node on which the father node depends; all edges should be unidirectional and cannot form a ring structure, namely, when a task node upwards searches for a task which depends on the task node, the task node cannot depend on the task node; a node is allowed to have multiple parents or multiple children, i.e. a task is allowed to depend on multiple tasks, or a task is depended on by multiple tasks.
Example one
Referring to fig. 1, a flowchart illustrating steps of a parallel task allocation method for multi-stage program analysis according to an embodiment of the present invention is shown, and as shown in fig. 1, the parallel task allocation method for multi-stage program analysis may specifically include the following steps:
step 101: and constructing a task relation graph corresponding to the code to be analyzed according to the dependency relation among all tasks in the code to be analyzed.
The embodiment of the invention can be combined with the task relation graph corresponding to the code to be analyzed to carry out program analysis.
The code to be analyzed refers to a level code that needs to be analyzed.
After the code to be analyzed is obtained, a task relationship diagram corresponding to the code to be analyzed may be constructed according to the dependency relationships among all tasks in the code to be analyzed, and specifically, the detailed description may be performed in combination with the following specific implementation manner.
In a specific implementation manner of the present invention, the step 101 may include:
substep A1: taking all the tasks as graph nodes;
substep A2: and connecting the graph nodes with the dependency relationship according to the dependency relationship to generate the task relationship graph.
In the embodiment of the present invention, all tasks in the code to be analyzed may be used as graph nodes, and then the graph nodes having a dependency relationship are connected according to the dependency relationship among all tasks, so that a task relationship graph may be generated, specifically, a dependency relationship graph of all tasks is generated according to the dependency relationship among the tasks, and the nodes of this graph represent the execution tasks selected by the user to be analyzed and the tasks on which these tasks depend, obviously, all nodes of this graph must include all the dependent tasks of the analysis checker selected by the user.
Step 102: and acquiring an analysis task needing to be operated in the code to be analyzed.
The analysis task refers to a task needing to be executed in the code to be analyzed.
Only part of the checker analysis tasks may be run in one analysis, and the set of these tasks is recorded as S1, i.e., the set formed by the analysis tasks. Checker settings refer to the checkers that configure the application needed for static analysis, typically specified by the user as input. Parameter settings are additional parameter settings made to the analysis tool, typically using default settings, which may also be specified by the user as input.
The analysis task generation main function is to arrange the input of the analysis task setting part and generate a command line for executing the analysis task. The analysis task dependency calculation is used for calculating the dependency relationship of the analysis tasks and generating multi-stage analysis tasks in cooperation with the analysis task generation part.
After the analysis task to be executed in the code to be analyzed is obtained, step 103 is executed.
Step 103: according to the task relation graph and the analysis task, performing stage division on the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel.
After the task relationship graph and the analysis task are obtained, the analysis task may be segmented according to the task relationship graph and the analysis task to obtain a segment task set, and each segment task set includes at least one parallel task that can be executed in parallel.
In another specific implementation manner of the present invention, the step 103 may include:
substep B1: acquiring a target graph node of the analysis task in the task relation graph;
substep B2: acquiring a first graph node which is a father node in the target graph nodes, and dividing the first graph node into a stage task set;
substep B3: deleting the first graph node from the target graph nodes, circularly executing the steps of acquiring the first graph node which is a parent node from the target graph nodes and dividing the first graph node into a phase task set for the rest target graph nodes until all analysis tasks are divided into the phase task set.
In this embodiment of the present invention, it may be noted that the dependency graph of all tasks is G, and then the element in S1 is a node in G. Finding parent nodes of all elements in S1 in G, recording a set formed by the parent nodes as S2, if elements of which the parent nodes cannot be found exist, putting the elements into another set as S0, and removing the elements in S1; and then carrying out the operation on the S2-Sn-1 set until Sn is an empty set. Finally, a plurality of sets are formed, namely task sets operated in different stages, wherein S0 is the task set operated in the first stage and then is arranged in the sequence from Sn-1 to S1.
After the set of phase tasks is obtained, step 104 is performed.
Step 104: and according to the number of the concurrent running tasks, running the stage tasks in the stage task set, and acquiring a task running result.
After the phase task set is obtained, the phase tasks may be executed according to the number of concurrently executed tasks, and a task execution result is obtained, and specifically, the following detailed description may be performed in combination with the following specific implementation manner:
in another specific implementation manner of the present invention, the step 104 may include:
substep C1: and determining the number of the concurrent running tasks according to the equipment performance of the running equipment and the average occupied memory corresponding to the analysis task.
Substep C2: and according to the phase running sequence of the phase task set and the number of the concurrent running tasks, running the analysis tasks in the phase task set in stages to obtain a task running result.
In the embodiment of the invention, the analysis task scheduler is used for handing the tasks to the analysis task executor one by one to execute during the execution of one stage of analysis task, judging whether the execution of the tasks is successful according to the exit code returned by the executor, and not executing the subsequent tasks of the tasks which are not successfully executed.
The analysis task executor has the functions of creating system process based on task, redirecting various outputs to file, receiving the exit code of the task process and sending the exit code to the analysis task scheduler.
In the set Sn of the plurality of tasks obtained through the above steps, all tasks in the set can be performed simultaneously, but generally, due to the performance limitation of the PC executing the tasks, all tasks cannot be performed simultaneously, and therefore, the number of tasks that can be simultaneously executed needs to be calculated according to the performance of the PC (i.e., the memory size and the number of computer cores) and the average memory occupation of the tasks.
And according to the stage sequence calculated in the steps, establishing a process for running the tasks in stages, designating the redirection positions of the process output stream and the error stream, after starting execution, waiting for the completion of the task running and recording a process exit code, wherein the number of the simultaneously-running tasks cannot exceed the number of the concurrently-running tasks obtained by calculation.
Judging exit codes of processes running tasks (supposed to be Sn), forming a failure task set by the tasks which are not normally exited by the processes, recording the failure task set as Sfn, calculating which tasks in the Sfn are depended by the next stage (namely Sn-1), removing the tasks from the Sn-1, and adding the tasks into the failure task set Sfn-1 of the next stage.
And continuing the running of the next stage task: and repeating the running process for the next stage task.
And after all tasks are completed, performing result integration: for the task actually executed last in S1, according to the output file in the above step, the results of all the output files are integrated into the final result through several threads, where the threads at least include 1 read thread and 1 write thread, and assuming that the calculated concurrency number is greater than 2, it may be determined to create more read threads or write threads according to the number and size of the output files, and it should be noted that if multiple write threads are started, the problem of inter-thread conflict needs to be solved.
In the embodiment of the invention, the task scheduler firstly hands the tasks operated in the first stage to the task executor to execute, calculates which tasks in the second stage are unnecessary to execute according to the process exit code returned by the executor after all the tasks in the first stage are executed, removes the tasks which are unnecessary to execute, and hands the rest tasks to the executor to execute. After the operation is finished, the results of the task analyzed by the checker selected by the user are integrated through the result data integration part, firstly, a result reading thread is started to read the result files of all tasks one by one, then, a result storing thread is started to store the read results into the final result file in a gathering manner, if the concurrent operation number of the tasks selected by the user is more than 2, more result reading threads can be started, the threads are also performed concurrently, only one result storing thread is started, and the problem of writing conflict is avoided.
According to the embodiment of the invention, the number of processes which can run simultaneously is calculated according to the number of tasks, the size of the memory of the PC and the number of the cores of the PC, and the analysis tasks which run in a single stage are concurrently executed according to the calculated number of the parallel tasks, so that the hardware performance can be greatly exerted, and the overall analysis time is shortened.
Example two
Referring to fig. 2, a schematic structural diagram of a parallel task allocation apparatus for multi-stage program analysis according to an embodiment of the present invention is shown, and as shown in fig. 2, the parallel task allocation apparatus for multi-stage program analysis may specifically include the following modules:
the task relationship graph building module 210 is configured to build a task relationship graph corresponding to a code to be analyzed according to dependency relationships among all tasks in the code to be analyzed;
an analysis task obtaining module 220, configured to obtain an analysis task that needs to be executed in the code to be analyzed;
a task set obtaining module 230, configured to perform stage division on the analysis task according to the task relationship diagram and the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel;
and the operation result obtaining module 240 is configured to operate the stage tasks in the stage task set according to the number of concurrent operation tasks, and obtain a task operation result.
Optionally, the task relationship graph building module 210 includes:
a graph node acquisition unit, configured to take all the tasks as graph nodes;
and the task relation graph generating unit is used for connecting graph nodes with the dependency relationship according to the dependency relationship to generate the task relation graph.
Optionally, the task set obtaining module 230 includes:
the target graph node acquisition unit is used for acquiring target graph nodes of the analysis tasks in the task relation graph;
a first graph node obtaining unit, configured to obtain a first graph node that is a parent node in the target graph nodes, and divide the first graph node into a phase task set;
and the task set acquisition unit is used for deleting the first graph node from the target graph nodes and circularly executing the first graph node acquisition unit on the rest target graph nodes until all the analysis tasks are divided into the stage task sets.
Optionally, the operation result obtaining module 240 includes:
the concurrent task number determining unit is used for determining the concurrent operation task number according to the equipment performance of the operation equipment and the average occupied memory corresponding to the analysis task;
and the task running result acquisition unit is used for running the analysis tasks in the stage task set in stages according to the stage running sequence of the stage task set and the number of the concurrent running tasks to obtain a task running result.
Optionally, the task operation result obtaining unit includes:
the first-stage execution subunit is used for sending the task operated in the first stage to the task executor to be executed through the task scheduler;
the residual task execution subunit is used for calculating a first task which is not required to be executed in the second-stage task according to a process exit code returned by the executor after all the tasks in the first stage are executed, removing the first task, and sending the residual tasks to the executor for execution;
and the operation result acquisition subunit is used for integrating the result of the analysis task selected by the user through the result data integration part after the operation is finished so as to obtain the task operation result.
The above description is only a preferred embodiment of the present invention, and although the present invention has been disclosed in terms of the preferred implementation method, the present invention is not limited to the above description.
Those skilled in the art will appreciate that those matters not described in detail in the present specification are well known in the art.

Claims (10)

1. A method for parallel task distribution for multi-stage program analysis, comprising:
according to the dependency relationship among all tasks in the code to be analyzed, a task relationship graph corresponding to the code to be analyzed is constructed;
acquiring an analysis task needing to be operated in the code to be analyzed;
according to the task relation graph and the analysis task, performing stage division on the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel;
and according to the number of the concurrent running tasks, running the stage tasks in the stage task set, and acquiring a task running result.
2. The method according to claim 1, wherein the constructing a task relationship graph corresponding to the code to be analyzed according to the dependency relationship among all tasks in the code to be analyzed comprises:
taking all the tasks as graph nodes;
and connecting the graph nodes with the dependency relationship according to the dependency relationship to generate the task relationship graph.
3. The method according to claim 1, wherein the step of performing stage division on the analysis task according to the task relationship graph and the analysis task to obtain a stage task set comprises:
acquiring a target graph node of the analysis task in the task relation graph;
acquiring a first graph node which is a father node in the target graph nodes, and dividing the first graph node into a stage task set;
deleting the first graph node from the target graph nodes, circularly executing the steps of acquiring the first graph node which is a parent node from the target graph nodes and dividing the first graph node into a phase task set for the rest target graph nodes until all analysis tasks are divided into the phase task set.
4. The method according to claim 1, wherein the operating the phase tasks in the phase task set according to the number of concurrently operating tasks and obtaining the task operation result comprises:
determining the number of the concurrent running tasks according to the equipment performance of the running equipment and the average occupied memory corresponding to the analysis task;
and according to the phase running sequence of the phase task set and the number of the concurrent running tasks, running the analysis tasks in the phase task set in stages to obtain a task running result.
5. The method according to claim 4, wherein the step of executing the analysis tasks in the phase task sets in stages according to the phase execution order of the phase task sets and the number of concurrently executed tasks to obtain task execution results includes:
sending the task operated in the first stage to a task executor through a task scheduler for execution;
after all the tasks in the first stage are executed, calculating a first task which is not required to be executed in the second stage task according to a process exit code returned by the executor, removing the first task, and sending the rest tasks to the executor for execution;
and after the operation is finished, integrating the results of the analysis tasks selected by the user through a result data integration part to obtain the task operation results.
6. A parallel task allocation apparatus for multi-stage program analysis, comprising:
the task relation graph building module is used for building a task relation graph corresponding to the code to be analyzed according to the dependency relationship among all tasks in the code to be analyzed;
the analysis task acquisition module is used for acquiring an analysis task to be operated in the code to be analyzed;
the task set acquisition module is used for carrying out stage division on the analysis task according to the task relation graph and the analysis task to obtain a stage task set; the phase task set comprises at least one parallel task which can be executed in parallel;
and the operation result acquisition module is used for operating the stage tasks in the stage task set according to the number of the concurrent operation tasks and acquiring the task operation result.
7. The apparatus of claim 6, wherein the task relationship graph building module comprises:
a graph node acquisition unit, configured to take all the tasks as graph nodes;
and the task relation graph generating unit is used for connecting graph nodes with the dependency relationship according to the dependency relationship to generate the task relation graph.
8. The apparatus of claim 6, wherein the task set obtaining module comprises:
the target graph node acquisition unit is used for acquiring target graph nodes of the analysis tasks in the task relation graph;
a first graph node obtaining unit, configured to obtain a first graph node that is a parent node in the target graph nodes, and divide the first graph node into a phase task set;
and the task set acquisition unit is used for deleting the first graph node from the target graph nodes and circularly executing the first graph node acquisition unit on the rest target graph nodes until all the analysis tasks are divided into the stage task sets.
9. The apparatus of claim 6, wherein the operation result obtaining module comprises:
the concurrent task number determining unit is used for determining the concurrent operation task number according to the equipment performance of the operation equipment and the average occupied memory corresponding to the analysis task;
and the task running result acquisition unit is used for running the analysis tasks in the stage task set in stages according to the stage running sequence of the stage task set and the number of the concurrent running tasks to obtain a task running result.
10. The apparatus according to claim 9, wherein the task execution result obtaining unit includes:
the first-stage execution subunit is used for sending the task operated in the first stage to the task executor to be executed through the task scheduler;
the residual task execution subunit is used for calculating a first task which is not required to be executed in the second-stage task according to a process exit code returned by the executor after all the tasks in the first stage are executed, removing the first task, and sending the residual tasks to the executor for execution;
and the operation result acquisition subunit is used for integrating the result of the analysis task selected by the user through the result data integration part after the operation is finished so as to obtain the task operation result.
CN202011272405.2A 2020-11-13 Parallel task allocation method and device for multi-stage program analysis Active CN112269648B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011272405.2A CN112269648B (en) 2020-11-13 Parallel task allocation method and device for multi-stage program analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011272405.2A CN112269648B (en) 2020-11-13 Parallel task allocation method and device for multi-stage program analysis

Publications (2)

Publication Number Publication Date
CN112269648A true CN112269648A (en) 2021-01-26
CN112269648B CN112269648B (en) 2024-05-31

Family

ID=

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130036425A1 (en) * 2011-08-04 2013-02-07 Microsoft Corporation Using stages to handle dependencies in parallel tasks
CN104915260A (en) * 2015-06-19 2015-09-16 北京搜狐新媒体信息技术有限公司 Hadoop cluster management task distributing method and system
CN105718244A (en) * 2016-01-18 2016-06-29 上海交通大学 Streamline data shuffle Spark task scheduling and executing method
CN107329828A (en) * 2017-06-26 2017-11-07 华中科技大学 A kind of data flow programmed method and system towards CPU/GPU isomeric groups
CN107612886A (en) * 2017-08-15 2018-01-19 中国科学院大学 A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques
CN107885587A (en) * 2017-11-17 2018-04-06 清华大学 A kind of executive plan generation method of big data analysis process
CN108268312A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 Method for scheduling task and scheduler
US20180373540A1 (en) * 2017-06-21 2018-12-27 International Business Machines Corporation Cluster graphical processing unit (gpu) resource sharing efficiency by directed acyclic graph (dag) generation
CN110225082A (en) * 2019-04-30 2019-09-10 北京奇艺世纪科技有限公司 Task processing method, device, electronic equipment and computer-readable medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130036425A1 (en) * 2011-08-04 2013-02-07 Microsoft Corporation Using stages to handle dependencies in parallel tasks
CN104915260A (en) * 2015-06-19 2015-09-16 北京搜狐新媒体信息技术有限公司 Hadoop cluster management task distributing method and system
CN105718244A (en) * 2016-01-18 2016-06-29 上海交通大学 Streamline data shuffle Spark task scheduling and executing method
CN108268312A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 Method for scheduling task and scheduler
US20180373540A1 (en) * 2017-06-21 2018-12-27 International Business Machines Corporation Cluster graphical processing unit (gpu) resource sharing efficiency by directed acyclic graph (dag) generation
CN107329828A (en) * 2017-06-26 2017-11-07 华中科技大学 A kind of data flow programmed method and system towards CPU/GPU isomeric groups
CN107612886A (en) * 2017-08-15 2018-01-19 中国科学院大学 A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques
CN107885587A (en) * 2017-11-17 2018-04-06 清华大学 A kind of executive plan generation method of big data analysis process
CN110225082A (en) * 2019-04-30 2019-09-10 北京奇艺世纪科技有限公司 Task processing method, device, electronic equipment and computer-readable medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ACHIM LOSCH; MARCO PLATZNER: "MigHEFT: DAG-based Scheduling of Migratable Tasks on Heterogeneous Compute Nodes", 2020 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) *
仇文娟: "云计算中依赖任务动态并行调度机制的研究", 中国优秀硕士学位论文全文数据库信息科技辑 *
陈俊宇;刘茜萍;: "云环境下基于阶段划分的数据密集型工作流调度", 南京邮电大学学报(自然科学版), no. 04 *

Similar Documents

Publication Publication Date Title
US5303170A (en) System and method for process modelling and project planning
US10366084B2 (en) Optimizing pipelining result sets with fault tolerance in distributed query execution
US20040268335A1 (en) Modulo scheduling of multiple instruction chains
CN109799991B (en) Source code compiling method and system based on MapReduce framework distributed computing environment
JPH07152614A (en) Operation analyzing device for parallel processing system
CN112579586A (en) Data processing method, device, equipment and storage medium
CN103995778A (en) Script file generation method and device based on event and action
CN109934507A (en) A kind of method and device of operation flow scheduling
JPH0816429A (en) Parallel program generation supporting device, parallel program generating method, and parallel program executing device
CN111258985A (en) Data cluster migration method and device
US11275659B2 (en) Proactive cherry-picking to back-port commits
CN112765014B (en) Automatic test system for multi-user simultaneous operation and working method
CN109656868B (en) Memory data transfer method between CPU and GPU
US10976965B1 (en) Optimization of in-memory processing of data represented by an acyclic graph so that the removal and re-materialization of data in selected nodes is minimized
CN113190905B (en) Building model analysis method, device and storage medium
CN112269648B (en) Parallel task allocation method and device for multi-stage program analysis
CN112269648A (en) Parallel task allocation method and device for multi-stage program analysis
CN111522730B (en) Program testing method and device, computer device and computer readable medium
JP6169302B2 (en) Specification configuration apparatus and method
JPH0850554A (en) Method and device for automatically generating operation model of processor and test instruction sequence for logic verification
Paulin et al. Method for Constructing the Model of Computing Processbased on Petri net
CN113742125A (en) Lightweight high-throughput computing mode and fault-tolerant method thereof
CN112882751A (en) CUDA program migration method, device, electronic equipment and storage medium
JP4952317B2 (en) Saved data discrimination method, saved data discrimination program, and saved data discrimination device
EP3547127A1 (en) Method for configuration of an automation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant