CN111026516B - Exception handling method, task assigning apparatus, task handling system, and storage medium - Google Patents

Exception handling method, task assigning apparatus, task handling system, and storage medium Download PDF

Info

Publication number
CN111026516B
CN111026516B CN201811179066.6A CN201811179066A CN111026516B CN 111026516 B CN111026516 B CN 111026516B CN 201811179066 A CN201811179066 A CN 201811179066A CN 111026516 B CN111026516 B CN 111026516B
Authority
CN
China
Prior art keywords
task
current
job
instruction
end information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811179066.6A
Other languages
Chinese (zh)
Other versions
CN111026516A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201811179066.6A priority Critical patent/CN111026516B/en
Priority to PCT/CN2019/110273 priority patent/WO2020073938A1/en
Publication of CN111026516A publication Critical patent/CN111026516A/en
Application granted granted Critical
Publication of CN111026516B publication Critical patent/CN111026516B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

According to the exception handling method, the task assigning device, the task handling system and the storage medium, when the task assigning device receives the task destroying instruction, the task destroying instruction can be used for executing the destroying operation, so that when an exception condition exists, the influence of the exception condition on the operation correctness can be avoided, the unnecessary operation consumption of the handling is reduced, and the operation efficiency and the reliability are ensured.

Description

Exception handling method, task assigning device, task handling system, and storage medium
Technical Field
The present application relates to the field of computer application technologies, and in particular, to an exception handling method, a task assigning apparatus, a task processing system, and a storage medium.
Background
With the development of computer technology and the advent of the big data era, processors have played an increasingly important role as basic computing devices. If there is an abnormal condition in the process of executing the operation by the processor, the execution result of the operation will be wrong, thereby affecting the efficiency and reliability of the operation. Therefore, how to monitor and process the abnormal situation in the process of executing the operation by the processor becomes an urgent problem to be solved.
Disclosure of Invention
In view of the above, it is necessary to provide an exception handling method, a task assigning apparatus, a task handling system, and a storage medium, which can monitor and handle an exception condition during the execution of a computation by a processor, and ensure the computation efficiency and reliability of the processor.
A method of exception handling, the method comprising the steps of:
when a task destroying instruction is received, executing a destroying operation according to the task destroying instruction, wherein the destroying operation comprises destroying the current task with execution exception and all tasks in the task queue to which the current task belongs.
In one embodiment, the task destruction instruction comprises a first task destruction instruction; the step of executing the destruction operation according to the task destruction instruction comprises the following steps:
and when the first task destroying instruction is received, terminating and scheduling the current operation with execution exception and all operations after the current operation according to the first task destroying instruction, and generating scheduling end information of the current task to destroy the current task.
In one embodiment, the task destruction instruction further comprises a second task destruction instruction; the step of executing the destruction operation according to the task destruction instruction further comprises:
when the second task destroying instruction is received, terminating the scheduling of the current task and other tasks to be processed after the current task according to the second task destroying instruction;
after the current task is cleared from the register, sending a task registration request corresponding to each task to be processed to a state monitoring device so as to obtain a task identifier corresponding to each task to be processed;
and receiving task identifiers corresponding to the tasks to be processed, which are fed back by the state monitoring device, and obtaining scheduling end information corresponding to the tasks to be processed according to the task identifiers corresponding to the tasks to be processed so as to destroy all the tasks to be processed after the current task.
In one embodiment, the method further comprises the steps of:
and when the task destroying instruction is received, generating a first interrupt signal, transmitting the first interrupt signal to a first processor, and then executing the destroying operation.
In one embodiment, the method further comprises the steps of:
and after the destruction operation is finished, generating a second interrupt signal and transmitting the second interrupt signal to the first processor.
A method of exception handling, the method comprising the steps of:
the second processor obtains job end information of a current job and transmits the job end information of the current job to the state monitoring device, wherein the job end information of the current job comprises an execution state of the current job, and the execution state of the current job is used for identifying whether the current job is abnormal in execution or not;
the state monitoring device judges whether the current operation has execution abnormality according to the operation end information of the current operation, generates a task destroying instruction when the current operation has execution abnormality, and transmits the task destroying instruction to a task assigning device;
and the task dispatching device executes destruction operation according to the task destruction instruction, wherein the destruction operation comprises destroying the current task with execution exception and all tasks in the task queue to which the current task belongs.
In one embodiment, the execution exception of the current task includes a first exception condition and a second exception condition, and the task destruction instruction includes a first task destruction instruction and a second task destruction instruction; the step of generating a task destruction instruction by the state monitoring device when the current operation has execution abnormality, further comprises:
if the state monitoring device determines that the current operation has a first abnormal condition according to the operation ending information of the current operation, a first task destroying instruction is generated according to the current operation ending information, and the first task destroying instruction is transmitted to the task assigning device;
and if the state monitoring device determines that the current operation has a second abnormal condition according to the operation end information of the current operation, generating a second task destruction instruction according to the current operation end information, and transmitting the second task destruction instruction to the task assigning device.
In one embodiment, the step of executing the destruction operation by the task dispatching device according to the task destruction instruction further includes:
when the task dispatching device receives the first task destroying instruction, the task dispatching device terminates and dispatches the current operation with execution exception and all operations after the current operation according to the first task destroying instruction, and generates dispatching end information of the current task so as to destroy the current task.
In one embodiment, the step of executing the destruction operation by the task dispatching device according to the task destruction instruction further includes:
when the task dispatching device receives the second task destroying instruction, the current task and other tasks to be processed after the current task are scheduled according to the second task destroying instruction;
after the current task is cleared from the register, the task dispatching device sends a task registration request corresponding to each task to be processed to a state monitoring device;
the state monitoring device distributes a task identifier for each task to be processed respectively according to the task registration request corresponding to each task to be processed, and transmits the task identifier corresponding to each task to be processed to the task assigning device;
when the task assigning device receives the task identifier corresponding to each task to be processed, the scheduling end information corresponding to each task to be processed is obtained according to the task identifier corresponding to each task to be processed, so that all tasks to be processed after the current task are destroyed.
In one embodiment, the method further comprises the steps of:
after the state monitoring device receives scheduling end information of the current task or receives scheduling end information of the current task and all tasks in a task queue to which the current task belongs, the state monitoring device generates exception handling end information and transmits the exception handling end information to the task dispatching device;
and the task dispatching device generates a second interrupt signal according to the exception handling ending information and transmits the second interrupt signal to the first processor.
In one embodiment, the step of the second processor transmitting the job end information of the current job to the status monitoring device includes:
when the information fed back by more than one second processor body corresponding to the current operation is abnormal, the control device of the second processor body marks the execution state of the current operation as abnormal execution;
and when the control device of the second processor body receives the information fed back by all the second processor bodies corresponding to the operation, acquiring the operation end information of the current operation, and transmitting the operation end information of the current operation to the state monitoring device.
A task assigning device comprises a task destroying circuit, wherein the task destroying circuit is connected with a state monitoring device and a task caching device, and the task caching device is used for storing a task queue;
the task destroying circuit is used for executing a destroying operation according to a task destroying instruction when receiving the task destroying instruction transmitted by the state monitoring device, wherein the destroying operation comprises destroying the current task with execution abnormity and all tasks in the task queue to which the current task belongs.
A task processing system comprises a task scheduler and a second processor connected with the task scheduler, wherein the task scheduler comprises a task dispatching device and a state monitoring device connected with the task dispatching device, and the state monitoring device and the task dispatching device are both connected to the second processor; wherein,
the second processor is used for obtaining the job end information of the current job and transmitting the job end information of the current job to the state monitoring device, wherein the job end information of the current job comprises the execution state of the current job, and the execution state of the current job is used for identifying whether the current job is abnormal in execution or not;
the state monitoring device is used for judging whether the current operation has execution abnormality according to the operation end information of the current operation, generating a task destroying instruction when the current operation has execution abnormality, and transmitting the task destroying instruction to the task assigning device;
the task dispatching device is used for executing a destroying operation according to the task destroying instruction, wherein the destroying operation comprises destroying the current task with the execution exception and all tasks in the task queue to which the current task belongs.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the methods described above.
According to the exception handling method, the task assigning device, the task processing system and the storage medium, when the task assigning device receives a task destroying instruction, the task destroying device can execute destroying operation according to the task destroying instruction, wherein the destroying operation comprises destroying the current task with execution exception and all tasks in a task queue to which the current task belongs; therefore, when an abnormal condition exists, the task which is being scheduled and other tasks after the task can be destroyed in time, the influence of the abnormal condition on the operation correctness can be avoided, the unnecessary operation consumption of processing is reduced, and the operation efficiency and the reliability are ensured.
Drawings
FIG. 1 is a block diagram of a task processing system in one embodiment;
FIG. 2 is a block diagram of a task processing system in one embodiment;
FIG. 3 is a block diagram of a task processing system in one embodiment;
FIG. 4 is a block diagram of a second processor entity in one embodiment;
FIG. 5 is a block diagram of a second processor entity in another embodiment;
FIG. 6 is a block diagram of a second processor entity in accordance with an alternative embodiment;
FIG. 7 is a flowchart illustrating a method of exception handling in one embodiment;
FIG. 8 is a flowchart of a method for exception handling in another embodiment;
FIG. 9 is a flow diagram of an exception handling method of one embodiment;
FIG. 10 is a flowchart illustrating a method of exception handling in one embodiment;
FIG. 11 is a flowchart illustrating a process of generating a task destroy instruction according to an embodiment;
FIG. 12 is a flowchart illustrating a method of exception handling in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
As shown in fig. 1 to 3, a task processing system according to an embodiment of the present application may include a task scheduler and a second processor connected to the task scheduler. Further, the task processing system may also be connected to the first processor 200, in particular, the task scheduler may be connected to the first processor 200. The first processor 200 may be a general-purpose processor such as a CPU, and the second processor 300 may be a coprocessor of the first processor 200. Specifically, the second processor 300 may include a second processor body and a control device for controlling the operation of the second processor body, and the second processor body may be an artificial intelligent processor such as an IPU (intelligent Processing Unit) or an NPU (Neural-network Processing Unit). Of course, in other embodiments, the second processor body may also be a general-purpose processor such as a CPU or a GPU.
Further, the number of the second processor bodies may be plural, and the plural second processor bodies are all connected to the control device of the second processor body. Or, the second processor body may also include a plurality of core processors, and each core processor is connected to the control device of the second processor body. The state monitoring apparatus 100 may be connected to a control apparatus of the second processor body, which is capable of transmitting task execution information of the second processor body to the state monitoring apparatus 100.
In one embodiment, as shown in fig. 4, the second processor body may include a controller unit 310 and an arithmetic unit 320, wherein the controller unit 310 is connected with the arithmetic unit 320, and the arithmetic unit 320 may include a master processing circuit 321 and a plurality of slave processing circuits 322, and the master processing circuit 321 and the slave processing circuits 322 form a master-slave structure. Optionally, the controller unit 310 is used for acquiring data and calculating instructions. The data may specifically include machine learning data, which may optionally be neural network data. The controller unit 310 is further configured to parse the obtained calculation instruction to obtain an operation instruction, and send a plurality of operation instructions and data to the main processing circuit. The master processing circuit 321 is configured to perform preamble processing on data and arithmetic instructions and data transferred between the master processing circuit 321 and the plurality of slave processing circuits 322. The plurality of slave processing circuits 322 are configured to perform an intermediate operation in parallel according to the data and the operation instruction transmitted from the master processing circuit 321 to obtain a plurality of intermediate results, and transmit the plurality of intermediate results to the master processing circuit 321; the main processing circuit 321 is further configured to perform subsequent processing on the plurality of intermediate results to obtain a calculation result of the calculation instruction.
Optionally, the controller unit 310 may include an instruction cache unit 311, an instruction processing unit 312, and a store queue unit 314; the instruction cache unit 311 is configured to store a calculation instruction associated with the machine learning data;
the instruction processing unit 312 is configured to parse the computation instruction to obtain a plurality of operation instructions; store queue unit 314 is used to store an instruction queue, which includes: and a plurality of operation instructions or calculation instructions to be executed according to the front and back sequence of the queue. Optionally, the controller unit 310 may further include a dependency processing unit 313, configured to determine whether the first operation instruction is associated with a zeroth operation instruction before the first operation instruction when there are multiple operation instructions, if the first operation instruction is associated with the zeroth operation instruction, cache the first operation instruction in the instruction storage unit, and after the zeroth operation instruction is executed, fetch the first operation instruction from the instruction storage unit and transmit the first operation instruction to the operation unit. Specifically, if the dependency processing unit 313 extracts a first storage address interval of the required data (e.g. a matrix) in the first operation instruction according to the first operation instruction, and extracts a zero-th storage address interval of the required matrix in the zero-th operation instruction according to the zero-th operation instruction, if the first storage address interval and the zero-th storage address interval have an overlapping region, it is determined that the first operation instruction and the zero-th operation instruction have an association relationship, and if the first storage address interval and the zero-th storage address interval do not have an overlapping region, it is determined that the first operation instruction and the zero-th operation instruction do not have an association relationship.
In one embodiment, as shown in fig. 5, the operation unit 320 may further include one or more branch processing circuits 323, wherein the one or more branch processing circuits 323 are all connected to the main processing circuit 321, and each branch processing circuit 323 is connected to more than one slave processing circuit 322. The branch processing circuit 323 is used to execute data or instructions between the forwarding master processing circuit 321 and the slave processing circuit 322. In this embodiment, the main processing circuit 321 is specifically configured to determine that the input neuron is broadcast data, weight is distribution data, allocate the distribution data to a plurality of data blocks, and send at least one data block of the plurality of data blocks, the broadcast data, and at least one operation instruction of the plurality of operation instructions to the branch processing circuit; the branch processing circuit 323 is configured to forward the data block, the broadcast data, and the operation instruction between the master processing circuit 321 and the plurality of slave processing circuits 322; the slave processing circuits 322 are configured to perform operations on the received data blocks and broadcast data according to the operation instructions to obtain intermediate results, and transmit the intermediate results to the branch processing circuit 323; the main processing circuit 321 is further configured to perform subsequent processing on the intermediate result sent by the branch processing circuit to obtain a result of the calculation instruction, and send the result of the calculation instruction to the controller unit.
In another alternative embodiment, as shown in fig. 6, the arithmetic unit 320 may include a master processing circuit 321 and a plurality of slave processing circuits 322. Wherein, a plurality of slave processing circuits 322 are distributed in an array; each slave processing circuit 322 is connected to other adjacent slave processing circuits 322, and the master processing circuit 321 is connected to k slave processing circuits among the plurality of slave processing circuits, where the k slave processing circuits are: it should be noted that, as shown in fig. 6, the K slave processing circuits include only the n slave processing circuits in the 1 st row, the n slave processing circuits in the m th row, and the m slave processing circuits in the 1 st column, that is, the K slave processing circuits are slave processing circuits directly connected to the master processing circuit among the plurality of slave processing circuits. The K slave processing circuits are used for forwarding data and instructions between the main processing circuit and the plurality of slave processing circuits.
Optionally, the main processing circuit may further include one or any combination of a conversion processing circuit, an activation processing circuit, and an addition processing circuit; wherein the conversion processing circuit is used for performing interchange between the first data structure and the second data structure (such as conversion of continuous data and discrete data) on the data block or the intermediate result received by the main processing circuit; or performing an interchange between the first data type and the second data type (e.g. conversion of a fixed point type to a floating point type) on a data block or intermediate result received by the main processing circuitry; the activation processing circuit is used for executing activation operation of data in the main processing circuit; the addition processing circuit is used for executing addition operation or accumulation operation.
Further, the slave processing circuit comprises a multiplication processing circuit; the multiplication processing circuit is used for executing multiplication operation on the received data block to obtain a product result. Still further, the slave processing circuit may further include a forwarding processing circuit for forwarding the received data block or the multiplied result, and an accumulation processing circuit for performing an accumulation operation on the multiplied result to obtain the intermediate result.
The processor provided by the application sets the arithmetic unit 320 to be a master-slave structure, and for the calculation instruction of forward operation, the processor can split data according to the calculation instruction of forward operation, so that parallel operation can be performed on the part with larger calculation amount through a plurality of slave processing circuits, thereby improving the operation speed, saving the operation time and further reducing the power consumption.
Optionally, the machine learning calculation specifically may include: the artificial neural network operation, where the input data specifically includes: neuron data and weight data are input. The calculation result may specifically be: and outputting the neuron data as the result of the artificial neural network operation.
In the forward operation, after the execution of the artificial neural network of the previous layer is completed, the operation instruction of the next layer takes the output neuron calculated in the operation unit as the input neuron of the next layer to perform operation (or performs some operation on the output neuron and then takes the output neuron as the input neuron of the next layer), and at the same time, the weight value is replaced by the weight value of the next layer; in the reverse operation, after the reverse operation of the artificial neural network of the previous layer is completed, the operation instruction of the next layer takes the input neuron gradient calculated in the operation unit as the output neuron gradient of the next layer to perform operation (or performs some operation on the input neuron gradient and then takes the input neuron gradient as the output neuron gradient of the next layer), and at the same time, the weight value is replaced by the weight value of the next layer.
The above machine learning calculations may also include support vector machine operations, k-nearest neighbors (k-nn) operations, k-means (k-means) operations, principal component analysis operations, and the like. For convenience of description, the following takes artificial neural network operation as an example to illustrate a specific scheme of machine learning calculation.
For the operation of the artificial neural network, if the operation of the artificial neural network has multi-layer operation, the input neurons and the output neurons of the multi-layer operation do not refer to the neurons in the input layer and the neurons in the output layer of the whole neural network, but for any two adjacent layers in the network, the neurons in the lower layer of the forward operation of the network are the input neurons, and the neurons in the upper layer of the forward operation of the network are the output neurons. Taking the convolutional neural network as an example, let a convolutional neural network have
Figure 127299DEST_PATH_IMAGE001
A layer of a material selected from the group consisting of,
Figure 171872DEST_PATH_IMAGE002
to, forIn the first place
Figure 575171DEST_PATH_IMAGE003
Layer and the first
Figure 37377DEST_PATH_IMAGE004
Layer by layer, we will first
Figure 650760DEST_PATH_IMAGE003
The layer is called the input layer, in which the neuron is the input neuron, the first
Figure 856614DEST_PATH_IMAGE004
The layer is referred to as the output layer, where the neurons are the output neurons. That is, each layer except the topmost layer can be used as an input layer, and the next layer is a corresponding output layer.
Optionally, the task identifier corresponding to the aforementioned exemplary service; . The task scheduler may include a task assigning device 400 and a status monitoring device 100 connected to the task assigning device 400. Further, the task assigning device 400 may include a task decomposition device 410 and a task scheduling device 420, wherein the task decomposition device 410 is configured to determine task decomposition information of the current task, and the task scheduling device 420 is configured to determine scheduling information. The task decomposition device 410 and the task scheduling device 420 may both be connected to the second processor, the task decomposition device 410 may obtain decomposition information of the current task, and the task scheduling device may obtain scheduling information according to the decomposition information of the current task and the processor state information of the second processor, and transmit the scheduling information to the second processor. Meanwhile, the task scheduling device can also transmit the task decomposition information and all the information of the current task to the second processor, and the second processor can perform scheduling processing on the current task. Specifically, the control device of the second processor body of the second processor may split the current task into a plurality of jobs according to the split information of the current task and all information thereof, and send each job to the second processing body matched therewith for execution.
Further, the control device of the second processor body can also receive information fed back by each second processor body, and judge whether the second processor body is abnormal or not according to the information fed back by each second processor body. When there is an abnormality in more than one second processor body, the control device of the second processor body may transmit the information to the state monitoring device 100, and the state monitoring device 100 may control the task decomposition device 410 to execute the destruction operation, so as to ensure the correctness of the operation.
In one embodiment, the condition monitoring apparatus 100 may include a task registration circuit 110, a verification circuit 120, and a condition processing circuit 130, wherein the task registration circuit 110, the verification circuit 120, and the condition processing circuit 130 are electrically connected in sequence. The task registration circuit 110 may be connected to the task dispatching device 400, the checking circuit and the state processing circuit 130 may be connected to the second processor 300, and the state processing circuit 130 may be connected to the first processor 200. The task registration circuit 110 is configured to receive a task registration request of a current task, and allocate a task identifier to the current task according to the task registration request of the current task, where the task identifier may be used to distinguish different tasks. Specifically, the task registration circuit 110 may be further connected to the task decomposition device 410 of the task dispatching device 400, the task decomposition device 410 may send a registration request of the current task to the task registration circuit 110, and the task registration circuit 110 may assign a task identifier to the current task according to the task registration request received by the task registration circuit 110, so as to complete the registration of the current task. The task registration circuit 110 may transmit the task identifier of the current task to the task decomposition device 410, and the current task having the task identifier can be decomposed by the task decomposition device 410, and further scheduled by the task scheduling device 420, and sent to the second processor 300 for processing.
After the task registration circuit 110 completes the registration of the current task, the task decomposition device 410 may determine decomposition information of the current task and transmit the decomposition information of the current task to the task registration circuit 110, and the task registration circuit 110 may obtain the total number of jobs included in the current task according to the decomposition information of the current task and transmit the total number of jobs included in the current task to the verification circuit 120. Further, the task assigning device 400 may be further connected to the second processor 300, specifically, the task scheduling device 420 of the task assigning device 400 may be connected to the second processor 300, and send the decomposition information and all task information of the current task to the second processor 300, specifically, the task scheduling device 420 may send the decomposition information and all task information of the current task to the control device of the second processor body, and the control device of the second processor body may split the current task into a plurality of jobs according to the decomposition information and all task information of the current task received by the control device of the second processor body, and send the plurality of jobs to each of the second processor bodies respectively for processing, so as to process the plurality of jobs of the current task in parallel, and improve the processing efficiency of the current task. The check circuit 120 can thereby obtain the job receiving number of the second processor transmitted by the control device of the second processor body, and specifically, the check circuit 120 can obtain the job receiving number of the second processor transmitted by the control device of the second processor body in real time.
The checking circuit 120 may then generate a dispatch complete instruction according to the total number of jobs of the current task and the number of jobs that the second processor 300 has received, the dispatch complete instruction being used to indicate that the second processor 300 has received all jobs sent by the task dispatching device 400. It should be clear that, in the embodiment of the present application, the decomposition information of the task refers to information such as the total number of jobs included in the task and the size of each job. The overall task information for a task may include configuration information such as computer instructions, input data, and the task category for the task. The task categories may include: block (blocking task), cluster (clustering task) and union (join task).
The status processing circuit 130 is configured to receive job end information of each job of the current task according to the received assignment complete instruction, and transmit the job end information of each job of the current task to the first processor 200. In particular, when the state processing circuit 130 receives a dispatch complete instruction indicating that the second processor has completed scheduling work for the respective job, execution of the respective job may begin. The status monitoring apparatus 100 may wait for the execution status information of each received job fed back by the second processor 300, i.e. the status processing circuit 130 may start to receive and buffer the job end information of each job in the current task transmitted by the second processor 300. Alternatively, the state processing circuit 130 may be connected to a global memory through DMA, and the global memory is connected to the first processor, so that the state processing circuit 130 may write the obtained job end information of each job into the global memory, so as to transmit the job end information of each job of the current task to the first processor 200 through the global memory.
Further, the state processing circuit 130 may include an exception processing circuit 131, and the exception processing circuit 131 may be capable of monitoring the exception execution of each task. In particular, the exception handling circuit 131 may also be connected to the second processor 300. The exception handling circuit 131 may obtain job end information of the current job, and determine whether there is an execution exception for the current task to which the current job belongs according to the job end information of the current job. If the current task is abnormal, a task destroying instruction is generated and transmitted to the task assigning device 400, and the task assigning device 400 can execute a destroying operation according to the task destroying instruction. Specifically, if there is an abnormality in the current task, the abnormality processing circuit 131 may transmit the task destruction instruction generated by the abnormality processing circuit to the task decomposition device 410, and the task decomposition device 410 may execute the destruction operation according to the task destruction instruction.
Alternatively, the job end information of the current job includes result flag data, and the exception handling circuit 131 may determine whether there is an execution exception for the current task according to the result flag data included in the job end information of the current job. For example, if there is an execution abnormality in the current task (i.e., if there is an abnormality in the second processor body corresponding to the current job), the control device of the second processor body may set the result flag data in the job end information of the current job to be not 0 (if the abnormality flag data is 1), and at this time, the abnormality processing circuit 131 may determine that there is an execution abnormality in the current task based on the result flag data. If the current task has no execution exception, the control device of the second processor body may set the result flag data in the job end information of the current job to 0, and at this time, the exception processing circuit 131 may determine that the current task has no execution exception according to the result flag data.
Further, the execution exception of the current task may include a first exception condition and a second exception condition, and the task destruction instruction may include a first task destruction instruction corresponding to the first exception condition and a second task destruction instruction corresponding to the second exception condition. Alternatively, when it is determined that there is an abnormality in the current task, the abnormality processing circuit 131 may further determine whether the execution abnormality of the current task is a first abnormality or a second abnormality, based on abnormality flag data included in the job end information of the current job.
For example, if the execution exception of the current task is the first exception condition, the control device of the second processor body may set the exception flag data in the job end information of the current job to be not 0 (e.g., the exception flag data is 1). If the execution abnormality of the current task is the second abnormality, the control device of the second processor body may set the abnormality flag data in the job end information of the current job to 0. In this way, the exception handling circuit 131 can determine whether the execution exception of the current task is the first exception condition or the second exception condition based on the exception flag data included in the job end information of the current job.
Alternatively, if the exception handling circuit 131 determines that the current task has the first exception condition according to the job end information of the current job, a first task destruction instruction is generated according to the job end information of the current job, so as to destroy the current task. Specifically, if the exception handling circuit 131 determines that the current task has the first exception condition according to the job end information of the current job, it generates a first task destruction instruction according to the job end information of the current job, transmits the first task destruction instruction to the task decomposition device 410 of the task assigning device 400, and notifies the task decomposition device 410 of the task assigning device 400 to destroy the current task. That is, the task decomposition device 410 of the task assigning device 400 may destroy, according to the first task destruction instruction received by the task decomposition device, the current job with the exception in the current task and all jobs after the current job, so as to destroy all jobs to be processed in the Table ID corresponding to the current task. Further, after the task assigning device 400 completes the destruction operation of the current task, the task scheduling end information of the current task may be obtained and transmitted to the status monitoring device 100.
If the exception handling circuit 131 determines that the current task has the second exception condition according to the job end information of the current job, it generates a second task destruction instruction according to the job end information of the current job to destroy the current task and all tasks after the current task. Specifically, if the exception handling circuit 131 determines that the current task has the second exception condition according to the job end information of the current job, it may generate a second task destruction instruction according to the job end information of the current job, transmit the second task destruction instruction to the task decomposition device 410 of the task assigning device 400, and notify the task decomposition device 410 of the task assigning device 400 to destroy the current task and all tasks thereafter. Alternatively, the task assigning device 400 may store a plurality of tasks to be scheduled into one task queue according to a certain sequence, and after the task assigning device 400 receives the second task destruction instruction transmitted by the exception handling circuit 131, the task decomposing device 410 of the task assigning device 400 may destroy all tasks in the task queue to which the current task belongs. Specifically, the task assigning device 400 first terminates the scheduling of the current task and other tasks to be processed after the current task according to the second task destruction instruction, and notifies a register connected to the task assigning device to clear the current task. After the current task is cleared from the register, scheduling end information for the current task may be obtained.
Meanwhile, after the current task is cleared from the register, the task decomposition device 410 of the task assigning device 400 may send a task registration request corresponding to each task to be processed to the status monitoring device 130, so as to obtain a task identifier corresponding to each task to be processed. The status monitoring apparatus 130 may assign a task identifier to each task to be processed. When the task decomposition device 410 of the task assigning device 400 receives the task identifier corresponding to each to-be-processed task fed back by the state monitoring device 130, the task decomposition device 410 of the task assigning device 400 may obtain scheduling end information corresponding to each to-be-processed task according to the task identifier corresponding to each to-be-processed task, so as to destroy all to-be-processed tasks after the current task. Further, the task assigning device 400 can also transmit the scheduling end information of each of the pending tasks to the status monitoring device 130.
By setting the exception handling mechanism, the accuracy of the task execution result can be ensured. And when the abnormal condition exists, the state monitoring device can inform the task dispatching device to destroy the corresponding task or all the tasks after the corresponding task, so that the resource waste caused by the fact that the second processor continues to execute other tasks when the abnormal condition exists is avoided.
Optionally, the state processing circuit 130 further includes a state cache circuit 132, the state cache circuit 132 may connect the checking circuit 120 and the first processor 200, and specifically, the state cache circuit 132 may be connected to the first processor 200 through a global memory. The state buffer circuit 132 is configured to receive an assignment completion instruction output by the comparator of the check circuit 120, receive job end information of each job of the current task according to the assignment completion instruction, reorder the received job end information according to a preset arrangement mode when the number of the received job end information reaches a preset end information number, and transmit the received job end information to the first processor 200 according to the reordered order. Alternatively, the preset arrangement may be an execution order of the respective jobs. In this way, by reordering the job end information of each job, it can be ensured that one or more jobs before the current job have all been executed and ended, and thus the reliability of the execution result of the current task can be ensured.
The working principle of the task scheduler of the embodiment of the present application is illustrated below with reference to fig. 1 and 3:
when a task needs to be sent to the second processor 300 for processing, the task is first registered. Specifically, the task assigning device 400 may transmit a task registration request of the current task to the status monitoring device 100. The task registration circuit 110 of the condition monitoring device 100 may assign a task identifier to the current task and transmit the task identifier to the task assigning device 400.
The task assigning device 400 may obtain the task decomposition information of the current task according to the received task identifier of the current task, and transmit the decomposition information of the current task to the status monitoring device 100. The task registration circuit 110 of the status monitoring apparatus 100 may obtain the total number of jobs of the current task according to the decomposition information of the current task, and transmit the total number of jobs of the current task to the verification circuit 120.
Further, the task assigning device 400 may further send the decomposition information and all task information of the current task to the second processor 300, specifically, the task assigning device 400 may send the decomposition information and all task information of the current task to the control device of the second processor body, and the control device of the second processor body may split the current task into a plurality of jobs according to the received decomposition information and all task information of the current task, and send the plurality of jobs to each second processor body for processing. The checking circuit 120 may thus obtain the job receiving number of the second processor transmitted by the control device of the second processor body, and specifically, the checking circuit may obtain the job receiving number of the second processor transmitted by the control device of the second processor body in real time and obtain the dispatch completion instruction according to the job receiving number of the second processor. After obtaining the dispatch completion instruction, the status monitoring apparatus 100 may wait for the execution status information of each received job fed back by the second processor 300, that is, the status processing circuit 130 of the status monitoring apparatus 100 may start to receive and buffer the job end information of each job in the current task transmitted by the second processor 300.
Further, the state processing circuit 130 may also determine whether there is an execution exception for the current task according to the job end information of the current job it receives. If the current task has an execution abnormal condition, the state processing circuit 130 may further generate a task destruction instruction according to the job end information of the current job, and transmit the task destruction instruction to a task destruction circuit of the task assigning device. The task destruction circuit can execute destruction operation according to the task destruction instruction.
For example, if the state processing circuit 130 determines that the current task has the first abnormal condition according to the job end information of the current job, it generates a first task destruction instruction according to the job end information of the current job, transmits the first task destruction instruction to the task assigning device 400, and notifies the task assigning device 400 to destroy the current task. If the state processing circuit 130 determines that the current task has the second abnormal condition according to the job end information of the current job, a second task destruction instruction may be generated according to the job end information of the current job, and the second task destruction instruction is transmitted to the task assigning device 400, so as to notify the task assigning device 400 to destroy the current task and all tasks after the current task.
If the state processing circuit 130 determines that there is no abnormal condition in the current task according to the job end information of the current job, the state processing circuit 130 may write the acquired job end information of each job into the global memory, so as to transmit the job end information of each job of the current task to the first processor 200 through the global memory.
The task processing system can determine whether each second processor is abnormal or not by monitoring whether the execution of the task is abnormal or not in the process of processing each task, and destroy the task currently being processed when the abnormality exists, so that the accuracy of the operation result is ensured. The exception handling process of the task processing system may specifically refer to an exception handling method below, and is not described herein again.
In one embodiment, as shown in fig. 7, the present application provides an exception handling method, which can be applied to the task assigning apparatus described above, and is used for executing a destroy operation when the second processor is abnormal. Specifically, the method comprises the following steps:
s100, receiving a task destroying instruction;
s200, executing destruction operation according to the task destruction instruction, wherein the destruction operation comprises destroying the current task with execution exception and all tasks in the task queue to which the current task belongs.
Specifically, when there is an exception in one or more second processor entities, the control device of the second processor entity may mark a job corresponding to the second processor entity with the exception as an execution exception, and add information of the execution exception of the current job to the job end information of the current job by means of a flag bit. The state monitoring device can determine that the current operation is abnormal according to the acquired operation end information of the current operation, generate a task destroying instruction and transmit the task destroying instruction to the task assigning device. The task assigning device can execute the destroying operation according to the task destroying instruction.
Optionally, the task destruction instruction includes a first task destruction instruction; as shown in fig. 8, the method may specifically include the following steps:
s110, receiving a first task destroying instruction;
s210, according to the first task destruction instruction, stopping scheduling the current operation with execution exception and all the operations after the current operation, and generating scheduling end information of the current task to destroy the current task.
Specifically, when receiving the first task destruction instruction, the task assigning device may terminate the scheduling of the current task where the current job is located, that is, the task assigning device may terminate the scheduling of the current job where the current job is abnormal, and terminate all jobs after the current job is scheduled, so as to destroy all jobs to be processed in Table ID of the current task where the current job belongs, and generate scheduling end information of the current task, so as to destroy the current task. And the scheduling end information of the current task is used for indicating that the current task is scheduled to be finished.
Optionally, the task destruction instruction further includes a second task destruction instruction; as shown in fig. 9, the method may specifically include the following steps:
s120, receiving a second task destroying instruction;
and S220, terminating the scheduling of the current task and other tasks to be processed after the current task according to the second task destruction instruction.
Specifically, when receiving the second task destruction instruction, the task assigning device first suspends scheduling of the current task and terminates task registration of other tasks to be processed subsequent to the current task. Meanwhile, the task dispatching device can send a clearing signal to the register, and the register can delete the current task from the register according to the clearing signal received by the register, namely, the storage space occupied by the current task in the register is released.
And S230, after the current task is cleared from the register, sending a task registration request corresponding to each task to be processed to the state monitoring device so as to obtain a task identifier corresponding to each task to be processed.
Specifically, after completing the clearing of the current task in which the exception exists, the task assigning device may send a task registration request of each task to be processed to the status monitoring device, so as to register each task to be processed after the current task, respectively. The task registration circuit of the state monitoring device can allocate a task identifier to each task to be processed according to the task registration request received by the task registration circuit, and transmit the task identifier corresponding to each task to be processed to the task assigning device.
And S240, receiving the task identifiers corresponding to the to-be-processed tasks fed back by the state monitoring device, and obtaining scheduling end information corresponding to the to-be-processed tasks according to the task identifiers corresponding to the to-be-processed tasks so as to destroy all the to-be-processed tasks after the current task.
Specifically, the task assigning device may obtain scheduling end information corresponding to each to-be-processed task according to the received task identifier of each to-be-processed task, so as to destroy all to-be-processed tasks after the current task.
Optionally, the method further comprises the following steps:
when a task destroying instruction is received, a first interrupt signal is generated and transmitted to a first processor, and then the task destroying operation is carried out. Specifically, when the task assigning device receives the task destruction instruction, the scheduling of the current task is terminated first, so as to avoid unnecessary resource consumption when scheduling is performed under an abnormal condition. Meanwhile, after the task dispatching device receives the task destroying instruction, a first interrupt signal can be generated and transmitted to the first processor.
Further, after the first processor receives the first interrupt signal, the first processor can also obtain the state information of each second processor body, and determine the second processor body with the abnormality according to the state information of each second processor body.
Optionally, the method further includes the following steps:
and after the destroying operation is finished, generating a second interrupt signal and transmitting the second interrupt signal to the first processor. In particular, upon completion of the destruction operation, the task dispatcher may generate a second interrupt signal that may be used to characterize completion of exception handling. Further, the task dispatcher may transmit a second interrupt signal to the first processor.
In an embodiment, as shown in fig. 10, an exception handling method is further provided in the embodiment of the present application, and is used in the task processing system described above. Specifically, the method may include the steps of:
s300, the second processor obtains the job end information of the current job and transmits the job end information of the current job to the state monitoring device, wherein the job end information of the current job comprises the execution state of the current job, and the execution state of the current job is used for identifying whether the current job is abnormal in execution.
S400, the state monitoring device judges whether the current operation has execution abnormality according to the operation end information of the current operation, generates a task destroying instruction when the current operation has execution abnormality, and transmits the task destroying instruction to the task assigning device;
and S500, the task dispatching device executes destruction operation according to the task destruction instruction, wherein the destruction operation comprises destroying the current task with execution exception and all tasks in the task queue to which the current task belongs.
In one embodiment, the step S300 further includes the following steps:
when the information fed back by more than one second processor body corresponding to the current operation is abnormal, the control device of the second processor body marks the execution state of the current operation as abnormal execution;
and when the control device of the second processor body receives the information fed back by all the second processor bodies corresponding to the current operation, the control device transmits the operation end information of the current operation to the state monitoring device.
Specifically, the current job may be divided into a plurality of sub-jobs, and the plurality of sub-jobs may be sent to different second processor bodies for parallel execution, so that when there is an exception in the information fed back by the second processor body corresponding to one of the sub-jobs (i.e., when there is an exception in the second processor body corresponding to one of the sub-jobs), the control device of the second processor body may mark the execution status of the current job as an execution exception. When the control device of the second processor body receives the information fed back by the second processor body corresponding to all the sub-jobs, the control device of the second processor body may obtain the job end information of the current job, and transmit the job end information of the current job to the state monitoring device. The job end information of the current job includes an execution state of the current job, and the execution state of the current job is used to identify whether the current job is abnormal in execution (i.e., whether one or more second processor bodies corresponding to the current job are abnormal in execution).
Further, the execution status of the current job may be represented using result flag data. For example, if there is an execution abnormality in the current task (i.e., if there is an abnormality in the second processor main body corresponding to the current job), the control device of the second processor main body may set the result flag data in the job end information of the current job to be not 0 (if the result flag data is 1). If there is no execution abnormality in the current task, the control device of the second processor body may set the result flag data in the job end information of the current job to 0.
Further, the status monitoring apparatus 100 may determine whether there is an execution abnormality of the current task according to result flag data included in the job end information of the current job. Specifically, as shown in fig. 11, the step S400 includes the following steps:
s410, the status monitoring apparatus 100 receives job end information of the current job; specifically, when the second processor 300 completes execution of the current job, the second processor 300 may transmit job end information of the current job to the status monitoring apparatus 100.
S420, the status monitoring apparatus 100 determines whether the current task is abnormal according to the job end information of the current job. Specifically, the status monitoring apparatus 100 may determine whether there is an execution abnormality in the current task to which the current job belongs, according to the job end information of the current job.
S430, if the current task is abnormal, the status monitoring device 100 generates a task destruction instruction, and transmits the task destruction instruction to the task assigning device. And if the current task has no execution exception, continuing to execute other operations of the current task. If the current task is not abnormal, the task ending information of the next task can be continuously received, and at this time, the steps S420 to S430 can be repeated.
For example, if the current task has execution exception, the control device of the second processor body may set result flag data in the job end information of the current job as non-0 (for example, the exception flag data is 1), at this time, the exception processing circuit of the status monitoring device 100 may determine that the current task has execution exception according to the result flag data, at this time, the exception processing circuit 131 of the status monitoring device may determine that the current task has execution exception according to the result flag data, generate a task destruction instruction, and transmit the task destruction instruction to the task destruction circuit of the task assigning device. The task dispatching device can execute the destruction operation according to the received task destruction instruction.
If the current task has no execution exception, the control device of the second processor body may set the result flag data in the job end information of the current job to 0, and at this time, the exception handling circuit may determine that the current task has no execution exception according to the result flag data.
Optionally, the execution exception of the current task includes a first exception condition and a second exception condition, and the task destruction instruction includes a first task destruction instruction and a second task destruction instruction. Specifically, as shown in fig. 12, when the status monitoring device 100 determines that there is an abnormal condition in the current task, the step S400 specifically includes the following steps:
s421, the status monitoring apparatus 100 determines whether the execution abnormality of the current task is the first abnormality or not according to the job end information of the current job. If the job end information of the current job determines that the current job has the first abnormal condition, step S431 is executed, and the state monitoring apparatus 100 generates a first job destruction instruction according to the job end information of the current job, so as to destroy the current job. If the job end information of the current task determines that the current task has the second abnormal condition, step S432 is executed, and the state monitoring device 100 generates a second task destruction instruction according to the job end information of the current task, so as to destroy the current task and all tasks after the current task.
For example, if the execution of the current task is abnormal in the first abnormal condition, the control device of the second processor body may set the abnormal flag data in the job end information of the current job to be non-0 (e.g., the abnormal flag data is 1). If the execution abnormality of the current task is the second abnormality, the control device of the second processor body may set the abnormality flag data in the job end information of the current job to 0. In this way, the exception handling circuit can determine whether the execution exception of the current task is the first exception condition or the second exception condition based on the exception flag data included in the job end information of the current job.
If the exception handling circuit determines that the current task has a first exception condition according to the job end information of the current job, a first task destruction instruction is generated according to the job end information of the current job, the first task destruction instruction is transmitted to the task assigning device 400, and the task assigning device 400 is notified to destroy the current task. If the exception handling circuit determines that the current task has the second exception condition according to the job end information of the current job, a second task destruction instruction may be generated according to the job end information of the current job, and the second task destruction instruction is transmitted to the task assigning device 400, so as to notify the task assigning device 400 to destroy the current task and all tasks after the current task. Alternatively, the task assigning apparatus 400 may store a plurality of tasks to be scheduled into a queue in a certain order, and after the task assigning apparatus 400 receives the second task destruction instruction transmitted by the exception handling circuit 131, the task assigning apparatus 400 may destroy all the tasks in the queue.
Further, the exception handling circuit of the state monitoring apparatus 100 may also transmit current task execution exception information obtained by detection thereof to the first processor 200. For example, when the exception handling circuit determines that the current task has a first exception condition, the exception handling circuit may report the first exception condition to the first processor 200. When the exception handling circuit determines that the current task has the second exception condition, the exception handling circuit may report the second exception condition to the first processor 200.
In an embodiment, when the task assigning apparatus receives the first task destruction instruction, step S510 is executed, and the task assigning apparatus terminates scheduling of the current job with execution exception and all jobs after the current job according to the first task destruction instruction, and generates scheduling end information of the current task, so as to destroy the current task. Specifically, when the task assigning device receives the first task destruction instruction, the task assigning device may terminate the scheduling of the current task where the current job is located, that is, the task assigning device may terminate the scheduling of the current job where the current job is abnormal, and terminate all jobs after the current job is scheduled, so as to destroy all jobs to be processed in Table ID of the current task where the current job belongs, and generate scheduling end information of the current task, so as to destroy the current task. And the scheduling end information of the current task is used for indicating that the current task is scheduled to be finished.
In one embodiment, when the task dispatching device receives the second task destroying instruction, the following steps are executed:
s521, the task dispatching device terminates the scheduling of the current task and other tasks to be processed after the current task according to the second task destruction instruction; specifically, when receiving the second task destruction instruction, the task assigning device first suspends scheduling of the current task and terminates task registration of other tasks to be processed subsequent to the current task. Meanwhile, the task dispatching device can send a clearing signal to the register, and the register can delete the current task from the register according to the clearing signal received by the register, namely, the storage space occupied by the current task in the register is released.
S522, after the current task is cleared from the register, sending a task registration request corresponding to each task to be processed to the state monitoring device so as to obtain a task identifier corresponding to each task to be processed; specifically, after completing the clearing of the current task in which the exception exists, the task assigning device may send a task registration request of each task to be processed to the status monitoring device, so as to register each task to be processed after the current task, respectively.
And S523, the state monitoring device allocates a task identifier to each to-be-processed task according to the task registration request corresponding to each to-be-processed task, and transmits the task identifier corresponding to each to-be-processed task to the task assigning device. Specifically, the task registration circuit of the state monitoring device may allocate a task identifier to each to-be-processed task according to the task registration request received by the task registration circuit, and transmit the task identifier corresponding to each to-be-processed task to the task assigning device.
And S524, the task dispatching device receives the task identifier corresponding to each to-be-processed task fed back by the state monitoring device, and obtains scheduling end information corresponding to each to-be-processed task according to the task identifier corresponding to each to-be-processed task, so as to destroy all to-be-processed tasks after the current task. Specifically, the task assigning device may obtain scheduling end information corresponding to each to-be-processed task according to the received task identifier of each to-be-processed task, so as to destroy all to-be-processed tasks after the current task.
In one embodiment, the exception handling method further includes the following steps:
after the state monitoring device receives scheduling end information of the current task or receives scheduling end information of the current task and all tasks in a task queue to which the current task belongs, the state monitoring device generates exception handling end information and transmits the exception handling end information to the task dispatching device;
the task dispatching device generates a second interrupt signal according to the exception handling ending information and transmits the second interrupt signal to the first processor.
It should be understood that although the various steps in the flow charts of fig. 7-12 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 7-12 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct Rambus Dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (14)

1. A method of exception handling, the method comprising the steps of:
the second processor obtains job end information of a current job, and transmits the job end information of the current job to the state monitoring device, wherein the job end information of the current job comprises an execution state of the current job, and the execution state of the current job is used for identifying whether the current job is abnormal in execution or not;
the state monitoring device judges whether the current operation has execution abnormality according to the operation end information of the current operation, generates a task destroying instruction when the current operation has execution abnormality, and transmits the task destroying instruction to a task assigning device;
the task dispatching device executes destruction operation according to the task destruction instruction, wherein the destruction operation comprises the destruction of the current task with execution exception and all tasks in a task queue to which the current task belongs;
the execution exception of the current task comprises a first exception condition and a second exception condition, and the task destroying instruction comprises a first task destroying instruction and a second task destroying instruction; the step of generating a task destruction instruction by the state monitoring device when the current operation has execution abnormality, further comprises:
if the state monitoring device determines that a first abnormal condition exists in the current operation according to the operation ending information of the current operation, a first task destroying instruction is generated according to the current operation ending information, and the first task destroying instruction is transmitted to the task assigning device;
and if the state monitoring device determines that the current operation has a second abnormal condition according to the operation end information of the current operation, generating a second task destruction instruction according to the current operation end information, and transmitting the second task destruction instruction to the task assigning device.
2. The method according to claim 1, wherein the task assigning device executes the step of destroying according to the task destroying instruction, further comprising:
when the task dispatching device receives the first task destroying instruction, the task dispatching device terminates and dispatches the current operation with execution exception and all operations after the current operation according to the first task destroying instruction, and generates dispatching end information of the current task so as to destroy the current task.
3. The method according to claim 1, wherein the task assigning device executes a destruction operation according to the task destruction instruction, and further comprises:
when the task dispatching device receives the second task destroying instruction, the current task and other tasks to be processed after the current task are scheduled according to the second task destroying instruction;
after the current task is cleared from the register, the task dispatching device sends a task registration request corresponding to each task to be processed to a state monitoring device;
the state monitoring device allocates a task identifier to each task to be processed according to the task registration request corresponding to each task to be processed, and transmits the task identifier corresponding to each task to be processed to the task dispatching device;
when the task assigning device receives the task identifier corresponding to each task to be processed, the scheduling end information corresponding to each task to be processed is obtained according to the task identifier corresponding to each task to be processed, so that all tasks to be processed after the current task are destroyed.
4. The method of claim 1, further comprising the steps of:
after the state monitoring device receives scheduling end information of the current task or receives scheduling end information of the current task and all tasks in a task queue to which the current task belongs, the state monitoring device generates exception handling end information and transmits the exception handling end information to the task dispatching device;
and the task dispatching device generates a second interrupt signal according to the exception handling ending information and transmits the second interrupt signal to the first processor.
5. The method of claim 1, wherein the step of the second processor transmitting job end information of the current job to the status monitoring device comprises:
when the information fed back by more than one second processor body corresponding to the current operation is abnormal, the control device of the second processor body marks the execution state of the current operation as abnormal execution;
and when the control device of the second processor body receives the information fed back by all the second processor bodies corresponding to the operation, acquiring the operation end information of the current operation, and transmitting the operation end information of the current operation to the state monitoring device.
6. The method of claim 1, further comprising the steps of:
when the task dispatching device receives the task destroying instruction, a first interrupt signal is generated and transmitted to a first processor, and then the destroying operation is executed.
7. The method according to claim 1, characterized in that the method further comprises the steps of: and after the destruction operation is finished, generating a second interrupt signal and transmitting the second interrupt signal to the first processor.
8. A task processing system is characterized by comprising a task scheduler and a second processor connected with the task scheduler, wherein the task scheduler comprises a task dispatching device and a state monitoring device connected with the task dispatching device, and the state monitoring device and the task dispatching device are both connected to the second processor; wherein,
the second processor is used for obtaining the job end information of the current job and transmitting the job end information of the current job to the state monitoring device, wherein the job end information of the current job comprises the execution state of the current job, and the execution state of the current job is used for identifying whether the current job is abnormal in execution or not;
the state monitoring device is used for judging whether the current operation has execution abnormity according to the operation end information of the current operation, generating a task destroying instruction when the current operation has execution abnormity, and transmitting the task destroying instruction to the task assigning device;
the task dispatching device is used for executing destruction operation according to the task destruction instruction, wherein the destruction operation comprises destroying the current task with execution exception and all tasks in the task queue to which the current task belongs;
the execution exception of the current task comprises a first exception condition and a second exception condition, and the task destroying instruction comprises a first task destroying instruction and a second task destroying instruction; the state monitoring device is also used for generating a first task destroying instruction according to the operation ending information of the current operation and transmitting the first task destroying instruction to the task assigning device when the first abnormal condition of the current task is determined according to the operation ending information of the current operation;
the state monitoring device is further configured to generate a second task destruction instruction according to the job end information of the current job and transmit the second task destruction instruction to the task assigning device when it is determined that the second abnormal condition exists in the current task according to the job end information of the current job.
9. The system according to claim 8, wherein the task assigning device is further configured to terminate scheduling the current job with execution exception and all jobs after the current job according to the first task destruction instruction when the task assigning device receives the first task destruction instruction, and generate scheduling end information of the current task to destroy the current task.
10. The system of claim 8, wherein said task assigning means is further configured to:
when the task dispatching device receives the second task destroying instruction, the current task and other tasks to be processed after the current task are scheduled according to the second task destroying instruction;
after the current task is cleared from the register, the task dispatching device sends a task registration request corresponding to each task to be processed to a state monitoring device;
the state monitoring device allocates a task identifier to each task to be processed according to the task registration request corresponding to each task to be processed, and transmits the task identifier corresponding to each task to be processed to the task dispatching device;
when the task assigning device receives the task identifier corresponding to each task to be processed, the scheduling end information corresponding to each task to be processed is obtained according to the task identifier corresponding to each task to be processed, so that all tasks to be processed after the current task are destroyed.
11. The system according to claim 8, wherein the status monitoring device is further configured to generate exception handling end information after receiving scheduling end information of the current task or after receiving scheduling end information of the current task and all tasks in a task queue to which the current task belongs, and transmit the exception handling end information to the task dispatching device;
and the task dispatching device is also used for generating a second interrupt signal according to the exception handling ending information and transmitting the second interrupt signal to the first processor.
12. The system according to claim 8, wherein the second processor is further configured to mark the execution status of the current job as execution exception if there is exception in the information fed back by the one or more second processor entities corresponding to the current job;
and the second processor is further configured to obtain the job end information of the current job and transmit the job end information of the current job to the state monitoring device when receiving the information fed back by all the second processor bodies corresponding to the job.
13. The system according to claim 8, wherein the task assigning device is configured to generate a first interrupt signal when receiving the task destruction instruction, and transmit the first interrupt signal to the first processor, and then execute the destruction operation.
14. The system according to claim 8, wherein the task assigning means is configured to generate a second interrupt signal after completion of the destruction operation, and to transmit the second interrupt signal to the first processor.
CN201811179066.6A 2018-10-10 2018-10-10 Exception handling method, task assigning apparatus, task handling system, and storage medium Active CN111026516B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811179066.6A CN111026516B (en) 2018-10-10 2018-10-10 Exception handling method, task assigning apparatus, task handling system, and storage medium
PCT/CN2019/110273 WO2020073938A1 (en) 2018-10-10 2019-10-10 Task scheduler, task processing system, and task processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811179066.6A CN111026516B (en) 2018-10-10 2018-10-10 Exception handling method, task assigning apparatus, task handling system, and storage medium

Publications (2)

Publication Number Publication Date
CN111026516A CN111026516A (en) 2020-04-17
CN111026516B true CN111026516B (en) 2022-12-02

Family

ID=70191851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811179066.6A Active CN111026516B (en) 2018-10-10 2018-10-10 Exception handling method, task assigning apparatus, task handling system, and storage medium

Country Status (1)

Country Link
CN (1) CN111026516B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034554A (en) * 2012-12-30 2013-04-10 焦点科技股份有限公司 ETL (Extraction-Transformation-Loading) dispatching system and method for error-correction restarting and automatic-judgment starting

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9588804B2 (en) * 2014-01-21 2017-03-07 Qualcomm Incorporated System and method for synchronous task dispatch in a portable device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034554A (en) * 2012-12-30 2013-04-10 焦点科技股份有限公司 ETL (Extraction-Transformation-Loading) dispatching system and method for error-correction restarting and automatic-judgment starting

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
嵌入式MPSoC系统中的任务调度管理研究;李橙;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110315(第03期);全文 *

Also Published As

Publication number Publication date
CN111026516A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
JP4016010B2 (en) Real-time scheduling possibility determination method and real-time system
CN111625331B (en) Task scheduling method, device, platform, server and storage medium
CN111026521B (en) Task scheduler, task processing system and task processing method
KR20140080434A (en) Device and method for optimization of data processing in a mapreduce framework
JP2003256221A5 (en)
CN111026540B (en) Task processing method, task scheduler and task processing device
CN111026518B (en) Task scheduling method
CN113010286A (en) Parallel task scheduling method and device, computer equipment and storage medium
CN111026516B (en) Exception handling method, task assigning apparatus, task handling system, and storage medium
CN103823712A (en) Data flow processing method and device for multi-CPU virtual machine system
CN111598768B (en) Image optimization processing method and device, computer equipment and storage medium
CN111026539B (en) Communication task processing method, task cache device and storage medium
CN111026523A (en) Task scheduling control method, task scheduler and task processing device
CN114816777A (en) Command processing device, method, electronic device and computer readable storage medium
JPH07160656A (en) External interruption control method
CN108874548B (en) Data processing scheduling method and device, computer equipment and data processing system
CN111026515B (en) State monitoring device, task scheduler and state monitoring method
CN111026520B (en) Task processing method, control device of processor and processor
CN111026517B (en) Task decomposition device and task scheduler
CN111026513B (en) Task assigning device, task scheduler, and task processing method
JPS59167756A (en) Dispatch control system of virtual computer
CN112463388B (en) SGRT data processing method and device based on multithreading
WO2020073938A1 (en) Task scheduler, task processing system, and task processing method
CN113360186B (en) Task scheduling method, device, electronic equipment and computer readable storage medium
CN112667397B (en) Machine learning system and resource allocation method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant