CN112527471B - Task processing method and device and storage medium - Google Patents

Task processing method and device and storage medium Download PDF

Info

Publication number
CN112527471B
CN112527471B CN201910887217.1A CN201910887217A CN112527471B CN 112527471 B CN112527471 B CN 112527471B CN 201910887217 A CN201910887217 A CN 201910887217A CN 112527471 B CN112527471 B CN 112527471B
Authority
CN
China
Prior art keywords
task
client
subtask
primary
sending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910887217.1A
Other languages
Chinese (zh)
Other versions
CN112527471A (en
Inventor
韩伟森
顾刚
初颖俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201910887217.1A priority Critical patent/CN112527471B/en
Publication of CN112527471A publication Critical patent/CN112527471A/en
Application granted granted Critical
Publication of CN112527471B publication Critical patent/CN112527471B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/461Saving or restoring of program or task context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application discloses a task processing method, a device and a storage medium thereof, wherein the task processing method is applied to clients in a cluster, the cluster comprises a plurality of clients capable of communicating with each other, and the method comprises the following steps: receiving a first task sent by a task sending end; when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1; sending k primary subtasks to the corresponding client; the client corresponding to the primary subtask is determined by the type of the primary subtask; receiving an execution result of a primary subtask sent by a corresponding client; determining a first execution result of the first task according to the execution result of each primary subtask; and sending the first execution result to the task sending end. In the embodiment of the application, the split primary subtask is sent to other clients in the cluster, so that distributed execution of the tasks can be realized, and the operation pressure at each client is reduced.

Description

Task processing method and device and storage medium
Technical Field
The embodiment of the application relates to the technical field of computer communication, and relates to but is not limited to a task processing method and device and a storage medium.
Background
Workflow management technology is a technology that abstracts computation into tasks and expresses task dependencies by way of graphs. Through the workflow management technology, the user can concurrently distribute tasks in the workflow to different machines for execution without considering the time and space attributes of calculation, and the development difficulty is greatly simplified.
A device for executing tasks in a workflow by managing the workflow is called a workflow engine. The workflow engine in the related art can implement some beneficial functions, but these functions are mostly added in a statically predefined manner, such as: the failure rollback function can be realized, so that the fault tolerance is improved. But complex real-world scenarios require more efficient performance of workflow processing methods.
Disclosure of Invention
In view of this, embodiments of the present application provide a task processing method and apparatus, and a storage medium.
The embodiment of the application provides a task processing method, which is applied to a client in a cluster, wherein the cluster comprises a plurality of clients capable of communicating with each other, and the method comprises the following steps:
receiving a first task sent by a task sending end;
when the first task is determined to be a detachable composite task, splitting the first task to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1;
sending the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask;
receiving an execution result of a primary subtask sent by a corresponding client;
determining a first execution result of the first task according to the execution result of each primary subtask;
and sending the first execution result to the task sending end.
An embodiment of the present application further provides a task processing device, where the device is a client in a cluster, where the cluster includes a plurality of clients that can communicate with each other, and the device includes:
the receiving and sending module is used for receiving a first task sent by the task sending end;
the task splitting module is used for splitting the first task to obtain k primary subtasks when the first task is determined to be a split composite task; wherein k is a positive integer, and k is more than or equal to 1;
the receiving and sending module is further configured to send the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask;
the receiving and sending module is further used for receiving the execution result of the primary subtask sent by the corresponding client;
the execution result generation module is used for determining a first execution result of the first task according to the execution result of each primary subtask;
the transceiver module is further configured to send the first execution result to the task sending end.
An embodiment of the present application further provides a task processing device, where the task processing device includes: the system comprises a memory and a processor, wherein the memory stores a computer program capable of running on the processor, and the processor realizes the task processing method when executing the computer program.
An embodiment of the present application further provides a computer-readable storage medium, in which computer-executable instructions are stored, and the computer-executable instructions are configured to execute the above task processing method.
In the embodiment of the application, the split primary subtask is sent to other clients in the cluster, so that distributed execution of the tasks can be realized, and the operation pressure at each client is reduced. And the corresponding client receives the primary subtask, and sends an execution result to the local client after the primary subtask is executed. Compared with the method that the task sending end queries the database regularly, the task receiving end immediate feedback method in the embodiment of the application can improve the task execution efficiency and reduce the sending times of the instructions, thereby saving the computing resources.
Drawings
FIG. 1 is a schematic flowchart of a task processing method according to an embodiment of the present application;
FIG. 2 is a tree diagram formed by the primary subtasks in the embodiment of the present application;
fig. 3 is a schematic structural diagram of a client that executes a task processing method in an embodiment of the present application;
FIG. 4 is a schematic diagram of the triggering process of any flip-flop in the embodiment of the present application;
FIG. 5 is a schematic diagram of the triggering process of the "all" flip-flop in the embodiment of the present application;
FIG. 6 is a schematic diagram of a client registration process in a cluster in an embodiment of the present application;
FIG. 7 is a schematic diagram of a structure of a client that sends and receives tasks in an embodiment of the present application;
FIG. 8 is a flowchart illustrating the execution of an atomic task according to an embodiment of the present application;
FIG. 9 is a schematic diagram illustrating a configuration of a task processing device according to an embodiment of the present application;
fig. 10 is a hardware entity diagram of a task processing device according to an embodiment of the present application.
Detailed Description
In the related art, parameterization in the task can inherently simplify the process of configuring the workflow by the user, but in the actual scene, the output of a preposed task is referred to, for example: and according to the data amount processed by the preposed data extraction task, dynamically determining the mapping number used by the postposition task.
In addition, the distributed workflow engine in the related art needs to store the state into a public database, and continuously queries the public database in the execution process to determine whether the task is finished, and when the task amount is large, the efficiency is low.
Moreover, during workflow orchestration, it may happen that a number of different workflows share a portion of similar sub-workflows, but a slightly different scenario. For example, multiple workflows that process data in the same manner differ only in the source of the data. At this point, the user needs to duplicate this same structure multiple times and modify the configurations that differ among them, which is time consuming and prone to error.
Based on the above problems, an embodiment of the present application provides a task processing method to implement a function of a workflow engine, which is used for processing a workflow. In the embodiment of the application, the workflow is expressed in the form of a compound task, and the context in the compound task is maintained by a symbol table, so that the output of the prepositive task can be referred by the postpositive task.
In addition, the front task drives the execution of the rear task by sending a trigger signal to the rear task, so that the polling of a database state in distributed workflow in the related technology is replaced, the access amount of a workflow engine to the database is greatly reduced, the load of the database is effectively reduced when a large number of workflows are operated simultaneously, and the throughput of the workflow engine is improved.
In addition, different atomic tasks are loaded in different clients in the embodiment of the application, so that when a user needs to use the same data processing flow for multiple times, only the required parameters need to be sent to the client where the atomic task corresponding to the data processing flow is located, the data processing efficiency is improved, and the error rate is reduced.
The technical solution of the present application is further elaborated below with reference to the drawings and the embodiments.
Example one
An embodiment of the present application provides a task processing method, as shown in fig. 1, where the task processing method is applied to a client in a cluster, where the cluster includes multiple clients capable of communicating with each other, and the method includes:
s110, receiving a first task sent by a task sending end.
Here, the task sending end may be a task input interface, or may be another client in the cluster. If the task sending end is a task input interface, the first task can be presented in the form of a flow specification file; if the task sender is other client, the first task is presented in the form of symbol table and abstract syntax tree.
The process specification file is a text file, and the workflow is recorded in the process specification file according to the specific document structure of the process specification file, so that the functions which can be realized by each task in the workflow and the dependency relationship among each task can be clearly shown. Through the target direction of each task, the execution sequence of the processes contained in the workflow can be displayed.
In computer science, an Abstract Syntax Tree (AST) is an Abstract representation of the Syntax structure of the source code. The abstract syntax tree shows the relationship among the execution statements in a function and the variable and constant references involved in the execution statements in a tree form.
S120, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1.
Here, the compound task is a concept with respect to the atomic task. In the embodiment of the present application, the basic task that can be executed in the client is referred to as an atomic task. Generally, a workflow is a complex task. The functionality that an atomic task can implement is the basic steps that essentially every workflow contains in the large number of workflows that a workflow engine can handle. For example, if a workflow engine can process workflows related to employee attendance, and workflows for asking for approval and workflows for obtaining card time in workflows related to employee attendance all need to obtain employee job numbers according to employee names, a task for obtaining employee job numbers according to employee names may be an atomic task, and a task for asking for approval may be a composite task.
In general, a compound task is composed of multiple atomic tasks, which may be in the same hierarchy or in different levels of nesting. For example, a compound task may be composed of two atomic tasks executed sequentially, or may be composed of an atomic task executed first and a compound task executed subsequently, and the sub-tasks split from the compound task executed subsequently may include the atomic task and the compound task, so as to form nesting of atomic tasks at different levels.
Based on the composition manner of the compound task, in S120, all of the k primary sub-tasks obtained after splitting the first task may be atomic tasks, or may include both the atomic tasks and the compound task.
S130, sending the k primary subtasks to corresponding clients; and the client corresponding to the primary subtask is determined by the type of the primary subtask.
In the embodiment of the application, after the local client splits the composite task, the split primary subtask is sent to other clients in the cluster, so that distributed execution of the task is realized, and the operation pressure at each client is reduced.
Before sending a certain primary subtask (for example, the ith primary subtask) to the outside, the local client determines a client corresponding to the ith primary subtask according to the type of the ith primary subtask. For example, when the ith primary subtask is an atomic task, if the client receiving the ith primary subtask cannot execute the ith primary subtask, the ith primary subtask needs to be forwarded to the next client again. This will undoubtedly result in waste of resources and decrease the efficiency of task processing. Therefore, when one client sends an atomic task to the outside, the client capable of executing the atomic task is selected as the corresponding client.
When the ith primary subtask is a compound task, since all clients in the cluster have the capability of splitting the compound task, generally speaking, when one client sends the compound task to the outside, any one other client in the cluster can be selected as a corresponding client. As can be appreciated by those skilled in the art, if the selected client is processing other tasks, the composite task sent to the client definitely needs to be processed after the client has processed the current task, which reduces the efficiency of task processing, and therefore, any idle client in the cluster may be selected as the corresponding client.
And S140, receiving the execution result of the primary subtask sent by the corresponding client.
S150, determining a first execution result of the first task according to the execution result of each primary subtask.
S160, sending the first execution result to the task sending end.
Here, after receiving the primary subtask and completing the execution of the primary subtask, the corresponding client sends an execution result to the local client. Just as after the local client finishes executing the first task, the local client also sends a first execution result of the first task to the task sending end. This is a result feedback manner corresponding to the distributed execution manner of the task in the embodiment of the present application. Compared with the method that the task sending end queries the database regularly, the task receiving end immediate feedback method in the embodiment of the application can improve the task execution efficiency and reduce the sending times of the instructions, thereby saving the computing resources.
Generally, the subtasks split by one compound task have an association with each other. For example, the execution result of the previous sub-task may be a parameter required for the execution of the next sub-task. Therefore, in S150, when determining the first execution result of the first task, the execution result of each primary sub-task obtained by splitting needs to be obtained. That is, the first execution result of the first task can be obtained only after all the primary subtasks of the first task have been executed.
Example two
The embodiment of the application provides a task processing method, which comprises the following steps:
s210, receiving a first task sent by a task sending end.
S220, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1.
And S230, determining the sending sequence of the k primary subtasks according to the post tasks of the k primary subtasks.
S240, sending the k primary subtasks to corresponding clients according to the sending sequence; and the client corresponding to the primary subtask is determined by the type of the primary subtask.
In general, the subtasks in a compound task are executed in order. For example, in a compound task for inquiring the time of punching a card according to the name of an employee, a subtask for inquiring the number of the employee according to the name of the employee is executed firstly, and the execution result is the number of the employee; and the subtask for inquiring the time of punching the card according to the employee job number is executed later, and the employee job number is a parameter required to be used when the atomic task is executed. The employee job number inquiry subtask is a front task of the card punching time inquiry subtask, and the card punching time inquiry subtask is a subtask rear task of the employee job number inquiry subtask.
In a process specification file corresponding to a workflow, each task defines a post-task of the task in the target attribute. After the process specification file is converted into the abstract syntax tree, the pre-task and the post-task of each task can be obtained according to the corresponding relation between the constant and the variable in each abstract syntax tree.
In some embodiments, when the flow specification file is processed, an adjacency list may be generated according to the target attribute of the task, and the adjacency list is used to record the post-task of each task. When the clients send tasks to each other, the adjacency list is sent at the same time, so that each client can conveniently search the post-task of each task.
In S230, after determining the post task of each of the k primary subtasks, the execution order of the k primary subtasks may be determined. Here, the sending of the primary subtasks to the corresponding client marks the start of the execution of one primary subtask, and therefore, the execution order of the k primary subtasks is the sending order of the k primary subtasks. And sending the k primary subtasks to the corresponding client according to the sending sequence, so that the distributed execution of the tasks can be realized.
In some embodiments, S240 comprises the steps of:
s241, arranging the k primary subtasks according to the sending order.
Here, a tree is formed by arranging k primary subtasks in the order of transmission. As shown in FIG. 2, each primary subtask corresponds to a node in the tree, the root node corresponds to the last executed primary subtask, and the leaf nodes correspond to the first executed primary subtask. The upper node number connected with each node is the number of the front tasks contained in the primary subtask corresponding to the node, and the lower node number connected with each node is the number of the rear tasks contained in the primary subtask corresponding to the node. For example, the k-th primary subtask that is sent last among the k primary subtasks is used to implement post-tax payroll calculation, three parameters are required for post-tax payroll calculation, and the three parameters are respectively from three preceding tasks of the k-th primary subtask, and then the k primary subtasks are arranged according to the sending sequence, and the number of upper nodes connected to the root node corresponding to the k-th primary subtask is 3.
S242, when the ith primary subtask is sent, if the ith primary subtask does not have a pre-task, the ith primary subtask is sent to the corresponding client; wherein i is a positive integer, and i is more than or equal to 1 and less than or equal to k.
Here, the ith primary subtask having no preceding task means that the ith primary subtask is the first primary subtask executed among the k primary subtasks, and corresponds to the leaf node in fig. 2, and the client directly transmits the ith primary subtask to the corresponding client. If the tree map obtained in S241 has k 1 A leaf node means that there are k primary subtasks 1 The primary subtask has no preceding task, k 1 The primary subtasks will be sent directly to the corresponding clients.
S243, if the ith primary subtask has a pre-task, acquiring the number of generated trigger signals of the ith primary subtask to obtain a first numerical value; and the trigger signal is generated according to the execution result of the preposed task of the ith primary subtask.
And S244, determining the number of the trigger signals of the ith primary subtask which need to be generated before the ith primary subtask starts to be processed, and obtaining a second value.
And S245, when the first numerical value is determined to be equal to the second numerical value, sending the ith primary subtask to the corresponding client.
And S246, after receiving the execution result of the ith primary subtask, generating a trigger signal of a post-task of the ith primary subtask according to the execution result.
Here, if the ith primary subtask has a pre-task, that is, the node corresponding to the ith primary subtask is not a leaf node in fig. 2, it indicates that the processing of the ith primary subtask needs to depend on the execution result of the pre-task, and at this time, the ith primary subtask needs to be sent to the corresponding client after the local client generates the trigger signal meeting the condition.
Here, the trigger signal is generated according to the execution result of the preceding task of the ith primary subtask. For example, the ith-1 st primary subtask is a pre-task of the ith primary subtask, and the trigger signal of the ith primary subtask needs to be generated according to the execution result of the ith-1 st primary subtask. And the post task of the (i-1) th primary sub task has an (i + 1) th primary sub task besides the (i) th primary sub task. After receiving the execution result of the (i-1) th primary subtask, the client determines that a trigger signal corresponding to the (i + 1) th primary subtask should be generated, so that the (i) th primary subtask cannot start to be executed.
If the local client has generated m of the ith primary subtask at this time 1 A trigger signal, and the definition of the ith primary subtask in the trigger attribute of the ith primary subtask requires the generation of m at the local client 2 The execution of which can only be started after a trigger signal, the first value then being equal to m 1 The second value is equal to m 2
And when the client determines that the first value is equal to the second value, sending the ith primary subtask to the corresponding client. And after the corresponding client executes the ith primary subtask, sending an execution result of the ith primary subtask to the local client, and determining the trigger signal of which post task of the ith primary subtask is generated by the local client according to the execution result.
In some embodiments, the second value may be 1. In this case, there is a possibility that the ith primary subtask has only one pre-task, or there is at least one pre-task in the ith primary subtask, but the ith primary subtask may start to execute as long as the execution result of any one of the pre-tasks enables the local client to generate the trigger signal of the ith primary subtask.
In other embodiments, the second value may be equal to the number of predecessors to the ith primary sub-task. At this time, the ith primary subtask may start to execute only after the execution result of each pre-task of the ith primary subtask generates a trigger signal of the ith primary subtask at the local client.
And S250, receiving the execution result of the primary subtask sent by the corresponding client.
And S260, determining a first execution result of the first task according to the execution result of each primary subtask.
S270, the first execution result is sent to the task sending end.
In some embodiments, sending the ith primary sub-task to the corresponding client in S242 and S245 includes the following steps:
s247a, if the ith primary subtask has a pre-task, determining whether the ith primary subtask is bound to the pre-task of the ith primary subtask, and if the ith primary subtask is bound to the pre-task of the ith primary subtask, determining a client receiving the pre-task of the ith primary subtask as the first client.
S247b, if the ith primary subtask is not bound with the pre-task of the ith primary subtask or the ith primary subtask does not have the pre-task, judging whether the ith primary subtask is an atomic task, an undetachable composite task or a detachable composite task, and obtaining a first judgment result.
S247c, if the first judgment result shows that the ith primary subtask is an atomic task, judging whether the ith primary subtask has a specified execution client;
if yes, determining the execution client specified by the ith primary subtask as a first client;
if not, determining the client with the atomic task interpreter corresponding to the ith primary subtask as the first client; and the atomic task interpreter corresponding to the ith primary subtask is used for executing the ith primary subtask.
S247d, if the first determination result indicates that the ith primary subtask is an undetachable composite task, determining a client having atomic task interpreters corresponding to all subtasks of the ith primary subtask as a first client.
And S247e, if the first judgment result shows that the ith primary subtask is a detachable composite task, determining an idle client in the cluster as the first client.
And S247a to S247e are used for determining the client corresponding to the ith primary subtask.
S247f, sending the ith primary sub-task to the first client.
EXAMPLE III
An embodiment of the present application provides a task processing method, where the task processing method includes:
s310, receiving a first task sent by a task sending end.
S320, judging whether the first task needs to be analyzed.
S330, when the first task is determined to need to be analyzed, the first task is analyzed.
Here, when the task sending end is a task input interface, the first task is presented in the form of a flow specification file, and the local client needs to parse the first task into a symbol table and an abstract syntax tree.
S340, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1; sending the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask; receiving an execution result of a primary subtask sent by a corresponding client; and determining a first execution result of the first task according to the execution result of each primary subtask.
Here, the steps in S340 are the same as S120 to S150 in the first embodiment, and are used for processing a detachable composite task.
S350, when the first task is determined to be an atomic task, judging whether the first task has a specified execution client;
if yes, determining the execution client specified by the first task as a second client;
if not, determining the client with the atomic task interpreter corresponding to the first task as a second client; wherein, the atomic task interpreter corresponding to the first task is used for executing the first task.
Here, S350 defines processing that needs to be performed when the first task that the client receives from the task input interface is an atomic task. When the client receives the first task from the task input interface, the first task needs to be analyzed first, and then it can be determined that the first task is an atomic task. In other words, the client has already performed some arithmetic processing before being able to determine that the first task is an atomic task. Therefore, in order to avoid occupying too much computing resources of the local client and improve the distributed execution degree of the task as much as possible, the local client sends the atomic task obtained by analysis to other clients for execution.
S360, the first task is sent to the second client.
S370, receiving a first execution result of the first task sent by the second client.
S380, the first execution result is sent to the task sending end.
Here, the second client has an atomic task interpreter corresponding to the first task, that is, the second client can execute the first task. And when the second client finishes executing the first task and obtains a first execution result of the first task, sending the first execution result to the local client, and sending the first execution result to the task input interface by the local client to mark that the whole workflow is finished.
Example four
The embodiment of the application provides a task processing method, which comprises the following steps:
s410, receiving a first task sent by a task sending end.
S420, judging whether the first task needs to be analyzed.
S430, when the first task is determined to need to be analyzed, the first task is analyzed.
S440, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1; sending the k primary subtasks to a corresponding client; the client corresponding to the primary subtask is determined by the type of the primary subtask; receiving an execution result of a primary subtask sent by a corresponding client; and determining a first execution result of the first task according to the execution result of each primary subtask.
S450, when the first task is determined to be an undetachable composite task, sending the first task to a third client; and the third client has an atomic task interpreter corresponding to all subtasks of the first task.
S450 defines the processing that needs to be performed when the first task received by the client from the task input interface is a non-splittable compound task. In a non-splittable compound task, all subtasks need to be executed on the same client. In general, the local client may determine that the first task is an un-splittable compound task only if all of the subtasks of the first task are atomic tasks, and determine, for each subtask, a client that can execute the subtask separately.
Those skilled in the art will appreciate that a cluster of a certain size needs to handle multiple workflows simultaneously. In order to improve the processing efficiency of the workflow, an atomic task interpreter corresponding to one atomic task needs to be loaded on at least one client in the cluster, so that the same atomic task in different workflows can be executed simultaneously. Therefore, the client corresponding to each subtask of the first task is a set including at least one client, the local client finds an intersection of the sets, and the clients in the obtained intersection can execute all subtasks of the first task. The local client may determine a client in the intersection as a third client and then send the first task to the third client.
S460, receiving a first execution result of the first task sent by the third client.
S470, sending the first execution result to the task sending end.
EXAMPLE five
An embodiment of the present application provides a task processing method, where the task processing method includes:
s510, receiving a first task sent by a task sending end.
S520, judging whether the first task needs to be analyzed.
And S530, when the first task does not need to be analyzed and is determined to be an atomic task, executing the first task by using an atomic task interpreter corresponding to the first task to obtain a first execution result of the first task.
Here, when the task sender is another client in the cluster, the first task is presented in the form of an abstract syntax tree, and the local client does not need to parse the first task.
S530 defines the processing that needs to be performed when the first task that the client receives from the other clients in the cluster is an atomic task. The reason why the other clients in the cluster send the first task to the local client is that the first task can be executed because the local client is determined to have the atomic task interpreter corresponding to the first task. If the local client forwards the first task to other clients, unnecessary data transmission process is started, system resources are wasted, and task processing efficiency is reduced. Therefore, in order to improve the efficiency of task processing, the local client directly executes the first task by using the atomic task interpreter corresponding to the first task, and obtains a first execution result.
S540, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1; sending the k primary subtasks to a corresponding client; the client corresponding to the primary subtask is determined by the type of the primary subtask; receiving an execution result of a primary subtask sent by a corresponding client; and determining a first execution result of the first task according to the execution result of each primary subtask.
And S550, sending the first execution result to the task sending end.
EXAMPLE six
S610, receiving a first task sent by a task sending end.
S620, judging whether the first task needs to be analyzed.
S630, when the first task does not need to be analyzed and is determined to be an undetachable composite task, the first task is split, and at least two primary subtasks are obtained.
And S640, executing each primary subtask in the at least two primary subtasks by using the atomic task interpreter corresponding to each primary subtask to obtain an execution result of each primary subtask.
S650, determining a first execution result of the first task according to the execution result of each primary subtask.
S630 to S650 define the processing that needs to be performed when the first task that the client receives from the other clients in the cluster is an un-splittable composite task. As described in the fourth embodiment, only when all the subtasks of a compound task are atomic tasks, the client can determine that the compound task is an un-splittable compound task. It can be seen that the subtasks of the first task received by the local client are all atomic tasks. If only one sub-task is available after the first task is split, then the first task is actually an atomic task, not an un-splittable compound task. Thus, it can be concluded that at least two primary subtasks will result after the first task split.
The reason why the local client receives the first task is that the client sending the first task performs the determination process described in the fourth embodiment, which means that the local client has an atomic task interpreter corresponding to each primary subtask of the first task. Therefore, as long as the primary subtasks are executed by the atomic task interpreter corresponding to each primary subtask in turn, the first execution result of the first task can be obtained.
S660, when the first task is determined to be a detachable composite task, the first task is detached to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1; sending the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask; receiving an execution result of a primary subtask sent by a corresponding client; and determining a first execution result of the first task according to the execution result of each primary subtask.
S670, sending the first execution result to the task sending end.
EXAMPLE seven
The embodiment of the application provides a task processing method, and an object processed by the method is a workflow. In the embodiment of the application, the workflow is represented as a composite task. The method is applied to clients in a cluster processing workflows. Here, a cluster appears as a distributed system, with clients located on multiple devices in the distributed system. The clients in the cluster can communicate information, send tasks and execute results. Each client comprises various functional modules which are respectively used for transmitting information, tasks and execution results with other clients or splitting and executing received tasks.
In the embodiment of the application, the workflow is recorded in the process specification file as a compound task. The flow specification file is a text file, and the workflow is recorded in the flow specification file according to the specific document structure of the flow specification file, so that the functions which can be realized by each task in the workflow and the dependency relationship among the tasks can be clearly shown. Through the target direction of each task, the execution sequence of the processes contained in the workflow can be shown. An example of a workflow is presented in a flow specification file as shown below:
name: example workflow
Configuration:
workingDir:/received
task:
-type: watch for file
Name: monitoring document
Routing: client B
Target:
merge avg (files) <10
Loading: otherwise
Configuration:
monitoring interval: 10
Monitoring the catalog: $ workgdir/ftp
The number of target files: 1000
-type: local
Name: merging
The target is as follows:
loading
-type: load. Hdfs
Name: loading
Configuration:
target directory: $ workgdir }
Number of mappers: $ Sum (monitoring file. Sizes)/1024 }
Triggering: any one of them
As shown in the above flow specification document, the task name of the compound task is an example workflow, and includes three atomic tasks, and the task names of the three atomic tasks are defined as monitoring file, merging, and loading, respectively, in the task attributes of the compound task. The example workflow is a parent task that monitors files, merges and loads three tasks, which are children tasks of the example workflow.
In order to reduce the complexity of the task executed on each client, in the embodiment of the present application, a simple task with basic operation and high repetition rate is defined as an atomic task, and a simple operation corresponding to an atomic task may be included in many complex workflows. By splitting the composite task corresponding to the workflow into the combination of the atomic tasks, the atomic tasks can be distributed to different clients to be executed, and therefore the operation pressure on each client is reduced. In addition, different workflows may be split to obtain some identical atomic tasks, and the atomic tasks from different workflows may be dispatched to the same client to be executed, so that the identical atomic tasks do not need to be repeatedly executed on different clients, and thus the computing resources of the clients are saved.
A compound task has no type attributes but contains task attributes. A compound task itself does not actually perform an operation, but only forms a layer of nested scopes. This nested scope can fulfill the following functions:
first, a symbol table scope is provided, and variables defined in the configuration properties of the parent task can be used in the child task.
In computer science, a symbol table is a data structure for a language translator (e.g., a compiler and an interpreter). In the symbol table, each identifier in the program source code is bound with its declaration or usage information, such as its data type, scope, and memory address. Here, scoping is used to define the scope of code that is used/available by name in a piece of program code.
For example, a variable of workDir (working direction, working path) is defined in the configuration attribute of the compound task example workflow, and the monitoring file task is located in the task attribute of the example workflow, so that the variable of workDir can be referred to in the monitoring file task to define a monitoring directory — "monitoring directory: $ workDir/ftp ", without the need to enter the value of" workDir "this variable.
Second, a definition of life cycle is provided.
For example, the execution result of an atomic task named as an extraction file includes generating a temporary file, and path information of the temporary file is defined in the configuration attribute of a parent task of the extraction file task, so that all post tasks of the extraction file task in a child task of the parent task can use the temporary file by referring to the path information. When the execution of the parent task is finished, the scope formed by the parent task disappears, the temporary file is deleted, and the life cycle of the temporary file is finished.
Third, a way to multiplex workflows is provided.
For example, a parent task includes two child tasks, an extraction task for extracting a file from an extraction path and a loading task for loading the file to a loading path. Thus, the parent task including the two tasks can be actually used to load the file at the extraction path to the loading path, and the operation of file migration is completed. By defining the variable extraction path extract.dir and load path load.dir in the configuration attributes of this parent task, respectively:
load.dir:/load/20190413;
extract.dir:/extract/20190412。
the parent task may be referenced directly in the other workflows as a "file migration" task. Other tasks referencing the parent task need only define an extraction path and a loading path in the configuration attributes of the parent task, and the parent task can perform an action of migrating a file from the extraction path to the loading path.
Accordingly, based on the first function of the nested scope, an extraction task may refer to an extraction path through a "$ { extract.
Unlike a compound task, an atomic task is a task with a type attribute. One client can send the abstract syntax tree and the symbol table corresponding to a certain atomic task to a corresponding client capable of processing the atomic task according to the type identifier of the atomic task, so that the atomic task is executed at the corresponding client.
In addition to the respective attributes of the above compound task or atomic task, the compound task and the atomic task also include the following common attributes:
the attribute one is as follows: name, i.e. the name of the task. To avoid confusion among references, the name of each task may not be duplicated under the same compound task.
And II, attribute II: and routing, namely a routing mode of the task. Here, the routing manner may be configured as a default manner or a designated manner. If the attribute of the route is not limited in one task, the routing mode of the task is the default mode, and the task can be sent to any client side capable of executing the task. If a task has a client defined to perform the task, the task can only be sent to the client specified in the routing attributes.
For example, in the above-mentioned flow specification document, the route attribute for merging this atomic task is not particularly limited, which indicates that this atomic task can be dispatched to any client capable of executing it; the routing attribute of the atomic task of monitoring the file defines the client B, i.e. it indicates that the task can only be served to the client B.
Attribute three: the target defines a post-task for the task, indicating that the post-task needs to begin execution after the task has completed execution. In some embodiments, after the task is executed, the trigger signal is sent to the post-task only if a certain trigger condition is met. That is, in the target attribute of one task, a post-task of the task and a condition for sending a trigger signal to the post-task may be defined at the same time.
For example, in the above-mentioned flow specification file, the target attribute of the atomic task of the monitoring file defines merging and loading two atomic tasks. The merging task corresponds to a trigger signal sending condition: avg (file sizes) <10, which indicates that after the monitoring file task is executed, when the number of files is less than 10, a trigger signal is sent to the merging task; the loading task corresponds to a trigger signal sending condition: and (4) other, which indicates that after the task of monitoring the files is completed, when the number of the files is greater than or equal to 10, a trigger signal is sent to the loading task.
In some embodiments, target attributes for all tasks in a workflow may be extracted. Based on the corresponding relation between the names of the tasks and the target attributes, a tree diagram which embodies the execution sequence of all the tasks in the workflow can be generated. In the embodiment of the application, each subtask in the workflow is sent to the corresponding client when starting to execute. Therefore, the execution order of the tasks is the sending order of the tasks. In some embodiments, an adjacency list may be generated based on execution order information in the tree. Then, when needed, the pre-task and the post-task of any one task can be searched from the adjacency list.
And IV, attribute: and (6) configuring. Each task may define variables and functions in its own configuration attributes.
Variables in the configuration attributes of a compound task may be referenced by the task itself or by subtasks of the task.
For example, in the above-mentioned flow specification file, the monitoring file task, the loading task, and the merging task are subtasks of the example workflow. In the monitoring file task and the loading task, a variable 'workgdir' defined in the configuration attribute of the compound task example workflow is referred.
The content in the configuration attributes of an atomic task may be referenced by the atomic task interpreter to which the atomic task corresponds. The atomic task interpreter corresponds to a piece of executable code that is executed, i.e., is equivalent to executing the atomic task. After referencing the configuration attributes of the atomic task, the atomic task interpreter parses the content in the configuration attributes into content that can be identified by the executable code.
And attribute five: and triggering, wherein the triggering attribute is used for limiting the triggering mode of the task. Two types of trigger attributes are correspondingly configured, wherein one type is 'any', and the task can be executed after receiving a trigger signal of any one preposed task; the other is "all", which means that the task needs to receive the trigger signals of all the preceding tasks before it can be executed. Here, among the two tasks executed in the order, the task executed first is a preceding task of the task executed later, and the task executed later is a succeeding task of the task executed first.
For example, in the above-mentioned flow specification file, the monitoring file task has two post tasks, which are a merge task and a load task; the merging task has a post-task, which is a loading task. Accordingly, the loading task has two pre-tasks, namely a monitoring file task and a merging task. If the trigger mode defined by the trigger attribute of the loading task is "any", the loading task can start to execute after receiving the trigger signal sent by any one of the monitoring file task and the merging task.
In order to execute the task processing method in the embodiment of the present application, the client needs to include a corresponding functional module. Fig. 3 is an example of a client in the embodiment of the present application, where the client 300 includes a coordinator 301, a gatekeeper 302, a parser 303, a resolver 304, an executor 305, a trigger 306, a transmitter 307, and an interpreter 308, where:
a coordinator 301 for receiving and transmitting the tasks, and receiving and transmitting the execution results. Here, the client 300 realizes the uniform reception and the uniform transmission of the tasks through the coordinator 301. The task received by the coordinator 301 may be an atomic task or a composite task sent by the coordinator of another client in the cluster, may be a flow specification file input by a user from a task input interface, or may be an atomic task or a composite task sent from the gatekeeper 302 inside the client 300. The coordinator 301 may send tasks to the coordinators of other clients in the cluster, or may send tasks to the gatekeeper 302 inside the client 300. The tasks sent by the coordinator 301 may be atomic tasks or composite tasks.
In other words, task transceiving between the client 300 and other clients or task input interfaces in the cluster is performed by the coordinator 301, and the coordinator 301 is an external streaming interface of the task; the task transceiving between the modules in the client 300 is executed by the gatekeeper 302, and the gatekeeper 302 is an internal flow interface of the task; the transfer of tasks between the coordinator 301 and the gatekeeper 302 occurs when a task is sent to the client 300 from outside the client 300, or when a task is processed within the client 300 and needs to be sent to other clients in the cluster.
Before the user enters the flow specification file into the cluster from the task input interface, the client 300 has sent the atomic task types that the client 300 can handle to the coordinators of the other clients in the cluster through the coordinator 301. Correspondingly, the coordinators of other clients in the cluster also send the atomic task types that can be processed by the coordinators to the coordinator 301, and the coordinator 301 stores the first mapping relationship between the received atomic task type and the client that sends the atomic task type.
When the client 300 needs to send an atomic task to the outside, the gatekeeper 302 sends the atomic task to the coordinator 301, and the coordinator 301 sends the atomic task to the coordinator of the corresponding client according to the first mapping relationship. The type attribute of the flow specification file is recorded, for example, the type of the atomic task of the monitoring file is file.
Correspondingly, the coordinators of the other clients in the cluster may send the atomic task to the coordinator 301 of the client 300 according to the first mapping relationship stored by the coordinators. The task received by the coordinator 301 from outside the client 300 may be an atomic task.
When the client 300 needs to send a compound task to the outside, the gatekeeper 302 sends the compound task to the coordinator 301, and the coordinator 301 can send the compound task to an idle client in the cluster, and the idle client splits and sends the compound task again. The coordinator of the client can know which clients are idle through information interaction with the coordinators of other clients.
Correspondingly, the coordinators of other clients in the cluster may also send compound tasks to the coordinator 301 when the client 300 is idle. The tasks received by the coordinator 301 from the coordinators of the other clients in the cluster may also be compound tasks.
When a user inputs a process specification file into a cluster from a task input interface, the client 300 may be selected to receive the process specification file, and a task received by the coordinator 301 from outside the client 300 is a workflow described in the process specification file.
In this embodiment of the present application, according to the first mapping relationship, the coordinator 301 in the client 300 may determine to which client an atomic task is sent, so that the atomic task in the workflow may be processed on different clients, thereby implementing distributed execution of the task, and simplifying the development difficulty.
Coordinator 301 is also used to create a watcher 302 for each task received from outside client 300 and to allocate a resource pool for that task.
The resource pool is used for storing a symbol table of the task, and the symbol table comprises variables, corresponding variable values and other information. The resource pool provides uniform resource access and registration addresses for the subtasks of the task, and on one hand, after receiving execution results of the subtasks of the task sent by other clients, the task can update variable values or other data in the resource pool according to the execution results of the subtasks; on the other hand, in this task, a subtask being executed on another client may refer to data in the symbol table to interpret information such as its own variable.
In addition, when the client 300 executes a task and obtains an execution result, the coordinator 301 may send the execution result to the sender of the task. This sender may be a task input interface or may be another client in the cluster.
When the client 300 sends a task to the outside, the coordinator 301 is configured to receive an execution result of the task and send the execution result to the watcher 302.
A gatekeeper 302 for transmitting the task to the coordinator 301, the parser 303, the resolver 304, or the executor 305 according to the type of the received task; the task execution module is further configured to send a task corresponding to the execution signal to the coordinator 301 according to the execution signal sent by the trigger 306; and also for sending the received execution results to the transmitter 307.
If gatekeeper 302 receives the process specification file from coordinator 301, gatekeeper 302 sends the process specification file to parser 303; if the gatekeeper 302 receives the compound task from the coordinator 301, the gatekeeper 302 sends the compound task to the disassembler 304; if gatekeeper 302 receives an atomic task from coordinator 301, gatekeeper 302 sends the atomic task to executor 305.
If gatekeeper 302 receives an atomic task or a non-splittable compound task from parser 303, gatekeeper 302 sends the atomic task or the non-splittable compound task to coordinator 301.
If the gatekeeper 302 receives a splittable compound task from the parser 303, the gatekeeper 302 sends the compound task to the splitter 304.
If gatekeeper 302 receives an atomic task or a compound task from disassembler 304, gatekeeper 302 sends the atomic task or compound task to coordinator 301.
If the gatekeeper 302 receives the execution signal from the trigger 306, a task corresponding to the execution signal is transmitted to the coordinator 301.
If the gatekeeper 302 receives the execution result from the coordinator 301, the execution result is transmitted to the corresponding transmitter 307.
In the embodiment of the present application, each task sent to the coordinator 301 from outside the client 300 is forwarded by the coordinator 301 to the gatekeeper 302, and the gatekeeper 302 is used for coordinating the task transmission inside the client 300.
If the task is a process specification file received from the coordinator 301, the process specification file needs to be parsed into a symbol table and an abstract syntax tree in the client 300 to facilitate passing between clients or between various functional modules of the clients, so the gatekeeper 302 sends the process specification file to the parser 303 for parsing.
If the task is a compound task received from the coordinator 301, the compound task needs to be split into next-level sub-tasks in the client 300, so the gatekeeper 302 sends the compound task to the splitter 304 for splitting.
If the task is an atomic task received from coordinator 301, then the atomic task needs to be executed in client 300, so gatekeeper 302 sends the atomic task to executor 305 for execution.
If the task is an atomic task or a compound task received from the parser 303 or the resolver 304, the task needs to be sent to other clients for execution, so the gatekeeper 302 sends the task to the coordinator 301 and out through the coordinator 301.
And the parser 303 is configured to convert the flow specification file into a task represented in a form of a symbol table and an abstract syntax tree, and send the parsed task to the gatekeeper 302.
In the embodiment of the application, the task is represented in the form of an abstract syntax tree when being sent between the clients or among the functional modules in the clients. In computer science, an Abstract Syntax Tree (AST) is an Abstract representation of the Syntax structure of the source code. It represents the syntactic structure of the programming language in the form of a tree, where each node on the tree represents a structure in the source code.
The parser 303 may be a compiler for converting a flow specification file input from a user from a task input interface into a symbol table and an abstract syntax tree, and transmitting a task represented in the symbol table and the abstract syntax tree to the gatekeeper 302. Thus, the gatekeeper 302 can send the task to the coordinator 301, and the coordinator 301 sends the task to the corresponding client according to the type of the task.
Generally, the parser will perform the task parsing operation only when the coordinator receives the flow specification file from the task input interface. The tasks received by the coordinator from the coordinators of other clients are represented in the form of a symbol table and an abstract syntax tree, and the parser does not need to parse the tasks.
And the splitter 304 is configured to receive the compound task, split the compound task into at least one next-level subtask, and send the split subtask to the gatekeeper 302.
Such as: a compound task is composed of two atomic tasks and a compound task, and the gatekeeper 302 receives the compound task from the coordinator 301 and transmits the compound task to the disassembler 304. Disassembler 304 will split the compound task into two atomic tasks and one compound task and send the two atomic tasks and the one compound task to gatekeeper 302.
The disassembler 304 performs only one split operation on the compound task and then sends the split subtasks to the gatekeeper 302. If there are five levels of nesting in a composite task, after the splitter of the first client performs a split, the subtask with the highest number of nesting in the resulting subtasks will include only four levels of nesting, and then be sent to the second client. In this way, after the subsequent four splits are performed in different clients in sequence, the nesting of the compound task can be completely released, and all the atomic tasks constituting the compound task can be executed in the corresponding clients.
And the executor 305 is configured to execute the atomic task and send an execution result of the atomic task to the client where the parent task of the atomic task is located.
Generally, the abstract syntax tree corresponding to the atomic task includes variables, constants, and the like that need to be digitized, and may also include built-in functions. The executor 305 receives the abstract syntax tree and the symbol table of the atomic task, then recursively checks the syntax tree, first sends the symbol table and the abstract syntax tree to the general interpreter to interpret the variables and the constants to obtain the values corresponding to the variables and the constants, then sends the symbol table and the abstract syntax tree to the built-in function interpreter to interpret the built-in functions, and finally sends the symbol table and the abstract syntax tree to the atomic task interpreter to interpret the whole atomic task. Through such recursive interpretation, the atomic task may be interpreted in the atomic task interpreter as what the executable code may recognize.
After the atomic task is completed, the executor 305 sends the execution result of the atomic task to the watcher 302.
A flip-flop 306 for converting to a post-trigger state after receiving a trigger signal satisfying a condition, and transmitting an execution signal to the gatekeeper 302.
Here, the flip-flop 306 may be either an "any" flip-flop or an "all" flip-flop. FIG. 4 shows the triggering process for "any" flip-flop, and FIG. 5 shows the triggering process for "all" flip-flops.
As shown in fig. 4, there are two states for any flip-flop: before and after triggering. The pre-trigger state is an initialization state of any trigger, and the any trigger enters a post-trigger state after receiving a trigger signal from any pre-task and sends an execution signal to the gatekeeper 302, so that the gatekeeper 302 starts processing a task corresponding to the any trigger and ignores all subsequent trigger signals.
For example, in the flow specification file, if the trigger mode defined by the loading task is any one, the trigger corresponding to the task is an "any" trigger. The trigger receives a trigger signal sent by any one of the front tasks (the monitoring file task and the merging task) of the loading task, namely, the trigger can be switched from a pre-trigger state to a post-trigger state. After the "any" trigger transitions to the post-trigger state, an execution signal is sent to gatekeeper 302 causing gatekeeper 302 to begin processing load tasks.
As shown in fig. 5, the "all" flip-flop includes a trigger signal storage list, the number of entries of the trigger signal storage list is the same as the number of preceding tasks of the task corresponding to the "all" flip-flop, and the trigger signal storage list is used for storing IDs of all preceding tasks of the task, such as ID0, ID1, … …, idn, and trigger signals ST0, ST1, … …, STn. When the "all" flip-flop is in the initialized state, the list is empty. Every time the trigger signal of a preposed task is received by all triggers, the trigger signal is stored in a trigger signal storage list. When no empty entry exists in the trigger signal storage list, the "all" trigger enters the post-trigger state from the pre-trigger state and sends an execution signal to the gatekeeper 302, so that the gatekeeper 302 starts processing the task corresponding to the "all" trigger.
In some embodiments, the trigger signal may be the result of the execution of a preceding task. After the trigger signal storage list stores the IDs and execution results ST of all the pre-tasks, the all-flip-flop combines the execution results of all the pre-tasks with the symbol table of the current scope to generate an execution signal, and sends the execution signal to the gatekeeper 302.
For example, if the triggering mode defined by the triggering attribute of the loading task is "all" in the flow specification file, the trigger corresponding to the task is the "all" trigger. The trigger includes a trigger signal storage list with a number of entries of 2. And after receiving the trigger signal of the file monitoring task, the trigger stores the trigger signal into a trigger signal storage list. At this point the number of empty entries in the list is 1 and the flip-flop will continue to wait for a trigger signal. And after receiving the trigger signal of the merging task, the trigger stores the trigger signal into a trigger signal storage list. At this time, the number of empty entries in the list is 0, and the trigger enters the post-trigger state from the pre-trigger state and sends an execution signal to the gatekeeper 302.
And the transmitter 307 is configured to receive the execution result of the task, and determine which trigger signal for the post task is generated according to the execution result and the trigger signal sending condition. Here, the user may define an assertion function as the trigger signal transmission condition. The transmitter 307 transmits the execution result and the trigger signal transmission condition to the interpreter 308, and the interpreter 308 interprets the execution result and the trigger signal transmission condition to obtain a judgment result. The transmitter 307 determines which trigger signal for the post task is generated according to the determination result, and sends the trigger signal to the trigger corresponding to the post task. If the currently executed task has only one post task, the transmitter 307 directly generates a trigger signal for the only post task. If the currently performed task has no post-task, the transmitter 307 does not generate the trigger signal.
For example, in the above-mentioned flow specification file, the target attribute of the monitoring file task defines a trigger signal sending condition of the post task: avg (files) <10.avg (file sizes) <10 is a predicate function. After the monitoring file task is executed, the emitter of the monitoring file task obtains an execution result, and an interpreter is called to interpret avg (file sizes) <10, so that a judgment result is obtained. If the judgment result is true, the transmitter generates a trigger signal for combining the post task; if the result of the determination is false, the transmitter generates a trigger signal for loading the post-task.
In some embodiments, the transmitter is connected with the trigger of the post task and is used for sending a trigger signal to the trigger of the post task. For example, in the above-mentioned process specification document, the transmitter for monitoring the document task is respectively connected to the triggers for merging the task and loading the task, and sends a trigger signal to one of the triggers according to the judgment result.
And an interpreter 308 for interpreting the abstract syntax tree and the symbol table as contents recognizable by the executable code or converting an execution result of the executable code into the abstract syntax tree.
Here, the interpreter may be a compiler for extracting information from an abstract syntax tree and converting the extracted information into an input of an executable code. The general interpreter is used for interpreting variable references, constants and the like into corresponding values, the built-in function interpreter is used for interpreting built-in functions into contents which can be recognized by executable codes, and the atomic task interpreter is used for interpreting an abstract syntax tree corresponding to an atomic task into contents which can be recognized by the executable codes and converting an execution result of the executable codes into the abstract syntax tree.
In the embodiment of the application, each client in the cluster is allocated with the atomic task which needs to be executed, and the executable code and the interpreter which are needed by one atomic task can be loaded in only a plurality of fixed clients, so that the oriented dispatching of the tasks is realized through data transmission among the clients, the resource waste caused by that the same executable code and the interpreter are not loaded regularly in a large number of clients can be avoided, and the utilization efficiency of the clients is improved.
The task processing method provided by the embodiment of the application comprises the following steps:
s701, each client in the cluster registers itself to other clients in the cluster.
The method comprises the following substeps:
s701a, each client in the cluster broadcasts its own client name and client address.
S701b, each client in the cluster loads an interpreter which is installed respectively, and broadcasts the type of the interpreter; the interpreter comprises an atomic task interpreter and a built-in function interpreter, and the type of the interpreter is the type of an atomic task or a built-in function corresponding to the interpreter.
S701c, other clients in the cluster receive the broadcast message, and store a first mapping relation between the client name and the interpreter type and a second mapping relation between the client name and the client address.
Here, the atomic task interpreter is a module for processing an atomic task in the client, and the built-in function interpreter is a module for processing a built-in function in the client. For example, in the above process specification file, monitoring the file, merging and loading are all atomic tasks; in the configuration attribute of the load task, sum () is a built-in function for determining the total number of files. As shown in fig. 6, the client C installs the interpreters corresponding to the load.hdfs and the function.avg, so that the client C can process the built-in function of loading the atomic task and avg (), and the types of the interpreters loaded by the client C are "load.hdfs" and "function.avg". Client a also loads interpreter types of "load.
Fig. 6 also shows a flow of mutually broadcasting the notification atomic task interpreter type and the built-in function interpreter type between the clients. Client C may handle the load this atomic task and may handle the built-in function avg (). Therefore, the client C broadcasts the type identifier "load.hdfs" of the loading task and the type identifier "function.avg" of the avg () function, and the client a and the client B store two first mappings related to the client C, which are respectively the mapping between the name of the client C and the interpreter type "load.hdfs" and the mapping between the name of the client C and the interpreter type "function.avg". Client B may process the monitoring file and merge the two atomic tasks, then client B broadcasts the monitoring file and merges the type identifications "file. Client a may handle the built-in function of loading this atomic task and avg (). Therefore, the client a broadcasts the type identifier "load.hdfs" of the loading task and the type identifier "function.avg" of the avg () function, and the client C and the client B store two first mapping relationships related to the client a.
In some embodiments, client C stores the first mapping in the coordinator, as shown in FIG. 6. The coordinator is a module in the client for maintaining the first mapping relationship.
In some embodiments, the client broadcasts only the type of atomic task interpreter. In order to interpret an atomic task, a corresponding atomic task interpreter needs to be loaded in the client, and if the atomic task includes a built-in function, a corresponding built-in function interpreter also needs to be loaded in the client. It can be seen that if an atomic task interpreter is included in a client, a built-in function interpreter required for interpreting the atomic task is also included in general. Therefore, the client may also broadcast only the type of the atomic task interpreter when broadcasting the interpreter type.
In addition, each client in the cluster also broadcasts its own client name and client address to other clients in the cluster, thereby registering its own client name and client address with the other clients in the cluster. In this way, each client in the cluster has stored therein a second mapping relationship between names and addresses of other clients.
In the embodiment of the present application, the second mapping relationship is stored in the coordinator. For example, the coordinator of the client C further stores a second mapping relationship between the name of the client B and the address of the client B, so that when the client C needs to send the atomic task of the monitoring file to other clients, the coordinator of the client C can determine that the atomic task of the monitoring file needs to be sent to the client B according to the first mapping relationship between the name of the monitoring file and the name of the client B; after the client B is selected to receive the monitoring file task, the coordinator of the client C determines the sending address of the atomic task of the monitoring file according to the second mapping relationship between the name of the client B and the address of the client B, so that the atomic task of the monitoring file is sent to the client B.
In the embodiment of the application, each client in the cluster stores the second mapping relation between the names and the addresses of other clients, and also stores the first mapping relation between the names of other clients, atomic tasks which can be processed and built-in functions. Therefore, each client can dispatch corresponding tasks to other clients according to the stored information, so that distributed processing of the tasks is realized, and the processing efficiency of the tasks is improved.
S702, after receiving the first task, the first coordinator 704 of the first client 701 creates a first watcher 702 corresponding to the first task, and sends the first task to the first watcher 702.
As shown in fig. 7, after the first coordinator 704 of the first client 701 receives the first task from the task input interface, a first watcher 702 corresponding to the first task is created. Here, the first task is a flow specification file.
S703, the first coordinator 704 allocates a first resource pool for the first task.
Here, the first coordinator 704 determines the size of the first resource pool from the empirical value according to the size of the flow specification file, and allocates the first resource pool to the first task. The first resource pool is used for storing a symbol table in the first task. In this way, all the subtasks in the first task can read the values corresponding to the constants and the variables in the symbol table, and the first task can update the values corresponding to the constants and the variables in the symbol table according to the execution result of each subtask.
S704, the first watcher 702 sends the first task to the first parser 703, the first parser 703 parses the first task into a symbol table and an abstract syntax tree, and the first parser 703 sends the symbol table and the abstract syntax tree corresponding to the first task to the first watcher 702.
Generally, tasks are sent between clients in the form of abstract syntax trees. Therefore, in the process of processing a workflow, only the client receiving the flow specification file corresponding to the workflow needs to call the parser to parse the flow specification file into the abstract syntax tree.
S705, the first gatekeeper 702 sends the notation list and the abstract syntax tree of the first task to the first coordinator 704 or the first disassembler 705 according to the type of the first task.
The first watcher 702 determines the type of the first task from the abstract syntax tree of the first task. If the first task is an un-splittable compound task or an atomic task, the first gatekeeper 702 sends the notation list and the abstract syntax tree of the first task to the first coordinator 704; if the first task is a detachable compound task, the first daemon 702 sends the notation table and the abstract syntax tree of the first task to the first disassembler 705.
In a workflow, if the dependency of the subtask of a compound task on the execution result of its predecessor is very high, then two subtasks of the compound task need to be executed in the same client. At this time, the compound task is an undetachable compound task.
S706, the first splitter 705 splits the first task into a first primary subtask of the next level, … …, an ith primary subtask, … …, and a kth primary subtask, and sends k primary subtasks to the first gatekeeper 702; wherein i and k are positive integers, i is more than or equal to 1 and less than or equal to k, and k is more than or equal to 1.
Here, the first disassembler 705 is configured to split the composite task to obtain k next primary subtasks, and then send the k primary subtasks to the gatekeeper 702. In the first client 701, the first task is split only once. If the k primary subtasks obtained by splitting include a compound task, the first coordinator 704 of the first client 701 sends the compound task in the k primary subtasks to the next client, and performs the second splitting of the compound task in the splitter of the next client. In this way, by the layer-by-layer splitting of the splitters in the multiple clients, a compound task can be split into nested combinations of multiple atomic tasks, so that the atomic tasks are executed by using the corresponding atomic task interpreters.
In the embodiment of the application, each client only splits the compound task into the next-level subtasks, so that the tracking path between the parent task and the subtask is more concise and clear. Meanwhile, the multi-layer splitting operation of the compound task is distributed in resolvers of different clients according to the nesting level of the compound task to be executed, so that the operation pressure of each client can be reduced, and the task processing speed can be increased.
S707, the first gatekeeper 702 creates an ith primary trigger 706 and an ith primary transmitter 707 corresponding to the ith primary subtask according to the in-degree and out-degree of the ith primary subtask.
If the in-degree of the ith primary sub-task is 0 and the out-degree is not 0, the ith primary trigger 706 is a special trigger and the ith primary transmitter 707 is a normal transmitter.
If the out-degree of the ith primary sub-task is 0 and the in-degree is not 0, the ith primary trigger 706 is a normal trigger and the ith primary transmitter 707 is a special transmitter.
If the in-degree and out-degree of the ith primary sub-task are not 0, the ith primary trigger 706 is a normal trigger, and the ith primary transmitter 707 is a normal transmitter.
If the in-degree and out-degree of the ith primary sub-task are both 0, the ith primary trigger 706 is a special trigger and the ith primary transmitter 707 is a special transmitter.
Here, the in-degree of one task indicates the number of tasks preceding the task, and the out-degree of one task indicates the number of tasks succeeding the task. The first gatekeeper 702 can obtain the out-degree and in-degree values of a task by counting the information of the pre-task and the post-task of the task in the adjacency list. In other embodiments, the out-degree and the in-degree of the task may also be directly counted and stored when the adjacency list is generated according to the target attribute of the task, and the first gatekeeper 702 may directly read the counted out-degree and in-degree values of the task.
The in-degree of a task corresponding to the special trigger is 0, and the special trigger receives a trigger signal sent by the watching device; the in degree of the task corresponding to the common trigger is larger than 0, and the common trigger receives a trigger signal sent by the common transmitter of the front task.
The out degree of the task corresponding to the special transmitter is 0, and the special transmitter does not send a trigger signal outwards; the out degree of the task corresponding to the common emitter is larger than 0, and the common emitter sends a trigger signal to the common trigger of the post task when the trigger signal sending condition is met.
The first gatekeeper 702, in creating the k primary triggers and the k primary transmitters, also connects the i-1 primary transmitter of the i-1 primary subtask with the ith primary trigger 706 of the ith primary subtask so that the i-1 primary transmitter can send a trigger signal to the ith primary trigger 706 when a trigger signal sending condition is met. Here, the i-1 st primary subtask is a pre-task of the i-th primary subtask.
For example, in the above-mentioned flow specification file, two tasks of merging and loading are defined in the target attribute of the monitoring file task, a loading task is defined in the target attribute of the merging task, and the loading task does not define the target attribute thereof. Therefore, in the workflow, the post task of the file monitoring task is to merge and load two tasks, and the out degree is 2; the post task of the merging task is a loading task, and the out degree is 1; the loading task has no post task, and the out-degree is 0. Correspondingly, the in-degree of the file monitoring task is 0; the preposed task of the merging task is a file monitoring task, and the degree of entry is 1; the prepositive tasks of the loading task are a file monitoring task and a merging task, and the degree of entry is 2.
According to the out-degree value and the in-degree value, the gatekeeper creates a special trigger and a common transmitter for the task of monitoring the file; for loading tasks, the gatekeeper creates a common trigger and a special transmitter; for the merge task, the gatekeeper creates a normal trigger with a normal transmitter. And the common emitter for monitoring the file task is connected with the common trigger for merging the tasks and the common trigger for loading the tasks, and the common emitter for merging the tasks is connected with the common trigger for loading the tasks.
S708, after the ith primary trigger 706 is triggered, an ith execution signal for starting the execution of the ith primary sub-task is sent to the first gatekeeper 702.
Here, if the ith primary trigger 706 is a special trigger, the trigger signal will be sent directly to the ith primary trigger 706 after the first gatekeeper 702 creates the ith primary trigger 706. In a compound task, a plurality of subtasks are executed in sequence in a certain order. In a compound task, the subtask without the preceding task is the head subtask. First watcher 702 first initiates execution of all of the head subtasks in the compound task. After the head subtask is executed, a trigger signal is sent to a trigger of a post task, so that the post task of the head subtask can be executed.
The ith primary flip-flop 706, if it is a normal flip-flop, will receive the trigger signal of the preceding task of the ith primary sub-task. The ith primary flip-flop 706 may be either an "any" flip-flop or an "all" flip-flop. After the ith primary flip-flop 706 receives the trigger signal, if it is determined that the trigger signal is sufficient to transition from the pre-trigger state to the post-trigger state, the ith primary flip-flop 706 is triggered and sends an ith execute signal to the first gatekeeper 702.
For example, in the above-mentioned process specification file, after the task of the monitoring file is completed, the common transmitter corresponding to the task sends a trigger signal to the common transmitter corresponding to the merging task. The merging task only has one preposed task, the trigger signal sent by the monitoring file task is enough to trigger the common trigger corresponding to the task, and at the moment, the common trigger corresponding to the merging task sends an execution signal to the watchdog.
S709, after receiving the ith execution signal sent by the ith primary trigger 706, the first gatekeeper 702 sends the ith primary subtask corresponding to the ith execution signal to the first coordinator 704.
Here, sending a task to the coordinator means that the task is sent to the corresponding client for execution or splitting, so as to start the execution process of the task.
S710, the first coordinator 704 sends the symbol table and the abstract syntax tree corresponding to the first task or the ith primary sub-task to the corresponding client according to the task sending rule.
Here, if the first task is an un-splittable compound task or an atomic task, the task sent out by the first coordinator 704 is the first task itself; if the first task is a splittable compound task, the task sent out by the first coordinator 704 is the ith primary sub-task obtained after the first task is split.
The task sending rules specify how the first coordinator 704 sends the task to the corresponding client, including the following rules:
if the task is an atomic task, the first coordinator 704 first determines whether the atomic task has a designated client according to the routing attribute of the atomic task.
If so, the first coordinator 704 queries whether the designated client can execute the atomic task in the saved first mapping relation according to the type identification of the atomic task, and if so, the first coordinator 704 sends the atomic task to the designated client; if not, the first coordinator 704 notifies the first watcher 702 of the task transmission failure.
If no client is specified, the first coordinator 704 queries clients that can execute the atomic task in the saved first mapping according to the type identification of the atomic task, and sends the atomic task to one of the idle clients. If all clients are not in the idle state, the first coordinator 704 informs the first watcher 702 of the task transmission failure.
If the task is an un-splittable compound task, the first coordinator 704 firstly queries whether a client can execute all subtasks under the compound task in a stored first mapping relation according to the type identifier of the atomic task included in the compound task, and if so, the first coordinator 704 sends the task to the client; if not, the first coordinator 704 notifies the first watcher 702 of the task transmission failure.
If the task is a detachable composite task, the first coordinator 704 sends the task to the idle clients in the cluster.
If the routing mode of the task is defined as the following policy in the routing attributes of the task, the first coordinator 704 first queries to which client the pre-task of the task is sent, and then checks whether the client can execute the current task. If so, the first coordinator 704 sends the task to the client; if not, the first coordinator 704 informs the first watcher 702 of the task transmission failure.
Here, the following policy is different from the non-splittable compound task in that all the subtasks in the non-splittable compound task need to be bound to the same client; in the following strategy, only the current task and the prepositive task of the current task need to be bound on the same client.
S711, the second coordinator 709 of the second client 708 receives the ith primary subtask.
S712, the second coordinator 709 of the second client 708 creates a second watcher 710 corresponding to the ith primary subtask, and sends the ith primary subtask to the second watcher 710.
S713, the second coordinator 709 allocates a second resource pool for the ith primary subtask.
Here, the operation performed by the second client 708 after receiving the ith primary subtask is substantially the same as the operation performed by the first client 701 after receiving the first task. Instead, the second client 708 receives the symbol table and the abstract syntax tree without invoking a corresponding parser to parse the task into the symbol table and the abstract syntax tree.
In the embodiment of the application, the tasks are sent between the clients in the form of the abstract syntax tree, so that the clients receiving the tasks do not need to perform secondary analysis on the tasks, and the execution efficiency of the tasks is improved.
S714, the second gatekeeper 710 sends the ith primary subtask to the second executor 711 or the second disassembler 712 according to the type of the ith primary subtask.
Here, if the ith primary subtask is an atomic task, i.e. it means that there is an atomic task interpreter corresponding to the atomic task in the second client 708, the first coordinator 704 of the first client 701 sends the atomic task to the second client 708. At this time, the second gatekeeper 710 cannot send the atomic task to the second coordinator 709 for secondary dispatch, but should send it to the second executor 711 to execute the atomic task.
If the ith primary subtask is a compound task, the second gatekeeper 710 sends the ith primary subtask to a second disassembler 712 for splitting.
S715, the second executor 711 executes the ith primary sub-task to obtain an execution result.
Here, the ith primary subtask sent to the second executor 711 is an atomic task, and is executed at the second executor 711. As shown in fig. 8, after the second executor 711 receives the symbol table and the abstract syntax tree of the ith primary sub-task, it recursively checks the syntax tree, and first calls the second general interpreter 713 to interpret the variable references and constants. The built-in function interpreter 714 is then called to interpret the built-in function based on the result of the variable reference, constant interpretation. And finally, calling an atomic task interpreter 715 to interpret the atomic task as a whole based on the variable reference, the interpretation result of the constant and the interpretation result of the built-in function. Here, the built-in function interpreter 714 is an interpreter corresponding to a built-in function included in the abstract syntax tree of the atomic task, and the second executor 711 calls the corresponding built-in function interpreter according to the type of the built-in function included in the atomic task. The atomic task interpreter 715 is an interpreter corresponding to the ith primary subtask.
Through such recursive interpretation, the atomic task may be interpreted in the atomic task interpreter 715 as what the executable code may recognize. The executable code executes the atomic task, and after obtaining the execution result, the atomic task interpreter 715 converts the execution result into an abstract syntax tree, and sends the abstract syntax tree to the second executor 711.
S716, the second executor 711 sends the execution result of the ith primary sub-task to the first coordinator 704 through the second gatekeeper 710 and the second coordinator 709, and the first coordinator 704 forwards the execution result of the ith primary sub-task to the first gatekeeper 702.
S717, the second watcher 710 notifies the second coordinator 709 to release the second resource pool.
Here, after the second executor 711 finishes executing the ith primary sub-task, the execution result is sent to the second watcher 710, the second watcher 710 sends the execution result to the first coordinator 704 of the first client 701 through the second coordinator 709, and the second watcher 710 further notifies the second coordinator 709 to release the second resource pool of the ith primary sub-task. After the above steps, the ith primary sub-task is executed, and the first client 701 obtains an execution result of the ith primary sub-task.
S718, the first gatekeeper 702 sends the execution result to the ith primary transmitter 707 corresponding to the ith primary subtask, and updates the symbol table of the current scope according to the execution result of the ith primary subtask.
S719, the ith primary transmitter 707 judges whether to send the trigger signal to the outside according to the execution result and the trigger signal sending condition, if so, the process goes to S720; if the result of the determination is negative, go to S722.
S720, the ith primary transmitter 707 sends a trigger signal to the (i + 1) th primary trigger; wherein, the (i + 1) th primary subtask is a post task of the (i) th primary subtask.
S721, the i +1 st primary trigger receives the trigger signal of the i primary emitter 707, and determines whether to switch from the pre-trigger state to the post-trigger state.
Here, if the ith primary subtask has at least one post task, the ith primary transmitter determines to which post task the trigger signal is to be transmitted, based on a determination result obtained by comparing the execution result with the trigger signal transmission condition. If the result of the determination indicates that the ith primary transmitter needs to send a trigger signal to the (i + 1) th primary subtask, the ith primary transmitter 707 sends a trigger signal to the (i + 1) th primary trigger.
If the ith primary subtask has only one post task, the i +1 st primary subtask, the ith primary transmitter 707 sends a trigger signal to the i +1 st primary trigger.
In the embodiment of the application, by setting the trigger signal sending condition of the emitter, under the condition that one task has a plurality of postposition tasks, the user can designate which postposition task to turn to after the task is executed as required, and a user-defined branching function is provided, so that a cluster can process workflow with complicated flow branches, and the task processing capacity is improved.
In addition, compared with a mode of inquiring the database to know whether the task is executed completely, in the embodiment of the application, after one task is executed completely, the execution result is sent to the transmitter, so that the trigger of the post task is triggered through the transmitter, the database does not need to be updated and inquired frequently, the times of instruction transmission and data operation can be reduced, and the processing speed of the workflow is improved.
In addition, the first watcher 702 also updates the symbol table of the current scope according to the execution result of the ith primary subtask. In this way, in the k primary subtasks obtained by splitting the first task, the post-task of the ith primary subtask can refer to the execution result of the ith primary subtask by reading the symbol table in the first resource pool, so that the execution efficiency of the post-task is improved.
After the (i + 1) th primary trigger receives the trigger signal, whether the trigger signal is enough to enable the (i + 1) th primary trigger to be converted into a post-trigger state from a pre-trigger state is judged. If so, the first gatekeeper 702 will start to execute the i +1 st primary subtask, and the process of executing the i +1 st primary subtask is similar to the process of executing the ith primary subtask; if not, the i +1 st primary trigger is an 'all' trigger, and the trigger signals sent by other prepositive tasks need to be continuously waited.
S722, the ith primary transmitter 707 intercepts the trigger signal, and notifies the first gatekeeper 702 to check whether the first task has a subtask being executed.
S723, the first watcher 702 confirms that the first task has no sub-task in execution, and notifies the first coordinator 704 to release the first resource pool.
Here, if the ith primary transmitter 707 intercepts the trigger signal, indicating that the ith primary transmitter 707 is a special transmitter, the ith primary subtask has no post task. After the ith primary subtask is completely executed, the first gatekeeper 702 checks whether there are other subtasks that have not been completely executed in the corresponding first task, and if not, the first task is completely executed, and the first gatekeeper 702 notifies the first coordinator 704 to release the first resource pool.
S724, after receiving the ith primary subtask, the second disassembler 712 splits the ith primary subtask into a first secondary subtask of a next level, … …, a jth secondary subtask, … …, and a pth secondary subtask, and sends the pth secondary subtasks to the second gatekeeper 710; wherein j and p are positive integers, and p is more than or equal to 1.
S725, the second gatekeeper 710 creates a jth secondary trigger and a jth secondary transmitter corresponding to the jth secondary subtask according to the in-degree and out-degree of the jth secondary subtask.
S726, after the jth secondary trigger is triggered, a jth execution signal is sent to the second gatekeeper 710, where the jth execution signal is used to start the execution of the jth secondary sub-task.
S727, after receiving the jth execution signal sent by the jth secondary trigger, the second gatekeeper 710 sends the jth secondary subtask corresponding to the jth execution signal to the second coordinator 709.
S728, the second coordinator 709 sends the symbol table and the abstract syntax tree corresponding to the jth secondary subtask to the corresponding client according to the task sending rule.
Here, the second watcher 710 sends the ith primary subtask to the second disassembler 712, indicating that the ith primary subtask received by the second client 708 is a compound task.
If the ith primary sub-task is not a detachable composite task, the processing procedure of the jth secondary sub-task in the second client 708 is substantially the same as the processing procedure of the ith primary sub-task in the first client 701, and the execution result of the jth secondary sub-task can be obtained by executing the steps related to the processing of the ith primary sub-task.
If the ith primary subtask is an un-splittable compound task, it indicates that the ith primary subtask is to be sent to the second client 708 because the first coordinator 704 determines that the second client 708 can perform all of the subtasks of the ith primary subtask. At this time, after the second client 708 splits the ith primary subtask to obtain p secondary subtasks, and creates a secondary trigger and a secondary transmitter corresponding to each secondary subtask, when each secondary subtask starts to execute, the second coordinator 709 sends the secondary subtask to the second gatekeeper 710, the second gatekeeper 710 sends the received secondary subtask to the second executor 711, and the secondary subtasks are executed in the second executor 711 of the second client 708, and an execution result is generated, so that triggering of the post task is realized until all the secondary subtasks split by the ith primary subtask are completed in the second client, and an execution result of the ith primary subtask is generated.
In the embodiment of the application, the composite task is split once in each client, the composite task formed by nested atomic tasks is split into single atomic tasks through iterative splitting in different clients, the atomic tasks are distributed to each client to be executed, distributed processing of the workflow can be achieved, and development difficulty is simplified.
Example eight
Based on the foregoing embodiments, the embodiments of the present application provide a task processing apparatus, where each module included in the apparatus and each unit included in each module can be implemented by a processor in a computer device; of course, the implementation can also be realized through a specific logic circuit; in implementation, the processor may be a Central Processing Unit (CPU), a Microprocessor (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like.
Fig. 9 is a schematic structural diagram of a task processing device according to an embodiment of the present application, and as shown in fig. 9, the task processing device 900 includes a transceiver module 901, a task splitting module 902, and an execution result generating module 903, where:
the transceiver module 901 is configured to receive a first task sent by a task sending end;
a task splitting module 902, configured to split the first task to obtain k primary subtasks when it is determined that the first task is a separable composite task; wherein k is a positive integer, and k is more than or equal to 1;
the transceiver module 901 is further configured to send the k primary subtasks to a corresponding client; the client corresponding to the primary subtask is determined by the type of the primary subtask;
the transceiver module 901 is further configured to receive an execution result of the primary sub-task sent by the corresponding client;
an execution result generation module 903, configured to determine a first execution result of the first task according to an execution result of each primary subtask;
the transceiver module 901 is further configured to send the first execution result to the task sending end.
In some embodiments, the transceiver module comprises:
a sending sequence determining unit, configured to determine a sending sequence of the k primary subtasks according to a post task of the k primary subtasks;
and the task sending unit is used for sending the k primary subtasks to the corresponding client according to the sending sequence.
In some embodiments, the task sending unit includes:
a sequential ordering subunit configured to order the k primary subtasks according to the transmission sequence;
the head subtask sending subunit is used for sending the ith primary subtask to the corresponding client when the ith primary subtask has no preposed task; wherein i is a positive integer, i is more than or equal to 1 and less than or equal to k;
the non-head subtask sending subunit is used for acquiring the number of the generated trigger signals of the ith primary subtask to obtain a first numerical value when the ith primary subtask has a front-end task; the trigger signal is generated according to the execution result of the pre-task of the ith primary subtask;
the first processor is further configured to determine the number of trigger signals of the ith primary subtask that need to be generated before the ith primary subtask starts to be processed, and obtain a second value;
the first value is determined to be equal to the second value, and the ith primary subtask is sent to the corresponding client;
and the trigger signal generating subunit is used for receiving the execution result of the ith primary subtask, and then generating a trigger signal of a post task of the ith primary subtask according to the execution result.
In some embodiments, when the head subtask sending subunit or the non-head subtask sending subunit sends the ith primary subtask to the corresponding client, the method is specifically configured to:
if the ith primary subtask has a pre-task, judging whether the ith primary subtask is bound with the pre-task of the ith primary subtask, and if the ith primary subtask is bound with the pre-task of the ith primary subtask, determining a client receiving the pre-task of the ith primary subtask as a first client;
if the ith primary subtask is not bound with the pre-task of the ith primary subtask or the ith primary subtask does not have the pre-task, judging whether the ith primary subtask is an atomic task, an undetachable composite task or a detachable composite task, and obtaining a first judgment result;
if the first judgment result shows that the ith primary subtask is an atomic task, judging whether the ith primary subtask has a specified execution client;
if yes, determining the execution client specified by the ith primary subtask as a first client;
if not, determining the client with the atomic task interpreter corresponding to the ith primary subtask as the first client; wherein, the atomic task interpreter corresponding to the ith primary subtask is used for executing the ith primary subtask;
if the first judgment result shows that the ith primary subtask is an inseparable compound task, determining a client side with an atomic task interpreter corresponding to all subtasks of the ith primary subtask as a first client side;
if the first judgment result shows that the ith primary sub-task is a detachable composite task, determining an idle client in the cluster as a first client;
and sending the ith primary subtask to the first client.
In some embodiments, the apparatus further comprises:
the analysis requirement judging module is used for judging whether the first task needs to be analyzed or not after the transceiving module receives the first task sent by the task sending end;
and the task analysis module is used for analyzing the first task when the first task needs to be analyzed.
In some embodiments, the transceiver module is further configured to:
when the first task is determined to be an atomic task, judging whether the first task has an appointed execution client;
if yes, determining the execution client specified by the first task as a second client;
if not, determining the client with the atomic task interpreter corresponding to the first task as a second client; wherein, the atomic task interpreter corresponding to the first task is used for executing the first task;
the transceiver module is further configured to send the first task to the second client, and receive a first execution result of the first task sent by the second client.
In some embodiments, the transceiver module is further configured to:
when the first task is determined to be an undetachable compound task, the first task is sent to a third client; the third client side is provided with an atomic task interpreter corresponding to all subtasks of the first task;
and receiving a first execution result of the first task sent by the third client.
In some embodiments, the apparatus further comprises:
and the atomic task execution module is used for executing the first task by using an atomic task interpreter corresponding to the first task when the first task does not need to be analyzed and is an atomic task, and obtaining a first execution result of the first task.
In some embodiments, the apparatus further comprises:
the non-detachable task splitting module is used for splitting the first task to obtain at least two primary subtasks when the first task does not need to be analyzed and is a non-detachable composite task;
the non-detachable task execution module is used for executing each primary subtask in the at least two primary subtasks by utilizing the atomic task interpreter corresponding to each primary subtask to obtain an execution result of each primary subtask;
and the non-detachable task execution result generation module is used for determining a first execution result of the first task according to the execution result of each primary subtask.
The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
It should be noted that, in the embodiment of the present application, if the task processing method is implemented in the form of a software functional module and sold or used as a standalone product, the task processing method may also be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Correspondingly, an embodiment of the present application provides a task processing device, which includes a memory and a processor, where the memory stores a computer program that can run on the processor, and the processor executes the computer program to implement the steps in the task processing method provided in the foregoing embodiment.
Correspondingly, the embodiment of the present application provides a computer readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps in the task processing method provided in the above embodiment.
Here, it should be noted that: the above description of the storage medium and device embodiments, similar to the description of the method embodiments above, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
Fig. 10 is a schematic diagram of a hardware entity of a task processing device according to an embodiment of the present application, and as shown in fig. 10, the hardware entity of the task processing device 1000 includes: a processor 1001, a communication interface 1002, and a memory 1003, wherein:
the processor 1001 generally controls the overall operation of the task processing device 1000.
The communication interface 1002 may enable the task processing device 1000 to communicate with other devices via a network.
The Memory 1003 is configured to store instructions and applications executable by the processor 1001, and may also cache data to be processed or already processed by each module in the processor 1001 and the task processing apparatus 1000, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product stored in a storage medium, which includes several instructions for causing a computer device to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.
The above description is only for the embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A task processing method is applied to a client in a cluster, wherein the cluster comprises a plurality of clients capable of communicating with each other, and the method comprises the following steps:
receiving a first task sent by a task sending end;
when the first task is determined to be a detachable composite task, splitting the first task to obtain k primary subtasks; wherein k is a positive integer, and k is more than or equal to 1;
sending the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask;
receiving an execution result of a primary subtask sent by a corresponding client;
determining a first execution result of the first task according to an execution result of each primary subtask;
sending the first execution result to the task sending end;
wherein the sending the k primary subtasks to the corresponding client comprises:
determining the sending sequence of the k primary subtasks according to the post tasks of the k primary subtasks;
according to the sending sequence, the k primary subtasks are sent to the corresponding client;
wherein the sending the k primary subtasks to the corresponding client according to the sending sequence includes:
arranging the k primary subtasks according to the sending sequence;
when the ith primary subtask is sent, if the ith primary subtask has no preposed task, the ith primary subtask is sent to the corresponding client; wherein i is a positive integer, i is more than or equal to 1 and less than or equal to k;
if the ith primary subtask has a pre-task, acquiring the number of generated trigger signals of the ith primary subtask to obtain a first numerical value; the trigger signal is generated according to an execution result of a front task of the ith primary subtask;
determining the number of trigger signals of the ith primary subtask, which need to be generated before the ith primary subtask starts to be processed, to obtain a second value;
when the first numerical value is determined to be equal to the second numerical value, sending the ith primary subtask to the corresponding client;
and after receiving the execution result of the ith primary subtask, generating a trigger signal of a post task of the ith primary subtask according to the execution result.
2. The method of claim 1, wherein the sending the ith primary subtask to the corresponding client comprises:
if the ith primary subtask has a pre-task, judging whether the ith primary subtask is bound with the pre-task of the ith primary subtask, and if the ith primary subtask is bound with the pre-task of the ith primary subtask, determining a client receiving the pre-task of the ith primary subtask as a first client;
if the ith primary subtask is not bound with the pre-task of the ith primary subtask or the ith primary subtask does not have the pre-task, judging whether the ith primary subtask is an atomic task, an undetachable composite task or a detachable composite task, and obtaining a first judgment result;
if the first judgment result shows that the ith primary subtask is an atomic task, judging whether the ith primary subtask has a specified execution client;
if yes, determining the execution client specified by the ith primary subtask as a first client;
if not, determining the client with the atomic task interpreter corresponding to the ith primary subtask as the first client; wherein, the atomic task interpreter corresponding to the ith primary subtask is used for executing the ith primary subtask;
if the first judgment result shows that the ith primary subtask is an inseparable compound task, determining a client side with an atomic task interpreter corresponding to all subtasks of the ith primary subtask as a first client side;
if the first judgment result shows that the ith primary subtask is a detachable composite task, determining an idle client in the cluster as a first client;
and sending the ith primary subtask to the first client.
3. The method according to claim 1 or 2, wherein after receiving the first task sent by the task sender, the method further comprises:
judging whether the first task needs to be analyzed or not;
and when the first task is determined to need to be analyzed, analyzing the first task.
4. The method of claim 3, further comprising:
when the first task is determined to be an atomic task, judging whether the first task has a specified execution client;
if yes, determining the execution client specified by the first task as a second client;
if not, determining the client with the atomic task interpreter corresponding to the first task as a second client; wherein, the atomic task interpreter corresponding to the first task is used for executing the first task;
sending the first task to the second client;
receiving a first execution result of the first task sent by the second client;
and sending the first execution result to the task sending end.
5. The method of claim 3, further comprising:
when the first task is determined to be an undetachable composite task, sending the first task to a third client; the third client has atomic task interpreters corresponding to all subtasks of the first task;
receiving a first execution result of the first task sent by the third client;
and sending the first execution result to the task sending end.
6. The method of claim 3, further comprising:
when the first task is determined not to be analyzed and is an atomic task, executing the first task by using an atomic task interpreter corresponding to the first task to obtain a first execution result of the first task;
and sending the first execution result to the task sending end.
7. The method of claim 3, further comprising:
when the first task does not need to be analyzed and is determined to be an undetachable composite task, splitting the first task to obtain at least two primary subtasks;
executing each primary subtask of the at least two primary subtasks by using an atomic task interpreter corresponding to each primary subtask to obtain an execution result of each primary subtask;
determining a first execution result of the first task according to the execution result of each primary subtask;
and sending the first execution result to the task sending end.
8. A task processing apparatus, wherein the apparatus is a client in a cluster, the cluster includes a plurality of clients that can communicate with each other, and the apparatus includes:
the receiving and sending module is used for receiving a first task sent by the task sending end;
the task splitting module is used for splitting the first task to obtain k primary subtasks when the first task is determined to be a split composite task; wherein k is a positive integer, and k is more than or equal to 1;
the receiving and sending module is further configured to send the k primary subtasks to corresponding clients; the client corresponding to the primary subtask is determined by the type of the primary subtask;
the receiving and sending module is further used for receiving the execution result of the primary subtask sent by the corresponding client;
the execution result generation module is used for determining a first execution result of the first task according to the execution result of each primary subtask;
the receiving and sending module is further configured to send the first execution result to the task sending end;
the transceiver module comprises:
a sending sequence determining unit, configured to determine a sending sequence of the k primary subtasks according to a post task of the k primary subtasks;
the task sending unit is used for sending the k primary subtasks to the corresponding client according to the sending sequence;
the task sending unit comprises:
a sequential ordering subunit configured to order the k primary subtasks according to the transmission sequence;
the head subtask sending subunit is used for sending the ith primary subtask to the corresponding client when the ith primary subtask has no preposed task; wherein i is a positive integer, i is more than or equal to 1 and less than or equal to k;
the non-head subtask sending subunit is used for acquiring the number of the generated trigger signals of the ith primary subtask to obtain a first numerical value when the ith primary subtask has a front-end task; the trigger signal is generated according to the execution result of the pre-task of the ith primary subtask; the method is also used for determining the number of trigger signals of the ith primary subtask, which need to be generated before the ith primary subtask starts to be processed, so as to obtain a second value; the first value is determined to be equal to the second value, and the ith primary subtask is sent to the corresponding client;
and the trigger signal generating subunit is used for receiving the execution result of the ith primary subtask, and then generating a trigger signal of a post task of the ith primary subtask according to the execution result.
9. A task processing apparatus, characterized in that the apparatus comprises: memory storing a computer program operable on a processor, the processor implementing the task processing method according to any one of claims 1 to 7 when executing the computer program.
10.A computer-readable storage medium having computer-executable instructions stored therein, the computer-executable instructions being configured to perform the task processing method of any one of claims 1 to 7.
CN201910887217.1A 2019-09-19 2019-09-19 Task processing method and device and storage medium Active CN112527471B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910887217.1A CN112527471B (en) 2019-09-19 2019-09-19 Task processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910887217.1A CN112527471B (en) 2019-09-19 2019-09-19 Task processing method and device and storage medium

Publications (2)

Publication Number Publication Date
CN112527471A CN112527471A (en) 2021-03-19
CN112527471B true CN112527471B (en) 2023-04-07

Family

ID=74974205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910887217.1A Active CN112527471B (en) 2019-09-19 2019-09-19 Task processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN112527471B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254506A (en) * 2021-06-18 2021-08-13 浙江口碑网络技术有限公司 Data processing method and device, computer equipment and storage medium
CN113805976A (en) * 2021-09-16 2021-12-17 上海商汤科技开发有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN115392063B (en) * 2022-10-31 2023-01-31 西安羚控电子科技有限公司 Multi-rate simulation method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100497384B1 (en) * 2003-01-28 2005-06-23 삼성전자주식회사 Distributed processing system using virtual machine, and method thereof
CN102508704A (en) * 2011-11-10 2012-06-20 上海市共进通信技术有限公司 Method for implementing task decomposition and parallel processing in computer software system
CN108566408A (en) * 2018-01-18 2018-09-21 咪咕文化科技有限公司 A kind of method for processing business, device and storage medium
CN110113387A (en) * 2019-04-17 2019-08-09 深圳前海微众银行股份有限公司 A kind of processing method based on distributed batch processing system, apparatus and system

Also Published As

Publication number Publication date
CN112527471A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN109889575B (en) Collaborative computing platform system and method under edge environment
CN112527471B (en) Task processing method and device and storage medium
CN108920259B (en) Deep learning job scheduling method, system and related equipment
US20190377604A1 (en) Scalable function as a service platform
US11016673B2 (en) Optimizing serverless computing using a distributed computing framework
CN109240758B (en) Method for supporting synchronous asynchronous unified call of plug-in interface and microkernel system
CN108696381B (en) Protocol configuration method and device
CN110908641B (en) Visualization-based stream computing platform, method, device and storage medium
CN110333941B (en) Big data real-time calculation method based on sql
US20160294929A1 (en) System and method for reusing javascript code available in a soa middleware environment from a process defined by a process execution language
CN112313627B (en) Mapping mechanism of event to serverless function workflow instance
US10225375B2 (en) Networked device management data collection
US20220179711A1 (en) Method For Platform-Based Scheduling Of Job Flow
CN110888736A (en) Application management method and system based on container cloud platform and related components
CN100462956C (en) Method and system for loading programme on computer system
US9996344B2 (en) Customized runtime environment
US10268496B2 (en) System and method for supporting object notation variables in a process defined by a process execution language for execution in a SOA middleware environment
CN110891083B (en) Agent method for supporting multi-job parallel execution in Gaia
KR102441167B1 (en) Apparatus and method for executing function
US20160291941A1 (en) System and method for supporting javascript as an expression language in a process defined by a process execution language for execution in a soa middleware environment
Kang et al. Android RMI: a user-level remote method invocation mechanism between Android devices
Campos et al. The chance for Ada to support distribution and real-time in embedded systems
US10223142B2 (en) System and method for supporting javascript activities in a process defined by a process execution language for execution in a SOA middleware environment
JP2004520641A (en) Event bus architecture
EP3495960A1 (en) Program, apparatus, and method for communicating data between parallel processor cores

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant