CN114721811A - Distributed processing method, system and device - Google Patents

Distributed processing method, system and device Download PDF

Info

Publication number
CN114721811A
CN114721811A CN202110010199.6A CN202110010199A CN114721811A CN 114721811 A CN114721811 A CN 114721811A CN 202110010199 A CN202110010199 A CN 202110010199A CN 114721811 A CN114721811 A CN 114721811A
Authority
CN
China
Prior art keywords
task
node
execution
address
control end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110010199.6A
Other languages
Chinese (zh)
Inventor
沈村敬
董俊峰
强群力
刘超千
赵彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NetsUnion Clearing Corp
Original Assignee
NetsUnion Clearing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NetsUnion Clearing Corp filed Critical NetsUnion Clearing Corp
Priority to CN202110010199.6A priority Critical patent/CN114721811A/en
Publication of CN114721811A publication Critical patent/CN114721811A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/203Failover techniques using migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition

Abstract

The application discloses a distributed processing method, a system and a device. The method is based on a distributed processing system, the distributed processing system comprises a first control end and a plurality of execution nodes, and the method is executed by the first control end. The method comprises the following steps: acquiring a task to be processed; according to the task, determining a task fragment corresponding to each execution node; distributing the task fragments corresponding to the execution nodes to the corresponding execution nodes, enabling the execution nodes to process the distributed task fragments, and storing intermediate results obtained by processing to the local of the execution nodes; when a preset first condition is met, acquiring intermediate results stored locally in each execution node; and summarizing the intermediate results obtained from each execution node to obtain the processing result of the task. The distributed processing process in the specification does not depend on the shared file, and negative effects on distributed processing caused by faults of the shared file and limitation of storage space are effectively avoided.

Description

Distributed processing method, system and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a distributed processing method, system, and apparatus.
Background
Under the distributed task processing architecture, the task processing function of the distributed processing system is dispersed to a plurality of nodes for execution, and the task processing pressure of the distributed processing system is favorably relieved.
Generally, after each node in the distributed processing system processes the fragment assigned to the node, a file is generated to characterize the intermediate result obtained from the fragment. And each node stores the generated intermediate result into a shared file of the distributed processing system so as to further process each intermediate result in the shared file in the subsequent steps. If the shared file fails, intermediate results generated by each node are affected to a certain extent. In addition, the storage space of the shared file is limited, and if the data volume of the intermediate result generated by one node is large, the storage of the intermediate result generated by other nodes in the shared file is influenced.
Disclosure of Invention
The embodiment of the application provides a distributed processing method, a system and a device, which are used for at least partially solving the problems caused by the fault of shared files and the limited storage space of the shared files.
The embodiment of the application adopts the following technical scheme:
in a first aspect, an embodiment of the present application provides a distributed processing method, where the method is based on a distributed processing system, the distributed processing system includes a first control end and several execution nodes, the method is executed by the first control end, and the method includes:
acquiring a task to be processed;
according to the task, determining a task fragment corresponding to each execution node;
distributing the task fragments corresponding to the execution nodes to the corresponding execution nodes, enabling the execution nodes to process the distributed task fragments, and storing intermediate results obtained by processing to the local of the execution nodes;
when a preset first condition is met, acquiring intermediate results stored locally in each execution node;
and summarizing the intermediate results obtained from each execution node to obtain the processing result of the task.
In an alternative embodiment of the present description, the first control end includes a control node and a coordination component connected; the control node is used for determining the corresponding relation between the task fragments and the execution nodes, and the coordination component is used for distributing the task fragments to the execution nodes corresponding to the task fragments.
In an optional embodiment of this specification, before determining, according to the task, a task fragment corresponding to each execution node, the method further includes:
and when the coordination component cannot detect the control node connected with the coordination component, determining at least one of the execution nodes as a new control node, and executing distributed processing according to the new control node.
In an optional embodiment of this specification, allocating the task fragment corresponding to each execution node to the corresponding execution node includes:
aiming at each task fragment, taking the storage position of the data which is depended on by the task fragment in a preset data source as a first address of the task fragment;
and sending the first address of the task fragment corresponding to the execution node, so that the execution node acquires data according to the received first address to process the task fragment corresponding to the execution node.
In an alternative embodiment of the present description, the distributed processing system further comprises an address database; the address database is used for storing a second address of each task fragment aiming at each task fragment; the second address indicates a local storage location of the intermediate result corresponding to the task fragment at the execution node.
In an optional embodiment of this specification, before obtaining the intermediate result stored locally at each execution node, the method further includes:
for each task fragment, determining whether a second address corresponding to the task fragment exists in the address database;
if yes, determining that a preset first condition is met.
In an alternative embodiment of the present description, obtaining intermediate results stored locally at each execution node includes:
for each executing node, determining a second address generated by the executing node and corresponding to the task from the address database;
and acquiring an intermediate result locally stored by the execution node according to the determined second address.
In an optional embodiment of the present specification, the first control end comprises a control node;
obtaining intermediate results stored locally at each executing node, comprising:
for each execution node, the control node acquires an intermediate result which is locally stored by the execution node and corresponds to the task, and stores the intermediate result locally in the control node;
summarizing intermediate results obtained from each execution node, comprising:
the control node is local to the control node and collects intermediate results obtained from the execution nodes.
In an alternative embodiment of the present description, the distributed processing system further comprises a second control end;
after obtaining the processing result of the task, the method further comprises:
and locally storing the processing result in the control node, and determining the storage position of the processing result in the control node as a third address, so that the second control end locally acquires the processing result from the control node according to the third address.
In an optional embodiment of the present specification, after determining that the processing result is in a storage location local to the control node as the third address, the method further comprises:
if a processing result deleting instruction is received, the control node deletes the locally stored processing result corresponding to the task; and the processing result deleting instruction is generated by the second control end according to the task.
In an optional embodiment of this specification, before obtaining the intermediate result stored locally at each execution node, the method further includes:
when a fault node exists in each execution node, distributing the task fragments corresponding to the tasks, which are originally processed by the fault node, to other execution nodes except the fault node.
In an optional embodiment of the present specification, the first control end includes a coordination component, and the coordination component is connected to each execution node separately;
before distributing the task fragments corresponding to the tasks originally processed by the failed node to other execution nodes except the failed node, the method further comprises:
when the coordinating component detects the existence of a disconnected executing node, the disconnected executing node is determined as a failure node.
In an alternative embodiment of the present description, obtaining intermediate results stored locally at each execution node includes:
for each execution node, if the intermediate result corresponding to the task and locally stored by the execution node is not obtained, determining that the node is a fault node;
distributing the task fragments corresponding to the tasks, which are originally processed by the fault nodes, to other execution nodes except the fault nodes, and acquiring intermediate results corresponding to the task fragments from the other execution nodes.
In an optional embodiment of this specification, wherein the distributing the task fragment corresponding to the task originally processed by the failed node to other execution nodes except the failed node includes:
determining an execution node with the most available data processing resources from other execution nodes except the fault node;
distributing the task fragment corresponding to the task, which is originally processed by the fault node, to the execution node with the most available data processing resources.
In an optional embodiment of this specification, after obtaining the processing result of the task, the method further includes:
and generating an intermediate result deleting instruction, and sending the intermediate result deleting instruction to each execution node, so that each execution node deletes the locally stored intermediate result corresponding to the task according to the intermediate result deleting instruction.
In an alternative embodiment of the present specification, at least one of the first address and the second address uses HTTPS protocol.
In a second aspect, an embodiment of the present application further provides a method based on a distributed processing system, where the distributed processing system includes a second control end and a plurality of machine rooms connected to the second control end; at least one machine room comprises a first control end and a plurality of execution nodes in the first aspect; tasks processed by the machine rooms belong to the same service, the method is executed by the second control terminal, and the method comprises the following steps:
when a preset second condition is met, acquiring a processing result of each machine room for the task;
processing the services corresponding to the processing results according to the processing results obtained from the machine rooms; at least one of the processing results is obtained by the method of the first aspect.
In an alternative embodiment of the present description, the first control end includes a control node and a coordination component connected;
before obtaining a processing result obtained by the machine room for the task, the method further includes:
for each machine room, acquiring a third address generated by a control node of the machine room; and the third address shows a storage position of the machine room, which is local to the control node of the machine room, of the processing result obtained by the machine room aiming at the task.
In an optional embodiment of this specification, before obtaining a processing result obtained by the machine room for the task, the method further includes:
and if the third address generated by the control node of the machine room is acquired for each machine room, determining that the preset second condition is met.
In an optional embodiment of this specification, obtaining a processing result obtained by the machine room for the task includes:
acquiring a third address corresponding to the machine room;
and according to the acquired third address, acquiring a processing result of the machine room for the task locally at the control node of the machine room.
In a third aspect, an embodiment of the present application further provides a distributed processing method, where the method is based on a distributed processing system, the distributed processing system includes a first control end and several execution nodes, and the method is executed by any execution node, and the method includes:
processing the task fragments distributed to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks;
storing the intermediate result to the local;
and when the first control end acquires an intermediate result corresponding to the task, sending the intermediate result to the first control end, so that the first control end obtains a processing result of the task according to the intermediate result.
In an optional embodiment of this specification, before processing the task fragment that is allocated to the execution node by the first control end, the method further includes:
receiving a first address sent by a first control end, wherein the first address shows a storage position of data depended by the task fragment in a preset data source;
and acquiring data from the data source according to the first address so as to process the task fragments distributed to the execution node.
In an alternative embodiment of the present specification, storing the intermediate result locally includes:
storing the intermediate result to the local, and determining the storage position of the intermediate result in the local as a second address;
the distributed processing system further comprises an address database; after storing the intermediate result locally, the method further comprises:
and sending the second address to the address database, so that the first control end acquires the second address through the address database.
In an optional embodiment of this specification, when the first control end obtains an intermediate result corresponding to the task, sending the intermediate result to the first control end includes:
and when the first control end acquires the intermediate result corresponding to the task according to the second address, the locally stored intermediate result corresponding to the second address is sent to the first control end.
In an optional embodiment of the present specification, after sending the intermediate result to the first control end, the method further includes:
if an intermediate result deleting instruction is received, the execution node deletes the locally stored intermediate result corresponding to the task; and the intermediate result deleting instruction is generated by the first control end according to the task.
In a fourth aspect, the embodiments of the present specification further provide a first distributed processing apparatus, where the first distributed processing apparatus is applied to a first control end. The first control terminal belongs to a distributed processing system, and the distributed processing system further comprises a plurality of execution nodes. The first distributed processing apparatus comprises one or more of the following modules:
the task acquisition module is configured to acquire a task to be processed;
the task fragment determining module is configured to determine task fragments corresponding to the execution nodes according to the tasks;
the first distribution module is configured to distribute the task fragments corresponding to the execution nodes to the corresponding execution nodes, so that the execution nodes process the distributed task fragments, and intermediate results obtained by processing are stored locally in the execution nodes;
the intermediate result acquisition module is configured to acquire the intermediate results stored locally in each execution node when a preset first condition is met;
and the summarizing module is configured to summarize the intermediate results obtained from the execution nodes to obtain the processing result of the task.
In an alternative embodiment of the present description, the first control end includes a control node and a coordination component connected; the control node is used for determining the corresponding relation between the task fragments and the execution nodes, and the coordination component is used for distributing the task fragments to the execution nodes corresponding to the task fragments.
In an optional embodiment of the present description, the first distributed processing apparatus may further include a control node determination module. The control node determination module is configured to: and when the coordination component cannot detect the control node connected with the coordination component, determining at least one of the execution nodes as a new control node, and executing distributed processing according to the new control node.
In an optional embodiment of the present disclosure, the first distribution module is specifically configured to: and aiming at each task fragment, taking the storage position of the data depended by the task fragment in a preset data source as a first address of the task fragment. And sending the first address of the task fragment corresponding to the execution node, so that the execution node acquires data according to the received first address to process the task fragment corresponding to the execution node.
In an alternative embodiment of the present description, the distributed processing system further comprises an address database; the address database is used for storing a second address of each task fragment aiming at each task fragment; the second address indicates a local storage location of the intermediate result corresponding to the task fragment at the execution node.
In an optional embodiment of this specification, the first distributed processing apparatus may further include a first determining module. The first judging module determines whether a second address corresponding to each task fragment exists in the address database or not aiming at each task fragment. If yes, determining that a preset first condition is met.
In an optional embodiment of this specification, the intermediate result obtaining module is specifically configured to: for each execution node, determining a second address generated by the execution node and corresponding to the task from the address database; and acquiring an intermediate result locally stored by the execution node according to the determined second address.
In an optional embodiment of the present specification, the first control end comprises a control node. The intermediate result obtaining module is specifically configured to: and for each execution node, the control node acquires the intermediate result which is locally stored by the execution node and corresponds to the task, and stores the intermediate result locally at the control node. The aggregation module is specifically configured to aggregate the intermediate results obtained from each execution node when the control node is local to the control node.
In an alternative embodiment of the present disclosure, the distributed processing system may further include a second control end. The first distributed processing apparatus may further include a third address generation module. The third address generation module is configured to: and locally storing the processing result in the control node, and determining the local storage position of the processing result in the control node as a third address, so that the second control end locally acquires the processing result from the control node according to the third address.
In an alternative embodiment of the present description, the distributed processing system may further include a first receiving module. The first receiving module is configured to delete a locally stored processing result corresponding to the task if a processing result deleting instruction is received; and the processing result deleting instruction is generated by the second control terminal according to the task.
In an alternative embodiment of the present description, the distributed processing system may further include a second allocation module. The second distributing module is configured to distribute the task fragments corresponding to the tasks, which are originally processed by the failed nodes, to other execution nodes except the failed nodes when the failed nodes are detected to exist in the execution nodes.
In an optional embodiment of the present specification, the first control end may further include a coordination component, and the coordination component is connected to each execution node separately. The first distributed processing apparatus may further include a detection module. The detection module is configured to: when the coordinating component detects the existence of a disconnected executing node, the disconnected executing node is determined as a failure node.
In an optional embodiment of this specification, the intermediate result obtaining module is specifically configured to determine, for each execution node, that the execution node is a failed node if the intermediate result corresponding to the task and locally stored in the execution node is not obtained. Distributing the task fragments corresponding to the tasks, which are originally processed by the fault nodes, to other execution nodes except the fault nodes, and acquiring intermediate results corresponding to the task fragments from the other execution nodes.
In an optional embodiment of the present specification, the first control end may further include a resource determination module. The resource determination module is configured to determine an execution node having the most available data processing resources among other execution nodes except the failed node. Distributing the task fragment corresponding to the task, which is originally processed by the fault node, to the execution node with the most available data processing resources.
In an optional embodiment of the present specification, the first control end may further include a first deletion module. The first deleting module is configured to generate an intermediate result deleting instruction and send the intermediate result deleting instruction to each execution node, so that each execution node deletes the locally stored intermediate result corresponding to the task according to the intermediate result deleting instruction.
In a fifth aspect, the present specification further provides a second distributed processing apparatus, where the second distributed processing apparatus is applied to a second control terminal. The distributed processing system also comprises a plurality of machine rooms; at least one machine room comprises a first control end and a plurality of execution nodes; tasks processed by all the machine rooms belong to the same service. The second distributed processing apparatus comprises one or more of the following modules:
and the processing result acquisition module is configured to acquire a processing result of each machine room for the task when a preset second condition is met.
And the processing module is configured to process the service corresponding to each processing result according to each processing result obtained from each machine room. At least one of the processing results is obtained through any one of the distributed processing procedures.
In an optional embodiment of the present specification, the second distributed processing apparatus may further include a second determining module. The second determination module is configured to: and if the third address generated by the control node of the machine room is acquired for each machine room, determining that the preset second condition is met.
In an optional embodiment of the present specification, the second distributed processing apparatus may further include a third address obtaining module. The third address acquisition module is configured to acquire, for each machine room, a third address generated by a control node of the machine room; the third address shows a local storage location of the control node of the machine room where the machine room obtains the processing result for the task.
In an optional embodiment of this specification, the processing result obtaining module is specifically configured to: acquiring a third address corresponding to the machine room; and according to the acquired third address, acquiring a processing result of the machine room for the task locally at the control node of the machine room.
In a sixth aspect, embodiments of the present specification further provide a third distributed processing apparatus. The third distributed processing apparatus is applied to any one execution node in a plurality of execution nodes of a distributed processing system, and the distributed processing system further includes a first control end. The third distributed processing apparatus comprises one or more of the following modules:
the intermediate result generation module is configured to process the task fragments distributed to the execution node by the first control end to obtain intermediate results; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks.
A storage module configured to store the intermediate result locally.
And the intermediate result sending module is configured to send the intermediate result to the first control end when the first control end obtains the intermediate result corresponding to the task, so that the first control end obtains the processing result of the task according to the intermediate result.
In an optional embodiment of this specification, the third distributed processing apparatus may further include a first address receiving module and a task fragment processing module.
The first address receiving module is configured to receive a first address sent by a first control end, where the first address shows a storage location of data depended on for processing the task fragment in a preset data source.
And the task fragment processing module is configured to acquire data from the data source according to the first address so as to process the task fragment allocated to the execution node.
In an optional embodiment of the present specification, the storage module is specifically configured to store the intermediate result locally, and determine a storage location of the intermediate result locally as the second address.
In an alternative embodiment of the present disclosure, the third distributed processing apparatus may further include a second address transmission module. The second address sending module is configured to: and sending the second address to the address database, so that the first control end acquires the second address through the address database.
In an optional embodiment of this specification, the intermediate result sending module is specifically configured to send the locally stored intermediate result corresponding to the second address to the first control end when the first control end obtains the intermediate result corresponding to the task according to the second address.
In an optional embodiment of the present specification, the third distributed processing apparatus may further include a second deletion module. The second deletion module is configured to: if an intermediate result deleting instruction is received, the execution node deletes the locally stored intermediate result corresponding to the task; and the intermediate result deleting instruction is generated by the first control end according to the task.
In a seventh aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to perform any one of the methods provided by the first, second or third aspects of the specification.
In an eighth aspect, embodiments of the present application further provide a computer-readable storage medium storing one or more programs which, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform any one of the methods provided in the first, second or third aspects of the present specification.
The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects: the distributed processing method and device in this specification are based on a distributed processing system, and the distributed processing system includes a first control end and a plurality of execution nodes. After at least one execution node processes the fragments distributed to itself and obtains an intermediate result, the intermediate result is stored locally instead of being stored in the shared file, and in the subsequent step, the first control end can obtain the intermediate result from the local execution node instead of obtaining the intermediate result from the shared file. Therefore, the distributed processing process in the description does not depend on the shared file, and negative effects on distributed processing caused by faults of the shared file and limitation of storage space are effectively avoided. And the intermediate results are dispersedly stored in the local execution nodes, so that even if one execution node fails, the intermediate results stored in other execution nodes cannot be influenced, and the disaster tolerance capability of the distributed processing system is favorably improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a process diagram illustrating a task processing performed by a distributed processing system;
fig. 2 is a schematic diagram of a partial architecture of a distributed processing system according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a distributed processing process provided in an embodiment of the present specification;
fig. 4 is a schematic diagram of a partial architecture of a distributed processing system according to an embodiment of the present disclosure;
fig. 5 is a schematic view of a scenario of determining a faulty node in a distributed processing process according to an embodiment of the present specification;
fig. 6 is a schematic view of a scenario of determining a faulty node in a distributed processing process according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a first distributed processing apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a second distributed processing apparatus provided in an embodiment of the present specification;
fig. 9 is a schematic structural diagram of a third distributed processing apparatus provided in an embodiment of the present specification;
fig. 10 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Existing distributed processing systems typically have an architecture as shown in fig. 1. The master control end splits the task into a plurality of fragments (namely, fragment 1 to fragment k), and distributes each fragment to each node (namely, node 1 to node k). Each node processes the fragments responsible for it to obtain intermediate results. And then each node writes the generated intermediate result into a shared file, and the shared file is used for carrying out unified storage. And then, the main control end reads each intermediate result from the shared file and carries out the next processing.
Therefore, the existing task processing based on the distributed processing system has strong dependence on the shared file, and if the shared file has a problem, the task cannot be continuously executed.
In view of this, the technical solutions provided by the embodiments of the present specification are proposed to at least partially solve the problem caused by the dependence on the shared file.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
The distributed processing process in this specification is based on a distributed processing system, and as shown in fig. 2, the distributed processing system includes a second control end and several computer rooms. Taking a machine room i (i.e., the ith machine room, which may be any one of a plurality of machine rooms) in a plurality of machine rooms as an example, the machine room i includes: a first control end and several execution nodes (i.e., execution node 1 to execution node m in fig. 2). The second control end is in communication connection with the first control ends in the machine rooms.
The second control end interacts with the first control end of each machine room, and the first control end interacts with each execution node in the machine room to which the first control end belongs. The first control end is used for managing and controlling the distributed processing process of the machine room to which the first control end belongs, and the execution node bears specific processing functions.
For convenience of description, in this specification, an interaction between the second control end, the first control end in the machine room i, and the execution node j in the machine room i (the execution node j may be any one of the execution nodes in the machine room i) is taken as an example, and a distributed processing procedure in this specification is described. As shown in fig. 3, the distributed processing in this specification may include one or more of the following steps:
s300: the first control end obtains a task to be processed.
In this specification, the objects handled by the distributed processing system are "businesses," that is, "businesses" are the smallest units handled by the distributed processing system. After receiving the service, the distributed processing system splits the service into a plurality of tasks according to each machine room contained in the service, and distributes the tasks to each machine room, so that each task corresponds to one machine room to process the service, namely, the task is the minimum unit processed by the machine room. As shown in fig. 2, the machine room i obtains the task i.
The specific meanings of the services and tasks in the present specification may be determined according to actual scenarios. For example, where the distributed processes in this specification apply to a financial scenario, a business may be the clearing of a certain transaction batch and a task may be the clearing of a base-table-level transaction.
In this specification, a task before a processing result is obtained is in a state of being processed.
In an alternative embodiment, the task may be obtained by the first control terminal from the second control terminal.
S302: and the first control end determines the task fragments corresponding to the execution nodes according to the tasks.
The existing method for segmenting the task can be applied to the process in the specification.
In an optional embodiment of this specification, the first control end may divide the tasks according to the number of execution nodes in the machine room to which the first control end belongs. For example, in the example shown in fig. 2, the machine room i includes m execution nodes, and the first control end divides the task into m pieces.
It should be noted that the execution node in this specification refers to a node which has normal processing capability and is used in the distributed processing. In addition, there may be other nodes in a room that have abnormal processing capabilities (e.g., failed nodes), and/or nodes that are not being used in the distributed processing (e.g., non-executing nodes).
S304: and the first control end distributes the task fragments corresponding to the execution nodes to the corresponding execution nodes.
After determining the corresponding relationship between the execution node and the task fragment (the process of determining the corresponding relationship may be as shown in fig. 4), for each fragment, the fragment may be allocated to the execution node corresponding to the fragment. As shown in fig. 2, task slice 1 is assigned to execution node 1, and so on.
S306: and the execution node j processes the task fragments distributed to the execution node by the first control end to obtain an intermediate result.
Since the task fragment processed by the executing node j is obtained according to the task, the intermediate result generated by the executing node j corresponds to the task.
The present specification does not specifically limit the process of processing the fragments by the execution node. Any process capable of processing the task fragments is suitable for the step.
S308: and the execution node j stores the intermediate result to the local of the execution node j.
After generating the intermediate result from the task slice received by the execution node j, the execution node j directly stores the intermediate result in the local of the execution node j, instead of a common storage location such as a shared file (as shown in fig. 1) in the prior art. On one hand, the use of a common storage position can be avoided, and the framework of the distributed processing system is simplified; on the other hand, data loss and/or damage caused by the fault of the common storage position can be avoided.
Further, the number of execution nodes in one computer room is usually not unique, and accordingly, the total storage space provided by each execution node is also larger under the condition that the number of execution nodes is larger, so that the process in this specification can avoid the storage pressure caused by the limited storage capacity of the common storage space.
S310: when a preset first condition is met, the first control end obtains local intermediate results of each execution node stored in a machine room i to which the first control end belongs.
The first condition in this specification may be determined according to an actual requirement, and the first condition is used to determine a timing for triggering the first control end to obtain the intermediate result.
Because the task fragment in this specification is obtained by dividing the task, the processing result of the task can be reflected only from the viewpoint of the fragment according to the intermediate result obtained by the task fragment, that is, the intermediate result corresponding to a certain task fragment is difficult to reflect the full view of the processing result of the task. The first control end needs to obtain each intermediate result and summarize each intermediate result to obtain the processing result of the task.
In this specification, the intermediate result is stored locally in the execution node, and the first control end may acquire, for each execution node, the intermediate result corresponding to the task from locally in the execution node.
S312: and when the first control end acquires the intermediate result corresponding to the task, the execution node j sends the locally stored intermediate result to the first control end.
In an optional embodiment of this specification, when a preset first condition is met, the first control end generates an intermediate result obtaining request, and sends the intermediate result obtaining request to each execution node of the machine room to which the first control end belongs. And after receiving the intermediate result acquisition request, each execution node sends the locally stored intermediate result corresponding to the service to be processed to the first control end.
S314: and the first control end collects the intermediate results obtained from each execution node to obtain the processing result of the task.
The specification does not limit the specific manner in which the first control end summarizes the intermediate results. In an optional embodiment of this specification, the first control end stores each intermediate result corresponding to the service to be processed locally at the first control end, that is, the summary is completed, and the processing result of the task is obtained.
In an optional embodiment of this specification, after obtaining the processing result of the task, the first control end stores the processing result locally at the first control end for use in subsequent steps.
S316: and when a preset second condition is met, aiming at each machine room, the second control end obtains a processing result of the machine room aiming at the task.
The second condition in this specification may be determined according to an actual requirement, and the second condition is used to determine a time for triggering the second control end to obtain the processing result generated by each machine room.
As can be seen from the foregoing, a processing object of the distributed processing system in this specification is a service, and the service may be split into multiple tasks, so that the processing result of a certain task is difficult to reflect the overall view of the processing result of the entire service, and a second control end needs to obtain the processing result for the task generated in each machine room, and further process the obtained processing result for each task to obtain the processing result for the service.
S318: and the first control end sends the processing result to the second control end.
In an optional embodiment of this specification, a process of the second control end obtaining a processing result of each task may be that the second control end generates a first obtaining instruction and sends the first obtaining instruction to each computer room (specifically, the first obtaining instruction may be sent to the first control end of each computer room), so that each computer room returns a processing result of a task according to the first obtaining instruction.
S320: and the second control terminal processes the business corresponding to each processing result according to each processing result obtained from each machine room.
The specification does not limit the specific manner in which the second control end processes the service. Since the object processed by the distributed processing system in this specification is "service", in an optional embodiment of this specification, the result output by the second control end is the result of processing for the service; in another optional embodiment of this specification, the result output by the second control end is an undetermined result, and the undetermined result needs to be further processed by another end (for example, a third control end) in the distributed processing system, so as to obtain a result of processing for the service.
As can be seen from the foregoing, in the machine room i, the first control end is responsible for the management and coordination of the distributed processing to some extent, and how the first control end implements the management and coordination function will now be described.
In an alternative embodiment of the present description, the first control end comprises a control node and a coordination component, as shown in fig. 4. The control node is used for dividing the task received by the machine room i into a plurality of task fragments and determining the corresponding relation between the task fragment and the execution node aiming at each task fragment; and the coordination component is used for distributing the task fragments divided by the control node to the execution nodes corresponding to the task fragments.
Optionally, the control node in this embodiment of the present specification has not only functions of control and management, but also a function of task slicing processing. When the control node divides the task into task fragments, the control node also divides the task fragments to be processed by the control node, and the corresponding relation is generated according to the task fragments. And then, when distributing the task fragments, the coordination component distributes the task fragments which are processed by the control node to the control node according to the corresponding relation.
The specification does not specifically limit the type of the coordination component, and the coordination component may be, for example, a ZK (Zookeeper, distributed application coordination service). The coordination component may be responsible for supervising the operating states of the various nodes (including control nodes and execution nodes) in the room to which it belongs.
In order to enable each executing node in the machine room i to effectively process the task fragment for which it is responsible, in an optional embodiment of this specification, for each task fragment, a storage location of data that is depended on by the processing of the task fragment in a preset data source may be used as a first address of the task fragment by a first control terminal (specifically, may be a control node in the first control terminal, as shown in fig. 4). Then, the first control end (specifically, may be a coordination component in the first control end) sends the first address of the task fragment corresponding to the execution node.
And after receiving the first address, the execution node acquires data from the data source according to the first address so as to process the task fragment distributed to the execution node.
Since the intermediate results generated by the various executing nodes in this description are stored locally at the executing nodes, rather than in a shared file. In order to enable the first control end to obtain the intermediate results from the execution nodes, after the execution nodes store the intermediate results locally, the first control end may obtain the local storage locations of the intermediate results in the execution nodes. And then, the first control end acquires the intermediate result according to the storage position of the intermediate result.
In an alternative embodiment of the present description, the distributed processing system of the present description may further include an address database, as shown in fig. 2. The address database is used for storing a second address of each task fragment aiming at the task fragment. After the execution node generates the intermediate result, the execution node may store the generated intermediate result locally, and determine a storage location of the intermediate result locally as the second address. Then, the executing node sends the second address to the address database. The first control end may learn, according to each second address recorded in the address database, a specific storage location of the intermediate result generated by the execution node. When the first control end obtains the intermediate result from the execution node according to the second address, the execution node sends the intermediate result corresponding to the second address, which is locally stored by the execution node, to the first control end, as shown in fig. 4.
Further, the operation of obtaining the intermediate result may be performed by the control node. After the control node acquires the intermediate results of the execution results, the control node stores the acquired intermediate results locally at the control node. The control node may thereafter directly aggregate the intermediate results for its local storage.
After the control node obtains the processing result of the task through aggregation, the processing result of the task is stored in the local control node, and the storage position of the processing result in the local control node is determined to be used as a third address, so that the second control end obtains the processing result from the local control node according to the third address. Specifically, after the control node generates the third address, the third address is written into the address database, so that the second control end can obtain the third address according to the address database. And then, the second control end acquires a processing result of the machine room i aiming at the task locally at the control node of the machine room i according to the acquired third address. Therefore, the service processing process in the specification can also realize cross-machine room service processing.
Further, the second address described in the address database in this specification does not only have a function of causing the first control terminal to obtain an intermediate result, but the second address described in the address database may be used to determine whether the first condition is satisfied. Specifically, the first control end may determine, for each task fragment, whether a second address corresponding to the task fragment exists in the address database. And if so, determining that a preset first condition is met. If not, waiting for a preset first time, and determining whether a second address corresponding to the task fragment exists in the address database again until the judgment result is yes.
Alternatively, a third address described in an address database in the specification may be used to determine whether the second condition is satisfied. Specifically, the second control terminal may determine, for each task, whether a third address corresponding to a processing result of the task exists in the address database. And if so, determining that a preset second condition is met. If not, waiting for a preset second time, and determining whether a third address corresponding to the task fragment exists in the address database again until the judgment result is yes.
Wherein the first time and the second time may be determined according to actual requirements.
As can be seen from the foregoing, the coordination component in this specification is also responsible for supervision of the state of each node in the machine room to which it belongs. In particular, the supervision of the state of each node may be implemented by: the coordinating component is connected to each node separately, and when a disconnection of a node from the coordinating component is detected, it is determined that the disconnected node is a failed node, as shown in fig. 5 and 6.
When the coordination component detects that the control node is a failure node, as shown in fig. 5, in order to improve the disaster tolerance capability of the distributed processing system, the coordination component generates a control node selection instruction and sends the control node selection instruction to each execution node. And after receiving the control node selection instruction, each execution node generates a control node selection request and sends the control node selection request to the coordination component. The coordination component determines that it receives the first control node selection request, and takes the execution node (e.g., execution node 2 in fig. 5) that sent the first control node selection request as the new control node. Then, the coordination component generates an assignment instruction and a notification, and sends the assignment instruction to the new control node, so that the new control node performs management execution of task execution; and sending the notification to other execution nodes except the new control node, so that other execution nodes do not continuously send control node selection requests to the coordination component any more and continue to exercise the functions of the execution nodes.
When the coordination component detects that an execution node is a failure node (e.g., the execution node 3 in fig. 6), a decision is further made according to the timing of the failure.
Specifically, if the timing when the executing node is the failed node is found before the failed node stores the execution result of the task fragment processed by the failed node (i.e., before step S308), the task fragment corresponding to the task originally processed by the failed node may be allocated to other executing nodes except the failed node.
If the intermediate result locally stored by the execution node corresponding to the second address cannot be obtained according to the second address in the address database, the node is determined to be a fault node, and the execution node can be determined to be a fault node. At this time, the timing of finding that the executing node is the failed node is after the failed node generates the execution result of the task slice processed by the failed node (i.e., after the aforementioned step S308). Distributing the task fragments corresponding to the tasks, which are originally processed by the fault nodes, to other execution nodes except the fault nodes; the second address corresponding to the task generated by the failed node in the address database is also deleted.
When distributing the task fragment originally processed by the failed node to other nodes, an execution node with the most available data processing resources may be determined from other execution nodes except the failed node. Distributing the task fragments corresponding to the tasks, which are processed by the fault nodes, to the execution nodes with the most available data processing resources, so as to realize load sharing and improve the overall efficiency of the distributed processing system.
Therefore, the distributed processing process in the description has better disaster recovery capability, and even if a node in a machine room fails, adverse effects of the failure on the execution of tasks can be reduced to a greater extent.
As can be seen from the foregoing, the shared files relied on by the existing task processing processes also have the problem of limited storage space. The process in this specification can effectively expand the storage space for storing intermediate results through the aforementioned distributed storage implemented by the respective nodes in common. Further, in the process in this specification, after the first control end generates a processing result for the task, each execution node is instructed to delete local intermediate data of the execution node, so as to release a local storage space of the execution node and prepare for next distributed processing.
Specifically, the process of releasing the local storage space of the execution node may be: and after the first control end obtains the processing result of the task, generating an intermediate result deleting instruction and sending the intermediate result deleting instruction to each execution node. And after receiving the intermediate result deleting instruction, the execution node deletes the locally stored intermediate result corresponding to the task according to the execution node. Optionally, the executing node deletes the second address corresponding to the deleted intermediate result in the address database after deleting its local intermediate result.
Since the processing result of the task stored in the control node also occupies the storage space of the control node to a certain extent, the second control end can generate a processing result deleting instruction after generating the processing result of the service, and send the processing result deleting instruction to each first control end. And deleting the locally stored processing result corresponding to the task by the control node of each first control end according to the received processing result deleting instruction so as to release the storage space of the control node. Optionally, the control node deletes the third address in the address database corresponding to the deleted processing result after deleting the processing result locally corresponding to the task.
In an optional embodiment of this specification, when dividing the task segment, the control node first determines, for each execution node, a storage space locally available to the execution node according to information of each execution node in the machine room to which the control node belongs (for example, the information may be obtained according to data in the address database, and the data may be the second address). And determining the task fragments to be processed by the execution node according to the locally available storage space of the execution node, so that the data volume of the intermediate results obtained by the execution node aiming at the processing of the task fragments does not exceed the locally available storage space of the execution node.
It can be seen that, in the service processing process and the service processing system in this specification, after processing the fragments allocated to the node and obtaining the intermediate result, the intermediate result is stored locally instead of in the shared file, which is beneficial to improving the disaster tolerance capability of the system and improving the storage expansion capability of the system. And after the subsequent business processing step is completed, deleting the file (such as at least one of the intermediate result and the processing result of the task) generated in the previous business processing step, which is beneficial to releasing the storage space of the system in time.
Based on the same idea, as shown in fig. 7, the present specification embodiment further provides a first distributed processing apparatus corresponding to a part of the steps of the process shown in fig. 3, where the first distributed processing apparatus is applied to a first control end. The first control end belongs to a distributed processing system, and the distributed processing system further comprises a plurality of execution nodes. The first distributed processing apparatus comprises one or more of the following modules:
a task obtaining module 700 configured to obtain a task to be processed;
a task fragment determining module 702 configured to determine task fragments corresponding to each execution node according to the task;
the first distribution module 704 is configured to distribute the task fragments corresponding to the execution nodes to the corresponding execution nodes, so that the execution nodes process the distributed task fragments, and store intermediate results obtained by processing to the local parts of the execution nodes;
an intermediate result obtaining module 706, configured to obtain an intermediate result stored locally at each execution node when a preset first condition is satisfied;
the summarizing module 708 is configured to summarize the intermediate results obtained from the execution nodes to obtain the processing result of the task.
In an alternative embodiment of the present description, the first control end includes a control node and a coordination component connected; the control node is used for determining the corresponding relation between the task fragments and the execution nodes, and the coordination component is used for distributing the task fragments to the execution nodes corresponding to the task fragments.
In an optional embodiment of the present description, the first distributed processing apparatus may further include a control node determination module 710. The control node determination module 710 is configured to: and when the coordination component cannot detect the control node connected with the coordination component, determining at least one of the execution nodes as a new control node, and executing distributed processing according to the new control node.
In an optional embodiment of the present disclosure, the first allocating module 704 is specifically configured to: and aiming at each task fragment, taking the storage position of the data depended by the task fragment in a preset data source as a first address of the task fragment. And sending the first address of the task fragment corresponding to the execution node, so that the execution node acquires data according to the received first address to process the task fragment corresponding to the execution node.
In an alternative embodiment of the present description, the distributed processing system further comprises an address database; the address database is used for storing a second address of each task fragment aiming at each task fragment; the second address indicates a local storage location of the intermediate result corresponding to the task fragment at the execution node.
In an optional embodiment of the present description, the first distributed processing apparatus may further include a first determining module 712. The first determining module 712 determines, for each task fragment, whether a second address corresponding to the task fragment exists in the address database. If yes, determining that a preset first condition is met.
In an optional embodiment of this specification, the intermediate result obtaining module 706 is specifically configured to: for each executing node, determining a second address generated by the executing node and corresponding to the task from the address database; and acquiring an intermediate result locally stored by the execution node according to the determined second address.
In an alternative embodiment of the present disclosure, the first control end includes a control node. The intermediate result obtaining module 706 is specifically configured to: and for each execution node, the control node acquires the intermediate result which is locally stored by the execution node and corresponds to the task, and stores the intermediate result locally at the control node. The aggregation module 708 is specifically configured to aggregate the intermediate results obtained from each execution node when the control node is local to the control node.
In an alternative embodiment of the present disclosure, the distributed processing system may further include a second control end. The first distributed processing apparatus may further include a third address generation module. The third address generation module is configured to: and locally storing the processing result in the control node, and determining the local storage position of the processing result in the control node as a third address, so that the second control end locally acquires the processing result from the control node according to the third address.
In an alternative embodiment of the present description, the distributed processing system may further include a first receiving module 714. The first receiving module 714 is configured to delete the locally stored processing result corresponding to the task if a processing result deleting instruction is received; and the processing result deleting instruction is generated by the second control end according to the task.
In an alternative embodiment of the present description, the distributed processing system may further include a second assignment module 718. The second allocating module 718 is configured to, when detecting that a failed node exists in each execution node, allocate the task fragment corresponding to the task, which is originally processed by the failed node, to other execution nodes except the failed node.
In an optional embodiment of the present specification, the first control end may further include a coordination component, and the coordination component is connected to each execution node separately. The first distributed processing apparatus may further comprise a detection module 716. The detection module 716 is configured to: upon the coordinating component detecting the presence of an executing node disconnected therefrom, determining the disconnected executing node as a failed node.
In an optional embodiment of this specification, the intermediate result obtaining module 706 is specifically configured to, for each execution node, determine that the execution node is a failed node if the intermediate result corresponding to the task and locally stored in the execution node is not obtained. Distributing the task fragments corresponding to the tasks, which are originally processed by the fault nodes, to other execution nodes except the fault nodes, and acquiring intermediate results corresponding to the task fragments from the other execution nodes.
In an optional embodiment of the present specification, the first control end may further include a resource determination module 720. The resource determination module 720 is configured to determine the execution node with the most available data processing resources among the other execution nodes except the failed node. Distributing the task fragment corresponding to the task, which is originally processed by the fault node, to the execution node with the most available data processing resources.
In an optional embodiment of the present disclosure, the first control end may further include a first deleting module 722. The first deleting module 722 is configured to generate an intermediate result deleting instruction, and send the intermediate result deleting instruction to each execution node, so that each execution node deletes the locally stored intermediate result corresponding to the task according to the intermediate result deleting instruction.
Based on the same idea, as shown in fig. 8, the present specification further provides a second distributed processing apparatus corresponding to a part of the steps of the process shown in fig. 3, where the second distributed processing apparatus is applied to a second control terminal. The distributed processing system also comprises a plurality of machine rooms; at least one machine room comprises a first control end and a plurality of execution nodes; tasks processed by all the machine rooms belong to the same service. The second distributed processing apparatus comprises one or more of the following modules:
the processing result obtaining module 800 is configured to, when a preset second condition is met, obtain, for each machine room, a processing result obtained by the machine room for the task.
The processing module 802 is configured to process the service corresponding to each processing result according to each processing result obtained from each machine room. At least one of the processing results is obtained through any one of the distributed processing procedures.
In an optional embodiment of the present description, the second distributed processing apparatus may further include a second determining module 804. The second determining module 804 is configured to: and if the third address generated by the control node of the machine room is acquired for each machine room, determining that the preset second condition is met.
In an alternative embodiment of the present disclosure, the second distributed processing apparatus may further include a third address obtaining module 806. The third address obtaining module 806 is configured to, for each machine room, obtain a third address generated by a control node of the machine room; the third address shows a local storage location of the control node of the machine room where the machine room obtains the processing result for the task.
In an alternative embodiment of the present disclosure, the processing result obtaining module 800 is specifically configured to: acquiring a third address corresponding to the machine room; and according to the acquired third address, acquiring a processing result of the machine room for the task locally at the control node of the machine room.
Based on the same idea, as shown in fig. 9, the present specification further provides a third distributed processing apparatus corresponding to a part of the steps of the process shown in fig. 3. The third distributed processing apparatus is applied to any one execution node in a plurality of execution nodes of a distributed processing system, and the distributed processing system further includes a first control end. The third distributed processing apparatus comprises one or more of the following modules:
an intermediate result generating module 900 configured to process the task fragment allocated to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks.
A storage module 902 configured to store the intermediate result locally.
An intermediate result sending module 904, configured to send the intermediate result to the first control end when the first control end obtains the intermediate result corresponding to the task, so that the first control end obtains the processing result of the task according to the intermediate result.
In an alternative embodiment of the present disclosure, the third distributed processing apparatus may further include a first address receiving module 906 and a task fragment processing module 908.
The first address receiving module 906 is configured to receive a first address sent by a first control end, where the first address shows a storage location of data depended on by processing the task fragment in a preset data source.
The task fragment processing module 908 is configured to obtain data from the data source according to the first address, so as to process the task fragment allocated to the execution node.
In an alternative embodiment of the present disclosure, the storage module 902 is specifically configured to store the intermediate result locally, and determine a storage location of the intermediate result locally as the second address.
In an alternative embodiment of the present disclosure, the third distributed processing apparatus may further include a second address sending module 910. The second address sending module 910 is configured to: and sending the second address to the address database, so that the first control terminal acquires the second address through the address database.
In an optional embodiment of this specification, the intermediate result sending module 904 is specifically configured to send, when the first control end obtains an intermediate result corresponding to the task according to the second address, a locally stored intermediate result corresponding to the second address to the first control end.
In an optional embodiment of this specification, the third distributed processing apparatus may further include a second deletion module. The second deletion module is configured to: if an intermediate result deleting instruction is received, the execution node deletes the locally stored intermediate result corresponding to the task; and the intermediate result deleting instruction is generated by the first control end according to the task.
Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 10, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 10, but this does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads a corresponding computer program from the non-volatile memory into the memory and then runs, forming one of the aforementioned distributed processing apparatuses on a logical level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
acquiring a task to be processed; according to the task, determining a task fragment corresponding to each execution node; distributing the task fragments corresponding to the execution nodes to the corresponding execution nodes, enabling the execution nodes to process the distributed task fragments, and storing intermediate results obtained by processing to the local of the execution nodes; when a preset first condition is met, acquiring intermediate results stored locally in each execution node; and summarizing the intermediate results obtained from each execution node to obtain the processing result of the task. Alternatively, the first and second electrodes may be,
and when a preset second condition is met, acquiring a processing result of each machine room for the task. Processing the services corresponding to the processing results according to the processing results obtained from the machine rooms; at least one of the respective processing results is obtained by any of the aforementioned methods. Alternatively, the first and second electrodes may be,
processing the task fragments distributed to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks. And storing the intermediate result to the local. And when the first control end acquires an intermediate result corresponding to the task, sending the intermediate result to the first control end, so that the first control end obtains a processing result of the task according to the intermediate result.
Any of the methods performed by the distributed processing apparatus according to the embodiments shown in fig. 7 to 9 of the present application may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute a method executed by any one of the distributed processing apparatuses in fig. 7 to fig. 9, and implement at least part of functions of the transaction reconciliation apparatus in the embodiment shown in fig. 3, which is not described herein again in this embodiment of the present application.
An embodiment of the present application further provides a computer-readable storage medium storing one or more programs, where the one or more programs include instructions, which, when executed by an electronic device including a plurality of application programs, enable the electronic device to perform a method performed by any one of the distributed processing apparatuses in fig. 7 to 9, and are specifically configured to perform:
acquiring a task to be processed; according to the task, determining a task fragment corresponding to each execution node; distributing the task fragments corresponding to the execution nodes to the corresponding execution nodes, enabling the execution nodes to process the distributed task fragments, and storing intermediate results obtained by processing to the local of the execution nodes; when a preset first condition is met, acquiring intermediate results stored locally in each execution node; and summarizing the intermediate results obtained from each execution node to obtain the processing result of the task. Alternatively, the first and second electrodes may be,
and when a preset second condition is met, acquiring a processing result of the machine room for the task for each machine room. Processing the services corresponding to the processing results according to the processing results obtained from the machine rooms; at least one of the respective processing results is obtained by any of the aforementioned methods. Alternatively, the first and second electrodes may be,
processing the task fragments distributed to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks. And storing the intermediate result to the local. And when the first control end acquires an intermediate result corresponding to the task, sending the intermediate result to the first control end, so that the first control end obtains a processing result of the task according to the intermediate result.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the phrase "comprising a. -. said" to define an element does not exclude the presence of other like elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (30)

1. A distributed processing method, wherein the method is based on a distributed processing system, the distributed processing system comprises a first control end and a plurality of execution nodes, the method is executed by the first control end, and the method comprises:
acquiring a task to be processed;
according to the task, determining a task fragment corresponding to each execution node;
distributing the task fragments corresponding to the execution nodes to the corresponding execution nodes, enabling the execution nodes to process the distributed task fragments, and storing intermediate results obtained by processing to the local of the execution nodes;
when a preset first condition is met, acquiring intermediate results stored locally in each execution node;
and summarizing the intermediate results obtained from each execution node to obtain the processing result of the task.
2. The method of claim 1, wherein the first control end comprises a connected control node and a coordinating component; the control node is used for determining the corresponding relation between the task fragments and the execution nodes, and the coordination component is used for distributing the task fragments to the execution nodes corresponding to the task fragments.
3. The method of claim 2, wherein before determining the task slice corresponding to each execution node according to the task, the method further comprises:
and when the coordination component cannot detect the control node connected with the coordination component, determining at least one of the execution nodes as a new control node, and executing distributed processing according to the new control node.
4. The method of claim 1, wherein distributing the task slice corresponding to each execution node to the corresponding execution node comprises:
aiming at each task fragment, taking the storage position of the data which is depended on by the task fragment in a preset data source as a first address of the task fragment;
and sending the first address of the task fragment corresponding to the execution node, so that the execution node acquires data according to the received first address to process the task fragment corresponding to the execution node.
5. The method of claim 1, wherein the distributed processing system further comprises an address database; the address database is used for storing a second address of each task fragment aiming at each task fragment; the second address indicates a local storage location of the intermediate result corresponding to the task fragment at the execution node.
6. The method of claim 5, wherein prior to obtaining the intermediate results stored locally at each executing node, the method further comprises:
for each task fragment, determining whether a second address corresponding to the task fragment exists in the address database;
if yes, determining that a preset first condition is met.
7. The method of claim 5, wherein obtaining intermediate results stored locally at each executing node comprises:
for each executing node, determining a second address generated by the executing node and corresponding to the task from the address database;
and acquiring an intermediate result locally stored by the execution node according to the determined second address.
8. The method of claim 1, wherein the first control terminal comprises a control node;
obtaining intermediate results stored locally at each executing node, comprising:
for each execution node, the control node acquires an intermediate result which is locally stored by the execution node and corresponds to the task, and stores the intermediate result locally in the control node;
summarizing intermediate results obtained from each execution node, comprising:
the control node is local to the control node and collects intermediate results obtained from the execution nodes.
9. The method of claim 8, wherein the distributed processing system further comprises a second control end;
after obtaining the processing result of the task, the method further comprises:
and locally storing the processing result in the control node, and determining the local storage position of the processing result in the control node as a third address, so that the second control end locally acquires the processing result from the control node according to the third address.
10. The method of claim 9, wherein after determining that the processing result is at a storage location local to the control node as a third address, the method further comprises:
if a processing result deleting instruction is received, the control node deletes the locally stored processing result corresponding to the task; and the processing result deleting instruction is generated by the second control end according to the task.
11. The method of claim 1, wherein prior to obtaining the intermediate results stored locally at each executing node, the method further comprises:
when detecting that a fault node exists in each execution node, distributing the task fragments corresponding to the tasks, which are originally processed by the fault node, to other execution nodes except the fault node.
12. The method of claim 11, wherein the first control end comprises a coordination component, and the coordination component is respectively connected with each execution node;
before distributing the task fragments corresponding to the tasks originally processed by the failed node to other execution nodes except the failed node, the method further comprises the following steps:
upon the coordinating component detecting the presence of an executing node disconnected therefrom, determining the disconnected executing node as a failed node.
13. The method of claim 1, wherein obtaining intermediate results stored locally at each executing node comprises:
for each execution node, if the intermediate result corresponding to the task and locally stored by the execution node is not obtained, determining that the node is a fault node;
distributing the task fragments corresponding to the tasks, which are originally processed by the fault nodes, to other execution nodes except the fault nodes, and acquiring intermediate results corresponding to the task fragments from the other execution nodes.
14. The method of claim 9 or 13, wherein distributing the task fragments corresponding to the tasks originally processed by the failed node to other execution nodes except the failed node comprises:
determining an execution node with the most available data processing resources from other execution nodes except the fault node;
distributing the task fragment corresponding to the task, which is originally processed by the fault node, to the execution node with the most available data processing resources.
15. The method of claim 1, wherein after obtaining the processing results of the task, the method further comprises:
and generating an intermediate result deleting instruction, and sending the intermediate result deleting instruction to each execution node, so that each execution node deletes the locally stored intermediate result corresponding to the task according to the intermediate result deleting instruction.
16. The method of claim 4 or 5, wherein at least one of the first address and the second address adopts HTTPS protocol.
17. A distributed processing method is based on a distributed processing system, wherein the distributed processing system comprises a second control end and a plurality of machine rooms connected with the second control end; at least one machine room comprises the first control end and a plurality of execution nodes in any one of claims 1 to 16; tasks processed by the machine rooms belong to the same service, the method is executed by the second control terminal, and the method comprises the following steps:
when a preset second condition is met, acquiring a processing result of each machine room for the task;
processing the services corresponding to the processing results according to the processing results obtained from the machine rooms; at least one of the results of each treatment is obtained by the method according to any one of claims 1 to 16.
18. The method of claim 17, wherein the first control terminal comprises a connected control node and a coordinating component;
before obtaining a processing result obtained by the machine room for the task, the method further comprises:
for each machine room, acquiring a third address generated by a control node of the machine room; the third address shows a local storage location of the control node of the machine room where the machine room obtains the processing result for the task.
19. The method of claim 18, wherein before obtaining the processing result obtained by the machine room for the task, the method further comprises:
and if the third address generated by the control node of the machine room is acquired for each machine room, determining that the preset second condition is met.
20. The method of claim 18, wherein obtaining the processing result obtained by the machine room for the task comprises:
acquiring a third address corresponding to the machine room;
and according to the acquired third address, acquiring a processing result of the machine room for the task locally at the control node of the machine room.
21. A distributed processing method, wherein the method is based on a distributed processing system, the distributed processing system comprises a first control end and a plurality of execution nodes, the method is executed by any execution node, and the method comprises:
processing the task fragments distributed to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks;
storing the intermediate result to the local;
and when the first control end acquires an intermediate result corresponding to the task, sending the intermediate result to the first control end, so that the first control end obtains a processing result of the task according to the intermediate result.
22. The method as claimed in claim 21, wherein before processing the task fragment allocated to the current execution node by the first control end, the method further comprises:
receiving a first address sent by a first control end, wherein the first address shows a storage position of data corresponding to task fragments processed by the execution node in a preset data source;
and acquiring data from the data source according to the first address so as to process the task fragment distributed to the execution node.
23. The method of claim 21, wherein storing the intermediate result locally comprises:
storing the intermediate result to the local, and determining the storage position of the intermediate result in the local as a second address;
the distributed processing system further comprises an address database; after storing the intermediate result locally, the method further comprises:
and sending the second address to the address database, so that the first control end acquires the second address through the address database.
24. The method of claim 23, wherein sending the intermediate result to the first control terminal when the first control terminal obtains the intermediate result corresponding to the task comprises:
and when the first control end acquires the intermediate result corresponding to the task according to the second address, the locally stored intermediate result corresponding to the second address is sent to the first control end.
25. The method of claim 21, wherein after sending the intermediate result to the first control terminal, the method further comprises:
if an intermediate result deleting instruction is received, the execution node deletes the locally stored intermediate result corresponding to the task; and the intermediate result deleting instruction is generated by the first control end according to the task.
26. A distributed processing device is applied to a first control end, the first control end belongs to a distributed processing system, and the distributed processing system further comprises a plurality of execution nodes; the device is used for realizing the method of any one of claims 1 to 16.
27. A distributed processing device is applied to a second control end of a distributed processing system, and the distributed processing system further comprises a plurality of machine rooms; at least one machine room comprises a first control end and a plurality of execution nodes; tasks processed by all machine rooms belong to the same service; the device is used for realizing the method of any one of claims 17 to 20.
28. A distributed processing device is applied to any execution node in a plurality of execution nodes of a distributed processing system, and the distributed processing system further comprises a first control end; the device is used for realizing the method of any one of claims 21 to 25.
29. A distributed processing system comprises a first control end and a plurality of execution nodes;
the first control terminal is configured to: acquiring tasks, determining task fragments corresponding to each execution node according to the tasks, and distributing the task fragments corresponding to each execution node to the corresponding execution nodes;
any of the executing nodes is configured to: processing the task fragments distributed to the execution node by the first control end to obtain an intermediate result; the task fragments are obtained according to tasks, and the intermediate results correspond to the tasks; storing the intermediate result to the local;
the first control end is configured to: when a preset first condition is met, acquiring intermediate results stored locally in each execution node;
any executing node is also configured as: and when the first control end acquires the intermediate result corresponding to the task, the intermediate result is sent to the first control end, so that the first control end collects the intermediate results acquired from each execution node to obtain the processing result of the task.
30. The system of claim 29, wherein the first control end and the plurality of execution nodes form a computer room, and the distributed processing system comprises a plurality of computer rooms; the distributed processing system also comprises a second control end;
the second control terminal is configured to: when a preset second condition is met, acquiring a processing result of each machine room for the task, and processing the service corresponding to each processing result according to each processing result acquired from each machine room; at least one of the results of each treatment is obtained by the method according to any one of claims 1 to 16.
CN202110010199.6A 2021-01-04 2021-01-04 Distributed processing method, system and device Pending CN114721811A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110010199.6A CN114721811A (en) 2021-01-04 2021-01-04 Distributed processing method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110010199.6A CN114721811A (en) 2021-01-04 2021-01-04 Distributed processing method, system and device

Publications (1)

Publication Number Publication Date
CN114721811A true CN114721811A (en) 2022-07-08

Family

ID=82233444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110010199.6A Pending CN114721811A (en) 2021-01-04 2021-01-04 Distributed processing method, system and device

Country Status (1)

Country Link
CN (1) CN114721811A (en)

Similar Documents

Publication Publication Date Title
CN111917864B (en) Service verification method and device
CN108633311B (en) Method and device for concurrent control based on call chain and control node
CN106933843B (en) Database heartbeat detection method and device
CN111209110B (en) Task scheduling management method, system and storage medium for realizing load balancing
CN110764963A (en) Service exception handling method, device and equipment
CN111049928A (en) Data synchronization method, system, electronic device and computer readable storage medium
CN113067875A (en) Access method, device and equipment based on dynamic flow control of micro-service gateway
CN111538585B (en) Js-based server process scheduling method, system and device
CN112256433A (en) Partition migration method and device based on Kafka cluster
CN110377664B (en) Data synchronization method, device, server and storage medium
CN110069217B (en) Data storage method and device
CN109995585B (en) Exception handling method, device and storage medium
US20220206836A1 (en) Method and Apparatus for Processing Virtual Machine Migration, Method and Apparatus for Generating Virtual Machine Migration Strategy, Device and Storage Medium
CN112559565A (en) Abnormity detection method, system and device
CN111831408A (en) Asynchronous task processing method and device, electronic equipment and medium
CN111400241B (en) Data reconstruction method and device
CN112631994A (en) Data migration method and system
CN113760522A (en) Task processing method and device
CN114721811A (en) Distributed processing method, system and device
CN106888244B (en) Service processing method and device
CN115033927A (en) Method, device, equipment and medium for detecting data integrity
CN109151016B (en) Flow forwarding method and device, service system, computing device and storage medium
CN114564153A (en) Volume mapping removing method, device, equipment and storage medium
CN111435320B (en) Data processing method and device
CN111708676A (en) Example cluster monitoring method and device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination