CN116126499A - Distributed process scheduling parallel processing device and method - Google Patents
Distributed process scheduling parallel processing device and method Download PDFInfo
- Publication number
- CN116126499A CN116126499A CN202310075129.8A CN202310075129A CN116126499A CN 116126499 A CN116126499 A CN 116126499A CN 202310075129 A CN202310075129 A CN 202310075129A CN 116126499 A CN116126499 A CN 116126499A
- Authority
- CN
- China
- Prior art keywords
- task
- processing
- service module
- module
- management server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
Abstract
The invention relates to a distributed process scheduling parallel processing device and a method, wherein the device comprises the following steps: the task management service module is used for registering and defining the task of data processing and decomposing the task into sub-tasks with independent functions; performing task scheduling and launching; acquiring a task processing result to perform result analysis; the control management server service module is used for registering task names and defining message queues, associating the task names with the message queues, outputting the tasks through the message queues, and acquiring task processing results through the message queues; and the task processing module is used for registering task names, processing and calculating related data and obtaining task processing results. Compared with the prior art, the invention has independent functions of each module, does not need to introduce other distributed supporting software, and can conveniently solve the problems of large non-real-time data volume, slow serial processing, high real-time computing complexity and real-time performance requirement and multi-task concurrent processing.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a distributed process scheduling parallel processing device and method.
Background
When the CPU processes complex tasks, the calculation amount is large and the time consumption is long, and particularly when a large amount of texts are processed, a complex encryption and decryption algorithm is required to be operated, and the processing can be completed after several hours or tens of hours.
The query speed can be greatly improved through the distributed process scheduling parallel processing device, and if the existing large-scale software product distributed computing system (currently popular Hadoop, spark, strorm architecture) is adopted, the software product with complex environment and large requirement is built, and the use is inconvenient.
Most of the distributed systems are based on Hadoop distributed file systems and adopt various kinds of software supported by big data, and the defects are that:
the fault is difficult to remove, and because various software supports are needed, a lot of software has no open source and the difficulty in mastering is large.
The support of the dependency tool software requires various different functional software to support due to the fact that a distributed file system is adopted and management of the file system exists.
Safety problem: data security and sharing problems exist due to distributed computing.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a distributed process scheduling parallel processing device and method, which solve the problems of large non-real-time data volume, slow serial processing and high real-time computing complexity and real-time requirement.
The aim of the invention can be achieved by the following technical scheme:
a distributed process scheduling parallel processing apparatus comprising:
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is processed by a corresponding processor.
Further, the control management server service module is provided with an authkey value, and when the task management service module and the task processing module are connected with the control management server service module, the task management server service module and the task processing module are checked according to the authkey value.
Further, the distributed process scheduling parallel processing device distributes and processes tasks based on the Queuemanager.
Further, the task management service module adopts a manger distributed process to decompose the task of a single node into the same task which is independently processed by multiple nodes.
Further, the task processing module performs task processing in parallel through a plurality of processors.
A processing method of a distributed process scheduling parallel processing apparatus as described above, comprising the steps of:
s1: the control management server service module is started, loads a task list, creates a key, and registers task node and work processing node information;
s2: the task management service module establishes connection with the control management server service module and performs key verification, and the task management service module generates and registers tasks or subtasks;
s3: the task management service module initiates a task or subtask to the control management server service module, and the control management server service module keeps the task or subtask initiated by the task management service module in a corresponding message queue;
s4: the task processing module establishes connection with the task management service module and the control management server service module and performs key verification;
s5: the task processing module acquires tasks or subtasks to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: and the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
Further, the key is an authkey value, and the task management service module and the task processing module verify according to the authkey value.
Further, the task node and work processing node information comprises processing addresses and user information corresponding to tasks.
Further, the task or subtask to be processed acquired by the task processing module includes a task identifier and data to be processed.
Further, the method further comprises: and processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
Compared with the prior art, the invention has the following advantages:
(1) The invention is realized based on the Queuemanager principle, and the functions of the control management server service module, the task management service module and the task processing module are mutually independent; the distribution and result return processing of the tasks can only be completed by a task management service module, and the task processing module can only complete the application function processing of task allocation without direct information exchange;
because the functions of the modules are independent, other distributed supporting software is not required to be introduced, and the problems of slow serial processing of large non-real-time data volume, high real-time computing complexity and real-time requirement and concurrent processing of multiple tasks can be conveniently solved.
(2) And the parallel processing is realized by adopting a distributed process processing technology, so that the speed of complex task processing is greatly improved.
Drawings
FIG. 1 is a schematic diagram of a distributed process scheduling parallel processing apparatus according to an embodiment of the present invention;
fig. 2 is a flow chart of a distributed process scheduling parallel processing method provided in an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
Term interpretation:
1. multiprocessing: multi-process manager module
2. Manger: sub-modules under the python module Multiprocessing support the distribution of multiple processes across multiple machines.
3. Queuemanager: a distributed process message manager in multiprocessing in the python language.
4. Queue: the message queue under the Multiprocessing module is mainly used for information sending and receiving of the distributed management server and the work server.
Example 1
As shown in fig. 1, the present embodiment provides a distributed process scheduling parallel processing apparatus, where main tasks include task decomposition, data partitioning, flow subtasks and data transmission and association.
The task decomposition mainly comprises: according to the data processing requirement, decomposing a complete data processing task into sub-tasks with independent functions;
the data block mainly comprises: and carrying out blocking processing on big data, such as: 1000w of data is split into 10 blocks of 100w, and the 10 blocks are provided for 10 processors to process;
the data transmission and association mainly comprise: defining a task request queue and a reply queue for completing the task, registering the relation between the task request and the request queue, and checking and setting an authkey value when the task security registration is applied;
in the task processing process, serial processing is changed into parallel processing: the manger distributed process is adopted to change the single-node task into the same task which is processed by multiple nodes independently, define various tasks and corresponding queues and write various processing codes for completing the subtasks, thereby greatly improving the processing efficiency.
The device mainly comprises:
a control management server service module (monitor server. Py) a task management service module (task. Py), a task processing module (work. Py), wherein the control management server service module is the core, including security detection, task and exchange of messages. The task management service task initiates registration, all needs to be connected through the control management server service module, and the source supply of task processing is also the control management server service module.
In particular, the method comprises the steps of,
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, the data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is respectively processed by a corresponding processor.
The distributed process scheduling parallel processing device distributes and processes tasks based on the Queuemanager.
The task management service module adopts a manger distributed process to decompose the task of a single node into the same task which is independently processed by multiple nodes, and the task processing module carries out task processing in parallel through multiple processors.
Preferably, the control management server service module is provided with an authkey value, and when the task management service module and the task processing module are connected with the control management server service module, the task management service module and the task processing module are checked according to the authkey value.
In this embodiment, an optional specific function of each of the above modules is provided as follows:
control management server service module: the method is characterized by comprising the core of safety detection, task and message exchange, including the definition of various task name registration and queues, the association relation between task names and queues, and the storage management of task information and returned task results. The value of the authkey is set for verification at the time of the task security registration application.
Task management service module: the name registration application of the execution task and the subtask, the definition of the task and the subtask, the organization of data, the initiation of the task and the subtask, the analysis of the results of the task and the subtask after the processing completion, the task and the subtask scheduling and the like.
The task processing module: the method comprises the steps of registering application of executing task and subtask names, applying for processing of the task and subtask, finishing related data processing and calculation of the task and the subtask, replying the finishing result of the task and the subtask, and the like.
As shown in fig. 2, the present embodiment further provides a processing method of the distributed process scheduling parallel processing apparatus as above, including the following steps:
s1: the control management server service module is started, a task list is loaded, a secret key is created, and task node and work processing node information is registered;
s2: the task management service module establishes connection with the control management server service module, performs key verification, and generates and registers tasks or subtasks;
s3: the task management service module initiates tasks or subtasks to the control management server service module, and the control management server service module keeps the tasks or subtasks initiated by the task management service module in corresponding message queues;
s4: the task processing module establishes connection with the task management service module and the control management server service module, and performs key verification;
s5: the task processing module acquires a task or subtask to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and state.
Optionally, the key is an authkey value, and the task management service module and the task processing module are each verified according to the authkey value.
Optionally, the control management server service module starts service, loads task list, creates key, registers task node and work processing node information (including address and user information), and prepares once at start;
the task management service module and the task processing module control the management server service module to establish connection, and the authkey check is completed once.
The task node and work processing node information comprises processing addresses and user information corresponding to tasks.
The task or subtask to be processed acquired by the task processing module comprises a task identifier and data to be processed.
And processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
In this embodiment, an optional specific scheme of the above processing method is provided:
(1) the control management server service module starts service, loads task list, creates key, registers task node and work processing node information (including address and user information).
(2) The task management service module establishes a connection with the control management server service module, and the authkey checks. Generating a task or subtask, and applying for registering the task.
(3) The task or subtask (including task identification, key computing process data to be processed) is submitted to the control management server service module and is actually maintained in the corresponding message queue.
(4) The task processing module establishes connection with the task management service module and the control management server service module, and the authkey checks and applies for registering tasks or subtasks to be processed.
(5) The task processing module obtains the task or subtask (including task identification and key calculation processing data needing to be processed) needing to be processed to carry out relevant functional processing.
(6) After the task processing module completes the task, the task result (the key data and the processing state (success/failure) of the processing completion) is sent to the control management server service module, and the result is actually kept in the corresponding message queue.
(7) The task management service module reads (processes the key calculation processing data and the processing state after processing) from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.
Claims (10)
1. A distributed process scheduling parallel processing apparatus, comprising:
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is processed by a corresponding processor.
2. A distributed process scheduling parallel processing apparatus according to claim 1, wherein the control management server service module is provided with an authkey value, and the task management service module and the task processing module are checked based on the authkey value when connected to the control management server service module.
3. The distributed process scheduling parallel processing apparatus according to claim 1, wherein the distributed process scheduling parallel processing apparatus performs task distribution and processing based on a queue manager.
4. The distributed process scheduling parallel processing apparatus according to claim 1, wherein the task management service module decomposes a task of a single node into the same task processed by multiple nodes separately using a manger distributed process.
5. A distributed process scheduling parallel processing apparatus according to claim 1, wherein the task processing module performs task processing in parallel by a plurality of processors.
6. A processing method of a distributed process scheduling parallel processing apparatus according to any one of claims 1 to 5, comprising the steps of:
s1: the control management server service module is started, loads a task list, creates a key, and registers task node and work processing node information;
s2: the task management service module establishes connection with the control management server service module and performs key verification, and the task management service module generates and registers tasks or subtasks;
s3: the task management service module initiates a task or subtask to the control management server service module, and the control management server service module keeps the task or subtask initiated by the task management service module in a corresponding message queue;
s4: the task processing module establishes connection with the task management service module and the control management server service module and performs key verification;
s5: the task processing module acquires tasks or subtasks to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: and the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
7. The method of claim 6, wherein the key is an authkey value from which the task management service module and the task processing module each verify.
8. The method of claim 6, wherein the task node and work processing node information includes processing addresses and user information corresponding to tasks.
9. The method of claim 6, wherein the task or subtask to be processed acquired by the task processing module includes a task identification and data to be processed.
10. The method of claim 6, wherein the method further comprises: and processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310075129.8A CN116126499A (en) | 2023-01-18 | 2023-01-18 | Distributed process scheduling parallel processing device and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310075129.8A CN116126499A (en) | 2023-01-18 | 2023-01-18 | Distributed process scheduling parallel processing device and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116126499A true CN116126499A (en) | 2023-05-16 |
Family
ID=86295251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310075129.8A Pending CN116126499A (en) | 2023-01-18 | 2023-01-18 | Distributed process scheduling parallel processing device and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116126499A (en) |
-
2023
- 2023-01-18 CN CN202310075129.8A patent/CN116126499A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101290581B (en) | Compiling system and method | |
US8041790B2 (en) | Dynamic definition for concurrent computing environments | |
CN113220431B (en) | Cross-cloud distributed data task scheduling method, device and storage medium | |
CN107807815A (en) | The method and apparatus of distributed treatment task | |
US20130117755A1 (en) | Apparatuses, systems, and methods for distributed workload serialization | |
Bellettini et al. | Distributed CTL model checking using MapReduce: theory and practice | |
CN111641678A (en) | Task scheduling method and device, electronic equipment and medium | |
US20200310828A1 (en) | Method, function manager and arrangement for handling function calls | |
US7428486B1 (en) | System and method for generating process simulation parameters | |
CN114006815B (en) | Automatic deployment method and device for cloud platform nodes, nodes and storage medium | |
CN113051049A (en) | Task scheduling system, method, electronic device and readable storage medium | |
CN116126499A (en) | Distributed process scheduling parallel processing device and method | |
Gmys | Exactly solving hard permutation flowshop scheduling problems on peta-scale gpu-accelerated supercomputers | |
US7159012B2 (en) | Computational data processing system and computational process implemented by means of such a system | |
CN112422331A (en) | Operation and maintenance operation node monitoring method and related equipment | |
CN111651509A (en) | Data importing method and device based on Hbase database, electronic device and medium | |
JP2007213576A (en) | Method and system for selectively tracing semantic web data using distributed update event, and storage device | |
CN114265997B (en) | Page information output method, device, storage medium and terminal | |
CN108614731B (en) | Method, device and system for operating MapReduce operation | |
CN111782482B (en) | Interface pressure testing method and related equipment | |
CN111324472B (en) | Method and device for judging garbage items of information to be detected | |
CN117573329B (en) | Multi-brain collaborative task scheduling method, task scheduling device and storage medium | |
WO2009041801A2 (en) | Trusted node for grid computing | |
Crespo Abril et al. | Scheduling resource-constrained projects using branch and bound and parallel computing techniques | |
CN114791799A (en) | Modeling method, device, equipment and storage medium based on BPMN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |