CN116126499A - Distributed process scheduling parallel processing device and method - Google Patents

Distributed process scheduling parallel processing device and method Download PDF

Info

Publication number
CN116126499A
CN116126499A CN202310075129.8A CN202310075129A CN116126499A CN 116126499 A CN116126499 A CN 116126499A CN 202310075129 A CN202310075129 A CN 202310075129A CN 116126499 A CN116126499 A CN 116126499A
Authority
CN
China
Prior art keywords
task
processing
service module
module
management server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310075129.8A
Other languages
Chinese (zh)
Inventor
肖飞军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Financial Technology Co Ltd
Original Assignee
Bank of China Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Financial Technology Co Ltd filed Critical Bank of China Financial Technology Co Ltd
Priority to CN202310075129.8A priority Critical patent/CN116126499A/en
Publication of CN116126499A publication Critical patent/CN116126499A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

The invention relates to a distributed process scheduling parallel processing device and a method, wherein the device comprises the following steps: the task management service module is used for registering and defining the task of data processing and decomposing the task into sub-tasks with independent functions; performing task scheduling and launching; acquiring a task processing result to perform result analysis; the control management server service module is used for registering task names and defining message queues, associating the task names with the message queues, outputting the tasks through the message queues, and acquiring task processing results through the message queues; and the task processing module is used for registering task names, processing and calculating related data and obtaining task processing results. Compared with the prior art, the invention has independent functions of each module, does not need to introduce other distributed supporting software, and can conveniently solve the problems of large non-real-time data volume, slow serial processing, high real-time computing complexity and real-time performance requirement and multi-task concurrent processing.

Description

Distributed process scheduling parallel processing device and method
Technical Field
The invention relates to the technical field of data processing, in particular to a distributed process scheduling parallel processing device and method.
Background
When the CPU processes complex tasks, the calculation amount is large and the time consumption is long, and particularly when a large amount of texts are processed, a complex encryption and decryption algorithm is required to be operated, and the processing can be completed after several hours or tens of hours.
The query speed can be greatly improved through the distributed process scheduling parallel processing device, and if the existing large-scale software product distributed computing system (currently popular Hadoop, spark, strorm architecture) is adopted, the software product with complex environment and large requirement is built, and the use is inconvenient.
Most of the distributed systems are based on Hadoop distributed file systems and adopt various kinds of software supported by big data, and the defects are that:
the fault is difficult to remove, and because various software supports are needed, a lot of software has no open source and the difficulty in mastering is large.
The support of the dependency tool software requires various different functional software to support due to the fact that a distributed file system is adopted and management of the file system exists.
Safety problem: data security and sharing problems exist due to distributed computing.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a distributed process scheduling parallel processing device and method, which solve the problems of large non-real-time data volume, slow serial processing and high real-time computing complexity and real-time requirement.
The aim of the invention can be achieved by the following technical scheme:
a distributed process scheduling parallel processing apparatus comprising:
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is processed by a corresponding processor.
Further, the control management server service module is provided with an authkey value, and when the task management service module and the task processing module are connected with the control management server service module, the task management server service module and the task processing module are checked according to the authkey value.
Further, the distributed process scheduling parallel processing device distributes and processes tasks based on the Queuemanager.
Further, the task management service module adopts a manger distributed process to decompose the task of a single node into the same task which is independently processed by multiple nodes.
Further, the task processing module performs task processing in parallel through a plurality of processors.
A processing method of a distributed process scheduling parallel processing apparatus as described above, comprising the steps of:
s1: the control management server service module is started, loads a task list, creates a key, and registers task node and work processing node information;
s2: the task management service module establishes connection with the control management server service module and performs key verification, and the task management service module generates and registers tasks or subtasks;
s3: the task management service module initiates a task or subtask to the control management server service module, and the control management server service module keeps the task or subtask initiated by the task management service module in a corresponding message queue;
s4: the task processing module establishes connection with the task management service module and the control management server service module and performs key verification;
s5: the task processing module acquires tasks or subtasks to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: and the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
Further, the key is an authkey value, and the task management service module and the task processing module verify according to the authkey value.
Further, the task node and work processing node information comprises processing addresses and user information corresponding to tasks.
Further, the task or subtask to be processed acquired by the task processing module includes a task identifier and data to be processed.
Further, the method further comprises: and processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
Compared with the prior art, the invention has the following advantages:
(1) The invention is realized based on the Queuemanager principle, and the functions of the control management server service module, the task management service module and the task processing module are mutually independent; the distribution and result return processing of the tasks can only be completed by a task management service module, and the task processing module can only complete the application function processing of task allocation without direct information exchange;
because the functions of the modules are independent, other distributed supporting software is not required to be introduced, and the problems of slow serial processing of large non-real-time data volume, high real-time computing complexity and real-time requirement and concurrent processing of multiple tasks can be conveniently solved.
(2) And the parallel processing is realized by adopting a distributed process processing technology, so that the speed of complex task processing is greatly improved.
Drawings
FIG. 1 is a schematic diagram of a distributed process scheduling parallel processing apparatus according to an embodiment of the present invention;
fig. 2 is a flow chart of a distributed process scheduling parallel processing method provided in an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
Term interpretation:
1. multiprocessing: multi-process manager module
2. Manger: sub-modules under the python module Multiprocessing support the distribution of multiple processes across multiple machines.
3. Queuemanager: a distributed process message manager in multiprocessing in the python language.
4. Queue: the message queue under the Multiprocessing module is mainly used for information sending and receiving of the distributed management server and the work server.
Example 1
As shown in fig. 1, the present embodiment provides a distributed process scheduling parallel processing apparatus, where main tasks include task decomposition, data partitioning, flow subtasks and data transmission and association.
The task decomposition mainly comprises: according to the data processing requirement, decomposing a complete data processing task into sub-tasks with independent functions;
the data block mainly comprises: and carrying out blocking processing on big data, such as: 1000w of data is split into 10 blocks of 100w, and the 10 blocks are provided for 10 processors to process;
the data transmission and association mainly comprise: defining a task request queue and a reply queue for completing the task, registering the relation between the task request and the request queue, and checking and setting an authkey value when the task security registration is applied;
in the task processing process, serial processing is changed into parallel processing: the manger distributed process is adopted to change the single-node task into the same task which is processed by multiple nodes independently, define various tasks and corresponding queues and write various processing codes for completing the subtasks, thereby greatly improving the processing efficiency.
The device mainly comprises:
a control management server service module (monitor server. Py) a task management service module (task. Py), a task processing module (work. Py), wherein the control management server service module is the core, including security detection, task and exchange of messages. The task management service task initiates registration, all needs to be connected through the control management server service module, and the source supply of task processing is also the control management server service module.
In particular, the method comprises the steps of,
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, the data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is respectively processed by a corresponding processor.
The distributed process scheduling parallel processing device distributes and processes tasks based on the Queuemanager.
The task management service module adopts a manger distributed process to decompose the task of a single node into the same task which is independently processed by multiple nodes, and the task processing module carries out task processing in parallel through multiple processors.
Preferably, the control management server service module is provided with an authkey value, and when the task management service module and the task processing module are connected with the control management server service module, the task management service module and the task processing module are checked according to the authkey value.
In this embodiment, an optional specific function of each of the above modules is provided as follows:
control management server service module: the method is characterized by comprising the core of safety detection, task and message exchange, including the definition of various task name registration and queues, the association relation between task names and queues, and the storage management of task information and returned task results. The value of the authkey is set for verification at the time of the task security registration application.
Task management service module: the name registration application of the execution task and the subtask, the definition of the task and the subtask, the organization of data, the initiation of the task and the subtask, the analysis of the results of the task and the subtask after the processing completion, the task and the subtask scheduling and the like.
The task processing module: the method comprises the steps of registering application of executing task and subtask names, applying for processing of the task and subtask, finishing related data processing and calculation of the task and the subtask, replying the finishing result of the task and the subtask, and the like.
As shown in fig. 2, the present embodiment further provides a processing method of the distributed process scheduling parallel processing apparatus as above, including the following steps:
s1: the control management server service module is started, a task list is loaded, a secret key is created, and task node and work processing node information is registered;
s2: the task management service module establishes connection with the control management server service module, performs key verification, and generates and registers tasks or subtasks;
s3: the task management service module initiates tasks or subtasks to the control management server service module, and the control management server service module keeps the tasks or subtasks initiated by the task management service module in corresponding message queues;
s4: the task processing module establishes connection with the task management service module and the control management server service module, and performs key verification;
s5: the task processing module acquires a task or subtask to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and state.
Optionally, the key is an authkey value, and the task management service module and the task processing module are each verified according to the authkey value.
Optionally, the control management server service module starts service, loads task list, creates key, registers task node and work processing node information (including address and user information), and prepares once at start;
the task management service module and the task processing module control the management server service module to establish connection, and the authkey check is completed once.
The task node and work processing node information comprises processing addresses and user information corresponding to tasks.
The task or subtask to be processed acquired by the task processing module comprises a task identifier and data to be processed.
And processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
In this embodiment, an optional specific scheme of the above processing method is provided:
(1) the control management server service module starts service, loads task list, creates key, registers task node and work processing node information (including address and user information).
(2) The task management service module establishes a connection with the control management server service module, and the authkey checks. Generating a task or subtask, and applying for registering the task.
(3) The task or subtask (including task identification, key computing process data to be processed) is submitted to the control management server service module and is actually maintained in the corresponding message queue.
(4) The task processing module establishes connection with the task management service module and the control management server service module, and the authkey checks and applies for registering tasks or subtasks to be processed.
(5) The task processing module obtains the task or subtask (including task identification and key calculation processing data needing to be processed) needing to be processed to carry out relevant functional processing.
(6) After the task processing module completes the task, the task result (the key data and the processing state (success/failure) of the processing completion) is sent to the control management server service module, and the result is actually kept in the corresponding message queue.
(7) The task management service module reads (processes the key calculation processing data and the processing state after processing) from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims (10)

1. A distributed process scheduling parallel processing apparatus, comprising:
the task management service module is used for registering and defining a task of data processing and decomposing the task into sub-tasks with independent functions according to data processing requirements; performing task scheduling and initiating according to the acquired tasks and subtasks; acquiring a task processing result and analyzing the result;
the control management server service module is connected with the task management service module and is used for registering task names and defining message queues according to tasks initiated by the task management service module, associating the task names with the message queues, outputting the tasks through the message queues, acquiring task processing results through the message queues and sending the task processing results to the task management service module;
the task processing module is connected with the control management server service module and used for registering task names according to the tasks output by the control management server service module, processing the corresponding tasks, processing and calculating related data and returning the task processing results to the control management server service module; in the data processing and calculating process, data corresponding to the task are subjected to block processing to obtain a plurality of data blocks, and each data block is processed by a corresponding processor.
2. A distributed process scheduling parallel processing apparatus according to claim 1, wherein the control management server service module is provided with an authkey value, and the task management service module and the task processing module are checked based on the authkey value when connected to the control management server service module.
3. The distributed process scheduling parallel processing apparatus according to claim 1, wherein the distributed process scheduling parallel processing apparatus performs task distribution and processing based on a queue manager.
4. The distributed process scheduling parallel processing apparatus according to claim 1, wherein the task management service module decomposes a task of a single node into the same task processed by multiple nodes separately using a manger distributed process.
5. A distributed process scheduling parallel processing apparatus according to claim 1, wherein the task processing module performs task processing in parallel by a plurality of processors.
6. A processing method of a distributed process scheduling parallel processing apparatus according to any one of claims 1 to 5, comprising the steps of:
s1: the control management server service module is started, loads a task list, creates a key, and registers task node and work processing node information;
s2: the task management service module establishes connection with the control management server service module and performs key verification, and the task management service module generates and registers tasks or subtasks;
s3: the task management service module initiates a task or subtask to the control management server service module, and the control management server service module keeps the task or subtask initiated by the task management service module in a corresponding message queue;
s4: the task processing module establishes connection with the task management service module and the control management server service module and performs key verification;
s5: the task processing module acquires tasks or subtasks to be processed and carries out related processing;
s6: after the task processing module finishes processing, the task processing result is sent to the control management server service module and is kept in a corresponding message queue;
s7: and the task management service module reads the task processing result from the corresponding message queue of the control management server service module, updates and records the processing result and the state.
7. The method of claim 6, wherein the key is an authkey value from which the task management service module and the task processing module each verify.
8. The method of claim 6, wherein the task node and work processing node information includes processing addresses and user information corresponding to tasks.
9. The method of claim 6, wherein the task or subtask to be processed acquired by the task processing module includes a task identification and data to be processed.
10. The method of claim 6, wherein the method further comprises: and processing each task by controlling the management server service module, the task management service module and the task processing module until all the tasks are completed.
CN202310075129.8A 2023-01-18 2023-01-18 Distributed process scheduling parallel processing device and method Pending CN116126499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310075129.8A CN116126499A (en) 2023-01-18 2023-01-18 Distributed process scheduling parallel processing device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310075129.8A CN116126499A (en) 2023-01-18 2023-01-18 Distributed process scheduling parallel processing device and method

Publications (1)

Publication Number Publication Date
CN116126499A true CN116126499A (en) 2023-05-16

Family

ID=86295251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310075129.8A Pending CN116126499A (en) 2023-01-18 2023-01-18 Distributed process scheduling parallel processing device and method

Country Status (1)

Country Link
CN (1) CN116126499A (en)

Similar Documents

Publication Publication Date Title
CN101290581B (en) Compiling system and method
US8041790B2 (en) Dynamic definition for concurrent computing environments
CN113220431B (en) Cross-cloud distributed data task scheduling method, device and storage medium
CN107807815A (en) The method and apparatus of distributed treatment task
US20130117755A1 (en) Apparatuses, systems, and methods for distributed workload serialization
Bellettini et al. Distributed CTL model checking using MapReduce: theory and practice
CN111641678A (en) Task scheduling method and device, electronic equipment and medium
US20200310828A1 (en) Method, function manager and arrangement for handling function calls
US7428486B1 (en) System and method for generating process simulation parameters
CN114006815B (en) Automatic deployment method and device for cloud platform nodes, nodes and storage medium
CN113051049A (en) Task scheduling system, method, electronic device and readable storage medium
CN116126499A (en) Distributed process scheduling parallel processing device and method
Gmys Exactly solving hard permutation flowshop scheduling problems on peta-scale gpu-accelerated supercomputers
US7159012B2 (en) Computational data processing system and computational process implemented by means of such a system
CN112422331A (en) Operation and maintenance operation node monitoring method and related equipment
CN111651509A (en) Data importing method and device based on Hbase database, electronic device and medium
JP2007213576A (en) Method and system for selectively tracing semantic web data using distributed update event, and storage device
CN114265997B (en) Page information output method, device, storage medium and terminal
CN108614731B (en) Method, device and system for operating MapReduce operation
CN111782482B (en) Interface pressure testing method and related equipment
CN111324472B (en) Method and device for judging garbage items of information to be detected
CN117573329B (en) Multi-brain collaborative task scheduling method, task scheduling device and storage medium
WO2009041801A2 (en) Trusted node for grid computing
Crespo Abril et al. Scheduling resource-constrained projects using branch and bound and parallel computing techniques
CN114791799A (en) Modeling method, device, equipment and storage medium based on BPMN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination