CN117472530A

CN117472530A - Centralized management-based data intelligent scheduling method and system

Info

Publication number: CN117472530A
Application number: CN202311395544.8A
Authority: CN
Inventors: 彭云苹
Original assignee: Shanghai Kuanrui Information Technology Co ltd
Current assignee: Shanghai Kuanrui Information Technology Co ltd
Priority date: 2023-10-25
Filing date: 2023-10-25
Publication date: 2024-01-30
Anticipated expiration: 2043-10-25
Also published as: CN117472530B

Abstract

The invention provides a data intelligent scheduling method and system based on centralized management, which are used for acquiring a scheduling code, a scheduling statement and a scheduling object type in a first scheduling table, executing SQL scheduling when the scheduling object type is the first scheduling type, sending a scheduling operation instruction to a server, and executing an SQL storage process at regular time of the server to realize the regular data addition and/or update in a database, wherein the server comprises a plurality of local servers and/or a plurality of remote servers; and when the scheduling object type is the second scheduling type, executing shell scheduling, wherein the scheduling statement comprises server ID information, transmitting a scheduling operation instruction to a designated server according to the server ID information, and executing the SQL storage process at regular time of the designated server. According to the data intelligent scheduling method and system based on centralized management, the first scheduling type and the second scheduling type are executed respectively, so that automatic execution of scheduling tasks is realized, and accurate scheduling is realized.

Description

Centralized management-based data intelligent scheduling method and system

Technical Field

The invention relates to the technical field of data scheduling, in particular to a data intelligent scheduling method and system based on centralized management.

Background

In the prior scheduling technology in China, professional developers are generally required to process information to be scheduled, corresponding codes are written according to requirements, and single scheduling is carried out by using written programs on a fixed time period and a server. Meanwhile, the scheduling operation result is usually presented in a program internal error reporting mode and does not automatically inform a developer of program operation errors, the developer is required to automatically check the scheduling program operation result and related data content at regular time to judge the reasons of abnormal conditions or ensure normal program operation, and the method is required to communicate with related business personnel to solve certain scheduling abnormal conditions.

Therefore, it is necessary to provide a data intelligent scheduling method and system based on centralized management, which can solve the above problems.

Disclosure of Invention

Aiming at the problems and the shortcomings of the prior art, the invention provides a data intelligent scheduling method and system based on centralized management.

The invention solves the technical problems by the following technical proposal:

the invention provides a data intelligent scheduling method based on centralized management, which comprises the following steps:

acquiring a scheduling code, a scheduling statement and a scheduling object type in a first scheduling table, wherein the scheduling object type comprises a first scheduling type and a second scheduling type;

when the scheduling object type is the first scheduling type, SQL scheduling is executed, a scheduling operation instruction is sent to a server, SQL storage process is executed at regular time by the server, and data in the database is added and/or updated at regular time, wherein the server comprises a plurality of local servers and/or a plurality of remote servers;

and when the scheduling object type is the second scheduling type, executing shell scheduling, wherein the scheduling statement comprises server ID information, transmitting a scheduling operation instruction to a designated server according to the server ID information, and executing an SQL storage process at regular time of the designated server, wherein the scheduling operation instruction is splicing information of the server ID information, the scheduler ID information and the scheduler log ID information.

Preferably, the method further comprises the step of acquiring scheduling group information in the second scheduling table, wherein the scheduling group information comprises scheduling tasks executed at the same time and scheduling tasks of the same item.

Preferably, the method further comprises obtaining a scheduling execution type in the third scheduling table, wherein the scheduling execution type comprises serial waiting, parallel execution and abandoning execution.

Preferably, the method further comprises obtaining a scheduling type, a scheduler path and a scheduler log path in the fourth schedule.

Preferably, the method further comprises the step of obtaining a superior code in a fifth schedule, wherein the superior code comprises a first code, a second code and a third code, the superior code is preferentially executed when a scheduled task is executed, the fifth schedule is used for associating the first schedule with the second schedule, the fifth schedule comprises a schedule group code and a schedule code, the schedule group code is used for associating the second schedule, and the schedule code is used for associating the first schedule.

Preferably, the method further comprises the step of obtaining log information after scheduling operation in a sixth schedule, wherein the sixth schedule comprises monitoring scheduling codes, and the monitoring scheduling codes are used for associating the scheduling codes in the first schedule.

Preferably, the method further comprises obtaining a running condition of a scheduling group in a seventh scheduling table, wherein the seventh scheduling table comprises a scheduling group code, and the scheduling group code is used for associating the scheduling group code in the second scheduling table.

Preferably, when said upper level code comprises said first code, the upstream schedule must be completed in its entirety to perform the current schedule; when the upper level code comprises the second code, the upstream scheduling can execute the current scheduling only if one is completed; and when the upper-level code comprises the third code, the current scheduling does not have the upstream scheduling, and the current scheduling is directly executed.

Preferably, when the scheduling object type is the first scheduling type, scheduling is performed on the database, and corresponding SQL is executed; and when the scheduling object type is the second scheduling type, scheduling the remote task, and executing the corresponding shell script.

The invention also provides a data intelligent scheduling system based on centralized management, which comprises:

the scheduling information acquisition module is used for acquiring a scheduling statement and a scheduling object type in the first scheduling table, wherein the scheduling object type comprises a first scheduling type and a second scheduling type;

the SQL scheduling module is used for executing SQL scheduling when the scheduling object type is the first scheduling type, sending scheduling operation instructions to a server, executing SQL storage processes at regular time in the server, and realizing regular data addition and/or updating in the database, wherein the server comprises a plurality of local servers and/or a plurality of remote servers;

and the Shell scheduling module is used for executing Shell scheduling when the scheduling object type is the second scheduling type, the scheduling statement comprises server ID information, a scheduling operation instruction is sent to a designated server according to the server ID information, the SQL storage process is executed at the timing of the designated server, and the scheduling operation instruction comprises server ID information, scheduler ID information and scheduler log ID information.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a data intelligent scheduling method and system based on centralized management, which are used for acquiring a scheduling code, a scheduling statement and a scheduling object type in a first scheduling table, wherein the scheduling object type comprises a first scheduling type and a second scheduling type; when the scheduling object type is the first scheduling type, SQL scheduling is executed, a scheduling operation instruction is sent to a server, SQL storage process is executed at regular time by the server, and data in the database is added and/or updated at regular time, wherein the server comprises a plurality of local servers and/or a plurality of remote servers; when the scheduling object type is the second scheduling type, executing shell scheduling, wherein the scheduling statement comprises server ID information, transmitting a scheduling operation instruction to a designated server according to the server ID information, and executing an SQL storage process at regular time by the designated server, wherein the scheduling operation instruction is splicing information of the server ID information, the scheduler ID information and the scheduler log ID information, so that automatic execution of a scheduling task is realized, and accurate scheduling is realized;

further, when the upper level code includes the first code, the upstream schedule must be completed entirely to execute the current schedule; when the upper level code comprises the second code, the upstream scheduling can execute the current scheduling only if one is completed; when the upper level code comprises the third code, the current scheduling does not have the upstream scheduling, and the current scheduling is directly executed, so that the priority processing of the upstream scheduling is effectively completed.

Drawings

Fig. 1 is a schematic flow chart of a centralized management-based intelligent data scheduling method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a centralized management-based intelligent data scheduling system according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a scheduling group basic information configuration for executing shell scheduling in a data intelligent scheduling method based on centralized management according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a front-end display panel for executing shell scheduling in the intelligent data scheduling method based on centralized management according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a shell scheduling statement for executing shell scheduling in the intelligent data scheduling method based on centralized management according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a scheduling operation log for executing shell scheduling in the intelligent scheduling method based on centralized management according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a scheduling group basic information configuration for performing SQL scheduling in a centralized management-based data intelligent scheduling method according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a front-end display panel for performing SQL scheduling in the centralized management-based data intelligent scheduling method according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a scheduling operation log for executing SQL scheduling in a centralized management-based data intelligent scheduling method according to an embodiment of the present invention;

fig. 10 is a schematic diagram of a scheduling error log for executing SQL scheduling in the data intelligent scheduling method based on centralized management according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Based on the problems existing in the prior art, as shown in fig. 1, the invention provides a data intelligent scheduling method based on centralized management, which comprises the following steps:

step S101: acquiring a scheduling code, a scheduling statement and a scheduling object type in a first scheduling table, wherein the scheduling object type comprises a first scheduling type and a second scheduling type;

step S102: when the scheduling object type is the first scheduling type, SQL scheduling is executed, a scheduling operation instruction is sent to a server, SQL storage process is executed at regular time by the server, and data in the database is added and/or updated at regular time, wherein the server comprises a plurality of local servers and/or a plurality of remote servers;

step S103: and when the scheduling object type is the second scheduling type, executing shell scheduling, wherein the scheduling statement comprises server ID information, transmitting a scheduling operation instruction to a designated server according to the server ID information, and executing an SQL storage process at regular time of the designated server, wherein the scheduling operation instruction is splicing information of the server ID information, the scheduler ID information and the scheduler log ID information.

Watch structure

As shown in the above table, the first schedule, i.e. the platform schedule list, is a schedule statement, description, etc. for recording specific each schedule, is the actual schedule content in each schedule under the schedule group

The fields in the first schedule are interpreted as follows:

scheduling codes: a 6-digit code starting from 000001, uniquely identifying a schedule;

scheduling an object: the method comprises the steps of distinguishing SQL scheduling from server shell scheduling, wherein 1 represents SQL storage process scheduling; 2, server execution scheduling;

scheduling database: when the scheduling object is 1, SQL scheduling is carried out, and a database for executing the storage process is recorded; when the scheduling object is 2, shell scheduling is performed, and the field can be stored as a "server" to indicate that the scheduling statement to be executed is a non-canonical scheduling statement, and the type of statement is not normalized by the platform model scheduling configuration table, for example, the scheduling statement is: 1000401@PyTHONIOENCODING=utf-8/opt/mybin/Python 3-u/data/program/highfrequency/SH/download/rsyncGetFile. Py (indicating that the Python program following the @ symbol is executed on the 1000401 machine), this field may also store a scheduling type_specific scheduling configuration ID for already-specified scheduling, such as cll-into 240122, indicating the grabbing program number 240122;

scheduling statement: when the scheduling object is 1, SQL scheduling is carried out, and the content stored in the field is SQL scheduling statement, such as call ods web.face_fibasicinfoIB_1 (); when the scheduling object is 2 (shell scheduling), and the scheduling database is server, the original scheduling statement stored in the field is analyzed by the previous field; when the scheduling object is 2 and the scheduling database is normalized scheduling type_specific scheduling configuration ID, if the scheduling database is: cll-into 240122, this field is filled with: server machine code @ scheduler number @ scheduler log number, such as 1000301@ 240122; after storing the schedule statement, the scheduler may associate the schedule type of the platform scheduling model configuration table according to the data stored in the schedule database, for example, the schedule database: the cll-intonation platform removal scheduling model configuration table in cll-intonation 240122 is related to corresponding cll-intonation program configuration data, then the scheduling statement 1000301@240122@240122@240122 of the first scheduling table is divided by an@symbol to splice the second number with the program path of the fourth scheduling table, namely the platform scheduling model configuration table, the program path is spliced, the third number with the program path of the fourth scheduling table, namely the platform scheduling model configuration table, the log path is spliced, and the spliced scheduling statement is used for running on a machine represented by the first number. The database is as follows: cll-into 240122, the schedule statement is: 1000301@240122@240122, the complete scheduling statement finally spliced is: 1000301@/data/program/web/kr_web_client/linux/tools/python/bin/python 3-u/data/program/web/kr_web_client/linux/program/webcams/src/main.py 240122> >/data/program/web/kr_web_client/linux/webcams/bin/date "+% Y% M% d% H% M% S" "240122.log2 > &1;

scheduling description: recording detailed descriptions of actual scheduling implementation functions, such as: deep turnable debt turning detail-grabbing;

user code: recording an operator code;

effective mark: 1-effective and 0-ineffective, when the dispatching group starts to execute, effective dispatching in the dispatching group is screened to execute, and ineffective dispatching is not executed.

In a specific implementation, the method further comprises the step of obtaining scheduling group information in the second scheduling table, wherein the scheduling group information comprises scheduling tasks executed at the same time and scheduling tasks of the same item.

Form information

Watch structure

As shown in the above table, the second schedule, i.e. the platform schedule group table, stores the information of each schedule group, and can centrally control the schedule groups with the same characteristics, for example, the schedules executed at the same time are put in the same schedule group for centralized management, and for example, the schedules of the same item are put in the same schedule group, for example, a grabbing program for grabbing stock base information, a warehousing program for warehousing stock base information, a storage process for processing stock base information to the outside of another table is put in the same schedule group, and the operation schedule group can completely execute a whole business process;

the fields in the second schedule are interpreted as follows:

scheduling group code: a 6-bit number code from A00001;

scheduling group description: descriptive text for expressing overall characteristics of scheduling under a scheduling group;

user code: code for recording a newly created dispatch group operator;

effective mark: 1-effective, 0-ineffective, wherein when the scheduler is executed, all effective schedulers are screened, and the scheduling group which is effectively scheduled under the scheduling group is executed according to the scheduling operation rule recorded by the platform scheduling operation table.

In a specific implementation, the method further includes obtaining a scheduling execution type in the third schedule, wherein the scheduling execution type includes serial waiting, parallel execution and abandoning execution.

Form information

Watch structure

As shown in the above table, the third schedule, i.e. the platform schedule running table, is used for recording the running mode of the schedule group in the platform schedule group table.

The fields in the third schedule are defined as follows:

running codes: recording a dispatching group code, and associating the dispatching group code with a platform dispatching group table of a second dispatching table;

operation type: recording the type of a scheduling group, wherein 1 is single execution, and 2 is circular execution;

type of execution: and controlling the processing mode of the scheduling group when the operation time is overlapped, wherein the value is 1 or 2 or 3. For example, the schedule group a is set to be executed every 30 minutes, if the last execution of a is 09:00, but the execution of 09:00 is not completed at 09:30, the execution of 09:30 has different processing manners according to exec_type:

1.1-serial wait, the execution of 09:30 waits for the execution of 09:00 to finish and then executes;

2.2-parallel execution, this execution of 09:30 would be performed simultaneously with this execution of 09:00;

3.3-relinquish execution, this execution of 09:30 will be relinquished;

operation failure result: controlling the processing mode when the upstream scheduling of a certain schedule fails, wherein the value is 1 or 2:

1.1-terminating, when the upstream scheduling fails, giving up executing the current scheduling;

2.2-continuing to execute the current scheduling when the upstream scheduling fails;

run time: scheduling the starting operation time of the group, and when the operation frequency is an integer multiple of one day: a single time represents once a day, a plurality of times represents a plurality of times a day, and when the operation frequency is not an integer multiple of the day, only a single time or a time interval can be used, and the operation starting time or the time interval of the cyclic operation every day is represented;

operating frequency: the running interval between the current running and the next running is in minutes;

the applicable trade day market: the method comprises the steps that market codes are applicable, the respective trading day and non-trading day information of each market code are stored in an odsse, trade is selected, execution is carried out according to the trading day of the trade after selection, and the non-trading day is not carried out;

start date: calculating the starting date of each operation;

expiration date: scheduling an expiration date of the operation, and not executing after the expiration date;

addressee mailbox: scheduling mail addresses for executing abnormal reporting;

user code: operator code.

In an implementation, the method further includes obtaining a scheduling type, a scheduler path, and a scheduler log path in a fourth schedule.

As shown in the above figure, the fourth scheduling table, i.e., the platform scheduling model configuration table, records scheduling model configuration information.

The fields in the fourth schedule are defined as follows:

scheduling type: the specific scheduling statement in the server execution scheduling realizes the function types, such as file-into is represented as a file storage program, API-into is represented as an API access program, db-into is represented as a database access program, ftp-into is represented as an ftp file downloading program, cll-into is represented as a network grabbing program, nlp-into is represented as nlp bulletin parsing program, and db-push is represented as a database pushing program;

program path: the scheduling type program is stored in a server address;

log path: running the program to generate a log storage path;

by storing data in the above three fields, the schedule statement can be represented by standardized usage numbers.

In a specific implementation, the method further comprises the step of obtaining a superior code in a fifth schedule, wherein the superior code comprises a first code, a second code and a third code, the superior code is preferentially executed when a scheduled task is executed, the fifth schedule is used for associating the first schedule with the second schedule, the fifth schedule comprises a schedule group code and a schedule code, the schedule group code is used for associating the second schedule, and the schedule code is used for associating the first schedule.

As shown in the above table, the fifth schedule, i.e. the platform schedule group association table, is used for recording the association relationship between the first schedule, i.e. the schedule in the platform schedule detail table, and the second schedule, i.e. the schedule group in the platform schedule group table.

The definition of each field in the five schedules is as follows:

scheduling group code: may be associated with a dispatch group code of a platform dispatch group table;

scheduling codes: can be associated with a scheduling code of a platform scheduling list;

priority order: increasing Arabic numerals from 1, and scheduling execution sequences in the same scheduling group;

upper level code: writing the schedule which should be executed before the current schedule, and searching the field to execute the upper-level code preferentially when a single schedule is executed, and connecting by using the & and the & symbol, wherein & represents and the & represents or;

user code: for recording an operator ID;

effective mark: 1-active, 0-inactive, setting inactive indicates that the schedule is not under the schedule group.

In a specific implementation, the method further comprises the step of obtaining log information after scheduling operation in a sixth scheduling table, wherein the sixth scheduling table comprises monitoring scheduling codes, and the monitoring scheduling codes are used for associating the scheduling codes in the first scheduling table.

Watch structure

As shown in the table, the sixth scheduling table, namely the platform monitoring scheduling log table, is used for recording the log after scheduling operation, so that the front end can directly display the scheduling operation condition on the front-section large screen by accessing the table, and helps to solve the problem of scheduling error reporting and adjust scheduling contents. The visual large screen updates scheduling information based on all scheduling model tables, a scheduling page displays a scheduling state after the previous scheduling execution, if the scheduling operation is normal, the page displays a green normal frame prompt, a normal operation log and operation basic information are recorded, if the scheduling operation is abnormal, the page displays a red abnormal frame prompt, related mails are reported, and an abnormal operation log is recorded, at the moment, service personnel can conduct error checking according to the log, and after an abnormal problem is confirmed, the red abnormal prompt frame can be manually confirmed as a blue confirmed frame.

The fields in the sixth schedule are explained as follows:

log type: 1-scheduling logs;

monitoring a scheduling code: the scheduling code can be associated with the scheduling code of the platform scheduling list;

operation results: the format is (normal|abnormality|confirmed) _YYYY-MM-DD HH: MM: ss.SSS;

for shell scheduling, normally representing that the program return value is 0, and abnormally representing that the program return value is not 0;

for SQL scheduling, checking a 't_error' field in an SQL statement result;

-0 represents normal

-1 represents an abnormality

After the confirmed state is abnormal program operation, related personnel determine the abnormality at the front end, manually confirm abnormal scheduling and change the abnormal state into the confirmed state;

last run time: recording the starting time of the last operation of scheduling;

next run time: the calculated scheduled next running time;

error log description: recording an error report log and a warning log generated after scheduling operation;

correct log description: after the dispatching operation, normally recording a returned operation correct log;

user code: an operator user code is recorded.

In a specific implementation, the method further includes obtaining a running condition of a scheduling group in a seventh schedule, where the seventh schedule includes a scheduling group code, and the scheduling group code is used to associate the scheduling group code in the second schedule.

Form information

Watch structure

As shown in the table above, the seventh schedule, i.e. the platform schedule log table, records the running condition of each schedule group, and the schedule group version of plfm_DQA_disp_log is convenient for front-end presentation and scheduling and knowledge of the running condition of each schedule.

The fields in the seventh schedule are explained as follows:

log type: homoplfm_dqa_disp_log.log_type;

operation results: same plfm_dqa_disp_log.run_rst;

last run time: homoplfm_dqa_disp_log.lstm_run_time;

next run time: homoplfm_dqa_disp_log.next_run_time;

user code: an operator code.

In a specific implementation, when the upper level code includes the first code, the upstream schedule must be completed entirely to execute the current schedule; when the upper level code comprises the second code, the upstream scheduling can execute the current scheduling only if one is completed; and when the upper-level code comprises the third code, the current scheduling does not have the upstream scheduling, and the current scheduling is directly executed.

In specific implementation, when the scheduling object type is the first scheduling type, scheduling is performed on the database, and corresponding SQL is executed; and when the scheduling object type is the second scheduling type, scheduling the remote task, and executing the corresponding shell script.

Based on the problems existing in the prior art, as shown in fig. 2, the invention also provides a data intelligent scheduling system based on centralized management, which comprises:

a schedule information acquisition module 21 for acquiring a schedule sentence in a first schedule table and a schedule object type including a first schedule type and a second schedule type;

the SQL scheduling module 22 is configured to perform SQL scheduling when the scheduling object type is a first scheduling type, send a scheduling operation instruction to a server, perform an SQL storage procedure at the server timing, and implement timing addition and/or updating of data in the database, where the server includes a plurality of local servers and/or a plurality of remote servers;

and the Shell scheduling module 23 is configured to execute Shell scheduling when the scheduling object type is a second scheduling type, wherein the scheduling statement includes server ID information, send a scheduling operation instruction to a specified server according to the server ID information, and execute an SQL storage procedure at the specified server at regular time, where the scheduling operation instruction includes server ID information, scheduler ID information, and scheduler log ID information.

The scheduling operation processing procedure of the intelligent data scheduling method and system based on centralized management is described by a specific example:

the scheduling service is divided into a service end and a client end, wherein the service end is used for being responsible for scheduling the service, namely, when to schedule and schedule; the client is one end called by the dispatch service program and is mainly used for taking charge of some tasks related to the server where the client is located.

The algorithm logic of the program is as follows:

A. the deployment scheduling server deploys the scheduling client to other machines as required for the central scheduling service to call.

B. After the server is started, effective schedule group information is scanned out according to the second schedule table and the third schedule table and according to the field eff_flag=1, and then a schedule group list to be executed in the next minute is filtered out according to an algorithm.

The specific dispatch group filtering algorithm logic is as follows:

1. if the transaction code is in the form of 19:20-23:55 time intervals, checking whether the transaction code is a transaction day, if not, indicating that the next minute does not need to run, and eliminating the schedule from the schedule set. If yes, according to the time interval configured by scheduling, backward pushing according to the starting time and the scheduling frequency of the interval, and calculating the one-time execution time nearest to the current time;

2. the next execution time of the corresponding configuration is determined by loop judgment according to comma separation specific time points such as 10:30,10:30 and 15:00, and if the running interval can be divided by one day (1440 minutes), the number of loops is reduced by counting from runTime of the latest transaction day. The specific calculation logic is also configured starting time plus scheduling frequency push-back until the first time later than the current time, namely the next scheduling time;

3. and finally, comparing the next minute of the current time with the next execution time of each task group, and equally obtaining a scheduling task group set to be scheduled next time, and finishing screening.

C. And assembling the logical scheduling group operation object of the code layer according to the configuration information searched by the database.

The object information includes:

task queues within a readyqueue scheduling group;

configuration information of a disprop scheduling group;

state dispatch group status;

scheduletime execution time;

whether 5manual is manual (manual of interface operation is true, otherwise false);

start time of interface maintenance when argbegindate is manually executed;

end time of interface maintenance when argenddate is manually executed;

specific task information in tasks scheduling group;

the states are:

INITIALIZED device: an initial state, in which the task waits for a ready notification, the state being non-executable;

READY: ready state, the round of dispatch can be executed;

RUNNING: an executing state;

SUCCESS: successful execution;

faiiled: failure of execution;

KILLED: actively killed;

TIMEOUT: a timeout is performed.

D. When a scheduling group operation object is constructed, assigning an INITIALIZED to state, waiting to update to a READY state, wherein the scheduling group has three execution modes, and 1, performing serial execution; 2. executing in parallel; 3. the execution is aborted (if the previous task is not finished, the execution is aborted).

Each different mode is assigned a thread pool of execution based on the plfm_disp_run.exec_type field identification.

If the task is 1, a thread pool with the thread number of 1 is allocated, namely the former task is not successfully executed, and the latter task is queued;

if the number of the threads is 2, each task is allocated with a simple thread pool with the thread number of 1, so that the tasks can be executed concurrently;

if the task is 3, judging whether the task is currently executed, if so, directly skipping, and not distributing execution resources.

E. After the allocation of the execution resources is completed, assigning a state field of an execution object of the scheduling group as READY, submitting the tasks according to the allocated execution resources, setting random delayed execution time within 30 seconds, scattering the tasks to be executed within one minute, avoiding the occupation of resources and sudden increase, waiting for a thread pool to execute the tasks, and printing logs: "task group { } has scheduled the next run @ { }.

F. After the thread pool executes the task group, the specific execution logic is as follows:

1. changing state into RUNNING;

2. adding the scheduling group to the runningTaskgroups (runningTaskgroups are running scheduling groups), and recording the current running scheduling group;

3. recording a log before scheduling and executing, and warehousing;

4. finding all tasks in the group according to the scheduling group codes, and printing a log: "scheduling group { } tasks to be executed have { }";

5. and calling the buildTaskAG function to construct a directed acyclic graph of the Task running sequence of the group.

buildTaskDAG logic: if the plfm_disp_grp_rela.supra_cd field of the task has a value, which represents that the execution of the task depends on the superior task, maintaining the dependent task information of the task in the program, if not, constructing the execution sequence of the task according to the prio_ordr field of the scheduling configuration information, and changing the state of the task into INITIALIZED with a high priority of small sequence number.

Wherein the supra_cd field is described as follows:

when the upstream codes are connected by &', the upstream scheduling is indicated to be completed completely so as to execute the current scheduling;

when the upstream codes are connected by's', it means that the upstream schedule can execute the current schedule as long as one is completed;

when the upstream code is 'null' or empty string, it indicates that the current schedule has no upstream schedule, and the current schedule can be executed when the schedule group starts.

6. Adding the sorted task information (tasks) into a scheduling execution object (task group);

7. calling the taskgroup.updateReadQueue () to update readyQueue of the taskgroup;

8. updating the state of the task according to the dependency information maintained by the previous tasks, and marking the state without the pre-task as READY;

9. entering a while (true) dead loop, continuously taking the task from the readyQueue and calling runSingleTask, runSingleTask algorithm execution flow:

setting the task state as: RUNNING;

printing a log before execution;

executing corresponding SQL (structured query language) according to the configured scheduling information, if the corresponding SQL is scheduled by a database, and if the corresponding SQL is scheduled by a remote task, executing a corresponding shell script by a corresponding client, and if the corresponding SQL is not executed, changing the state into FAILED, and changing the state into TaskState if the corresponding SQL is executed successfully;

printing an execution state log, and warehousing;

10. after the execution is finished, the TaskGroup.updateReadyQueue () is called to update readyQueue, and the cycle is continued;

11. when the last task is finished, putting a task R (enumeration variable) into the readyQueue by the task group;

12. checking the running conditions of all tasks, maintaining log information corresponding to success or failure, setting a corresponding task group state, and warehousing a corresponding dispatching group log;

13. and finally deleting the completed schedule from the runningTaskgroups.

The scheduling operation processing procedure of the intelligent data scheduling method and system based on centralized management is specifically illustrated below.

When the scheduling object type is the second scheduling type, shell scheduling is performed, taking the a00188 scheduling group and the next 001609 scheduling as an example, please refer to fig. 3-6.

When the scheduling object type is the first scheduling type, SQL scheduling is executed, taking the A00102 scheduling group and the next 000111 scheduling group as examples, please refer to FIGS. 7-10.

As shown in the above table, in the second schedule in the database, i.e., the platform schedule group table, the schedule group codes, schedule group descriptions, and user codes of the a00102 and a00188 schedule groups are stored, and the valid flag thereof is 1.

As shown in the table above, in the third schedule, i.e., the platform schedule operation table, the schedule group operation configuration information is stored, wherein the schedule type of the a00102 group is 2, the execution type is serial waiting, the operation failure result is failure continuation, the operation time is 00:05, the operation frequency is 60 minutes, i.e., one hour, the method is applicable to the operation that the transaction day is empty, i.e., the operation is not performed according to the transaction calendar, the operation start date is 2022-11-26, the operation expiration date is 9999-01-01, the schedule can calculate the operation time from 2022-11-26 days, the operation is started every 00:05, and the operation is performed every hour.

Wherein the scheduling type of the A00188 group is 2, the execution type is serial waiting, the operation failure result is failure continuation, the operation time is 08:00, 18:00 and 23:30, the operation frequency is 1440 minutes, namely one day, the applicable transaction date is the uploading place, the operation starting date is 2023-06-16, the deadline is 9999-01-01, the scheduling is from 2023-06-16 date, and the scheduling is operated once according to 08:00, 18:00 and 23:30 of the transaction calendar of the uploading place every day.

As shown in the table, the first schedule table, i.e. the platform schedule list, records specific information of the schedule, wherein the schedule object of 000111 schedule is 1, i.e. SQL schedule, the schedule database is dwfin, the schedule statement is a database storage process statement, meanwhile, the schedule list and the user code of the schedule are recorded, and the valid flag is set to 1, then the schedule can execute the SQL statement locally when being executed in the schedule group, and the corresponding storage process is operated to update the data. The scheduling object of 001609 scheduling is 2, namely shell scheduling, the scheduling database is file_into_24010047, the scheduling statement is 1000301@24010047@24010047, and the scheduling description and the user code of the scheduling are stored at the same time.

As shown in the above table, the fifth schedule, i.e., the platform schedule group association table, wherein the 000111 schedule belongs to the a00102 schedule group and is the schedule running third in all the a00102 schedule group schedules; 001609 schedule belongs to the A00188 schedule group and is the 7 th run schedule in all A00188 schedule group schedules.

As shown in the above table, the fourth schedule, that is, the platform scheduling model configuration table, records the scheduling statement templates of each shell schedule, standardizes the contents of the scheduling statements, and standardizes the storage addresses of the scheduling logs and the association relationship between the schedule and the scheduler.

As shown in the above table, the sixth schedule, i.e. the platform monitoring schedule log table, records basic information of each time the schedule group runs.

As shown in the above table, the seventh schedule, i.e., the platform schedule group log table, records basic information for each time of scheduling.

In summary, according to the data intelligent scheduling method and system based on centralized management provided by the embodiment of the invention, the scheduling code, the scheduling statement and the scheduling object type in the first scheduling table are acquired, wherein the scheduling object type comprises a first scheduling type and a second scheduling type; when the scheduling object type is the first scheduling type, SQL scheduling is executed, a scheduling operation instruction is sent to a server, SQL storage process is executed at regular time by the server, and data in the database is added and/or updated at regular time, wherein the server comprises a plurality of local servers and/or a plurality of remote servers; when the scheduling object type is the second scheduling type, executing shell scheduling, wherein the scheduling statement comprises server ID information, transmitting a scheduling operation instruction to a designated server according to the server ID information, and executing an SQL storage process at regular time by the designated server, wherein the scheduling operation instruction is splicing information of the server ID information, the scheduler ID information and the scheduler log ID information, so that automatic execution of a scheduling task is realized, and accurate scheduling is realized;

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. The intelligent data scheduling method based on centralized management is characterized by comprising the following steps of:

2. The intelligent scheduling method for data based on centralized management according to claim 1, further comprising obtaining scheduling group information in the second schedule, wherein the scheduling group information includes scheduling tasks executed at the same time and scheduling tasks of the same item.

3. The intelligent scheduling method for data based on centralized management according to claim 1, further comprising obtaining a scheduling execution type in a third schedule, wherein the scheduling execution type includes serial waiting, parallel execution, and discard execution.

4. The intelligent scheduling method based on centralized management of claim 1, further comprising obtaining a scheduling type, a scheduler path, and a scheduler log path in a fourth schedule.

5. The intelligent scheduling method for data based on centralized management according to claim 2, further comprising obtaining a superior code in a fifth schedule, wherein the superior code comprises a first code, a second code and a third code, the superior code is preferentially executed when a scheduling task is executed, the fifth schedule is used for associating the first schedule with the second schedule, the fifth schedule comprises a schedule group code and a schedule code, the schedule group code is used for associating the second schedule, and the schedule code is used for associating the first schedule.

6. The intelligent scheduling method for data based on centralized management according to claim 1, further comprising obtaining log information after scheduling operation in a sixth schedule, wherein the sixth schedule includes a monitoring schedule code, and the monitoring schedule code is used for associating the schedule code in the first schedule.

7. The intelligent scheduling method for data based on centralized management according to claim 2, further comprising obtaining the running condition of a scheduling group in a seventh scheduling table, wherein the seventh scheduling table includes a scheduling group code, and the scheduling group code is used for associating the scheduling group code in the second scheduling table.

8. The intelligent scheduling method for data based on centralized management according to claim 5, wherein when the upper level code includes the first code, the upstream scheduling must be completed entirely to perform the current scheduling; when the upper level code comprises the second code, the upstream scheduling can execute the current scheduling only if one is completed; and when the upper-level code comprises the third code, the current scheduling does not have the upstream scheduling, and the current scheduling is directly executed.

9. The intelligent scheduling method for data based on centralized management according to claim 1, wherein when the scheduling object type is a first scheduling type, scheduling is performed for a database, and corresponding SQL is executed; and when the scheduling object type is the second scheduling type, scheduling the remote task, and executing the corresponding shell script.

10. An intelligent data scheduling system based on centralized management, which is characterized by comprising: