CN108829505A - A kind of distributed scheduling system and method - Google Patents

A kind of distributed scheduling system and method Download PDF

Info

Publication number
CN108829505A
CN108829505A CN201810689488.1A CN201810689488A CN108829505A CN 108829505 A CN108829505 A CN 108829505A CN 201810689488 A CN201810689488 A CN 201810689488A CN 108829505 A CN108829505 A CN 108829505A
Authority
CN
China
Prior art keywords
unit
subtask
scheduling
task
running
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810689488.1A
Other languages
Chinese (zh)
Inventor
王肖磊
刘陟
王志超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201810689488.1A priority Critical patent/CN108829505A/en
Publication of CN108829505A publication Critical patent/CN108829505A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Abstract

The present invention provides a kind of distributed scheduling system and methods, the system includes scheduling unit, operation portion and storage unit, scheduling unit includes at least one scheduling unit, operation portion includes multiple running units, storage unit includes at least one storage unit, scheduling unit is divided into multiple subtasks suitable for extracting waiting task from file system, and by waiting task;Scheduling unit is further adapted for for multiple subtasks being distributed to corresponding running unit, and executes corresponding subtask by running unit;Scheduling unit, the implementation procedure for being further adapted for recording each subtask generates record information, and record information is stored in storage unit.Solve the problems, such as to be focused on task as central equipment and caused by single-point, when mission failure or execute equipment and break down, continued to execute by other equipment node, realizing the automatic multimachine of mission failure retries, guarantee task in time, correct operation.Alert notice can also be carried out to user by the system by occurring other problems during task execution.

Description

A kind of distributed scheduling system and method
Technical field
The present invention relates to field of computer technology, more particularly to a kind of distributed scheduling system and method.
Background technique
Heimdall is that the mass data with entirely autonomous intellectual property is excavated and analysis system, the system can be with Realize the excavation and processing to mass data, and provide easy-to-use tool to make for data mining personnel and OA operation analysis personnel With.For present analysis personnel using the system when inquiring file, what is found be file is usually original log, therefore is also needed Original log is processed again, handles, analyze, this undoubtedly will increase the workload of analysis personnel, is unfavorable for improving and divide The working efficiency of analysis personnel needs directly real in Heimdall system at this time in order to provide convenience for analysis personnel etc. Further extraction, the refinement of existing original log.
But script/meter is all based on when carrying out data mining processing or data pick-up task using the system at present It calculates platform/hard coded mode and strings whole flow process.In this kind of mode, all scheduler tasks have all been unified by central equipment At data processing task scheduling information concentration is aggregated into this management node of central equipment, causes information flow crowded, if the pipe Reason node, which breaks down, will affect the data processing task of whole system, in addition, the task treatment effeciency of current system is low, it can not Meet user demand.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind State problem distributed scheduling system and corresponding method.
According to one aspect of the present invention, it provides a kind of distributed scheduling system, including scheduling unit, operation portion and deposits Storage portion, the scheduling unit include at least one scheduling unit, and the operation portion includes multiple running units, and the storage unit includes At least one storage unit,
The scheduling unit suitable for extracting waiting task from file system, and the waiting task is divided into more A subtask;
The scheduling unit is further adapted for for the multiple subtask being distributed to corresponding running unit, and by the operation Unit executes corresponding subtask;
The scheduling unit, the implementation procedure for being further adapted for recording each subtask generates record information, and the record is believed Breath is stored in storage unit.
Optionally, the system also includes:
Front end, including at least one front end unit, the front end unit are suitable for exploitation and obtain code, and according to the generation Code generates corresponding waiting task;
The waiting task is uploaded to file system to store.
Optionally, the scheduling unit, is further adapted for:
The waiting task is divided into multiple subtasks in conjunction with the source of the waiting task, and single according to each operation The multiple subtask is distributed to corresponding running unit by the operating status of member.
Optionally, the running unit, is further adapted for:
Corresponding running environment is created according to the received corresponding subtask of institute, and executes the son in the running environment and appoints Business.
Optionally, the scheduling unit, is further adapted for:
The implementation procedure of subtask in each running unit is monitored and recorded, and executes exception monitoring any subtask When, start other running units automatically and continues to execute the subtask.
Optionally, the scheduling unit, is further adapted for:
The metamessage and temporary information of each subtask are extracted from the implementation procedure of each subtask, wherein the member letter Breath includes subtask identification information and/or subtask type, and the temporary information includes subtask quantity and/or execution time;
Record information is generated using the metamessage and temporary information.
Optionally, the storage unit, is further adapted for:
If the storage unit includes etcd database, mysql database and redis database, the member is believed Breath is stored into the etcd database and/or mysql database, and the temporary information is stored to the redis number According in library.
Optionally, the front end unit, is further adapted for:
It shows the implementation procedure of each subtask, and monitors the display state, execute alarm in the display abnormal state Notice.
Optionally, the front end unit, is further adapted for:
The triggering of user is received to suspend the subtask executed needed for performed subtask or starting.
According to another aspect of the invention, a kind of distributed scheduling method is additionally provided, including:
Waiting task is extracted from file system, and the waiting task is divided into multiple subtasks;
The multiple subtask is distributed to corresponding running unit, and corresponding subtask is executed by the running unit;
The implementation procedure for recording each subtask generates record information, and the record information is stored in storage unit.
Optionally, before extracting waiting task in file system, further include:
Exploitation obtains code, and according to the corresponding waiting task of the code building;
The waiting task is uploaded to file system to store.
Optionally, the waiting task is divided into multiple subtasks, including:
The waiting task is divided into multiple subtasks in conjunction with the source of the waiting task.
Optionally, the multiple subtask is distributed to corresponding running unit, including:
The multiple subtask is distributed to corresponding running unit according to the operating status of each running unit.
Optionally, corresponding subtask is executed by the running unit, including:
Corresponding running environment is created according to the received corresponding subtask of institute by the running unit, and in the operation ring The subtask is executed in border.
Optionally, the method further includes:
The implementation procedure of subtask in each running unit is monitored and recorded, and executes exception monitoring any subtask When, start other running units automatically and continues to execute the subtask.
Optionally, the implementation procedure for recording each subtask generates record information, including:
The metamessage and temporary information of each subtask are extracted from each subtask implementation procedure, wherein the metamessage Including subtask identification information and/or subtask type, the temporary information includes subtask quantity and/or execution time;
Record information is generated using the metamessage and temporary information.
Optionally, the record information is stored to storage unit, including:
If the storage unit includes etcd database, mysql database and redis database, the member is believed Breath is stored into the etcd database and/or mysql database, and the temporary information is stored to the redis number According in library.
Optionally, the method also includes:
It shows the implementation procedure of each subtask, and monitors the display state, execute alarm in the display abnormal state Notice.
Optionally, the method further includes:
The triggering of user is received to suspend the subtask executed needed for performed subtask or starting.
According to another aspect of the invention, a kind of computer storage medium, the computer storage medium are additionally provided It is stored with computer program code, when the computer program code is run on computers, the computer is caused to execute Distributed scheduling method described in any of the above embodiments.
According to another aspect of the invention, a kind of calculating equipment is additionally provided, including:Processor;It is stored with computer The memory of program code;When the computer program code is run by the processor, the calculating equipment is caused to execute Distributed scheduling method described in any of the above embodiments.
Distributed scheduling system of the invention includes scheduling unit, operation portion and storage unit, wherein scheduling unit includes at least One scheduling unit, operation portion include multiple running units, and storage unit includes at least one storage unit.Specifically, first by Scheduling unit extracts waiting task from file system, and waiting task is divided into multiple subtasks.Secondly, single by scheduling Multiple subtasks are distributed to corresponding running unit by member, and corresponding son is executed after running unit receives corresponding subtask and is appointed Business.Further, the implementation procedure of each subtask is recorded by scheduling unit and generates record information, and then record information is stored in Storage unit.It can be seen that the present invention passes through at least one scheduling unit, multiple running units and at least one storage unit Distributed structure/architecture, acquired waiting task is divided into multiple subtasks by scheduling unit, and simultaneously by multiple running units Row executes corresponding subtask, solve the problems, such as in the prior art as central equipment focus on task and caused by single-point, and And the structure of task is executed using multi-node parallel, and when mission failure or equipment failure is executed, it can be by other equipment section Point continues to execute, and realizing the automatic multimachine of mission failure retries, and ensure that timely, the correct operation of task.On the other hand, this hair It is bright by carrying out real-time storage to by recorded information, searched convenient for subsequent analysis personnel or other processing equipments and obtain corresponding letter Breath further improves the timeliness of task processing from side.In addition, distributed scheduling system provided by the invention can also be certainly It is dynamic to provide the classification of extracted task code, facilitate the processing of each task, further how is concerned about task code without user Operation only need to be uploaded to scheduling system of the invention, and the code of required operation can be classified automatically and be dispatched to properly Machine on run, further realize the function that mission failure retries automatically.More, there are other during task execution Problem can also carry out alert notice to user by the system.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
According to the following detailed description of specific embodiments of the present invention in conjunction with the accompanying drawings, those skilled in the art will be brighter The above and other objects, advantages and features of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is the structural schematic diagram of distributed scheduling system according to an embodiment of the invention;
Fig. 2 is another structural schematic diagram of distributed scheduling system according to an embodiment of the invention;
Fig. 3 is the design structure schematic diagram of distributed scheduling system according to an embodiment of the invention;And
Fig. 4 is the flow chart of distributed scheduling method according to an embodiment of the invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of distributed scheduling systems.Fig. 1 shows root According to the structural schematic diagram of the distributed scheduling system of one embodiment of the invention.Referring to Fig. 1, the distributed scheduling system of the present embodiment System includes scheduling unit, operation portion and storage unit.Wherein, scheduling unit includes at least one scheduling unit 10, and operation portion includes more A running unit 20, storage unit include at least one storage unit 30.
Between the function and each section of each component units based on distributed scheduling system for now introducing the embodiment of the present invention Connection relationship:
Scheduling unit 10 is divided into suitable for extracting waiting task from file system, and by acquired waiting task Multiple subtask is further distributed to corresponding running unit 20 by multiple subtasks;
Running unit 20 is coupled with scheduling unit 10, suitable for receiving and executing the correspondence subtask of the distribution of scheduling unit 10;
Storage unit 30 is coupled with scheduling unit 10, suitable for recording the implementation procedure of each subtask in scheduling unit 10 simultaneously After generating record information, the record information is received and saved.
It should be noted that in the present embodiment, for convenience and clear, Fig. 1 illustrates only a scheduling unit 10 with the connection relationship of other each sections, it is to be understood that any scheduling unit of other in the distributed scheduling system of the present embodiment 10 have identical structure and function, each connection relationship and Fig. 1 example with the scheduling unit 10 enumerated in the embodiment In it is identical, no longer excessively repeat herein, the connection relationship in relation to other scheduling units 10 in figure is also no longer shown.
Point that the present invention passes through at least one scheduling unit 10, multiple running units 20 and at least one storage unit 30 Acquired waiting task is divided into multiple subtasks by scheduling unit by cloth framework, and parallel by multiple running units 20 Execute corresponding subtask, solve the problems, such as in the prior art as central equipment focus on task and caused by single-point, also, The structure that task is executed using multi-node parallel when mission failure or executes equipment failure, can be by other equipment node It continues to execute, realizes the automatic multimachine of mission failure and retry, ensure that timely, the correct operation of task.On the other hand, of the invention By carrying out real-time storage to by recorded information, is searched convenient for subsequent analysis personnel or other processing equipments and obtain corresponding letter Breath further improves the timeliness of task processing from side.In addition, distributed scheduling system provided by the invention can also be certainly It is dynamic to provide the classification of extracted task code, facilitate the processing of each task, further how is concerned about task code without user Operation only need to be uploaded to scheduling system of the invention, and the code of required operation can be classified automatically and be dispatched to properly Machine on run, further realize the function that mission failure retries automatically.More, there are other during task execution Problem can also carry out alert notice to user by the system.
Further, Fig. 2 shows another structures of distributed scheduling system according to an embodiment of the invention Schematic diagram, as shown in Fig. 2, the distributed scheduling system of this implementation further includes front end, which includes at least one front end Unit 40.The front end unit 40 of the present embodiment, couples with scheduling unit 10, is suitable for by data processing staff development code, and root According to the acquired corresponding waiting task of code building, and the waiting task is uploaded to file system and is stored.
In the embodiment, file system can be hdfs (the Hadoop Distributed File for being stored with massive logs System, distributed file system), the file system such as S3 (Simple Storage Service, simple storage service), certainly It can also be other file system.The waiting task stored in the document storage system of the present embodiment can be data and turn Change task is also likely to be other tasks, and the present embodiment is not especially limited this.
It, can be according to acquired waiting task after scheduling unit 10 extracts waiting task from file system Source be classified as multiple subtasks.Certain scheduling unit 10 can also be according to any other feasible rule to be processed Business is classified, waiting task is divided into multiple subtasks by based on script computing platform string by way of hard coded The scheme of whole flow process has done the improvement of essence, whole flow process sub-module is handled, so that can locate parallel between each task module Reason, not only improves task treatment effeciency, and avoid the Single Point of Faliure problem of system, allows task timely, stable Ground processing.
It in the present embodiment, can be with after acquired waiting task is divided into multiple subtasks by scheduling unit 10 In conjunction with the state of each running unit 20, multiple subtasks are distributed to corresponding running unit 20, and then by each running unit 20 Each subtask is executed parallel.It specifically, can be first when being distributed according to the state of each running unit 20 to multiple subtasks First judge whether each running unit 20 is currently carrying out task.If distributed scheduling system is in rigid starting state, usually In the case of, most of running unit 20 in operation portion is in idle condition, at this point, if scheduling unit 10 is got from file system Waiting task and after being classified as multiple subtasks, can arbitrarily choose in multiple running units 20 being in idle condition The running unit 20 of required number can also be believed with to be received according to the address information of each running unit 20 or other unique identifications Breath is ranked up each running unit 20, and then multiple subtasks are distributed to ordering preset number by scheduling unit 10 In running unit 20.It should be noted that the above description of the present embodiment is only to enumerate, and do not constitute a limitation of the invention.
After running unit 20 receives corresponding subtask, the subtask can be executed.Specifically, in the present embodiment, Corresponding running environment can be created according to the received subtask of institute by running unit 20 first, further in the operation created Corresponding subtask is executed in environment.In the present embodiment, when 20 subtasking of running unit, scheduling unit 10 can be supervised The implementation procedure of subtask in each running unit 20 is controlled and recorded, and when monitoring any subtask execution exception, is opened automatically It moves other running units 20 and continues to execute the subtask.Specifically, it when starting other running units 20 automatically, can open at random It moves any running unit 20 and continues to execute the subtask, can also be selected further combined with the status information of other running units 20 Suitable running unit 20 continues to execute, and can also choose corresponding running unit 20 according to preset rules and execute the son Task, the present embodiment only need in any 20 neutron task execution exception of running unit, start automatically other running units 20 after It is continuous execute the subtask with guarantee the subtask can in time, properly process.
Further, in the present embodiment, scheduling unit 10 can also extract each son from the implementation procedure of each subtask The metamessage and temporary information of task, wherein the metamessage of the present embodiment includes identification information and/or the subtask of subtask Type, temporary information include subtask quantity and/or execute the time.In turn, using extracted metamessage and temporarily Information generates record information.
In addition, the implementation procedure of the corresponding subtask in recording each running unit 20 of scheduling unit 10, and generate record letter After breath, record information generated can also be stored to storage unit 30.Specifically, in the present embodiment, storage unit 30 It may include etcd database, mysql database and redis database, wherein etcd database is the key of a High Availabitity It is worth storage system, is mainly used for configuration sharing and service discovery;Mysql database is mainly used for storing first number of some data It is believed that breath, will such as store after the metadata statistics of log into mysql database;Redis is the use ANSI an of open source C language writes, support network, it is memory-based also can persistence log type, Key-Value database.In the present embodiment, When storage unit 30 includes etcd database, mysql database and redis database, then metamessage is stored to etcd In database and/or mysql database, and temporary information stored into redis database.By to the information recorded Carry out classification storage so that task performed by distributed scheduling system is definitely changed, and be more convenient for subsequent analysis personnel or Inquiry and extraction of the data processing equipment to corresponding information.
Further, in the present embodiment, front end unit 40 is further adapted for showing the implementation procedure of each subtask, and monitors The display state executes alert notice to user when showing abnormal state.On the other hand, front end unit 40 can also receive use The triggering at family is to suspend the subtask executed needed for performed subtask or starting.
It is described in detail below with a specific embodiment to distributed scheduling system of the invention.
The distributed scheduling system of the present embodiment, core are distributed task dispatchings, and task may be that data conversion is appointed Business is also likely to be other tasks, belongs to an infrastructure component system.The distributed scheduling system is tied based on master slave Structure design, have and restore simultaneously retray function after executing mission failure automatically, and can support multiple-task type, is such as based on MapReduce model is simultaneously extracted log metadata, scheduling and is downloaded from the offline logs of file system using Spark engine File, load stored in hdfs etc..Wherein, MapReduce model is a kind of programming model, is used for large-scale dataset The concurrent operation of (being greater than 1TB), Spark engine are the computing engines for the Universal-purpose quick for aiming at large-scale data processing and designing. Referring to Fig. 3, the specific work process of distributed scheduling system is now introduced.
Etcd cluster in distributed scheduling system, i.e. master (scheduling unit i.e. in the present invention) cluster, There are multiple master in master cluster.Any master can be stored from file extracts to be processed appoint in (S3/hdfs) Be engaged in task, and the waiting task of extraction is distributed to and corresponding worker (running unit i.e. of the invention) node In, for example, the task of extraction can be distributed in 4 corresponding worker nodes by master leader, this 4 Worker node can be parallel execution task.
During worker node execution task, master can be to the current task and corresponding each of executing The implementation procedure of worker node is recorded.In addition, the task member number that master can also will be generated during the task of execution Store according to (such as log metadata) into etcd, mysql database, and by generation other record (such as task quantity), Temporary information is stored into memory/redis database.
Data processing task is dispatched using the distributed scheduling system of the embodiment of the present invention, it can be in a node tasks Other nodes re-execute task after failure, effectively prevent single-point problem, also, also greatly facilitate data processing and appoint Business, without being concerned about how task executes, as long as task is uploaded to distributed scheduler, task Automatic dispatching to suitable machine It is run on device, and can be carried out failure and retry.
Further, the embodiment of the invention also provides a kind of distributed datas to dispatch system, distributed data scheduling The core function of system is to carry out the scheduling and conversion of data, can carry out data using distributed scheduling system above The scheduling of analysis task, this task can be conversion task, be also possible to other tasks.Using distributed scheduling system tune After degree task, as distributed scheduling system dispatches offline logs to be processed, then by distributed data scheduling system to scheduling Task carries out further data processing, such as provides elasticity/programmable process flow, data stream monitoring, easy to use deposits Storage, modular data mart modeling process etc..Distributed data scheduling system can be to be set based on data processing shelf above Meter.
In the embodiment, distributed data dispatches system and task has been re-started definition, such as node, rdd, meta, In, node can represent a kind of mode that data processing is collected, and the output of a node can be used as the input of next node, Each node is logically independent, but can by configure/xml strings together.Node may include such as Types Below:
Filter, the node of filtration types can handle the rdd of input with customized filter condition;
Event, the node of event type customized can extract result according to customized event;
Fill has mended the node of type, customized can mend rule to handle the rdd of input;
Map/reduce carries out the node of data processing by map/reduce program;
Spark carries out the node of data processing by spark program;
Script carries out the node of data processing by script.
Rdd is derived from the concept of spark, elasticity distribution formula data set, and a results set of node is exactly rdd, and rdd can be certainly Definition storage, or data volume can be defined and automatically select storage, in addition, rdd also can define the rule cut, cut output.
Meta metadata, the data type that each node can be handled, such as processing sample, can be the data structure of sample It is defined as the form of Virtual table described above.
It is hereby achieved that distributed data scheduling its core function of system is to be configured to execute according to node in individual node Corresponding service logic.
Data pick-up task is described with simply example below.Such as data pick-up task is to extract the sample of Baidu This.
Firstly, extracting md5 (message-digest algorithm 5, message digest algorithm 5) from cloud killing log It is then calculated and is corresponded to according to md5, sha1 value of extraction with sha1 (secure hash algorithm, Secure Hash Algorithm) The daily pv of sample (page view, page browsing amount)/uv (unique visitor, independent visitor), and it is big to obtain pv/uv In the parent_url (parent_uniform resource locator, parent uniform resource locator) of the sample of 100w. If the sample comprising Baidu, its all subprocess is obtained, and the details for extracting previous hundred subprocess are shown.
The specific execution step of above-mentioned data pick-up task is executed using the distributed data scheduler of the embodiment of the present invention It can be:
Step1, monitoring pv are greater than 1000000 sample;Specific code can be
Step2, the parent_url attribute for pulling sample;Specific code can be
Step3, the sample that parent_url includes Baidu is filtered out;Specific code can be
{filter rdd
Calculate whether preent_url includes Baidu }
Step4, all subprocess samples of filtering;Specific code can be
Step5, the sample of previous hundred subprocess is shown in front end.
In the embodiment of the present invention, distributed data dispatches system and realizes that the Scheduling Core of logic can be according to whole configuration Above-mentioned each step is stringed together, and is responsible for the relevant storage of management rdd, the task of each node is distributed to each node and is held Row.In the embodiment, single node can be independently executed.
In addition, data processing system can also provide the function of visual edit by setting front end page, visualization The conf (configuration file) that editor's configuration generates json format is submitted to Scheduling Core.Also, front end page can not only be shown The progress of each node can also provide the function of being started manually by the user the stopping single node of the task.
The present invention passes through at least one scheduling unit, the distributed frame of multiple running units and at least one storage unit Acquired waiting task is divided into multiple subtasks by scheduling unit, and executes correspondence parallel by multiple running units by structure Subtask, solve the problems, such as in the prior art as central equipment focus on task and caused by single-point, also, use more piece The structure of point executing tasks parallelly when mission failure or executes equipment failure, can be continued to execute by other equipment node, It realizes the automatic multimachine of mission failure to retry, ensure that timely, the correct operation of task.On the other hand, the present invention passes through to general Recorded information carries out real-time storage, searches convenient for subsequent analysis personnel or other processing equipments and obtains corresponding informance, from side Further improve the timeliness of task processing.In addition, distributed scheduling system provided by the invention will can also be mentioned automatically The task code classification taken is provided, and is facilitated the processing of each task, is further concerned about how task code runs without user, only needs It is uploaded to scheduling system of the invention, the code of required operation can be classified automatically and be dispatched on suitable machine and transported Row, further realizes the function that mission failure retries automatically.More, occurring other problems during the task execution can be with Alert notice is carried out to user by the system.
Based on the same inventive concept, the embodiment of the invention also provides a kind of distributed scheduling method, Fig. 4 shows basis The flow chart of the distributed scheduling method of one embodiment of the invention.As shown in figure 4, the distributed scheduling method includes at least step Rapid S402 to step S406:
Step S402, waiting task is extracted from file system, and waiting task is divided into multiple subtasks;
Step S404, multiple subtasks are distributed to corresponding running unit, and corresponding subtask is executed by running unit;
Step S406, the implementation procedure for recording each subtask generates record information, and record information is stored in storage list Member.
In an embodiment of the present invention, before executing step S402, can also by data mining staff development code, and By front end unit according to the corresponding waiting task of code building, the waiting task of the generation is further uploaded to file system System is stored.
In an embodiment of the present invention, it when waiting task is divided into multiple subtasks by execution step S404, can combine The waiting task is divided into multiple subtasks by the source of waiting task, further will according to the operating status of each running unit Multiple subtasks are distributed to corresponding running unit.By running unit according to the received corresponding corresponding operation of subtask creation of institute Environment, and the subtask is executed in running environment.
In an embodiment of the present invention, in above-mentioned steps implementation procedure, scheduling unit can also monitor and record each operation The implementation procedure of subtask in unit, and when monitoring any subtask and executing abnormal, start automatically other running units after It is continuous to execute the subtask.Specifically, it when the implementation procedure for recording each subtask generates record information, can be held from each subtask The metamessage and temporary information of each subtask are extracted during row, wherein in the present embodiment, metamessage includes subtask mark Know information and/or subtask type, temporary information includes subtask quantity and/or execution time.Get metamessage and After temporary information, it can use acquired metamessage and temporary information and generate record information.
It further, in an embodiment of the present invention, can be according to the type pair of record information when executing step S406 It carries out selective storage.Specifically, if storage unit includes etcd database, mysql database and redis database, Then metamessage is stored into etcd database and/or mysql database, and temporary information is stored to redis database In.
In addition, the implementation procedure of each subtask can also be shown by front end unit in above-mentioned steps implementation procedure, and The display state is monitored, alert notice can also be executed when showing abnormal state.In another alternative embodiment, front end Unit can also receive the triggering of user to suspend the subtask executed needed for performed subtask or starting.
The embodiment of the invention also provides a kind of computer storage medium, computer storage medium is stored with computer program Code causes calculating equipment to execute point in any embodiment above when computer program code is run on the computing device Cloth dispatching method.
In addition, the embodiment of the invention also provides a kind of calculating equipment, including processor;It is stored with computer program code Memory;When computer program code is run by processor, calculating equipment is caused to execute point in any embodiment above Cloth dispatching method.
According to the combination of any one above-mentioned preferred embodiment or multiple preferred embodiments, the embodiment of the present invention can reach Following beneficial effect:
The present invention passes through at least one scheduling unit, the distributed frame of multiple running units and at least one storage unit Acquired waiting task is divided into multiple subtasks by scheduling unit, and executes correspondence parallel by multiple running units by structure Subtask, solve the problems, such as in the prior art as central equipment focus on task and caused by single-point, also, use more piece The structure of point executing tasks parallelly when mission failure or executes equipment failure, can be continued to execute by other equipment node, It realizes the automatic multimachine of mission failure to retry, ensure that timely, the correct operation of task.On the other hand, the present invention passes through to general Recorded information carries out real-time storage, searches convenient for subsequent analysis personnel or other processing equipments and obtains corresponding informance, from side Further improve the timeliness of task processing.In addition, distributed scheduling system provided by the invention will can also be mentioned automatically The task code classification taken is provided, and is facilitated the processing of each task, is further concerned about how task code runs without user, only needs It is uploaded to scheduling system of the invention, the code of required operation can be classified automatically and be dispatched on suitable machine and transported Row, further realizes the function that mission failure retries automatically.More, occurring other problems during the task execution can be with Alert notice is carried out to user by the system.
It is apparent to those skilled in the art that the specific work of the system of foregoing description, equipment and unit Make process, can refer to corresponding processes in the foregoing method embodiment, for brevity, does not repeat separately herein.
In addition, each functional unit in each embodiment of the present invention can be physically independent, can also two or More than two functional units integrate, and can be all integrated in a processing unit with all functional units.It is above-mentioned integrated Functional unit both can take the form of hardware realization, can also be realized in the form of software or firmware.
Those of ordinary skill in the art will appreciate that:If integrated functional unit is realized in the form of software and as only Vertical product when selling or using, can store in a computer readable storage medium.Based on this understanding, this hair Bright technical solution is substantially or all or part of the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium comprising some instructions, with (such as personal so that calculating equipment Computer, server or network etc.) all or part of the steps of execution various embodiments of the present invention method in operating instruction. And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or The various media that can store program code such as person's CD.
Alternatively, realizing that all or part of the steps of preceding method embodiment can be (all by the relevant hardware of program instruction Such as personal computer, the calculating equipment of server or network etc.) it completes, described program instruction can store to be calculated in one In machine read/write memory medium, when described program instruction is executed by the processor of calculating equipment, the calculating equipment executes sheet Invent all or part of the steps of each embodiment the method.
Finally it should be noted that:The above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Present invention has been described in detail with reference to the aforementioned embodiments for pipe, those skilled in the art should understand that:At this Within the spirit and principle of invention, it is still possible to modify the technical solutions described in the foregoing embodiments or right Some or all of the technical features are equivalently replaced;And these are modified or replaceed, and do not make corresponding technical solution de- From protection scope of the present invention.
Based on one aspect of the present invention, provide a kind of distributed scheduling system of A1., including scheduling unit, operation portion with And storage unit, the scheduling unit include at least one scheduling unit, the operation portion includes multiple running units, the storage unit Including at least one storage unit,
The scheduling unit suitable for extracting waiting task from file system, and the waiting task is divided into more A subtask;
The scheduling unit is further adapted for for the multiple subtask being distributed to corresponding running unit, and by the operation Unit executes corresponding subtask;
The scheduling unit, the implementation procedure for being further adapted for recording each subtask generates record information, and the record is believed Breath is stored in storage unit.
A2. system according to a1, wherein further include:
Front end, including at least one front end unit, the front end unit are suitable for exploitation and obtain code, and according to the generation Code generates corresponding waiting task;
The waiting task is uploaded to file system to store.
A3. system according to a1 or a2, wherein the scheduling unit is further adapted for:
The waiting task is divided into multiple subtasks in conjunction with the source of the waiting task, and single according to each operation The multiple subtask is distributed to corresponding running unit by the operating status of member.
A4. system according to a1 or a2, wherein the running unit is further adapted for:
Corresponding running environment is created according to the received corresponding subtask of institute, and executes the son in the running environment and appoints Business.
A5. system according to a4, wherein the scheduling unit is further adapted for:
The implementation procedure of subtask in each running unit is monitored and recorded, and executes exception monitoring any subtask When, start other running units automatically and continues to execute the subtask.
A6. system according to a5, wherein the scheduling unit is further adapted for:
The metamessage and temporary information of each subtask are extracted from the implementation procedure of each subtask, wherein the member letter Breath includes subtask identification information and/or subtask type, and the temporary information includes subtask quantity and/or execution time;
Record information is generated using the metamessage and temporary information.
A7. the system according to A6, wherein the storage unit is further adapted for:
If the storage unit includes etcd database, mysql database and redis database, the member is believed Breath is stored into the etcd database and/or mysql database, and the temporary information is stored to the redis number According in library.
A8. the system according to A7, wherein the front end unit is further adapted for:
It shows the implementation procedure of each subtask, and monitors the display state, execute alarm in the display abnormal state Notice.
A9. the system according to A8, wherein the front end unit is further adapted for:
The triggering of user is received to suspend the subtask executed needed for performed subtask or starting.
Based on another aspect of the present invention, a kind of distributed scheduling method of B10. is provided, including:
Waiting task is extracted from file system, and the waiting task is divided into multiple subtasks;
The multiple subtask is distributed to corresponding running unit, and corresponding subtask is executed by the running unit;
The implementation procedure for recording each subtask generates record information, and the record information is stored in storage unit.
B11. method according to b10, wherein before extracting waiting task in file system, further include:
Exploitation obtains code, and according to the corresponding waiting task of the code building;
The waiting task is uploaded to file system to store.
B12. method according to b10 or b11, wherein the waiting task is divided into multiple subtasks, including:
The waiting task is divided into multiple subtasks in conjunction with the source of the waiting task.
B13. method according to b12, wherein the multiple subtask is distributed to corresponding running unit, including:
The multiple subtask is distributed to corresponding running unit according to the operating status of each running unit.
B14. method according to b10 or b11, wherein corresponding subtask is executed by the running unit, including:
Corresponding running environment is created according to the received corresponding subtask of institute by the running unit, and in the operation ring The subtask is executed in border.
B15. method according to b14, wherein further include:
The implementation procedure of subtask in each running unit is monitored and recorded, and executes exception monitoring any subtask When, start other running units automatically and continues to execute the subtask.
B16. the method according to B15, wherein the implementation procedure for recording each subtask generates record information, including:
The metamessage and temporary information of each subtask are extracted from each subtask implementation procedure, wherein the metamessage Including subtask identification information and/or subtask type, the temporary information includes subtask quantity and/or execution time;
Record information is generated using the metamessage and temporary information.
B17. the method according to B16, wherein the record information is stored to storage unit, including:
If the storage unit includes etcd database, mysql database and redis database, the member is believed Breath is stored into the etcd database and/or mysql database, and the temporary information is stored to the redis number According in library.
B18. the method according to B17, wherein further include:
It shows the implementation procedure of each subtask, and monitors the display state, execute alarm in the display abnormal state Notice.
B19. the method according to B18, wherein further include:
The triggering of user is received to suspend the subtask executed needed for performed subtask or starting.
Based on an additional aspect of the present invention, a kind of computer storage medium of C20., the computer storage are additionally provided Media storage has computer program code, when the computer program code is run on computers, leads to the computer Execute the described in any item distributed scheduling methods of above-mentioned B10-B19.
Based on an additional aspect of the present invention, a kind of calculating equipment of D21. is additionally provided, including:Processor;It is stored with meter The memory of calculation machine program code;When the computer program code is run by the processor, lead to the calculating equipment Perform claim requires the described in any item distributed scheduling methods of B10-B19.

Claims (10)

1. a kind of distributed scheduling system, including scheduling unit, operation portion and storage unit, the scheduling unit include at least one tune Unit is spent, the operation portion includes multiple running units, and the storage unit includes at least one storage unit,
The scheduling unit is divided into multiple sons suitable for extracting waiting task from file system, and by the waiting task Task;
The scheduling unit is further adapted for for the multiple subtask being distributed to corresponding running unit, and by the running unit Execute corresponding subtask;
The scheduling unit, the implementation procedure for being further adapted for recording each subtask generates record information, and the record information is deposited It is stored in storage unit.
2. system according to claim 1, wherein further include:
Front end, including at least one front end unit, the front end unit are suitable for exploitation and obtain code, and raw according to the code At corresponding waiting task;
The waiting task is uploaded to file system to store.
3. system according to claim 1 or 2, wherein the scheduling unit is further adapted for:
The waiting task is divided into multiple subtasks in conjunction with the source of the waiting task, and according to each running unit The multiple subtask is distributed to corresponding running unit by operating status.
4. system according to claim 1 or 2, wherein the running unit is further adapted for:
Corresponding running environment is created according to the received corresponding subtask of institute, and executes the subtask in the running environment.
5. system according to claim 4, wherein the scheduling unit is further adapted for:
The implementation procedure of subtask in each running unit is monitored and records, and when monitoring any subtask execution exception, from Dynamic other running units that start continue to execute the subtask.
6. system according to claim 5, wherein the scheduling unit is further adapted for:
The metamessage and temporary information of each subtask are extracted from the implementation procedure of each subtask, wherein the metamessage packet Enclosed tool task identification information and/or subtask type, the temporary information include subtask quantity and/or execution time;
Record information is generated using the metamessage and temporary information.
7. system according to claim 6, wherein the storage unit is further adapted for:
If the storage unit includes etcd database, mysql database and redis database, the metamessage is deposited Storage is stored into the etcd database and/or mysql database, and by the temporary information to the redis database In.
8. a kind of distributed scheduling method, including:
Waiting task is extracted from file system, and the waiting task is divided into multiple subtasks;
The multiple subtask is distributed to corresponding running unit, and corresponding subtask is executed by the running unit;
The implementation procedure for recording each subtask generates record information, and the record information is stored in storage unit.
9. a kind of computer storage medium, the computer storage medium is stored with computer program code, when the computer When program code is run on computers, lead to distributed scheduling method described in the computer perform claim requirement 8.
10. a kind of calculating equipment, including:Processor;It is stored with the memory of computer program code;When the computer program When code is run by the processor, lead to distributed scheduling method described in the calculating equipment perform claim requirement 8.
CN201810689488.1A 2018-06-28 2018-06-28 A kind of distributed scheduling system and method Pending CN108829505A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810689488.1A CN108829505A (en) 2018-06-28 2018-06-28 A kind of distributed scheduling system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810689488.1A CN108829505A (en) 2018-06-28 2018-06-28 A kind of distributed scheduling system and method

Publications (1)

Publication Number Publication Date
CN108829505A true CN108829505A (en) 2018-11-16

Family

ID=64133787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810689488.1A Pending CN108829505A (en) 2018-06-28 2018-06-28 A kind of distributed scheduling system and method

Country Status (1)

Country Link
CN (1) CN108829505A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858817A (en) * 2019-02-03 2019-06-07 北京奇艺世纪科技有限公司 A kind of Workflow Management System and method
CN109918180A (en) * 2018-12-14 2019-06-21 深圳壹账通智能科技有限公司 A kind of method for scheduling task, device, computer system and readable storage medium storing program for executing
CN111124806A (en) * 2019-11-25 2020-05-08 山东鲁能软件技术有限公司 Equipment state real-time monitoring method and system based on distributed scheduling task
CN111221698A (en) * 2018-11-26 2020-06-02 北京京东金融科技控股有限公司 Task data acquisition method and device
CN112363817A (en) * 2020-11-27 2021-02-12 Oppo广东移动通信有限公司 Big data task execution method and device, storage medium and server
CN115834669A (en) * 2023-02-14 2023-03-21 北京邮电大学 Distributed AI service engine system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239144A (en) * 2014-09-22 2014-12-24 珠海许继芝电网自动化有限公司 Multilevel distributed task processing system
CN106033371A (en) * 2015-03-13 2016-10-19 杭州海康威视数字技术股份有限公司 Method and system for dispatching video analysis task
CN106202399A (en) * 2016-07-11 2016-12-07 浪潮软件集团有限公司 Method for implementing data management system of big data
CN106506605A (en) * 2016-10-14 2017-03-15 华南理工大学 A kind of SaaS application construction methods based on micro services framework
CN106993019A (en) * 2016-11-29 2017-07-28 上海壹账通金融科技有限公司 Distributed task dispatching method and system
CN107133089A (en) * 2017-04-27 2017-09-05 努比亚技术有限公司 A kind of task scheduling server and method for scheduling task
CN107450972A (en) * 2017-07-04 2017-12-08 阿里巴巴集团控股有限公司 A kind of dispatching method, device and electronic equipment
CN107688500A (en) * 2017-07-26 2018-02-13 阿里巴巴集团控股有限公司 A kind of distributed task scheduling processing method, device, system and equipment
US20180067764A1 (en) * 2016-09-08 2018-03-08 International Business Machines Corporation Smart reduce task scheduler
US20180088987A1 (en) * 2016-09-23 2018-03-29 Sap Se Failover handling in a distributed database system
CN107888669A (en) * 2017-10-31 2018-04-06 武汉理工大学 A kind of extensive resource scheduling system and method based on deep learning neutral net

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239144A (en) * 2014-09-22 2014-12-24 珠海许继芝电网自动化有限公司 Multilevel distributed task processing system
CN106033371A (en) * 2015-03-13 2016-10-19 杭州海康威视数字技术股份有限公司 Method and system for dispatching video analysis task
CN106202399A (en) * 2016-07-11 2016-12-07 浪潮软件集团有限公司 Method for implementing data management system of big data
US20180067764A1 (en) * 2016-09-08 2018-03-08 International Business Machines Corporation Smart reduce task scheduler
US20180088987A1 (en) * 2016-09-23 2018-03-29 Sap Se Failover handling in a distributed database system
CN106506605A (en) * 2016-10-14 2017-03-15 华南理工大学 A kind of SaaS application construction methods based on micro services framework
CN106993019A (en) * 2016-11-29 2017-07-28 上海壹账通金融科技有限公司 Distributed task dispatching method and system
CN107133089A (en) * 2017-04-27 2017-09-05 努比亚技术有限公司 A kind of task scheduling server and method for scheduling task
CN107450972A (en) * 2017-07-04 2017-12-08 阿里巴巴集团控股有限公司 A kind of dispatching method, device and electronic equipment
CN107688500A (en) * 2017-07-26 2018-02-13 阿里巴巴集团控股有限公司 A kind of distributed task scheduling processing method, device, system and equipment
CN107888669A (en) * 2017-10-31 2018-04-06 武汉理工大学 A kind of extensive resource scheduling system and method based on deep learning neutral net

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
丁爱萍: "《物联网导论》", 31 March 2017 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111221698A (en) * 2018-11-26 2020-06-02 北京京东金融科技控股有限公司 Task data acquisition method and device
CN109918180A (en) * 2018-12-14 2019-06-21 深圳壹账通智能科技有限公司 A kind of method for scheduling task, device, computer system and readable storage medium storing program for executing
CN109858817A (en) * 2019-02-03 2019-06-07 北京奇艺世纪科技有限公司 A kind of Workflow Management System and method
CN111124806A (en) * 2019-11-25 2020-05-08 山东鲁能软件技术有限公司 Equipment state real-time monitoring method and system based on distributed scheduling task
CN111124806B (en) * 2019-11-25 2023-09-05 山东鲁软数字科技有限公司 Method and system for monitoring equipment state in real time based on distributed scheduling task
CN112363817A (en) * 2020-11-27 2021-02-12 Oppo广东移动通信有限公司 Big data task execution method and device, storage medium and server
CN115834669A (en) * 2023-02-14 2023-03-21 北京邮电大学 Distributed AI service engine system
CN115834669B (en) * 2023-02-14 2023-05-09 北京邮电大学 Distributed AI service engine system

Similar Documents

Publication Publication Date Title
CN108829505A (en) A kind of distributed scheduling system and method
US10983963B1 (en) Automated discovery, profiling, and management of data assets across distributed file systems through machine learning
JP6985441B2 (en) Workload automation and data system analysis
US20210357211A1 (en) Meta-indexing, search, compliance, and test framework for software development
US9336288B2 (en) Workflow controller compatibility
US11663033B2 (en) Design-time information based on run-time artifacts in a distributed computing cluster
US7418453B2 (en) Updating a data warehouse schema based on changes in an observation model
Mattoso et al. Dynamic steering of HPC scientific workflows: A survey
CN105786864A (en) Offline analysis method for massive data
US9992269B1 (en) Distributed complex event processing
CN110377595A (en) A kind of vehicle data management system
Osman et al. Towards real-time analytics in the cloud
CN112529528B (en) Workflow monitoring and warning method, device and system based on big data flow calculation
CN111400011B (en) Real-time task scheduling method, system, equipment and readable storage medium
US20210397447A1 (en) Automated compliance and testing framework for software development
Balliu et al. A big data analyzer for large trace logs
Liu et al. Mr-cof: a genetic mapreduce configuration optimization framework
CN107423035B (en) Product data management system in software development process
CN109033196A (en) A kind of distributed data scheduling system and method
CN116431668A (en) Metadata acquisition-based data blood-edge analysis method and device and electronic equipment
Sun et al. The implementation of air pollution monitoring service using hybrid database converter
Davidson et al. Technical review of apache flink for big data
Darius et al. From Data to Insights: A Review of Cloud-Based Big Data Tools and Technologies
US11907159B2 (en) Method for representing a distributed computing system by graph embedding
Li et al. Power Quality Data Processing and Distributed Power Quality Evaluation Based on Hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181116