CN114020566A - Job monitoring method, device, medium and electronic equipment of job scheduling system - Google Patents

Job monitoring method, device, medium and electronic equipment of job scheduling system Download PDF

Info

Publication number
CN114020566A
CN114020566A CN202111266458.8A CN202111266458A CN114020566A CN 114020566 A CN114020566 A CN 114020566A CN 202111266458 A CN202111266458 A CN 202111266458A CN 114020566 A CN114020566 A CN 114020566A
Authority
CN
China
Prior art keywords
job
information
monitoring
execution unit
state information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111266458.8A
Other languages
Chinese (zh)
Inventor
洪钧煌
王呈炎
陈良龙
苏建清
林淇
翁志山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202111266458.8A priority Critical patent/CN114020566A/en
Publication of CN114020566A publication Critical patent/CN114020566A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3017Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is implementing multitasking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application discloses a method, a device, a medium and electronic equipment for monitoring jobs of a job scheduling system. The embodiment relates to the technical field of big data. Wherein, the method comprises the following steps: if the job execution unit is detected to be created, acquiring execution unit information; if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline; in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance; and aggregating the execution unit information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage. The technical scheme provided by the application solves the technical problem that if the job scheduling system is required to support various big data frames, the job scheduling system needs to be developed and butted respectively, and achieves the effect of generally and effectively monitoring the job for the job monitoring system.

Description

Job monitoring method, device, medium and electronic equipment of job scheduling system
Technical Field
The embodiment of the application relates to the technical field of big data, in particular to a job monitoring method, a job monitoring device, a job monitoring medium and electronic equipment of a job scheduling system.
Background
With the rapid development of internet big data platforms, job scheduling systems or workflow systems have become important components of most big data platforms. The operation scheduling system provides an operation monitoring function, is used for monitoring the operation real-time state of the operation, collects and reports the operation time consumption, resource consumption and other indexes of the operation, and provides support for counting resource consumption and optimizing an operation production line.
In the prior art, relevant services of a big data frame are connected, a service interface is used for collecting the real-time running state of operation, and indexes provided by the service interface are obtained. However, interfacing big data framework related services has the following drawbacks: and if the customized indexes need to be collected, the related service components need to be developed secondarily. In addition, service interfaces supported by different big data frames have differences, and if a job scheduling system is required to support multiple big data frames, the big data frames need to be developed and docked respectively.
Disclosure of Invention
The embodiment of the application provides a job monitoring method, a job monitoring device, a job monitoring medium and electronic equipment of a job scheduling system, solves the technical problem that if the job scheduling system is required to support various big data frames, the job scheduling system needs to be developed and butted respectively, and achieves the effect of generally and effectively monitoring jobs for the job monitoring system.
In a first aspect, an embodiment of the present application provides a job monitoring method for a job scheduling system, where the method includes:
if the fact that the job execution unit is created is detected, acquiring the information of the execution unit;
if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline;
in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance;
and aggregating the execution unit information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
In a second aspect, an embodiment of the present application provides a job monitoring apparatus of a job scheduling system, where the apparatus includes:
the execution unit information acquisition module is used for acquiring the execution unit information if the job execution unit is detected to be created;
the operation pipeline information acquisition module is used for acquiring the operation pipeline information if the initialization of the operation pipeline is detected;
the state information acquisition module is used for acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through the embedded points of the operation key path embedded in the operation scheduling system in advance in the operation execution process;
and the aggregation storage module is used for aggregating the execution unit information, the operation line information and the state information and pushing an aggregation result to a monitoring database for storage.
In a third aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements a job monitoring method of a job scheduling system according to an embodiment of the present application.
In a fourth aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the job monitoring method of the job scheduling system according to the embodiment of the present application.
According to the technical scheme provided by the embodiment of the application, the information of the execution unit is acquired if the fact that the job execution unit is created is detected; if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline; in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance; the technical scheme of aggregating the execution unit information, the operation pipeline information and the state information and pushing the aggregation result to the monitoring database for storage is adopted, and a method of embedding points in an operation key path instead of interface butt joint is adopted, so that the technical problems that service interfaces supported by different big data frames have difference, and if an operation scheduling system is required to support various big data frames, the service interfaces are required to be developed and butted respectively are solved, and the effect of generally and effectively monitoring the operation of the operation monitoring system is achieved.
Drawings
Fig. 1 is a flowchart of a job monitoring method of a job scheduling system according to an embodiment of the present application;
fig. 2 is a flowchart of a job monitoring method of a job scheduling system according to a second embodiment of the present invention;
fig. 3 is a flowchart of a job monitoring method of a job scheduling system according to a third embodiment of the present invention;
fig. 4 is a block diagram of a job monitoring apparatus of a job scheduling system according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a flowchart of a job monitoring method of a job scheduling system according to an embodiment of the present application, where the present embodiment is applicable to a scenario of monitoring a job running condition in a big data platform, and the method may be executed by a job monitoring apparatus of the job scheduling system according to the embodiment of the present application, where the apparatus may be implemented by software and/or hardware, and may be integrated in an electronic device.
As shown in fig. 1, the job monitoring method of the job scheduling system includes:
and S110, if the fact that the job execution unit is created is detected, acquiring the execution unit information.
In the present application, the job execution unit and the execution unit mean the same. The job Execution Unit (EU) represents a virtual module for executing the job logic code, and may be, for example, a job Execution container, or a job loading process, or a logically defined executor Unit, such as Task Slot of Flink. That is, the job execution unit is a functional module that can complete a certain function, corresponds to a process or a virtual machine that can execute a certain job, resides in the scheduling system, and executes a specific job scheduling task.
The job scheduling system initializes the system, creates a job execution unit for completing the job, and generates execution unit information of the execution unit. The Execution Unit information described herein mainly includes an Execution Unit Identity (EUID), and may use an algorithm with good randomness to generate the Execution Unit ID, and assign different Execution Unit IDs for distinguishing different Execution units.
In practice, after a job is completed in one execution unit, another execution unit may be entered for further processing, that is, a pending job involves multiple execution units. For example, the original job to be handled is processed by the execution unit a to obtain a second job, the second job is further processed by the execution unit B to obtain a third job, and finally the third job is processed by the execution unit C to finally complete the job to be handled. At this time, the job monitoring apparatus according to the present invention assigns different execution unit IDs, i.e., EUIDs, to the execution unit a, the execution unit B, and the execution unit C. The reason for this is that if execution unit a, execution unit B, and execution unit C use the same execution unit ID, the complexity of the system increases, increasing resource consumption.
And S120, if the initialization of the operation pipeline is detected, acquiring the operation pipeline information.
After receiving a job to be handled, the job scheduling system establishes a job pipeline for the job to be handled, and the job pipeline is generally executed in multiple execution units in stages. For example, if the pending job needs to be completed through the execution unit a, the execution unit B, and the execution unit C in sequence, a job pipeline is created for the pending job, and corresponding information is set for the job pipeline, where the job pipeline information mainly includes a job pipeline id (trace id). When the job to be handled is circulated in each execution unit in the job pipeline, the job pipeline ID is also circulated in each execution unit, so that the job to be handled is identified in which execution unit of the plurality of execution units to process specifically, and which step in the job pipeline to process specifically can be reflected. And if the operation monitoring device detects that the operation pipeline is initialized, acquiring a corresponding operation pipeline ID.
And S130, acquiring the state information of the job by adopting a preset monitoring data acquisition rule through the embedded point of the job critical path embedded in the job scheduling system in advance in the job execution process.
In this technical solution, optionally, the embedding manner of the embedding point of the job critical path in the job scheduling system includes: determining a job key node in the job execution path to form a key path; registering a callback function for each operation key node; the callback function is used for being triggered when the callback function is executed to the operation key node so as to form a buried point of a key path.
The job execution path described herein refers to the entire process that the to-do job goes through from the start job to the completion of the job. For example, after the pending job enters the execution unit a, one or more events of job creation, job completion, job suspension, job resumption, and job sampling may be experienced in the execution unit a. And taking the time points of the events as key time nodes of the to-do job, wherein the key time nodes can also be called as job key nodes, and all the job key nodes jointly form a key path of the to-do job.
Callback functions are registered for each job key node, and are the buried points of the key path. For example, callback functions are respectively registered for job creation, job completion, job suspension, job resumption and job sampling, when the pending job enters the execution unit a and is in a job suspension state, the callback function corresponding to the job suspension is triggered, and the triggered callback function acquires the state information of the pending job according to the preset setting.
The advantage of setting up like this is, need not adopt the interface to dock, and the range of reforming transform to the former system under the big data frame is littleer.
And S140, aggregating the execution unit information, the job pipeline information and the state information, and pushing an aggregation result to a monitoring database for storage.
Wherein the monitoring database is a storage area in a job monitoring device of the job scheduling system. The aggregation of the execution unit information, the job pipeline information, and the status information is to collect, unify, format, and arrange the execution unit information, the job pipeline information, and the status information, and then put the collected results into a storage area.
In this technical solution, optionally, after the aggregation result is pushed to the monitoring database for storage, an event corresponding to each piece of state information on the job pipeline may be determined according to job pipeline information to which a job belongs in response to a job query request.
That is, the user can query the information stored in the monitoring database as necessary, thereby determining to what state the to-do job has progressed. Optionally, the user may query through a query interface, and aggregate the state information of the interested job pipeline through a query menu and display the aggregated state information on the query interface. Since the pipeline ID correlates events for each execution unit on the pipeline, global monitor data for the pipeline may be obtained.
The advantage of this arrangement is that the user can conveniently query the status information to determine the progress of the job to be handled.
According to the technical scheme provided by the embodiment of the application, the information of the execution unit is acquired if the fact that the job execution unit is created is detected; if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline; in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance; the technical scheme of aggregating the execution unit information, the operation pipeline information and the state information and pushing the aggregation result to the monitoring database for storage is adopted, and a method of embedding points in an operation key path instead of interface butt joint is adopted, so that the technical problems that service interfaces supported by different big data frames have difference, and if an operation scheduling system is required to support various big data frames, the service interfaces are required to be developed and butted respectively are solved, and the effect of generally and effectively monitoring the operation of the operation monitoring system is achieved.
Example two
Fig. 2 is a flowchart of a job monitoring method of a job scheduling system according to a second embodiment of the present invention, and the present embodiment is optimized based on the above embodiments. The concrete optimization is as follows: in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path embedded in an operation scheduling system in advance, wherein the method comprises the following steps: in the process of executing the operation, acquiring the current state information of the operation through an embedded point of an operation key path embedded in an operation scheduling system in advance; identifying whether a pre-event exists in an event corresponding to the current state information; if the preset event exists, acquiring the state information of the preset event and acquiring the state information and the current state information; correspondingly, the aggregating the execution unit information, the job pipeline information and the state information, and pushing the aggregation result to a monitoring database for storage includes: and aggregating the execution unit information, the operation pipeline information, the current state information and the state information of the preposed event, and pushing an aggregation result to a monitoring database for storage.
As shown in fig. 2, the method of the present embodiment specifically includes the following steps, and for brevity of description, detailed description of the steps that are the same as or similar to those of the first embodiment is omitted:
s210, if the fact that the job execution unit is created is detected, execution unit information is acquired.
S220, if the initialization of the operation pipeline is detected, the operation pipeline information is obtained.
And S230, acquiring the current state information of the job through the embedded point of the job critical path embedded in the job scheduling system in advance in the job execution process.
In the second embodiment, the status information of the job is classified into the current status information of the job and the status information of the leading event. For example, after the pending job enters the execution unit a, the current job has already proceeded to the stage of job recovery, and the front event of the pending job includes job pause.
S240, identifying whether the event corresponding to the current state information has a pre-event, if so, executing S260; if not, go to S250.
For example, after the job to be handled enters the execution unit a, if the current job proceeds to the stage of job creation and the job is not penalized by the pre-job, the job creation event is the first event on the whole job pipeline, and S250 is executed; if the current job has proceeded to the job resuming stage, the current event has a leading event, such as job suspension, and S260 is executed.
And S250, aggregating the execution unit information, the operation pipeline information and the state information, and pushing an aggregation result to a monitoring database for storage.
And S260, acquiring the state information of the preposed event and acquiring the state information and the current state information.
The status information of the leading event may be a timestamp or other relevant information that can indicate the progress of the leading event, as with the current status information.
And S270, aggregating the execution unit information, the operation pipeline information, the current state information and the state information of the preposed event, and pushing an aggregation result to a monitoring database for storage.
Wherein the monitoring database is a storage area in a job monitoring device of the job scheduling system. The step of aggregating the execution unit information, the job pipeline information, the current state information and the state information of the leading event is to actually collect, unify the format and arrange the execution unit information, the job pipeline information, the current state information and the state information of the leading event, and then put the collected result into the storage area.
According to the technical scheme provided by the embodiment of the application, the current state information of the operation is obtained through the embedded points of the operation key path embedded in the operation scheduling system in advance; identifying whether a pre-event exists in an event corresponding to the current state information; if the preset event exists, acquiring the state information of the preset event and acquiring the state information and the current state information; the technical scheme of aggregating the execution unit information, the operation pipeline information, the current state information and the state information of the preposed event and pushing the aggregation result to the monitoring database for storage solves the technical problem that the information obtained when the operation is monitored is not perfect, and achieves the effect of monitoring indexes and data with finer granularity.
On the basis of the above technical solutions, optionally, obtaining the status information of the job by using a preset monitoring data obtaining rule includes: and adopting the current timestamp information as the state information of the operation.
For example, in the case where events include job creation, job completion, job pause, job resumption, and job sampling, the events always occur sequentially in the same execution unit. Therefore, when the status information of the job is represented by the event ID, the event ID should include the time information of the event, and the current time stamp can be used as the event ID.
The method has the advantages that the timestamp information is used as the state information of the operation, the operation can be directly carried out by using the state information of the operation when the indexes such as operation time consumption and the like are counted, and storage resources and calculation cost are saved.
Optionally, a source library (Parent) may also be created in this embodiment, and a dependency relationship generally exists between jobs on the job pipeline. For example, the original job to be handled is processed by the execution unit a to obtain a second job, the second job is further processed by the execution unit B to obtain a third job, and finally the third job is processed by the execution unit C to finally complete the job to be handled. At this time, the original job to be handled, the second job and the third job have a precedence and a dependency relationship therebetween. The original job to be handled, the second job and the third job are all called sub-jobs of the job to be handled.
When creating the sub-job, the related event of the pre-job and the job pipeline ID may be recorded into the source library of the currently created sub-job for subsequent collection, analysis and restoration of the entire flow of job completion. For example, when the third job is created, each event in the original to-do job and the second job and the job pipeline ID may be extracted and recorded, and since there are two front jobs at this time, two source libraries are created for storing the relevant information of the original to-do job and the second job, respectively.
The method has the advantages that one or more source libraries are created in the sub-operation, so that the source tracing of the historical information of each sub-operation can be conveniently carried out, and the collection and analysis of various information are facilitated.
EXAMPLE III
Fig. 3 is a flowchart of a job monitoring method of a job scheduling system according to a third embodiment of the present invention, and the present embodiment performs optimization based on the first embodiment. The concrete optimization is as follows: after acquiring the job pipeline information, the method further comprises the following steps: in the operation execution process, obtaining the user-defined information of the operation according to the pre-configured user-defined monitoring index; and aggregating the execution unit information, the user-defined information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
As shown in fig. 3, the method of the present embodiment specifically includes the following steps, and for brevity of description, detailed description of the steps that are the same as or similar to those of the first embodiment is omitted:
s310, if the fact that the job execution unit is created is detected, execution unit information is acquired.
S320, if the initialization of the operation pipeline is detected, the operation pipeline information is obtained.
And S330, acquiring the custom information of the operation according to the pre-configured custom monitoring index in the operation execution process.
In this technical solution, optionally, the configuration process of the user-defined monitoring index includes: responding to the trigger operation of the configuration interface, and displaying the configuration interface; responding to a selection request of the user-defined monitoring indexes, and determining and selecting the candidate indexes as the user-defined monitoring indexes; and the user-defined monitoring index is obtained through the buried point of the operation key path and/or the user-defined buried point.
In the foregoing embodiments, the status information of the job is obtained through the buried point, and the status of the job is distinguished by the divided events, and as described in the second embodiment, the status information of the job may be the current timestamp information, i.e., the timestamps of job creation, job completion, job pause, job resumption, and job sampling. The custom information in S330 is other information that needs to be acquired and is set by the user corresponding to the status information of the job, such as CPU utilization, memory utilization, and remaining storage space. The custom information is obtained through a pre-configured custom monitoring index.
The method has the advantages that the user can select the self-defined monitoring index within a certain range, and a certain selection range is provided for the user who does not know the job scheduling system.
In the technical scheme, optionally, the user-defined monitoring index includes at least one of a running operation time consumption index, a resource consumption index and an error reporting index; the selection request of the self-defined monitoring index comprises a selection result of the self-defined monitoring index or comprises a selection rule of the self-defined monitoring index.
The embedded point position of the user-defined monitoring index can be the same as the embedded point position of the operating state information, and the embedded point position can also be independently embedded so as to be different from the embedded point position of the operating state information. Similar to the status information of the job, the custom monitoring index is embedded by registering a callback function.
The benefit of this arrangement is that it provides an example of a specific custom monitoring index.
And S340, in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through the embedded points of the operation key path embedded in the operation scheduling system in advance.
And S350, aggregating the execution unit information, the user-defined information, the operation pipeline information and the state information, and pushing an aggregation result to a monitoring database for storage.
Optionally, in this embodiment, an Annotation library (Annotation) may also be created for storing the custom monitoring index and the custom information.
The technical scheme provided by the embodiment of the application further comprises the following steps after the operation pipeline information is obtained: in the operation execution process, obtaining the user-defined information of the operation according to the pre-configured user-defined monitoring index; and the technical scheme of aggregating the execution unit information, the user-defined information, the operation pipeline information and the state information and pushing the aggregation result to the monitoring database for storage solves the problem of poor expandability of the common operation monitoring method, and achieves the effects of enabling the operation monitoring method to have good expandability and being capable of acquiring information by user definition.
Example four
Fig. 4 is a block diagram of a job monitoring apparatus of a job scheduling system according to a fourth embodiment of the present invention, which is capable of executing a job monitoring method of a job scheduling system according to any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method. A job monitoring apparatus of a job scheduling system, the apparatus comprising:
an execution unit information obtaining module 410, configured to obtain the execution unit information if it is detected that the job execution unit is created;
a pipeline information obtaining module 420, configured to obtain pipeline information if it is detected that a pipeline is initialized;
the state information acquiring module 430 is configured to acquire state information of a job by using a preset monitoring data acquiring rule through a buried point of a job critical path buried in the job scheduling system in advance in a job execution process;
and the aggregation storage module 440 is configured to aggregate the execution unit information, the job pipeline information, and the state information, and push an aggregation result to the monitoring database for storage.
Optionally, the status information obtaining module 430 includes:
the current state information acquisition submodule is used for acquiring current state information of the operation through a buried point of an operation key path which is pre-buried in the operation scheduling system in the operation execution process;
the leading event identification submodule is used for identifying whether the event corresponding to the current state information has a leading event or not;
the preposed event state information acquisition submodule is used for acquiring the state information of the preposed event and acquiring the state information together with the current state information if the preposed event state information acquisition submodule exists;
correspondingly, the aggregation storage module comprises:
and the aggregation storage submodule is used for aggregating the execution unit information, the operation pipeline information, the current state information and the state information of the preposed event and pushing an aggregation result to the monitoring database for storage.
Optionally, the status information obtaining module 430 includes:
and the timestamp information acquisition submodule is used for adopting the current timestamp information as the state information of the operation.
Optionally, the status information obtaining module 430 includes:
the key path forming submodule is used for determining a job key node in the job execution path so as to form a key path;
the callback function submodule is used for registering a callback function for each operation key node; the callback function is used for being triggered when the callback function is executed to the operation key node so as to form a buried point of a key path.
Optionally, the job monitoring apparatus of the job scheduling system further includes:
the user-defined information acquisition module is used for acquiring user-defined information of the operation according to a pre-configured user-defined monitoring index in the operation execution process;
correspondingly, the aggregation storage module comprises:
and the aggregation storage submodule is used for aggregating the execution unit information, the user-defined information, the operation line information and the state information and pushing an aggregation result to the monitoring database for storage.
Optionally, the custom information obtaining module includes:
the configuration interface display submodule is used for responding to the triggering operation of the configuration interface and displaying the configuration interface;
the user-defined monitoring index determining submodule is used for responding to a selection request of the user-defined monitoring index and determining and selecting the candidate indexes as the user-defined monitoring index; and the user-defined monitoring index is obtained through the buried point of the operation key path and/or the user-defined buried point.
Optionally, the user-defined monitoring index includes at least one of a running operation time consumption index, a resource consumption index and an error reporting index;
the selection request of the self-defined monitoring index comprises a selection result of the self-defined monitoring index or comprises a selection rule of the self-defined monitoring index.
Optionally, the job monitoring apparatus of the job scheduling system further includes:
and the event determining module is used for responding to the job query request and determining the events corresponding to the state information on the job pipeline according to the job pipeline information to which the job belongs.
According to the technical scheme provided by the embodiment of the application, the information of the execution unit is acquired if the fact that the job execution unit is created is detected; if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline; in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance; the technical scheme of aggregating the execution unit information, the operation pipeline information and the state information and pushing the aggregation result to the monitoring database for storage is adopted, and a method of embedding points in an operation key path instead of interface butt joint is adopted, so that the technical problems that service interfaces supported by different big data frames have difference, and if an operation scheduling system is required to support various big data frames, the service interfaces are required to be developed and butted respectively are solved, and the effect of generally and effectively monitoring the operation of the operation monitoring system is achieved.
EXAMPLE five
An embodiment five of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for monitoring the job of the job scheduling system according to the embodiments of the present invention:
if the fact that the job execution unit is created is detected, acquiring the information of the execution unit;
if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline;
in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance;
and aggregating the execution unit information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
EXAMPLE six
The sixth embodiment of the application provides electronic equipment. Fig. 5 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present application. As shown in fig. 5, the present embodiment provides an electronic device, which includes: one or more processors 510; the storage device 520 is configured to store one or more programs, and when the one or more programs are executed by the one or more processors 510, the one or more processors 510 implement a job monitoring method of a job scheduling system provided in an embodiment of the present application, the method includes:
if the fact that the job execution unit is created is detected, acquiring the information of the execution unit;
if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline;
in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance;
and aggregating the execution unit information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
Of course, those skilled in the art can understand that the processor 510 also implements the technical solution of the job monitoring method of the job scheduling system provided in any embodiment of the present application.
The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 5, the electronic device includes a processor 510, a storage 520, an input 530, and an output 540; the number of the processors 510 in the electronic device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the storage 520, the input 530, and the output 540 in the electronic device may be connected by a bus or other means, and fig. 5 illustrates an example of a connection by a bus.
The storage device 520 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and module units, such as program instructions corresponding to the job monitoring method of the job scheduling system in the embodiment of the present application.
The storage device 520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the storage 520 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, storage 520 may further include memory located remotely from processor 510, which may be connected via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numbers, character information, or voice information, and to generate key signal inputs related to user settings and function control of the electronic apparatus. The output device 540 may include a display screen, speakers, etc. of electronic equipment.
The job monitoring device, the medium, and the electronic device of the job scheduling system provided in the above embodiments may execute the job monitoring method of the job scheduling system provided in any embodiment of the present application, and have functional modules and beneficial effects corresponding to the execution of the method. For details of the job scheduling system, reference may be made to the job monitoring method of the job scheduling system provided in any embodiment of the present application.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A job monitoring method of a job scheduling system, the method comprising:
if the fact that the job execution unit is created is detected, acquiring the information of the execution unit;
if the initialization of the operation pipeline is detected, acquiring the information of the operation pipeline;
in the process of executing the operation, acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through embedding points of an operation key path which are embedded in an operation scheduling system in advance;
and aggregating the execution unit information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
2. The method according to claim 1, wherein during the execution of the job, acquiring the status information of the job by using a preset monitoring data acquisition rule through a buried point of a job critical path pre-buried in the job scheduling system comprises:
in the process of executing the operation, acquiring the current state information of the operation through an embedded point of an operation key path embedded in an operation scheduling system in advance;
identifying whether a pre-event exists in an event corresponding to the current state information;
if the preset event exists, acquiring the state information of the preset event and acquiring the state information and the current state information;
correspondingly, the aggregating the execution unit information, the job pipeline information and the state information, and pushing the aggregation result to a monitoring database for storage includes:
and aggregating the execution unit information, the operation pipeline information, the current state information and the state information of the preposed event, and pushing an aggregation result to a monitoring database for storage.
3. The method of claim 2, wherein acquiring the status information of the job by using the preset monitoring data acquisition rule comprises:
and adopting the current timestamp information as the state information of the operation.
4. The method of claim 1, wherein the embedding of the embedding points of the job critical path in the job scheduling system comprises:
determining a job key node in the job execution path to form a key path;
registering a callback function for each operation key node; the callback function is used for being triggered when the callback function is executed to the operation key node so as to form a buried point of a key path.
5. The method of claim 1, wherein after obtaining job pipeline information, the method further comprises:
in the operation execution process, obtaining the user-defined information of the operation according to the pre-configured user-defined monitoring index;
correspondingly, the aggregating the execution unit information, the job pipeline information and the state information, and pushing the aggregation result to a monitoring database for storage includes:
and aggregating the execution unit information, the user-defined information, the operation line information and the state information, and pushing an aggregation result to a monitoring database for storage.
6. The method of claim 5, wherein the configuration process of the customized monitoring index comprises:
responding to the trigger operation of the configuration interface, and displaying the configuration interface;
responding to a selection request of the user-defined monitoring indexes, and determining and selecting the candidate indexes as the user-defined monitoring indexes; and the user-defined monitoring index is obtained through the buried point of the operation key path and/or the user-defined buried point.
7. The method of claim 6, wherein the custom monitoring metrics comprise at least one of a run job time consumption metric, a resource consumption metric, and an error reporting metric;
the selection request of the self-defined monitoring index comprises a selection result of the self-defined monitoring index or comprises a selection rule of the self-defined monitoring index.
8. The method of claim 1, wherein after pushing the aggregated results to a monitoring database store, the method further comprises:
and responding to the job query request, and determining events corresponding to the state information on the job pipeline according to the job pipeline information to which the job belongs.
9. An apparatus for monitoring a job of a job scheduling system, the apparatus comprising:
the execution unit information acquisition module is used for acquiring the execution unit information if the job execution unit is detected to be created;
the operation pipeline information acquisition module is used for acquiring the operation pipeline information if the initialization of the operation pipeline is detected;
the state information acquisition module is used for acquiring the state information of the operation by adopting a preset monitoring data acquisition rule through the embedded points of the operation key path embedded in the operation scheduling system in advance in the operation execution process;
and the aggregation storage module is used for aggregating the execution unit information, the operation line information and the state information and pushing an aggregation result to a monitoring database for storage.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of job monitoring of a job scheduling system according to any one of claims 1 to 8.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements a method of job monitoring of a job scheduling system according to any of claims 1-8 when executing the computer program.
CN202111266458.8A 2021-10-28 2021-10-28 Job monitoring method, device, medium and electronic equipment of job scheduling system Pending CN114020566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111266458.8A CN114020566A (en) 2021-10-28 2021-10-28 Job monitoring method, device, medium and electronic equipment of job scheduling system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111266458.8A CN114020566A (en) 2021-10-28 2021-10-28 Job monitoring method, device, medium and electronic equipment of job scheduling system

Publications (1)

Publication Number Publication Date
CN114020566A true CN114020566A (en) 2022-02-08

Family

ID=80058628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111266458.8A Pending CN114020566A (en) 2021-10-28 2021-10-28 Job monitoring method, device, medium and electronic equipment of job scheduling system

Country Status (1)

Country Link
CN (1) CN114020566A (en)

Similar Documents

Publication Publication Date Title
US20220046076A1 (en) Method And System For Real-Time Modeling Of Communication, Virtualization And Transaction Execution Related Topological Aspects Of Monitored Software Applications And Hardware Entities
US11240126B2 (en) Distributed tracing for application performance monitoring
US9483378B2 (en) Method and system for resource monitoring of large-scale, orchestrated, multi process job execution environments
CN110928772B (en) Test method and device
US8464221B2 (en) Visualization tool for system tracing infrastructure events
EP3806432A1 (en) Method for changing service on device and service changing system
CN108521353B (en) Processing method and device for positioning performance bottleneck and readable storage medium
CN108804215B (en) Task processing method and device and electronic equipment
US8489941B2 (en) Automatic documentation of ticket execution
US20210065083A1 (en) Method for changing device business and business change system
US11169896B2 (en) Information processing system
US20180143897A1 (en) Determining idle testing periods
US11526413B2 (en) Distributed tracing of huge spans for application and dependent application performance monitoring
CN111339118A (en) Kubernetes-based resource change history recording method and device
US11349730B2 (en) Operation device and operation method
CA2668958A1 (en) System and method for managing batch production
US20140123126A1 (en) Automatic topology extraction and plotting with correlation to real time analytic data
CN114020566A (en) Job monitoring method, device, medium and electronic equipment of job scheduling system
CN111046007B (en) Method, apparatus and computer program product for managing a storage system
CN114579416B (en) Index determination method, device, server and medium
CN111506422B (en) Event analysis method and system
CN112183982A (en) Workflow creating method and device, computer equipment and storage medium
CN107169133B (en) Snapshot capturing method, device, server and system
CN106777010B (en) Log providing method and device and log obtaining method, device and system
CN109995617A (en) Automated testing method, device, equipment and the storage medium of Host Administration characteristic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination