CN114154962A - Batch processing monitoring method, device and equipment - Google Patents

Batch processing monitoring method, device and equipment Download PDF

Info

Publication number
CN114154962A
CN114154962A CN202111486700.2A CN202111486700A CN114154962A CN 114154962 A CN114154962 A CN 114154962A CN 202111486700 A CN202111486700 A CN 202111486700A CN 114154962 A CN114154962 A CN 114154962A
Authority
CN
China
Prior art keywords
application
job
scene
abnormal
progress
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111486700.2A
Other languages
Chinese (zh)
Inventor
赖海滨
高伟钦
翁世清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111486700.2A priority Critical patent/CN114154962A/en
Publication of CN114154962A publication Critical patent/CN114154962A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0633Workflow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Abstract

The embodiment of the application provides a batch processing monitoring method, a batch processing monitoring device and batch processing monitoring equipment. The method comprises the following steps: acquiring the operation of the application plan according to the configuration information of the operation, and generating an operation schedule; acquiring key operation indexes of the operation according to operation historical data of the operation, and generating an operation historical index table; acquiring the operation finished through the operation schedule to calculate and display the progress of application completion; and displaying the abnormal conditions and drilling the information of the abnormal operation by combining the operation history index table to obtain an applied abnormal operation list. The technical scheme of the application has the capability of overall monitoring, the overall batch processing monitoring is organized through the dimension of the scene, the overall processing progress can be controlled in real time, and the operation and maintenance efficiency is improved; operation and maintenance can be carried out at fixed points through the abnormal operation list of each application in the scene; the operation range needing to be monitored is locked in advance through the operation schedule, the operation flow does not need to wait for completion of instantiation, and the monitoring accuracy is improved.

Description

Batch processing monitoring method, device and equipment
Technical Field
The application relates to the technical field of computers, in particular to a batch monitoring method, a batch monitoring device, batch monitoring equipment and a batch monitoring processor.
Background
Traditional batch scheduling monitoring faces several challenges: with the increase of services, the daily operation workload is more and more, the relationship between the operation and the job is more and more complicated, and the batch running monitoring efficiency is low; from the perspective of global monitoring, different processing links have different importance, and the important links cannot be monitored in a targeted manner; the difficulty of real-time monitoring is high, and due to the complexity of batch processing service, the processing progress of each application cannot be displayed in real time, and abnormal conditions existing in current batch processing and 'critical paths' influencing final target applications are acquired.
For this reason, the job scheduling system has the following problems in the application based on the traditional batch scheduling monitoring aspect: the provided monitoring is basically fine-grained monitoring of operation flow or operation and the like, and is not displayed from a higher dimension; some products can show the dependency relationship and the processing condition among all the operations on one canvas, and the defects are that the user experience is poor, the operation is disordered and the actual business meaning is difficult to embody.
Disclosure of Invention
The embodiment of the application aims to provide a batch processing monitoring method, a batch processing monitoring device, a batch processing monitoring equipment, a storage medium and a batch processing processor.
Therefore, a batch processing monitoring method which is displayed by scene dimension, can display the processing progress and the abnormal operation list of each application specific service date/batch in the scene in real time and support drilling to obtain a finer-grained operation example is established, and at least one of the following technical problems can be solved:
(1) the traditional scheduling software mainly starts from the dimension of an operation flow, the abstract dimension is not enough, so that a plurality of operations are always displayed on one canvas on a monitoring display, the operation is similar to a spider web, and under the condition of large operation amount, the batch processing global monitoring can not be refreshed in real time due to too large calculation amount;
(2) the traditional scheduling software generally only provides monitoring on instantiated job flows/job instances, does not well utilize historical running information, such as information of average running time of jobs, normal starting time of the jobs and the like, does not monitor in combination with planned data, can enable users to sense only after the jobs fail to run, and cannot effectively monitor in real time in combination with the historical data;
(3) the traditional scheduling software generally only provides primary monitoring, lacks the function of drilling down, and can not better meet the operation and maintenance requirements.
In order to achieve the above object, a first aspect of the present application provides a batch monitoring method, including: acquiring the operation planned to be operated of the application according to the configuration information of the operation, and generating an operation schedule; acquiring key operation indexes of the operation according to operation historical data of the operation, and generating an operation historical index table; acquiring the operation finished through the operation schedule to calculate and display the progress of application completion; and displaying the abnormal conditions and drilling the information of the abnormal operation by combining the operation history index table to obtain an applied abnormal operation list.
In the embodiment of the present application, acquiring a job scheduled to run by an application according to configuration information of the job, and generating a job run schedule includes: identifying end node applications for a scene, wherein a scene corresponds to an end node application; determining a dependency relationship between the end node application and a job of a corresponding scene; and calculating the application of each node step by step forward according to the dependency relationship among the jobs to obtain the AOE network of the scene, thereby generating an actual job list.
Further, identifying an end node application of the scene includes: classifying each scene according to the configuration information of the operation; and identifying the target application associated with each scene as an end node application.
Further, the obtaining of the key operation index of the job through the operation history data of the job includes using a median of the operation history data as the key operation index of the job.
Further, the key operation indexes of the operation comprise: the execution time length of the job, the starting running time node of the job and the ending running time node of the job.
Further, the method further comprises: and regularly refreshing the abnormal operation list, and keeping the refreshed data in a cache so as to directly read the refreshed data.
Further, acquiring the completed running job through the job running schedule to calculate and display the completion progress of each application, including: polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene; and representing the overall progress of the scene by the progress of the terminal node application.
Further, displaying the abnormal condition and drilling information of the abnormal operation, and acquiring an abnormal operation list of each application, including: acquiring a path with the maximum path length from the AOE network of the scene, and marking the path as a key path; and marking the abnormal conditions according to the critical path, wherein the abnormal conditions comprise marking operation failure, operation overtime and operation delayed starting.
By the method, the global batch processing condition is monitored by the dimension of the scene; by identifying the end point application and the end point application operation set of the scene, the whole AOE graph is automatically calculated and constructed, so that the configuration amount of a user is greatly reduced.
A second aspect of the present application provides a batch monitoring apparatus, comprising: the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring the operation planned to be operated of an application according to the configuration information of the operation and generating an operation schedule; the second module is used for acquiring key operation indexes of the job through operation historical data of the job and generating a job operation historical index table; the third module is used for acquiring the operation finished through the operation schedule so as to calculate and display the progress of application completion; and the fourth module is used for displaying abnormal conditions and drilling information of abnormal operation by combining the operation history index table to obtain an applied abnormal operation list.
In an embodiment of the present application, the first module is configured to: classifying each scene according to the configuration information of the operation; identifying the target application associated with each scene as an end node application; identifying end node applications for a scene, wherein a scene corresponds to an end node application; determining a dependency relationship between the end node application and a job of a corresponding scene; and calculating the application of each node forward step by step according to the dependency relationship among the jobs to obtain the AOE network of the scene, thereby generating the actual job list.
Further, the second module is configured to: polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene; and representing the overall progress of the scene by the progress of the terminal node application.
Further, the fourth module is configured to: obtaining abnormal condition classification; calculating the proportion of abnormal operation through the classification of the abnormal conditions; and performing distinguishing display according to the proportion of the abnormal operation.
The technical effect of the batch processing monitoring device is the same as that of a batch processing monitoring method.
The application further provides a processor configured to execute the batch monitoring method.
In another aspect, the present application provides a batch monitoring apparatus including a processor and a memory, the processor being configured to execute the batch monitoring method.
A fifth aspect of the application provides a computer program product comprising a computer program which, when executed by a processor, implements the batch monitoring method.
A machine-readable storage medium having instructions stored thereon, which when executed by a processor, cause the processor to be configured to perform the batch monitoring method.
By the technical scheme, overall batch processing monitoring is organized through the dimensionality of a scene, and the overall processing progress is controlled in real time; the operation and maintenance efficiency is improved, operation and maintenance can be performed on the operations at fixed points by operation and maintenance personnel through the abnormal operation list of each application in the scene, and the labor input of the operation and maintenance personnel is reduced.
According to the technical scheme, the global batch processing condition is monitored by the dimensionality of scenes, and each scene represents a batch processing logic with business meaning; by identifying the end point application and the end point application operation set of the scene, the whole AOE graph is automatically calculated and constructed, so that the configuration amount of a user is greatly reduced.
The AOE is used for representing the scene, so that the relation of each application in the scene is conveniently displayed, a key path is calculated, whether the current operation condition of the job is normal or not is judged by combining historical data, a job set needing to be operated by applying specific service date/batch is calculated by combining a job operation schedule, the monitoring range is locked, and the job flow does not need to wait for completion of instantiation.
Additional features and advantages of embodiments of the present application will be described in detail in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the embodiments of the disclosure, but are not intended to limit the embodiments of the disclosure. In the drawings:
FIG. 1 schematically illustrates a block diagram of a prior art Control-M system architecture;
FIG. 2 schematically illustrates a flow diagram of a method of batch monitoring according to an embodiment of the present application;
FIG. 3A schematically shows a job run schedule generation flow diagram according to an embodiment of the present application;
FIG. 3B is a schematic diagram illustrating a job run history indicator table generation flow according to an embodiment of the present application;
FIG. 3C is a schematic diagram illustrating an exception job list generation flow according to an embodiment of the present application;
FIG. 4 schematically shows a scene configuration flow diagram according to an embodiment of the application;
FIG. 5 schematically shows an exemplary diagram of a scene AOE according to an embodiment of the application;
FIG. 6 is a block diagram schematically illustrating the structure of a batch monitoring apparatus according to an embodiment of the present application;
fig. 7 schematically shows an internal structure diagram of a computer device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the specific embodiments described herein are only used for illustrating and explaining the embodiments of the present application and are not used for limiting the embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that if directional indications (such as up, down, left, right, front, and back … …) are referred to in the embodiments of the present application, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are changed accordingly.
In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present application.
FIG. 1 schematically shows a block diagram of a prior art Control-M system architecture.
Referring to fig. 1, the technical solution of the Control-M system realizes the separation of three functions of management, scheduling and job execution through a three-layer architecture. Wherein, the Control-M/EM of the first layer is a management module and is responsible for defining/uploading operation, monitoring operation and the like; the Control-M/Server of the second layer is the core of the whole scheduling and undertakes instantiation of job flow, scheduling of job, resource allocation, job operation management and the like; and the Control-M/Agent of the third layer is a job running node and is responsible for running the job issued by the Server, storing the job running state and synchronizing the job running state with the Control-M/Server. Wherein, the monitoring function is mainly completed by Control-M/EM. The principle is that Control-M/EM obtains the dependency relationship between jobs from the data of Control-M/EM DB and obtains the operation condition of job instance in real time from the data of Control-M/EM DB through Control-M/Server, and a user carries out real-time monitoring by specifying a target job flow/job set to be monitored.
The disadvantages of this prior art: the user needs to select all the operation flows/operations to be monitored, and the user is required to be familiar with the monitored whole link processing process; the relation of the selected operation is displayed in a canvas, the dimension is thin, and better monitoring cannot be carried out from the overall angle; the overall processing progress of the target operation monitored by the user cannot be reflected.
For a scheduling system (not shown in fig. 1) based on Airflow in the prior art, the technical scheme mainly monitors from the perspective of a job flow (i.e., DAG), provides visual display in a tree structure, a graph structure, a gantt chart and other manners, and can better monitor the job flow. The tree-shaped and graph structure can conveniently display the running state of each job instance in the job flow, and the Gantt chart mainly analyzes the bottleneck point of the job flow by analyzing the running starting and ending time of each job forming the job flow. It is also monitoring from a workflow perspective in nature.
The disadvantages of this prior art: the Airflow-based scheduling system provides monitoring of the dimension of the job flow, and monitoring with higher dimension is lacked, namely the scene dimension monitoring introduced below is difficult to meet the requirement of global batch processing monitoring; the monitored display only reflects the running state of the instantiated job, and the job flow which is not instantiated can not be displayed.
For a Dolphin scheduler-based scheduling system (not shown in FIG. 1) in the prior art, the technical scheme is similar to the implementation manner of an Airflow-based scheduling system, and is also used for monitoring from the viewpoint of workflow (same as workflow), and tree graphs are provided to show the types and task states of task nodes; furthermore, Dolphin scheduler also provides Gantt chart to complete the analysis capability of workflow bottleneck. The technical disadvantages are the same as above.
The batch monitoring method provided by the application can be applied to an Internet application environment. Wherein, the internet application environment includes: a terminal, a server and a network. The terminal communicates with the server through a network. The terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server can be implemented by an independent server or a server cluster formed by a plurality of servers.
FIG. 2 schematically shows a flow diagram of a method of batch monitoring according to an embodiment of the present application. As shown in fig. 2, in an embodiment of the present application, a batch monitoring method is provided, and this embodiment is mainly exemplified by applying the method to the terminal (or the server), and includes the following steps:
and S202, acquiring the operation of the application plan according to the configuration information of the operation, and generating an operation schedule of the operation.
According to an embodiment, an application is a plurality of applications, each having a number of jobs thereunder. An application may be simply understood as a project (project), and may contain multiple job streams under the application, with multiple jobs per stream. Each job belongs to a certain application.
Firstly, according to the input condition and the output condition of the job initially configured by a user, the dependency relationship between the jobs is found. When the input condition of the job is satisfied, the job is started.
And then converting into the dependency relationship between the applications according to the dependency relationship between the jobs.
And S204, acquiring key operation indexes of the job according to the operation history data of the job, and generating a job operation history index table. Specifically, the key operation indexes of the job include an execution time length of the job, a start operation time node of the job, and an end operation time node of the job.
And S206, acquiring the operated job through the job operation schedule to calculate and display the progress of application completion.
And S208, displaying abnormal conditions and drilling information of abnormal operation by combining the operation history index table, and acquiring an applied abnormal operation list.
The method and the device organize the overall (global) batch processing monitoring through the dimension of the scene, and control the overall processing progress in real time. The operation and maintenance personnel can carry out operation and maintenance on the operations at fixed points through the abnormal operation list of each application in the scene, and the labor input of the operation and maintenance personnel is reduced.
Fig. 3A schematically shows a job operation schedule generation flow diagram according to an embodiment of the present application.
Referring to fig. 3A, a detailed description is provided in conjunction with the method flow diagram of the present application of fig. 2.
According to an embodiment, the method of the technical scheme of the application integrally comprises three processes: the method comprises a job operation schedule generation flow, a job operation historical index refreshing flow and a scene AOE operation information refreshing flow (namely a main flow). The main role of the job operation schedule is to obtain a list of jobs that each application needs to operate every day according to the configuration information (frequency) of the jobs. The job configuration information used here is mainly the execution frequency of the job, such as daily, weekly, monthly, early monthly, late monthly, etc., or a specific time point, such as 09:00 times per day.
At S301, database scheduling configuration information is acquired. After the scene configuration is effective, real-time operation information of the scene needs to be calculated every day, and in concrete implementation, an actual operation list (operation set) of the scene, which needs to be operated, of the service date or the batch can be obtained only by acquiring the application date or the batch of the terminal node (terminal) and calculating forward step by step according to the dependency relationship among the operations.
According to an embodiment, the job set is found according to the end point job dependency relationship and belongs to all jobs under the application. It can be understood that, in order to successfully execute the end-point job, attention needs to be paid to the upstream job on which the end-point job depends, and the job set is the job needing attention and can also be called as the focus job.
According to the embodiment, the business date can be understood as the date to which a batch of jobs specifically belongs, and generally, the business date is equal to the job execution date, and there are cases of inconsistency.
In the present embodiment, the job execution schedule is periodically refreshed in S303. Preferably, the job operation schedule is refreshed once a day, and the job operation schedule may be refreshed at regular time intervals according to actual conditions.
According to an embodiment, the job run schedule is set to be automatically loaded for every day newday. Operation authority, selection scheduling date, job name, job type, command execution user, job permission scheduling date, such as scheduling every day, exception logic processing for the job, and scene information configuration.
The operation range to be monitored is locked in advance through the schedule, the instantiation of the operation flow is not required to be waited for, and the monitoring accuracy is greatly improved
Fig. 3B schematically shows a job operation history index table generation flow diagram according to an embodiment of the present application.
Referring to fig. 3B, according to the embodiment, the job operation history index table functions to obtain a key operation index of a job through the history operation data of the job, and may also be understood as an important operation index.
According to an embodiment, at S311, job flow information of successful execution, such as the execution duration of the job, the start running time (time node) of the job, and the end running time (time node) of the job, is obtained.
And in S313, periodically refreshing the operation running time, the starting time node and the ending time node according to a median method. Specifically, the method of calculation employs median acquisition of historical data. The median here means the median of the operation time length of the job history. Some jobs may run in batches every day, each day has a running time, and a storage process (Oracle) counts the median of the running time of all jobs each day.
At S315, the job operation history index table is refreshed.
Fig. 3C schematically shows an abnormal job list generation flow diagram according to an embodiment of the present application.
Referring to fig. 3C, according to an embodiment, in S321, the main flow of the scene AOE refresh may be refreshed every 5 minutes, and when it is determined that the refresh is not completed, the flow proceeds to S323, and the currently configured scene information is acquired, and in S235, all job sets that need to be run by applying the current service date or batch are acquired according to the job running schedule.
According to the embodiment, at S237, the number of jobs successfully run by each application is updated, the progress information is refreshed, and the jobs that have already been completed run are acquired to calculate the progress of the application.
In step S239, by judging the job running flow information and combining with the job running history index table, an abnormal job list of each application is obtained, and further drilling of information of abnormal jobs is supported. The scene AOE refreshes all data refreshed by the main flow in the Redis cache, and the Redis data can be directly read from the page, so that the condition that the page cannot be refreshed due to overlarge calculated amount is avoided.
Fig. 4 schematically shows a scene configuration flow diagram according to an embodiment of the present application.
Referring to fig. 4, a scenario represents a batch business process logic, such as supervisory submission, data warehouse processing, etc., and the scenario itself is business-meaningful. The scene configuration process is as follows:
in S401, the scenes are classified according to the configuration information of the job. In particular, the scenario is determined according to the batch business logic of interest.
In S403, the target application associated with each scene is identified as an end node application, specifically, an end node application, which is a target application associated with the scene, is identified, which may also be understood as an end node application, and end node applications of each scene are identified, where one scene corresponds to one end node application.
At S405, a dependency relationship between the end node application and the job of the corresponding scenario is determined.
According to the embodiment, a job set related to the scene, a concerned business date/batch rule (the rule can calculate the date and the batch needing to be monitored according to the current natural date, for example, T (1) represents the day after the current system date, T (-1) represents the day before the current system date, the batch is composed of 4-digit numbers and covers 1-9999, for example, B (1) represents the 1 st batch) and a calculation level are determined, wherein the calculation level is mainly used for determining the maximum level of automatic calculation.
In S407, each node application is calculated forward step by step according to the dependency relationship between the jobs, and the AOE network of the scene is obtained, thereby generating the actual job list.
Because the scenes are meaningful, one scene also represents an important batch processing logic, and batch processing business can be better understood from the business perspective through the scene combing process, so that the monitoring is more targeted; in the scene configuration process, after a user identifies a scene, only the terminal application needs to be determined, and the system automatically calculates and constructs the AOE graph of the scene, so that the configuration complexity is reduced, and the method is more user-friendly.
According to the embodiment, the upstream application is automatically calculated upwards stage by stage according to the dependency relationship between the jobs and the application to which the jobs belong in the system, so that the AOE graph of the scene is calculated, and the AOE graph represents the processing link of the scene. After the scene configuration is finished, the actual job list can be obtained.
Fig. 5 schematically shows an exemplary diagram of a scene AOE according to an embodiment of the present application.
Referring to FIG. 5, a typical scene AOE graph is shown in FIG. 5. The data which can be displayed by the method comprises the dependency relationship, the real-time processing progress and the abnormal condition display of the scene (namely whether the current progress is in a normal level or not needs to be reflected). The following describes the respective embodiments.
Acquiring the running completed jobs through the actual job list to calculate and display the completion progress of each application, wherein the method comprises the following steps: polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene; the overall progress of the scenario is characterized by the progress of the end node application.
According to an embodiment, the dependencies in fig. 5: by converting the dependencies between jobs into dependencies between applications, the application to which the upstream job belongs will have an edge pointing to the downstream application. For example, the start application a1 is an upstream application of application a2 and application A3. For the edge of application a1, the corresponding job sets upstream and downstream are determined.
According to the embodiment, the real-time processing progress is periodically polled by the background thread according to the job set contained in each application in the scenario of fig. 5 to poll the current completion condition of the jobs. Specifically, the starting point application a1 progress is 100%, and the progress of application a2 is displayed at 30%. The calculation method comprises the following steps: the real-time progress of the application is the number of successfully executed jobs/the total number of jobs included in the application in the scene.
According to an embodiment, the overall progress of the scenario is characterized by the progress of the end node application (end point application), e.g. application a6 progress is 30%.
According to an embodiment, the abnormal situation comprises three cases: job operation failure, job operation timeout, job delayed start. Displaying abnormal conditions and drilling operation instance information, and acquiring an abnormal operation list of each application, wherein the abnormal operation list comprises the following steps: acquiring a path with the maximum path length from the AOE network of the scene, and marking the path as a key path; and marking the abnormal conditions according to the critical path, wherein the abnormal conditions comprise marking operation failure, operation overtime and operation delayed starting.
According to the embodiment, the operation failure of the job can be acquired in real time according to the operation flow of the job; the overtime operation condition and the delayed start condition of the operation need to be acquired by combining historical operation data, the background calculates the historical operation index (execution duration and start time node) of each operation through a fixed thread, and then compares whether the overtime operation condition and the delayed start condition exist in the operation condition in real time. In summary, the ratio of the abnormal operation is (number of failed operations + number of overtime operations + number of delayed operations)/total number of operations included in the application in this scenario.
According to the embodiment, the severity of the abnormal condition can be displayed in different colors. For example, 80-100% of the working abnormalities correspond to deep red, 50-80% correspond to orange, 20-50% correspond to light yellow, 10-20% correspond to blue, and 0-10% correspond to green.
According to an embodiment, when an application in a scene is drilled down, the state of each job in the application under job set is shown. Referring to table 1, a further jump to the job instance page may be made by the job name. In this way, the detailed processing situation of each job under the current application can be shown in more detail.
Name of application Job flow name Job name Date of service Batches of Anomaly identification
Applications 1 Flowa Job1 20210101 0001 Execution failure
Applications 1 Flowb Job2 20210101 0001 Delayed start-up
Applications 1 Flowc Job3 20210101 0001 Is normal
TABLE 1 Scenario application drill-down information
According to an embodiment, in the AOE graph described above, the path having the largest path length (sum of time duration of each activity on the path) from the start point to the end point is the critical path. For example, referring to fig. 5, in the case that a 1-a 2 go through a4 to reach the end point, the key path is calculated by using a6 as the key path, and the key point is how to define the weight of the edge. Two strategies are provided in this application to define the weight of an edge. Strategy one: for example, the upstream application of a4 is a2 and A3, and then the weight with the longest running time is applied to a2 which judges that the progress is 30% by a policy, a critical path from a1 to a2 reaches an end-point application A6 through a4 is obtained, wherein the running time of each job can be represented by a median of the running time of the last 30 dates.
According to an embodiment, policy two: the latest ending time in the job set of the upstream application associated with the side indicates (the ending time of the job is the median of the ending times of the jobs on the past 30 dates), and for convenience of calculation, the latest ending time needs to be converted into an integer value based on a reference value of a certain time to represent the weight of the side. After the weight of each edge is determined, the critical path can be calculated, and the path with the maximum sum of the edge weights applied from the starting point to the end point is determined to be the critical path.
The method and the system organize the integral batch processing monitoring through the dimension of the scene, and control the integral processing progress in real time; the operation and maintenance efficiency is improved, operation and maintenance can be performed on the operations at fixed points by operation and maintenance personnel through the abnormal operation list of each application in the scene, and the labor input of the operation and maintenance personnel is reduced.
According to the technical scheme, the global batch processing condition is monitored by the dimensionality of scenes, and each scene represents a batch processing logic with business meaning; by identifying the end point application and the end point application operation set of the scene, the whole AOE graph is automatically calculated and constructed, so that the configuration amount of a user is greatly reduced.
The AOE is used for representing the scene, so that the relation of each application in the scene is conveniently displayed, a key path is calculated, whether the current operation condition of the job is normal or not is judged by combining historical data, a job set needing to be operated by applying specific service date/batch is calculated by combining a job operation schedule, the monitoring range is locked, and the job flow does not need to wait for completion of instantiation.
Fig. 6 schematically shows a block diagram of a batch monitoring apparatus according to an embodiment of the present application.
In one embodiment, as shown in FIG. 6, there is provided a batch monitoring apparatus 600 comprising a first module 601, a second module 603, a third module 605, and a fourth module 607, wherein:
the first module 601 is configured to obtain a job scheduled to run by an application according to configuration information of the job, and generate a job running schedule.
The second module 603 is configured to obtain a key operation index of the job according to the operation history data of the job, and generate a job operation history index table.
The third module 605 is configured to obtain the completed job through the job running schedule to calculate and display the progress of application completion.
The fourth module 607 is configured to combine the operation history index table, display an abnormal situation, drill information of an abnormal operation, and obtain an abnormal operation list of an application.
According to an embodiment, the first module 601 is configured to: classifying each scene according to the configuration information of the operation; identifying target applications associated with each scene as end node applications; identifying end node applications for a scene, wherein a scene corresponds to an end node application; determining a dependency relationship between the end node application and the jobs of the corresponding scene; and calculating the application of each node forward step by step according to the dependency relationship among the jobs to obtain the AOE network of the scene, thereby generating the actual job list.
According to an embodiment, the second module 603 is configured to: polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene; the overall progress of the scenario is characterized by the progress of the end node application.
According to an embodiment, the fourth module 607 is configured to: obtaining abnormal condition classification; calculating the proportion of abnormal operation through the classification of the abnormal conditions; and performing distinguishing display according to the proportion of the abnormal operation.
The batch monitoring device comprises a processor and a memory, wherein the first module 601, the second module 603, the third module 605 and the fourth module 607 are all stored in the memory as program units, and the processor executes the program modules stored in the memory to realize corresponding functions.
The batch processing monitoring device has the beneficial effects of the batch processing monitoring method, and is not repeated herein.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. One or more than one kernel can be set, and the batch processing monitoring method is realized by adjusting kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present application provides a storage medium, on which a program is stored, and the program, when executed by a processor, implements the batch monitoring method described above.
The embodiment of the application provides a processor, wherein the processor is used for running a program, and the batch monitoring method is executed when the program runs.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor a01, a network interface a02, a memory (not shown), and a database (not shown) connected by a system bus. Wherein processor a01 of the computer device is used to provide computing and control capabilities. The memory of the computer device comprises an internal memory a03 and a non-volatile storage medium a 04. The non-volatile storage medium a04 stores an operating system B01, a computer program B02, and a database (not shown in the figure). The internal memory a03 provides an environment for the operation of the operating system B01 and the computer program B02 in the nonvolatile storage medium a 04. The database of the computer device is used for storing batch process monitoring data. The network interface a02 of the computer device is used for communication with an external terminal through a network connection. The computer program B02 is executed by the processor a01 to implement a batch monitoring method.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the batch monitoring apparatus provided herein may be implemented in the form of a computer program that is executable on a computer device such as that shown in FIG. 7. The memory of the computer device may store various program modules that make up the batch monitoring apparatus, such as the first, second, third, and fourth modules shown in FIG. 6. The respective program modules constitute computer programs that cause the processors to execute the steps in the batch monitoring method of the embodiments of the present application described in the present specification.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, which include both non-transitory and non-transitory, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (16)

1. A batch monitoring method, the method comprising:
acquiring the operation planned to be operated of the application according to the configuration information of the operation, and generating an operation schedule;
acquiring key operation indexes of the operation according to operation historical data of the operation, and generating an operation historical index table;
acquiring the operation finished through the operation schedule to calculate and display the progress of application completion;
and displaying the abnormal conditions and drilling the information of the abnormal operation by combining the operation history index table to obtain an applied abnormal operation list.
2. The method according to claim 1, wherein acquiring the job scheduled to run by the application according to the configuration information of the job, and generating the job run schedule comprises:
identifying end node applications for a scene, wherein a scene corresponds to an end node application;
determining a dependency relationship between the end node application and a job of a corresponding scene;
and calculating the application of each node step by step forward according to the dependency relationship among the jobs to obtain the AOE network of the scene, thereby generating an actual job list.
3. The method of claim 2, wherein identifying an end node application of a scene comprises:
classifying each scene according to the configuration information of the operation;
and identifying the target application associated with each scene as an end node application.
4. The method according to claim 1, wherein obtaining a key operation index of a job from operation history data of the job comprises using a median of the operation history data as the key operation index of the job.
5. The method of claim 1 or 4, wherein the key operational indicators of the job comprise:
the execution time length of the job, the starting running time node of the job and the ending running time node of the job.
6. The method of claim 1, further comprising:
and regularly refreshing the abnormal operation list, and keeping the refreshed data in a cache so as to directly read the refreshed data.
7. The method according to claim 3, wherein acquiring, through the job running schedule, the jobs that have been completed by running to calculate and show the progress of completion of each application comprises:
polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene;
and representing the overall progress of the scene by the progress of the terminal node application.
8. The method of claim 2, wherein presenting the abnormal situation and drilling information of the abnormal operation to obtain the abnormal operation list of each application comprises:
acquiring a path with the maximum path length from the AOE network of the scene, and marking the path as a key path;
and marking the abnormal conditions according to the critical path, wherein the abnormal conditions comprise marking operation failure, operation overtime and operation delayed starting.
9. A batch monitoring apparatus, the apparatus comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring the operation planned to be operated of an application according to the configuration information of the operation and generating an operation schedule;
the second module is used for acquiring key operation indexes of the job through operation historical data of the job and generating a job operation historical index table;
the third module is used for acquiring the operation finished through the operation schedule so as to calculate and display the progress of application completion;
and the fourth module is used for displaying abnormal conditions and drilling information of abnormal operation by combining the operation history index table to obtain an applied abnormal operation list.
10. The apparatus of claim 9, wherein the first module is configured to:
classifying each scene according to the configuration information of the operation;
identifying the target application associated with each scene as an end node application;
identifying end node applications for a scene, wherein a scene corresponds to an end node application;
determining a dependency relationship between the end node application and a job of a corresponding scene;
and calculating the application of each node step by step forward according to the dependency relationship among the jobs to obtain the AOE network of the scene, thereby generating an actual job list.
11. The apparatus of claim 10, wherein the second module is configured to:
polling the current completion condition of the job according to the number of jobs contained in each application in the scene, wherein the real-time progress of each application is the number of successfully executed jobs/the total number of jobs contained in the application in the scene;
and representing the overall progress of the scene by the progress of the terminal node application.
12. The apparatus of claim 11, wherein the fourth module is configured to:
obtaining abnormal condition classification;
calculating the proportion of abnormal operation through the classification of the abnormal conditions;
and performing distinguishing display according to the proportion of the abnormal operation.
13. A batch monitoring apparatus comprising a processor and a memory, wherein the processor is configured to perform the method of any one of claims 1 to 8.
14. A processor configured to perform the batch monitoring method according to any one of claims 1 to 8.
15. A machine-readable storage medium having instructions stored thereon, which when executed by a processor causes the processor to be configured to perform the method of any one of claims 1 to 8.
16. A computer program product comprising a computer program, characterized in that the computer program realizes the method according to any one of claims 1 to 8 when executed by a processor.
CN202111486700.2A 2021-12-07 2021-12-07 Batch processing monitoring method, device and equipment Pending CN114154962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111486700.2A CN114154962A (en) 2021-12-07 2021-12-07 Batch processing monitoring method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111486700.2A CN114154962A (en) 2021-12-07 2021-12-07 Batch processing monitoring method, device and equipment

Publications (1)

Publication Number Publication Date
CN114154962A true CN114154962A (en) 2022-03-08

Family

ID=80453075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111486700.2A Pending CN114154962A (en) 2021-12-07 2021-12-07 Batch processing monitoring method, device and equipment

Country Status (1)

Country Link
CN (1) CN114154962A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971438A (en) * 2022-08-02 2022-08-30 中国工业互联网研究院 Industrial equipment monitoring method, device, equipment and medium based on industrial internet

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114971438A (en) * 2022-08-02 2022-08-30 中国工业互联网研究院 Industrial equipment monitoring method, device, equipment and medium based on industrial internet

Similar Documents

Publication Publication Date Title
US11847103B2 (en) Data migration using customizable database consolidation rules
US20200104377A1 (en) Rules Based Scheduling and Migration of Databases Using Complexity and Weight
US20200104375A1 (en) Data Migration Using Source Classification and Mapping
US11010197B2 (en) Dynamic allocation of physical computing resources amongst virtual machines
US8443373B2 (en) Efficient utilization of idle resources in a resource manager
EP3454210B1 (en) Prescriptive analytics based activation timetable stack for cloud computing resource scheduling
US10783002B1 (en) Cost determination of a service call
US9870269B1 (en) Job allocation in a clustered environment
US20140282520A1 (en) Provisioning virtual machines on a physical infrastructure
US8433675B2 (en) Optimization and staging
US20120005597A1 (en) Cooperative batch scheduling in multitenancy systems
US20060190944A1 (en) System and Method for Resource Management
CN105700948A (en) Method and device for scheduling calculation task in cluster
US10795724B2 (en) Cloud resources optimization
US20160321108A1 (en) Intelligent management of processing tasks on multi-tenant or other constrained data processing platform
WO2005066866A1 (en) Using technical performance metrics for business and usage analysis and cost allocation
CN110928655A (en) Task processing method and device
CN108268546A (en) A kind of method and device for optimizing database
CN108459905B (en) Resource pool capacity planning method and server
JP2021520560A (en) A method for scheduling semiconductor trailing factories
US20140358624A1 (en) Method and apparatus for sla profiling in process model implementation
CN104321753A (en) Method for representing usage amount of monitoring resource, computing device, and recording medium having program recorded thereon for executing thereof
CN114154962A (en) Batch processing monitoring method, device and equipment
CN115330219A (en) Resource scheduling method and device
CN113344392A (en) Enterprise project comprehensive management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination