CN113778803A - Task resource monitoring system, method and storage medium - Google Patents

Task resource monitoring system, method and storage medium Download PDF

Info

Publication number
CN113778803A
CN113778803A CN202111078983.7A CN202111078983A CN113778803A CN 113778803 A CN113778803 A CN 113778803A CN 202111078983 A CN202111078983 A CN 202111078983A CN 113778803 A CN113778803 A CN 113778803A
Authority
CN
China
Prior art keywords
task
monitoring
information
hardware
software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111078983.7A
Other languages
Chinese (zh)
Inventor
余辉
马万铮
王志国
邢焱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Coocaa Network Technology Co Ltd
Original Assignee
Shenzhen Coocaa Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Coocaa Network Technology Co Ltd filed Critical Shenzhen Coocaa Network Technology Co Ltd
Priority to CN202111078983.7A priority Critical patent/CN113778803A/en
Publication of CN113778803A publication Critical patent/CN113778803A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a task resource monitoring system, a method and a storage medium. The system comprises a task monitoring module, a task scheduling module and a task scheduling module, wherein the task monitoring module is used for acquiring a big data task to be monitored on a task scheduler and determining a target server to which the big data task to be monitored is distributed; hardware monitoring is carried out on hardware information of a target server in executing a big data task to be monitored through a preset hardware monitoring program on the target server, and hardware monitoring information is generated; software monitoring is carried out on software information on a distributed scheduling system in the process that a target server executes a big data task to be monitored through a preset software monitoring program, and software monitoring information is generated; and storing the hardware monitoring information and the software monitoring information to a log storage module. The invention monitors from both hardware and software aspects. Compared with the existing mode of only monitoring specific operation parameters, the mode of the invention can monitor the stability and the resource rationality of the big data task.

Description

Task resource monitoring system, method and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a task resource monitoring system, method, and storage medium.
Background
With the rise of the era of big data, each company can deploy its own big data platform, and each unit needs one big data task scheduling system along with the business growth of the company, however, the big data tasks of each family only increase every year and cannot be reduced, so that one monitoring system is urgently needed to monitor and evaluate the operation of all tasks.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a task resource monitoring system, a task resource monitoring method and a storage medium, and aims to solve the technical problem that the stability and the resource rationality of all big data tasks cannot be monitored when the running parameters of the big data tasks are monitored in the prior art.
In order to achieve the above object, the present invention provides a task resource monitoring system, which at least includes: a task monitoring module;
the task monitoring module is used for acquiring a big data task to be monitored on the task scheduler and determining a target server to which the big data task to be monitored is distributed;
the task monitoring module is further configured to perform hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generate hardware monitoring information;
the task monitoring module is also used for carrying out software monitoring on software information on a distributed scheduling system in the process that the target server executes the big data task to be monitored through a preset software monitoring program and generating software monitoring information;
the task monitoring module is further configured to store the hardware monitoring information and the software monitoring information in a log storage module.
Optionally, the task resource monitoring system further includes a resource display module;
the resource display module is used for acquiring a query instruction of a user, selecting target parameter information from the hardware monitoring information and the software monitoring information according to the query instruction, and displaying the target parameter information according to a preset display strategy.
Optionally, the task resource monitoring system further includes a task early warning module;
the task early warning module is used for judging whether the hardware monitoring information and the software monitoring information meet corresponding preset early warning conditions or not, and when the operation parameter information meets the preset early warning conditions, early warning is carried out in a preset prompting mode.
Optionally, the task early warning module is further configured to obtain an operating parameter value range of each operating parameter within a preset time period;
the task early warning module is also used for determining the maximum value and the minimum value of the operation parameter according to the range of the operation parameter value;
the task early warning module is also used for determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters;
the task early warning module is further configured to determine whether a current operation parameter value is greater than the target early warning threshold value, and perform early warning in a preset prompting manner when the current operation parameter value is greater than the target early warning threshold value.
Optionally, the task early warning module is further configured to obtain a preset task allocation table when the current operation parameter value is greater than the target early warning threshold value;
the task early warning module is also used for searching an object to be notified corresponding to the big data task to be monitored according to the preset task allocation table;
the task early warning module is further configured to send hardware monitoring information and software monitoring information corresponding to the big data task to be monitored to the object to be notified, so that the object to be notified corrects the current big data task according to the hardware monitoring information and the software monitoring information.
Optionally, the log storage module is further configured to generate a hardware resource log according to the hardware monitoring information, and store the hardware resource log in a hardware information table of a preset Mysql database;
the log storage module is further used for generating a software resource log according to the software monitoring information and storing the software resource log into a software information table of a preset Mysql database.
Optionally, the hardware monitoring information includes a server IP, a server process, a server CPU occupancy rate, a server memory, a task scheduling project name, and a task scheduling state;
the software monitoring information comprises the total number of cluster resources, the utilization rate of the cluster resources, the queue name, the queue utilization rate, the task name, the task execution time and the task cpu consumption information.
In addition, to achieve the above object, the present invention further provides a task resource monitoring method, where the method includes:
acquiring a big data task to be monitored on a task scheduler, and determining a target server to which the big data task to be monitored is distributed;
performing hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generating hardware monitoring information;
performing software monitoring on software information on a distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information;
and storing the hardware monitoring information and the software monitoring information into a log storage module.
Optionally, after the step of storing the hardware monitoring information and the software monitoring information in a log storage module, the method further includes:
acquiring the operating parameter value ranges of the operating parameters in the hardware monitoring information and the software monitoring information within a preset time period;
determining a maximum value and a minimum value of the operation parameter according to the range of the operation parameter value;
determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters;
and judging whether the current operation parameter value is larger than the target early warning threshold value or not, and when the current operation parameter value is larger than the target early warning threshold value, adopting a preset prompting mode to carry out early warning.
In addition, in order to achieve the above object, the present invention further provides a storage medium, wherein the storage medium stores a task resource monitoring program, and the task resource monitoring program implements the steps of the task resource monitoring method when executed by a processor.
The method comprises the steps that a task monitoring module is used for obtaining a big data task to be monitored on a task scheduler and determining a target server to which the big data task to be monitored is distributed; performing hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generating hardware monitoring information; performing software monitoring on software information on a distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information; and storing the hardware monitoring information and the software monitoring information into a log storage module. The invention monitors from both hardware and software aspects. Compared with the existing mode of only monitoring specific operation parameters, the mode of the invention can monitor the stability, the orderliness and the resource rationality of the big data task.
Drawings
FIG. 1 is a schematic diagram of a task resource monitoring system according to a first embodiment of the present invention;
FIG. 2 is a diagram of a task resource monitoring system according to a second embodiment of the present invention;
FIG. 3 is a flowchart illustrating a task resource monitoring method according to a first embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a block diagram of a task resource monitoring system according to a first embodiment of the present invention.
The task resource monitoring system at least comprises: a task monitoring module 10;
and the task monitoring module is used for acquiring the big data task to be monitored on the task scheduler and determining the target server to which the big data task to be monitored is distributed.
It should be noted that the task scheduler may be an Azkaban batch workflow task scheduler for running a set of jobs and processes in a specific order within a workflow. The Azkaban uses the job configuration file to establish the dependency relationship between tasks and provides an easy-to-use web user interface to maintain and track workflow. The big data task to be monitored can be all tasks on the task scheduler or a specified big data task needing monitoring. The number of Azkaban cluster hardware servers is generally several, and the Azkaban cluster hardware servers are set by self according to a specific use scene, for example, in a certain scene, there are 5 Azkaban cluster hardware servers: server a-server E. The target server can be a server executing the big data task to be monitored.
The task monitoring module is further configured to perform hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generate hardware monitoring information.
It should be noted that the preset hardware monitoring program may be a program which is installed in advance on the server and can monitor information such as how much G, CPU utilization rate of the memory of the machine is occupied by each big data task on the server, a CPU load rate, and the like, for example, a shell script which can obtain hardware monitoring information. The hardware monitoring information comprises information such as server IP, service process, server integral CPU occupancy rate, server integral memory, AZ project name, AZ-job information, AZ-flow state, AZ-flow execution ID, AZ task execution occupied memory, AZ task running time and the like.
It should be understood that, the Azkaban cluster hardware server is generally a plurality of servers, each server is installed with a preset hardware monitoring program, and the preset hardware monitoring program is called once every preset period through a crotab timing tool, that is, hardware monitoring information can be acquired once every preset period. The preset period can be set in a user-defined manner according to monitoring requirements, for example, the preset period is set to be 5 minutes, and a crotab timing tool is set to call a preset hardware monitoring program every 5 minutes to obtain hardware monitoring information.
In a specific implementation, for example, there are 5 Azkaban cluster hardware servers, each of which has a preset hardware monitoring program installed thereon, the preset hardware monitoring program is started every 5 minutes by a Corntab timing tool, after the preset hardware monitoring program is started, the current date (year-month-day-hour-minute-second) of each server is obtained, the basic information of each server, for example, server IP, total memory, and load value, is obtained, and all big data task information of each server is obtained, including: the information such as the task process, the task execution ID, the occupied memory, the AZ-process, the AZ-task execution occupied memory, the AZ-task operation time and the like can be obtained through the AZ-process, the project name, the job information and the flow information of the big data task on the Azkaban can be obtained, and the big data task can be accurately positioned to be operated. Calling an AZ interface through the execution ID of the task, and acquiring hardware monitoring information of the task through the AZ interface, wherein the hardware monitoring information comprises: information such as an AZ-project name, AZ-job information, AZ-flow state, AZ-flow execution ID and the like, because a big data task is in a byte form on Linux, unit system conversion, such as conversion into M or G, needs to be carried out on monitoring information of each task, and converted hardware monitoring information is obtained. And a hardware information table can be generated according to the hardware monitoring information, as shown in the following table 1-hardware information table:
TABLE 1 hardware information Table
Figure BDA0003259822480000061
The task monitoring module is further configured to perform software monitoring on software information on the distributed scheduling system in the process that the target server executes the big data task to be monitored through a preset software monitoring program, and generate software monitoring information.
It should be noted that the preset software monitoring program may be a program which is installed in advance on the server and is capable of monitoring each parameter in the software monitoring information, such as the size of a queue of the distributed scheduling system occupied by each task, the used memory of the task, the size of the JVM used by the task, and the like, for example, a shell script capable of acquiring the software monitoring information on the distributed scheduling system. The software monitoring information comprises the total number of cluster resources, the utilization rate of the cluster resources, the queue name, the queue utilization rate, the task name, the task execution time and the task cpu consumption information. The distributed scheduling system is a new Hadoop resource manager (Apache Hadoop YARN, Yran, hereinafter simply referred to as Yran), which is a universal resource management system and can provide uniform resource management and scheduling for upper applications. The distributed scheduling system only needs to be on a certain server in the cluster to obtain the software information.
In a specific implementation, for example, on a certain server in the cluster, a shell script of a software task, i.e., a preset software monitoring program, is started every 5 minutes by a Corntab timing tool, and the current date (year-month-day-hour-minute-second) on each server is acquired (e.g., 2021-09-0810: 10: 10), and all task information on Yran is acquired, including: the method comprises the following steps of obtaining all task information on a server, including the task ID of the YRan, the Az task execution time, the Az task jvm process, the Az task cpu consumption information, the Az task allocation memory (m) and the like, obtaining software monitoring information of the tasks on the YRan by matching the tasks on the YRan with the tasks of the server through the task ID of the YRan, carrying out binary conversion on all parameters in the software monitoring information, and generating a software information table according to the converted software monitoring information, wherein the total number of the YRan cluster resources, the utilization rate of the YRan cluster resources, the YRan single queue name, the YRan single queue utilization rate, the YRan task name, the YRan task creator, the task ID of the YRan and the like, obtaining the task ID of the YRan, the Az task execution time, the Az task jvm process, the Az task cpu consumption information, the Az task allocation memory (m) of the Az and the like, matching the tasks on the YRan with the task ID of the Server through the YRan, obtaining the software monitoring information of the tasks on the YRan, carrying out binary conversion on the YRan, carrying out the parameters in the software monitoring information, carrying out binary conversion on each parameter in the software monitoring information, and generating a software information table, as shown in the following table 2-software monitoring information table:
TABLE 2 software information Table
Figure BDA0003259822480000071
The task monitoring module is further configured to store the hardware monitoring information and the software monitoring information in a log storage module.
It should be noted that, the storing the hardware monitoring information and the software monitoring information into the log storage module may be generating a hardware resource log according to the hardware monitoring information, and storing the hardware resource log into a hardware information table of a preset Mysql database; and generating a software resource log according to the software monitoring information, and storing the software resource log into a software information table of a preset Mysql database.
The embodiment comprises a task monitoring module, a task scheduling module and a task scheduling module, wherein the task monitoring module is used for acquiring a big data task to be monitored on a task scheduler and determining a target server to which the big data task to be monitored is distributed; performing hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generating hardware monitoring information; performing software monitoring on software information on a distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information; and storing the hardware monitoring information and the software monitoring information into a log storage module. Since the present embodiment is monitored from both hardware and software aspects. Compared with the existing mode of only monitoring specific operation parameters, the mode of the embodiment can monitor the stability, the orderliness and the resource rationality of the big data task.
Referring to fig. 2, fig. 2 is a block diagram of a task resource monitoring system according to a second embodiment of the present invention, and the task resource monitoring system according to the second embodiment of the present invention is provided based on the first embodiment.
In this embodiment, the task resource monitoring system includes a resource display module 30 and a task early warning module 40 in addition to the task monitoring module 10 and the log storage module 20;
the resource display module 30 is configured to obtain a query instruction of a user, select target parameter information from the hardware monitoring information and the software monitoring information according to the query instruction, and display the target parameter information according to a preset display policy.
It should be noted that the query instruction may be a command, such as a date and a parameter that the user needs to query, for example, the CPU utilization of the server a that the user needs to query a certain day, or the total number of tasks on the query Azkaban, and the like. The target parameter information may be parameter information that a user needs to query, for example, a query instruction of the user is to query the total number of tasks on Azkaban, and then the target parameter information may be change information of the total number of tasks on Azkaban over time. The preset display policy may be a preset manner for displaying the target parameter information corresponding to the query instruction to the user, for example, the change information of the parameter value in one day may be displayed in a curve manner, or may be displayed in a table manner, and the present embodiment is not limited herein.
The task early warning module 40 is configured to determine whether each piece of operation parameter information in the hardware monitoring information and the software monitoring information meets a corresponding preset early warning condition, and perform early warning in a preset prompting manner when the operation parameter information meets the preset early warning condition.
It should be noted that the preset early warning condition may be a preset threshold condition that needs to be early warned, for example, if the CPU utilization rate reaches 80%, the task early warning module performs early warning if the CPU utilization rate at a certain time is greater than or equal to 80% according to the hardware monitoring information or the software monitoring information. The preset prompting mode can be a mode of short message or a mode of mail for early warning. For example, the warning information is sent to the corresponding maintenance personnel by a short message.
Furthermore, in order to achieve the resource rationality of big data task monitoring, the early warning threshold value of each parameter when early warning is needed can be reasonably determined through the operation information of each monitored parameter, and the task early warning module is also used for acquiring the operation parameter value range of each operation parameter within a preset time period; determining a maximum value and a minimum value of the operation parameter according to the range of the operation parameter value; determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters; and judging whether the current operation parameter value is larger than the target early warning threshold value or not, and adopting a preset prompting mode to carry out early warning when the current operation parameter value is larger than the target early warning threshold value.
It should be noted that the preset time period may be a preset time period, and is used to determine the target early warning threshold according to data in the preset time period. The operating parameter value range may be a parameter value range determined by the maximum value and the minimum value of the parameter values of the respective operating parameters within the preset time period. The target early warning threshold may be a threshold that needs to be early warned after the value is reached.
In a specific implementation, for example, the task number early warning threshold on Yran is determined according to the total number range of the tasks on Yran in the past week, the maximum total number of the tasks on Yran in the past week is 80, the minimum number of the tasks is 10, the total number range of the tasks on Yran in the past week is 10-80, the task number early warning threshold on Yran can be set to be 60 according to the total number range of the tasks being 10-80, and early warning is performed in a preset prompting mode when the number of the tasks on Yran is greater than or equal to 60.
Further, in order to make the management of the big data task more efficient, the big data task can be corrected in time after the early warning is performed, and the task early warning module is further configured to obtain a preset task allocation table when the current operation parameter value is greater than the target early warning threshold value; the task early warning module is also used for searching an object to be notified corresponding to the big data task to be monitored according to the preset task allocation table; the task early warning module is further configured to send hardware monitoring information and software monitoring information corresponding to the big data task to be monitored to the object to be notified, so that the object to be notified corrects the current big data task according to the hardware monitoring information and the software monitoring information.
It should be noted that the preset task allocation table may be a corresponding relationship table between a big data task and an object to be notified corresponding to the big data task, and the object to be notified may be a research and development or maintenance worker corresponding to the big data task. When the current operation parameter value is larger than the target early warning threshold value, searching an object to be notified corresponding to the big data task to be monitored according to the preset task allocation table; and sending the hardware monitoring information and the software monitoring information corresponding to the big data task to be monitored to the object to be notified, so that the object to be notified can correct the current big data task in time according to the hardware monitoring information and the software monitoring information.
In this embodiment, the task resource monitoring system further includes a resource display module 30 and a task early warning module 40; the resource display module 30 obtains a query instruction of a user, selects target parameter information from the hardware monitoring information and the software monitoring information according to the query instruction, and displays the target parameter information according to a preset display strategy. The task early warning module 40 judges whether each operation parameter information in the hardware monitoring information and the software monitoring information meets a corresponding preset early warning condition, and when the operation parameter information meets the preset early warning condition, a preset prompting mode is adopted for early warning. The task early warning module is also used for acquiring the operating parameter value range of each operating parameter in a preset time period; determining a maximum value and a minimum value of the operation parameter according to the range of the operation parameter value; determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters; and judging whether the current operation parameter value is larger than the target early warning threshold value or not, and adopting a preset prompting mode to carry out early warning when the current operation parameter value is larger than the target early warning threshold value. According to the embodiment, through the modules, hardware monitoring and software monitoring are carried out on the big data task together, display and early warning are carried out timely, and the stability and the resource rationality of the big data task are improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating a task resource monitoring method according to a first embodiment of the present invention.
As shown in fig. 3, a task resource monitoring method provided in an embodiment of the present invention includes:
step S100: the method comprises the steps of obtaining a big data task to be monitored on a task scheduler, and determining a target server to which the big data task to be monitored is distributed.
Step S200: and carrying out hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored by a preset hardware monitoring program on the target server, and generating hardware monitoring information.
Step S300: and performing software monitoring on software information on the distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information.
Step S400: and storing the hardware monitoring information and the software monitoring information into a log storage module.
The embodiment comprises a task monitoring module, a task scheduling module and a task scheduling module, wherein the task monitoring module is used for acquiring a big data task to be monitored on a task scheduler and determining a target server to which the big data task to be monitored is distributed; performing hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generating hardware monitoring information; performing software monitoring on software information on a distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information; and storing the hardware monitoring information and the software monitoring information into a log storage module. Since the present embodiment is monitored from both hardware and software aspects. Compared with the existing mode of only monitoring specific operation parameters, the mode of the embodiment can monitor the stability, the orderliness and the resource rationality of the big data task.
The embodiments or specific implementation manners of the task resource monitoring method of the present invention may refer to the embodiments of the systems described above, and are not described herein again.
It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may refer to the parameter operation method provided in any embodiment of the present invention, and are not described herein again.
Based on the first embodiment of the task resource monitoring method of the present invention, a second embodiment of the task resource monitoring method of the present invention is provided.
In this embodiment, after the step of storing the hardware monitoring information and the software monitoring information in a log storage module, the method further includes:
acquiring the operating parameter value ranges of the operating parameters in the hardware monitoring information and the software monitoring information within a preset time period;
determining a maximum value and a minimum value of the operation parameter according to the range of the operation parameter value;
determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters;
and judging whether the current operation parameter value is larger than the target early warning threshold value or not, and when the current operation parameter value is larger than the target early warning threshold value, adopting a preset prompting mode to carry out early warning.
Other embodiments or specific implementation manners of the task resource monitoring method of the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., a rom/ram, a magnetic disk, an optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A task resource monitoring system, comprising at least: a task monitoring module;
the task monitoring module is used for acquiring a big data task to be monitored on the task scheduler and determining a target server to which the big data task to be monitored is distributed;
the task monitoring module is further configured to perform hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generate hardware monitoring information;
the task monitoring module is also used for carrying out software monitoring on software information on a distributed scheduling system in the process that the target server executes the big data task to be monitored through a preset software monitoring program and generating software monitoring information;
the task monitoring module is further configured to store the hardware monitoring information and the software monitoring information in a log storage module.
2. The task resource monitoring system of claim 1, further comprising a resource presentation module;
the resource display module is used for acquiring a query instruction of a user, selecting target parameter information from the hardware monitoring information and the software monitoring information according to the query instruction, and displaying the target parameter information according to a preset display strategy.
3. The task resource monitoring system of claim 2, further comprising a task early warning module;
the task early warning module is used for judging whether the hardware monitoring information and the software monitoring information meet corresponding preset early warning conditions or not, and when the operation parameter information meets the preset early warning conditions, early warning is carried out in a preset prompting mode.
4. The task resource monitoring system of claim 3, wherein the task early warning module is further configured to obtain an operating parameter value range of each operating parameter within a preset time period;
the task early warning module is also used for determining the maximum value and the minimum value of the operation parameter according to the range of the operation parameter value;
the task early warning module is also used for determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters;
the task early warning module is further configured to determine whether a current operation parameter value is greater than the target early warning threshold value, and perform early warning in a preset prompting manner when the current operation parameter value is greater than the target early warning threshold value.
5. The task resource monitoring system of claim 4, wherein the task early warning module is further configured to obtain a preset task allocation table when the current operating parameter value is greater than the target early warning threshold value;
the task early warning module is also used for searching an object to be notified corresponding to the big data task to be monitored according to the preset task allocation table;
the task early warning module is further configured to send hardware monitoring information and software monitoring information corresponding to the big data task to be monitored to the object to be notified, so that the object to be notified corrects the current big data task according to the hardware monitoring information and the software monitoring information.
6. The task resource monitoring system of claim 1, wherein the log storage module is further configured to generate a hardware resource log according to the hardware monitoring information, and store the hardware resource log in a hardware information table of a preset Mysql database;
the log storage module is further used for generating a software resource log according to the software monitoring information and storing the software resource log into a software information table of a preset Mysql database.
7. The task resource monitoring system of claim 1, wherein the hardware monitoring information includes server IP, server process, server CPU occupancy, server memory, task scheduling project name, task scheduling status;
the software monitoring information comprises the total number of cluster resources, the utilization rate of the cluster resources, the queue name, the queue utilization rate, the task name, the task execution time and the task cpu consumption information.
8. A task resource monitoring method is characterized by comprising the following steps:
acquiring a big data task to be monitored on a task scheduler, and determining a target server to which the big data task to be monitored is distributed;
performing hardware monitoring on hardware information of the target server in the process of executing the big data task to be monitored through a preset hardware monitoring program on the target server, and generating hardware monitoring information;
performing software monitoring on software information on a distributed scheduling system in the process of executing the big data task to be monitored by the target server through a preset software monitoring program, and generating software monitoring information;
and storing the hardware monitoring information and the software monitoring information into a log storage module.
9. A task resource monitoring method as recited in claim 8, wherein after the step of storing the hardware monitoring information and the software monitoring information in a log storage module, further comprising:
acquiring the operating parameter value ranges of the operating parameters in the hardware monitoring information and the software monitoring information within a preset time period;
determining a maximum value and a minimum value of the operation parameter according to the range of the operation parameter value;
determining a target early warning threshold value according to the maximum value and the minimum value of the operation parameters;
and judging whether the current operation parameter value is larger than the target early warning threshold value or not, and when the current operation parameter value is larger than the target early warning threshold value, adopting a preset prompting mode to carry out early warning.
10. A storage medium having stored thereon a task resource monitor program, which when executed by a processor implements the steps of the task resource monitoring method according to any one of claims 8 to 9.
CN202111078983.7A 2021-09-13 2021-09-13 Task resource monitoring system, method and storage medium Pending CN113778803A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111078983.7A CN113778803A (en) 2021-09-13 2021-09-13 Task resource monitoring system, method and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111078983.7A CN113778803A (en) 2021-09-13 2021-09-13 Task resource monitoring system, method and storage medium

Publications (1)

Publication Number Publication Date
CN113778803A true CN113778803A (en) 2021-12-10

Family

ID=78843967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111078983.7A Pending CN113778803A (en) 2021-09-13 2021-09-13 Task resource monitoring system, method and storage medium

Country Status (1)

Country Link
CN (1) CN113778803A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900447A (en) * 2022-05-07 2022-08-12 北京红山信息科技研究院有限公司 Software and hardware resource management monitoring system based on Pass platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900447A (en) * 2022-05-07 2022-08-12 北京红山信息科技研究院有限公司 Software and hardware resource management monitoring system based on Pass platform
CN114900447B (en) * 2022-05-07 2023-12-12 北京红山信息科技研究院有限公司 Software and hardware resource management monitoring system based on Pass platform

Similar Documents

Publication Publication Date Title
CN110427252B (en) Task scheduling method, device and storage medium based on task dependency relationship
CN106406993A (en) Timed task management method and system
CN111338791A (en) Method, device and equipment for scheduling cluster queue resources and storage medium
CN109450693B (en) Hybrid cloud monitoring system and monitoring method using same
CN109669835B (en) MySQL database monitoring method, device, equipment and readable storage medium
CN110347494B (en) Context information management method, device, system and computer readable storage medium
CN107430526B (en) Method and node for scheduling data processing
CN114035925A (en) Workflow scheduling method, device and equipment and readable storage medium
CN111339062B (en) Data monitoring method and device, electronic equipment and storage medium
US9607275B2 (en) Method and system for integration of systems management with project and portfolio management
CN113778803A (en) Task resource monitoring system, method and storage medium
CN112200505A (en) Cross-business system process monitoring device and method, corresponding equipment and storage medium
CN108984290A (en) Method for scheduling task and system
CN111709723A (en) RPA business process intelligent processing method, device, computer equipment and storage medium
CN109639490B (en) Downtime notification method and device
CN112948109B (en) Quota flexible scheduling method, device and medium for AI computing cluster
CN113157569A (en) Automatic testing method and device, computer equipment and storage medium
CN112685160A (en) Scheduling method and device of timing task, terminal equipment and computer storage medium
CN111580948A (en) Task scheduling method and device and computer equipment
TW201737108A (en) Method for copying clustered data, and method and device for determining priority
CN114564249B (en) Recommendation scheduling engine, recommendation scheduling method and computer readable storage medium
CN116149829A (en) Task management method, device, equipment and storage medium
CN112748990A (en) Quartz-based data quality task execution method and device and computer equipment
CN113434591B (en) Data processing method and device
CN115509716A (en) Task scheduling method, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination