CN114036032A - Real-time program monitoring method and device - Google Patents

Real-time program monitoring method and device Download PDF

Info

Publication number
CN114036032A
CN114036032A CN202210020065.7A CN202210020065A CN114036032A CN 114036032 A CN114036032 A CN 114036032A CN 202210020065 A CN202210020065 A CN 202210020065A CN 114036032 A CN114036032 A CN 114036032A
Authority
CN
China
Prior art keywords
real
program
time program
time
running
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210020065.7A
Other languages
Chinese (zh)
Inventor
王禹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sohu Internet Information Service Co Ltd
Original Assignee
Beijing Sohu Internet Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sohu Internet Information Service Co Ltd filed Critical Beijing Sohu Internet Information Service Co Ltd
Priority to CN202210020065.7A priority Critical patent/CN114036032A/en
Publication of CN114036032A publication Critical patent/CN114036032A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a method and a device for monitoring a real-time program, wherein the method comprises the following steps: according to the timing task, periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program; for each real-time program, determining the running state of the real-time program based on the running number of the programs corresponding to the real-time program and other monitoring indexes; and for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program abnormal processing scheme and basic information corresponding to the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, abnormal real-time programs can be automatically processed according to a program exception handling scheme, monitoring efficiency is improved, and monitoring instantaneity is guaranteed.

Description

Real-time program monitoring method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for monitoring a real-time program.
Background
With the continuous expansion of services, more and more real-time data statistics needs exist, and the real-time data statistics needs are usually satisfied by a Spark real-time program and a flight real-time program. In order to ensure stable operation of the service, the operation state of the real-time program needs to be monitored.
The current way of monitoring the running state of a real-time program is as follows: each real-time program has an independent Web UI page, and technicians monitor the running state of a certain real-time program through the Web UI page of the real-time program. However, in general, a service includes a large number of real-time programs, and a technician can only open the Web UI pages of each real-time program one by one to monitor the running state of each real-time program, which consumes a lot of time, cannot find the abnormal state of the real-time program in time, and has low monitoring efficiency and poor real-time performance.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for monitoring a real-time program, so as to solve the problems of low monitoring efficiency and poor real-time performance in the existing method for monitoring a real-time program.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
the first aspect of the embodiments of the present invention discloses a method for monitoring a real-time program, where the method includes:
according to the timing task, periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program;
for each real-time program, determining the running state of the real-time program based on the program running number corresponding to the real-time program and the other monitoring indexes, wherein the running state is a normal state or an abnormal state;
and for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program abnormal processing scheme and basic information corresponding to the real-time program.
Preferably, the periodically obtaining the program running number and other monitoring indexes corresponding to each real-time program according to the timing task includes:
for each real-time program, periodically acquiring a value corresponding to the program name of the real-time program from a dictionary according to a timing task, and taking the acquired value as the program running number corresponding to the real-time program, wherein the dictionary stores multiple groups of key value pairs in advance, and the key value and the value of each group of key value pairs are respectively the program name of the real-time program and the program running number corresponding to the real-time program;
and for each real-time program, periodically acquiring other monitoring indexes of the real-time program through an API (application programming interface) according to the timing task.
Preferably, for each real-time program, determining the running state of the real-time program based on the program running number and the other monitoring indexes corresponding to the real-time program includes:
for each real-time program, when the real-time program meets an abnormal condition, determining that the running state of the real-time program is an abnormal state, wherein the abnormal condition is as follows: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range, and the state parameters are determined by other monitoring indexes;
and when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state.
Preferably, for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program exception handling scheme and basic information corresponding to the real-time program, including:
for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is 0, and starting the real-time program;
for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is not 0.
Preferably, the basic information of the real-time program at least comprises: program principal information, program name, start command, and program type.
A second aspect of the embodiments of the present invention discloses a device for monitoring a real-time program, where the device includes:
the acquisition unit is used for periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program according to the timing task;
the determining unit is used for determining the running state of each real-time program based on the program running number corresponding to the real-time program and the other monitoring indexes, wherein the running state is a normal state or an abnormal state;
and the processing unit is used for processing each real-time program according to a preset program exception handling scheme and basic information corresponding to the real-time program if the running state of the real-time program is an exception state.
Preferably, the acquiring unit includes:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for periodically acquiring a value corresponding to a program name of a real-time program from a dictionary according to a timing task and taking the acquired value as the program running number corresponding to the real-time program, the dictionary stores multiple groups of key value pairs in advance, and the key and value values of each group of key value pairs are respectively the program name of the real-time program and the program running number corresponding to the real-time program;
and the second acquisition module is used for periodically acquiring other monitoring indexes of each real-time program through an API (application programming interface) according to the timing task.
Preferably, the determining unit is specifically configured to: for each real-time program, when the real-time program meets an abnormal condition, determining that the running state of the real-time program is an abnormal state, wherein the abnormal condition is as follows: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range, and the state parameters are determined by other monitoring indexes; and when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state.
Preferably, the processing unit is specifically configured to: for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is 0, and starting the real-time program; for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is not 0.
Preferably, the basic information of the real-time program at least comprises: program principal information, program name, start command, and program type.
Based on the above method and apparatus for monitoring a real-time program provided by the embodiments of the present invention, the method includes: according to the timing task, periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program; for each real-time program, determining the running state of the real-time program based on the running number of the programs corresponding to the real-time program and other monitoring indexes; and for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program abnormal processing scheme and basic information corresponding to the real-time program. In the scheme, the program running number and other monitoring indexes corresponding to each real-time program are periodically acquired, and the running state of the real-time program is judged according to the program running number and other monitoring indexes. And for the real-time program with the abnormal running state, processing the real-time program according to the program exception handling scheme and the basic information of the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, abnormal real-time programs can be automatically processed according to a program exception handling scheme, monitoring efficiency is improved, and monitoring instantaneity is guaranteed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a monitoring method for a real-time program according to an embodiment of the present invention;
fig. 2 is a block diagram of a monitoring apparatus for a real-time program according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
As known from the background art, each real-time program has an independent Web UI page, and at present, when monitoring a real-time program, a technician needs to monitor the running state of the real-time program through a Web UI page of a certain real-time program. However, in general, a service includes a large number of real-time programs, and a technician can only open the Web UI pages of each real-time program one by one to monitor the running state of each real-time program, which consumes a lot of time, cannot find the abnormal state of the real-time program in time, and has low monitoring efficiency and poor real-time performance.
Therefore, embodiments of the present invention provide a method and an apparatus for monitoring a real-time program, which periodically obtain a program running number and other monitoring indexes corresponding to each real-time program, and accordingly determine a running state of the real-time program. And for the real-time program with the abnormal running state, processing the real-time program according to the program exception handling scheme and the basic information of the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, and abnormal real-time programs can be automatically processed according to a program exception handling scheme, so that the monitoring efficiency is improved, and the monitoring instantaneity is ensured.
It should be noted that, the method and the device for monitoring a real-time program according to the embodiments of the present invention are used to monitor the running states of real-time programs such as a spare real-time program and a Flink real-time program, and process a real-time program with an exception, and details of the method for monitoring a real-time program are described in the following embodiments.
Referring to fig. 1, a flowchart of a monitoring method for a real-time program according to an embodiment of the present invention is shown, where the monitoring method includes:
step S101: and periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program according to the timing task.
In the process of specifically implementing the step S101, according to the timing task, the program running number and other monitoring indexes corresponding to each real-time program are periodically obtained; each real-time program acquires the corresponding program running quantity and other monitoring indexes.
It should be noted that, for a certain real-time program, the number of program runs corresponding to the real-time program indicates: the number of starts of the real-time program. The program running number is one of the monitoring indexes, and the other monitoring indexes in the embodiment of the invention are as follows: monitoring indicators other than the number of program runs.
Specifically, a Linux-owned crontab timing execution script (namely a timing task) can be used for periodically acquiring the program running number and other monitoring indexes corresponding to each real-time program; for example: and acquiring the program running quantity and other monitoring indexes corresponding to each real-time program every 5 minutes.
It should be noted that the interval time for acquiring the program running number and other monitoring indexes corresponding to each real-time program may be set according to actual service requirements and the tolerance of the unavailable time of the service, and is not specifically limited herein.
In some embodiments, the real-time program may be a real-time computing task (or real-time computing program).
It will be appreciated that all real-time programs typically run in some resource management system, such as: all real-time programs run on YARN (top level item of Apache); all running real-time programs are obtained in advance through a specified command line, and program names and program running numbers corresponding to the running real-time programs are stored in a dictionary. The content stored in the dictionary is a plurality of groups of key-value pairs (key-value), and each group of key-value pairs corresponds to a real-time program; in each group of key value pairs, key is the program name of the real-time program, and value is the program running number corresponding to the real-time program.
For example: assuming that all real-time programs are running at YARN, all running real-time programs can be obtained by specifying command lines that: the yarn application-list | grep-E 'RUNNING | ACCEPTED'.
With the above, the specific way of periodically obtaining the program running number and other monitoring indexes corresponding to each real-time program is as follows: and for each real-time program, periodically acquiring a value corresponding to the program name of the real-time program from a dictionary according to the timing task, and taking the acquired value as the program running number corresponding to the real-time program.
That is, for a certain real-time program, the value corresponding to the key is looked up in the dictionary by using the program name of the real-time program as the key, and the looked-up value is the program running number corresponding to the real-time program.
It should be noted that, for a certain real-time program, the program running number corresponding to the real-time program may be used to determine whether the real-time program is hung up or is started more; that is, when the running number of the program corresponding to the real-time program is 0, it indicates that the real-time program is hung up (or the real-time program is stopped); and when the program running number corresponding to the real-time program is more than 1, indicating that the real-time program is started more.
And for each real-time program, periodically acquiring other monitoring indexes of the real-time program through the API according to the timing task. For example: if the real-time program is a Spark real-time program or a Flink real-time program, other monitoring indexes of the real-time program can be obtained by using the Rest API corresponding to Spark or Flink.
Step S102: and for each real-time program, determining the running state of the real-time program based on the running number of the programs corresponding to the real-time program and other monitoring indexes.
It should be noted that the operating state is a normal state or an abnormal state; other monitoring metrics of the real-time program may be used to determine status parameters of the real-time program, such as: the JobManager usage memory occupancy rate, the Checkpoint failure rate and the like of the real-time program can be determined.
It should be noted that the JobManager is a component of the subordinated YARN, and is mainly used for coordinating the scheduling of Spark or Flink tasks and the resource manager; checkpoint is used for recovery of program failures.
In the process of implementing step S102 specifically, for each real-time program, when the real-time program meets an abnormal condition, it is determined that the running state of the real-time program is an abnormal state, where the abnormal condition is: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range and are determined by other monitoring indexes.
As can be seen from the above, for a certain real-time program, if the program running number corresponding to the real-time program is 0, it indicates that the real-time program is hung up, and at this time, the running state of the real-time program is an abnormal state; if the program running number corresponding to the real-time program is greater than 1, the real-time program is indicated to be started more, and the running state of the real-time program is an abnormal state at the moment.
That is, for a certain real-time program, if the program running number corresponding to the real-time program is not 1, and/or if the state parameter corresponding to the real-time program is outside the threshold range, it indicates that the real-time program satisfies the abnormal condition, and determines that the running state of the real-time program is the abnormal state.
For example: for a certain real-time program, if the JobManager memory usage ratio of the real-time program exceeds 0.8, the running state of the real-time program is determined to be an abnormal state.
Another example is: for a certain real-time program, if the program running number corresponding to the real-time program is 0, determining that the running state of the real-time program is an abnormal state.
And for each real-time program, when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state. That is, for a certain real-time program, if the program running number corresponding to the real-time program is 1, and if the state parameter corresponding to the real-time program is within the threshold range, it is determined that the running state of the real-time program is a normal state.
Step S103: and for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program abnormal processing scheme and basic information corresponding to the real-time program.
It should be noted that, the program exception handling scheme and the basic information corresponding to the real-time program are configured in advance, and the program exception handling scheme and the basic information corresponding to each real-time program are stored in the database.
In some embodiments, the basic information of the real-time program comprises at least: program person in charge information, program name, start command, program type, program person in charge information, and the like.
In some specific embodiments, a program exception handling scheme corresponding to a real-time program can be configured at a client in a command line manner according to actual requirements, wherein the program exception handling scheme is a handling scheme of the real-time program when an exception occurs; for example: for the Flink real-time program, the program exception handling scheme of the Flink real-time program is as follows: and when the Flink real-time program is monitored to stop, sending alarm information in a mail alarm and/or WeChat alarm mode, and starting the Flink real-time program by using checkpoint.
In the process of implementing step S103 specifically, a program exception handling scheme and basic information corresponding to the real-time program are acquired from the database.
And for each real-time program, if the running state of the real-time program is a normal state, the real-time program is not processed.
For each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of programs corresponding to the real-time program is 0 (namely the real-time program is hung), and starting the real-time program; for example: if the running state of a certain real-time program is an abnormal state and the real-time program is hung, sending alarm information at least carrying basic information of the real-time program in a mode of WeChat alarm and/or mail alarm, and starting the real-time program by checkpoint.
For each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of programs corresponding to the real-time program is not 0 (namely the real-time program is not hung).
By the mode, when the running state of the real-time program is the abnormal state, the alarm information at least carrying the basic information of the real-time program is sent, and a user can know the abnormality of the real-time program from the basic information carried by the alarm information in time, so that the abnormality processing is quickly carried out.
In the embodiment of the invention, the program running number and other monitoring indexes corresponding to each real-time program are periodically acquired, and the running state of the real-time program is judged according to the program running number and other monitoring indexes. And for the real-time program with the abnormal running state, processing the real-time program according to the program exception handling scheme and the basic information of the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, abnormal real-time programs can be automatically processed according to a program exception handling scheme, monitoring efficiency is improved, and monitoring instantaneity is guaranteed.
Corresponding to the monitoring method of the real-time program provided by the embodiment of the present invention, referring to fig. 2, an embodiment of the present invention further provides a structural block diagram of a monitoring device of the real-time program, where the monitoring device includes: an acquisition unit 201, a determination unit 202, and a processing unit 203;
the acquiring unit 201 is configured to periodically acquire the program running number and other monitoring indexes corresponding to each real-time program according to the timing task.
The determining unit 202 is configured to determine, for each real-time program, an operating state of the real-time program based on the program operating quantity and other monitoring indicators corresponding to the real-time program, where the operating state is a normal state or an abnormal state.
In a specific implementation, the determining unit 202 is specifically configured to: for each real-time program, when the real-time program meets an abnormal condition, determining that the running state of the real-time program is an abnormal state, wherein the abnormal condition is as follows: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range and are determined by other monitoring indexes; and when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state.
The processing unit 203 is configured to, for each real-time program, process the real-time program according to a preset program exception handling scheme and basic information corresponding to the real-time program if the running state of the real-time program is an exception state.
In a specific implementation, the processing unit 203 is specifically configured to: for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is 0, and starting the real-time program; and for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is not 0.
In some embodiments, the basic information of the real-time program comprises at least: program principal information, program name, start command, and program type.
In the embodiment of the invention, the program running number and other monitoring indexes corresponding to each real-time program are periodically acquired, and the running state of the real-time program is judged according to the program running number and other monitoring indexes. And for the real-time program with the abnormal running state, processing the real-time program according to the program exception handling scheme and the basic information of the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, abnormal real-time programs can be automatically processed according to a program exception handling scheme, monitoring efficiency is improved, and monitoring instantaneity is guaranteed.
Preferably, in conjunction with the content shown in fig. 2, the obtaining unit 201 includes: the system comprises a first acquisition module and a second acquisition module, wherein the execution principle of each module is as follows:
the real-time program processing device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for periodically acquiring a value corresponding to a program name of a real-time program from a dictionary according to a timing task and taking the acquired value as the program running number corresponding to the real-time program, the dictionary stores multiple groups of key value pairs in advance, and the key value and the value of each group of key value pairs are respectively the program name of the real-time program and the program running number corresponding to the real-time program;
and the second acquisition module is used for periodically acquiring other monitoring indexes of the real-time program through the API according to the timing task for each real-time program.
In summary, embodiments of the present invention provide a method and an apparatus for monitoring real-time programs, which periodically obtain a program running number and other monitoring indicators corresponding to each real-time program, and determine a running state of the real-time program according to the obtained program running number and other monitoring indicators. And for the real-time program with the abnormal running state, processing the real-time program according to the program exception handling scheme and the basic information of the real-time program. The running states of the real-time programs are not required to be monitored one by technicians in a mode of opening a Web UI page, abnormal real-time programs can be automatically processed according to a program exception handling scheme, monitoring efficiency is improved, and monitoring instantaneity is guaranteed.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for monitoring a real-time program, the method comprising:
according to the timing task, periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program;
for each real-time program, determining the running state of the real-time program based on the program running number corresponding to the real-time program and the other monitoring indexes, wherein the running state is a normal state or an abnormal state;
and for each real-time program, if the running state of the real-time program is an abnormal state, processing the real-time program according to a preset program abnormal processing scheme and basic information corresponding to the real-time program.
2. The method of claim 1, wherein periodically obtaining the running number of the programs and other monitoring indexes corresponding to each real-time program according to the timing task comprises:
for each real-time program, periodically acquiring a value corresponding to the program name of the real-time program from a dictionary according to a timing task, and taking the acquired value as the program running number corresponding to the real-time program, wherein the dictionary stores multiple groups of key value pairs in advance, and the key value and the value of each group of key value pairs are respectively the program name of the real-time program and the program running number corresponding to the real-time program;
and for each real-time program, periodically acquiring other monitoring indexes of the real-time program through an API (application programming interface) according to the timing task.
3. The method of claim 1, wherein for each real-time program, determining the operating state of the real-time program based on the program running number and the other monitoring indicators corresponding to the real-time program comprises:
for each real-time program, when the real-time program meets an abnormal condition, determining that the running state of the real-time program is an abnormal state, wherein the abnormal condition is as follows: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range, and the state parameters are determined by other monitoring indexes;
and when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state.
4. The method according to claim 3, wherein for each real-time program, if the running status of the real-time program is an abnormal status, processing the real-time program according to a preset program exception handling scheme and basic information corresponding to the real-time program, includes:
for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is 0, and starting the real-time program;
for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is not 0.
5. The method according to any of claims 1-4, wherein the basic information of the real-time program at least comprises: program principal information, program name, start command, and program type.
6. A real-time programmed monitoring device, the device comprising:
the acquisition unit is used for periodically acquiring the program running quantity and other monitoring indexes corresponding to each real-time program according to the timing task;
the determining unit is used for determining the running state of each real-time program based on the program running number corresponding to the real-time program and the other monitoring indexes, wherein the running state is a normal state or an abnormal state;
and the processing unit is used for processing each real-time program according to a preset program exception handling scheme and basic information corresponding to the real-time program if the running state of the real-time program is an exception state.
7. The apparatus of claim 6, wherein the obtaining unit comprises:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for periodically acquiring a value corresponding to a program name of a real-time program from a dictionary according to a timing task and taking the acquired value as the program running number corresponding to the real-time program, the dictionary stores multiple groups of key value pairs in advance, and the key and value values of each group of key value pairs are respectively the program name of the real-time program and the program running number corresponding to the real-time program;
and the second acquisition module is used for periodically acquiring other monitoring indexes of each real-time program through an API (application programming interface) according to the timing task.
8. The apparatus according to claim 6, wherein the determining unit is specifically configured to: for each real-time program, when the real-time program meets an abnormal condition, determining that the running state of the real-time program is an abnormal state, wherein the abnormal condition is as follows: the running number of the programs corresponding to the real-time programs is not 1, and/or the state parameters corresponding to the real-time programs are out of the threshold range, and the state parameters are determined by other monitoring indexes; and when the real-time program does not meet the abnormal condition, determining that the running state of the real-time program is a normal state.
9. The apparatus according to claim 6, wherein the processing unit is specifically configured to: for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is 0, and starting the real-time program; for each real-time program, if the running state of the real-time program is an abnormal state, sending alarm information at least carrying basic information of the real-time program in a preset mode under the condition that the running number of the programs corresponding to the real-time program is not 0.
10. The apparatus according to any one of claims 6-9, wherein the basic information of the real-time program at least comprises: program principal information, program name, start command, and program type.
CN202210020065.7A 2022-01-10 2022-01-10 Real-time program monitoring method and device Pending CN114036032A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210020065.7A CN114036032A (en) 2022-01-10 2022-01-10 Real-time program monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210020065.7A CN114036032A (en) 2022-01-10 2022-01-10 Real-time program monitoring method and device

Publications (1)

Publication Number Publication Date
CN114036032A true CN114036032A (en) 2022-02-11

Family

ID=80147369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210020065.7A Pending CN114036032A (en) 2022-01-10 2022-01-10 Real-time program monitoring method and device

Country Status (1)

Country Link
CN (1) CN114036032A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115129573A (en) * 2022-08-31 2022-09-30 国汽智控(北京)科技有限公司 Program operation monitoring method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407076A (en) * 2016-09-22 2017-02-15 山东浪潮云服务信息科技有限公司 A monitoring method for the operation information of software and hardware based on a domestic CPU and operating system environment
CN108509212A (en) * 2018-02-07 2018-09-07 平安科技(深圳)有限公司 Application program update test method, device, terminal device and storage medium
CN110794800A (en) * 2019-12-11 2020-02-14 河南中烟工业有限责任公司 Monitoring system for wisdom mill information management
CN112416712A (en) * 2020-11-20 2021-02-26 常州微亿智造科技有限公司 Monitoring method and device based on industrial cloud edge service data acquisition
CN112631913A (en) * 2020-12-23 2021-04-09 平安银行股份有限公司 Method, device, equipment and storage medium for monitoring operation fault of application program
WO2021164267A1 (en) * 2020-02-21 2021-08-26 平安科技(深圳)有限公司 Anomaly detection method and apparatus, and terminal device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407076A (en) * 2016-09-22 2017-02-15 山东浪潮云服务信息科技有限公司 A monitoring method for the operation information of software and hardware based on a domestic CPU and operating system environment
CN108509212A (en) * 2018-02-07 2018-09-07 平安科技(深圳)有限公司 Application program update test method, device, terminal device and storage medium
CN110794800A (en) * 2019-12-11 2020-02-14 河南中烟工业有限责任公司 Monitoring system for wisdom mill information management
WO2021164267A1 (en) * 2020-02-21 2021-08-26 平安科技(深圳)有限公司 Anomaly detection method and apparatus, and terminal device and storage medium
CN112416712A (en) * 2020-11-20 2021-02-26 常州微亿智造科技有限公司 Monitoring method and device based on industrial cloud edge service data acquisition
CN112631913A (en) * 2020-12-23 2021-04-09 平安银行股份有限公司 Method, device, equipment and storage medium for monitoring operation fault of application program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FELIXAPFF: "《Linux下同进程多进程号实时监控》", 《HTTP://CNBLOGS.COM/APFF/P/7592585.HTML》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115129573A (en) * 2022-08-31 2022-09-30 国汽智控(北京)科技有限公司 Program operation monitoring method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109039833B (en) Method and device for monitoring bandwidth state
CN109586952B (en) Server capacity expansion method and device
CN108710544B (en) Process monitoring method of database system and rail transit comprehensive monitoring system
US5668944A (en) Method and system for providing performance diagnosis of a computer system
US7340654B2 (en) Autonomic monitoring in a grid environment
CN111901422B (en) Method, system and device for managing nodes in cluster
CN110427307A (en) Log analytic method, device, computer equipment and storage medium
CN112751726B (en) Data processing method and device, electronic equipment and storage medium
CN110445650B (en) Detection alarm method, equipment and server
CN110417586B (en) Service monitoring method, service node, server and computer readable storage medium
CN114356499A (en) Kubernetes cluster alarm root cause analysis method and device
CN114036032A (en) Real-time program monitoring method and device
US7206975B1 (en) Internal product fault monitoring apparatus and method
CN107426012B (en) Fault recovery method and device based on super-fusion architecture
WO2014196982A1 (en) Identifying log messages
US5559726A (en) Method and system for detecting whether a parameter is set appropriately in a computer system
CN115712521A (en) Cluster node fault processing method, system and medium
CN115378794A (en) Gateway fault detection method and device based on snapshot mode
CN111885159B (en) Data acquisition method and device, electronic equipment and storage medium
CN111082964B (en) Distribution method and device of configuration information
CN110932926B (en) Container cluster monitoring method, system and device
CN106487599B (en) Method and system for distributed monitoring of running state of cloud access controller
CN113645099B (en) High availability monitoring method, device, equipment and storage medium
CN115022209A (en) Monitoring method, monitoring device and computer-readable storage medium
CN114218050A (en) Cloud platform fault processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220211