WO2018001048A1 - Multi-process monitoring method, apparatus and service system - Google Patents

Multi-process monitoring method, apparatus and service system Download PDF

Info

Publication number
WO2018001048A1
WO2018001048A1 PCT/CN2017/087185 CN2017087185W WO2018001048A1 WO 2018001048 A1 WO2018001048 A1 WO 2018001048A1 CN 2017087185 W CN2017087185 W CN 2017087185W WO 2018001048 A1 WO2018001048 A1 WO 2018001048A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection
detected
processes
detection result
configuration information
Prior art date
Application number
PCT/CN2017/087185
Other languages
French (fr)
Chinese (zh)
Inventor
还超
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2018001048A1 publication Critical patent/WO2018001048A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring

Definitions

  • the present application relates to the field of multi-process monitoring technologies, for example, to a multi-process monitoring method, apparatus, and service system.
  • test items are relatively simple;
  • the detection item is serial detection, which may block in the middle, and the detection time is too long.
  • the present disclosure provides a multi-process monitoring method, apparatus and service system, which solves the problems of single project, difficult expansion and easy middle blocking in the multi-process monitoring scheme in the related art.
  • an embodiment of the present disclosure provides a multi-process monitoring method, including:
  • the detection operation is stopped, and the detection result of the process to be detected corresponding to the detection operation is obtained.
  • the step of performing parallel detection on multiple processes to be detected running in the system includes:
  • the step of obtaining the detection result of the process to be detected corresponding to the detecting operation includes:
  • a detection result of the detection process abnormality or the detection failure corresponding to the detection operation corresponding to the detection operation is obtained.
  • it also includes:
  • the step of performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information includes:
  • the process to be detected corresponding to the detection result is restarted according to the configuration information, and the process to be detected is detected after the restart is completed;
  • the entire system is restarted according to the configuration information.
  • the step of performing parallel detection on multiple processes to be detected running in the system includes:
  • the step of acquiring the detection result and the configuration information of the process to be detected that performs the detecting operation includes:
  • the second monitoring period is the next period of the first monitoring period.
  • the step of performing parallel detection on multiple processes to be detected running in the system includes:
  • the detection period is an integer multiple of the monitoring period.
  • the step of performing parallel detection on multiple processes to be detected running in the system includes:
  • the present disclosure also provides a multi-process monitoring device, including:
  • a detection module configured to perform parallel detection on a plurality of processes to be detected running in the system
  • the processing module is configured to stop the detecting operation if the detecting operation is not completed within the preset duration, and obtain a detection result of the to-be-detected process corresponding to the detecting operation.
  • the present disclosure also provides a multi-process service system, including: the multi-process monitoring device described above.
  • Embodiments of the present disclosure also provide a non-transitory computer readable storage medium storing computer executable instructions arranged to perform the above method.
  • An embodiment of the present disclosure further provides an electronic device, including:
  • At least one processor At least one processor
  • the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the method described above.
  • the multi-process monitoring method performs parallel detection on multiple processes to be detected running in the system, and stops the detection operation when there is an unfinished detection operation within a preset duration, and continues to perform subsequent operations, so that
  • the detection project is diversified, the detection process and result collection of the entire system are not blocked, and it is easy to expand (the process can be added or deleted flexibly).
  • FIG. 1 is a schematic flow chart of a multi-process monitoring method according to Embodiment 1 of the present disclosure
  • FIG. 2 is a schematic diagram of system startup according to Embodiment 1 of the present disclosure
  • FIG. 3 is a schematic diagram of a detection architecture according to Embodiment 1 of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a multi-process monitoring apparatus according to Embodiment 2 of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the present disclosure provides various solutions for the single project, difficult expansion, and easy to block in the multi-process monitoring scheme in the related art, as follows:
  • a multi-process monitoring method provided in Embodiment 1 of the present disclosure includes:
  • Step 11 Perform parallel detection on multiple processes to be detected running in the system
  • Step 12 If there is an unfinished detection operation within the preset duration, the detection operation is stopped, and the detection result of the process to be detected corresponding to the detection operation is obtained.
  • the preset duration refers to the detection timeout threshold.
  • the multi-process monitoring method provided by Embodiment 1 of the present disclosure passes through multiple pending tests in the system.
  • the measurement process performs parallel detection, and when there is an unfinished detection operation within a preset duration, the detection operation is stopped, and subsequent operations are continued, so that the detection items are diversified, the detection process of the entire system, and the result collection are not blocked, and is convenient. Extensions (the ability to add or remove detected processes flexibly).
  • the step of performing parallel detection on multiple processes to be detected running in the system may include: starting and initializing the system; and after the system is initialized, performing parallel detection on multiple processes to be detected running in the system.
  • the measures that can be taken are: when the system is powered on, first call the module with the stop function, then call the module with the startup and initialization functions to start and initialize the system.
  • the step of obtaining the detection result of the process to be detected corresponding to the detection operation includes: obtaining a detection result of the process detection abnormality or the detection failure corresponding to the detection operation.
  • the multi-process monitoring method may further include: acquiring a detection result and configuration information of a process to be detected that performs a detection operation; and performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information.
  • the step of performing a corresponding processing operation on the process to be detected according to the detection result and the configuration information includes: the detection result is When the detection is abnormal or the detection fails, the process to be detected corresponding to the detection result is restarted according to the configuration information, and the process to be detected is detected after the restart is completed; or the detection result is a detection abnormality or a detection failure. When the entire system is restarted according to the configuration information.
  • the configuration information includes the processing operations performed when the process detects a failure or an exception.
  • the failure and the abnormal processing operations may be identical or inconsistent, and are not limited herein.
  • the step of performing parallel detection on a plurality of processes to be detected running in the system includes: performing parallel detection on a plurality of processes to be detected running in the system in a first monitoring period;
  • the step of detecting the detection result and the configuration information of the process to be detected includes: acquiring, in the second monitoring period, a detection result and configuration information of the process to be detected that performs the detecting operation; the second monitoring period is the next one of the first monitoring period cycle.
  • the step of performing parallel detection on the plurality of processes to be detected in the system includes: separately detecting each process to be detected according to the monitoring period and the detection period of each process to be detected;
  • the detection period is an integer multiple of the monitoring period.
  • the step of performing parallel detection on the plurality of processes to be detected running in the system includes: acquiring detection information of the process to be detected; and performing multiple operations on the system according to the detection information, corresponding to the flexible process of adding or deleting the detected process.
  • the process to be tested performs parallel detection.
  • the detection information includes the identity of the process to be detected.
  • the system startup unit is responsible for the initialization of the multi-process system environment and the startup of all processes.
  • the detected process performs a startup operation on the registered detected process through the system startup unit in some abnormal situations when it needs to be started.
  • the system monitoring unit is responsible for timing execution (triggering the set function unit) and checking the detection results of each detected process, and adopting a corresponding strategy (calling the corresponding functional unit). This can be done by creating a timed task in the system.
  • the system stops the unit and is responsible for stopping all processes in the multi-process system.
  • the system stops the unit to perform a stop operation on the registered detected process.
  • the system control unit is responsible for the timeout control of the detected process and the detection of each process.
  • the system control unit periodically performs a state detection operation on the registered detected process (triggered by the system monitoring unit).
  • the environment configuration unit is responsible for configuring environment variables (process parameters, including operations to be performed after failure detection), timeout period of detection, time interval for monitoring, and submodules (processes) to be monitored.
  • the units are the function functions of each sub-function, including start, stop, and detection.
  • the multi-process monitoring method provided in Embodiment 1 of the present disclosure can be summarized into two processes:
  • the system monitoring unit first calls the system stop unit to ensure that the environment is clean and then calls the system boot unit.
  • the system monitoring unit sends a command to the system control unit according to the configuration of the environment configuration unit, and then the monitoring unit ends the return.
  • the system control unit concurrently calls the detection of each process Interface function, and control the detection timeout time of each process. All tests are completed within the specified time, and there is a blockage (detection abnormality) to kill it (stop the corresponding detection operation, which can be regarded as the detection failure).
  • the system monitoring unit executes the corresponding processing operation according to the result of the last cycle detection, and if there is an error, the process stop function interface (system stop unit) is called.
  • the solution provided by the first embodiment of the present disclosure can uniformly monitor multiple detected processes in the system, and adopt a parallel detection method, so that the detection process and result collection of the entire system are not blocked, and can be flexible. Add or remove detected processes.
  • a Linux multi-process service system has four processes, namely A process, B process, C process, and D process, which need to be monitored and monitored (monitored).
  • the detection items are made into function files, and are directly added to the environment configuration unit in a modular form, and it is no longer necessary to stop detecting and modifying the relevant system code) .
  • the system monitoring unit will first call the system stop unit, then call the system boot unit, start and initialize the system to ensure that the system does not have an abnormal environment.
  • the system monitoring unit creates asynchronous detection tasks for the four detected processes in the first monitoring period (create new tasks with scripts, and the detection operations are performed by the system control unit), and then wait until the next monitoring.
  • the cycle arrives, the result of the last monitoring cycle detection is collected, and corresponding processing is performed according to the detected result (calling the corresponding functional unit).
  • the detection period of the process is an integer multiple of the monitoring period. For example, if the monitoring period is 3s, the detection period is 3s, 6s, or 9s.
  • the detection finds that there is a problem with the detected process A, it will perform corresponding processing according to the configuration: only restart process A, or restart the entire system.
  • process A will not be detected again during the detection period until the process A restarts.
  • the multi-process monitoring apparatus provided in Embodiment 2 of the present disclosure includes:
  • the detecting module 41 is configured to perform parallel detection on multiple processes to be detected running in the system;
  • the first processing module 42 is configured to stop the detecting operation if the detecting operation is not completed within the preset time period, and obtain the detection result of the to-be-detected process corresponding to the detecting operation.
  • the preset duration refers to the detection timeout threshold.
  • the multi-process monitoring apparatus stops the detection operation by performing parallel detection on a plurality of processes to be detected running in the system, and stops performing the detection operation when there is an unfinished detection operation within a preset duration.
  • the operation makes the detection items diversified, the detection process and result collection of the entire system are not blocked, and it is easy to expand (the detection process can be added or deleted flexibly).
  • the detecting module may include: a first processing sub-module configured to start and initialize the system; and the first detecting sub-module configured to perform parallel detection on the plurality of processes to be detected running in the system after the system is initialized.
  • the first processing module includes: a second processing sub-module configured to obtain a detection result of a detection abnormality or a detection failure of the to-be-detected process corresponding to the detecting operation.
  • the multi-process monitoring device may further include: an obtaining module configured to acquire a detection result and configuration information of a process to be detected that performs a detecting operation; and a second processing module configured to be configured according to the detection result and the configuration information Performing a corresponding processing operation on the process to be detected.
  • the second processing module includes: a third processing submodule configured to, when the detection result is a detection abnormality or a detection failure, according to the The configuration information is restarted, and the process to be detected corresponding to the detection result is restarted, and the process to be detected is detected after the restart is completed.
  • the restarting submodule is configured to be configured according to the detection result being abnormal or detecting failure. The configuration information restarts the entire system.
  • the configuration information includes the processing operations performed when the process detects a failure or an exception.
  • the failure and the abnormal processing operations may be identical or inconsistent, and are not limited herein.
  • the detection module includes: a second detection submodule configured to perform parallel detection on a plurality of processes to be detected running in the system in a first monitoring period; the obtaining module includes: a first acquiring submodule And configured to acquire a detection result and configuration information of a process to be detected that performs a detection operation in a second monitoring period; the second monitoring period is a next period of the first monitoring period.
  • the detecting module comprises: a third detecting submodule, It is configured to separately detect each to-be-detected process according to a monitoring period and a detection period of each to-be-detected process; the detection period is an integer multiple of the monitoring period.
  • the detecting module includes: a second acquiring submodule configured to acquire detection information of the process to be detected; and a fourth detecting submodule configured to perform the system according to the detecting information Parallel detection is performed by multiple processes to be detected running in the process.
  • the detection information includes the identity of the process to be detected.
  • the embodiment of the present disclosure further provides a multi-process service system, including: the multi-process monitoring device described above.
  • the implementation examples of the multi-process monitoring device are applicable to the embodiment of the multi-process service system, and the same technical effects can be achieved.
  • Embodiments of the present disclosure also provide a non-transitory computer readable storage medium storing computer executable instructions arranged to perform the method of any of the above embodiments.
  • the embodiment of the present disclosure further provides a schematic structural diagram of an electronic device.
  • the electronic device includes:
  • At least one processor 50 which is exemplified by a processor 50 in FIG. 5; and a memory 51, may further include a communication interface 52 and a bus 53.
  • the processor 50, the communication interface 52, and the memory 51 can complete communication with each other through the bus 53.
  • Communication interface 52 can be used for information transmission.
  • Processor 50 can invoke logic instructions in memory 51 to perform the methods of the above-described embodiments.
  • logic instructions in the memory 51 described above may be implemented in the form of software functional units and sold or used as separate products, and may be stored in a computer readable storage medium.
  • the memory 51 is used as a computer readable storage medium for storing software programs, computer executable programs, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure.
  • Processor 50 runs through The software program, the instructions, and the modules stored in the memory 51 perform function application and data processing, that is, implement the multi-process monitoring method in the above method embodiments.
  • the memory 51 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to use of the terminal device, and the like. Further, the memory 51 may include a high speed random access memory, and may also include a nonvolatile memory.
  • the technical solution of the embodiments of the present disclosure may be embodied in the form of a software product stored in a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, or a network) The device or the like) performs all or part of the steps of the method described in the embodiments of the present disclosure.
  • the foregoing storage medium may be a non-transitory storage medium, including: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.
  • modules/sub-modules may be implemented in software for execution by various types of processors.
  • an identified executable code module can comprise one or more physical or logical blocks of computer instructions, which can be constructed, for example, as an object, procedure, or function. Nonetheless, the executable code of the identified modules need not be physically located together, but may include different instructions stored in different bits that, when logically combined, constitute a module and implement the functionality of the module. .
  • the executable code module can be a single instruction or a plurality of instructions, and can even be distributed across multiple different code segments, distributed among different programs, and distributed across multiple memory devices.
  • operational data may be identified within the modules and may be implemented in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed at different locations (including on different storage devices), and may at least partially exist as an electronic signal on a system or network.
  • the module can be implemented by software, considering the level of the related hardware process, the module can be implemented in software, and the technician can construct the corresponding hardware circuit to realize the corresponding function without considering the cost.
  • the hardware circuits include conventional Very Large Scale Integration (VLSI) circuits or gate arrays and related semiconductors such as logic chips, transistors, or other discrete components.
  • VLSI Very Large Scale Integration
  • the modules can also be implemented with programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices, and the like.
  • the multi-process monitoring method, device and service system provided by the present application make the detection items diversified, the detection process and result collection of the entire system are not blocked, and are easy to expand (the detection process can be flexibly added or deleted).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A multi-process monitoring method, apparatus and service system, wherein said multi-process monitoring method comprises: carrying out parallel detection on a plurality of processes to be detected that run within a system (11); if a detection operation is not completed within a pre-set time, said detection operation is stopped, and detection results of the processes to be detected corresponding to the detection operation are obtained (12). By means of carrying out parallel detection on a plurality of processes to be detected that run within a system and stopping a detection operation when the detection operation has not been completed within a pre-set time, subsequent operations continue to be executed, allowing for diversification of detection items and preventing detection processes and result-collecting of the entire system from being blocked, which is beneficial for expansion and allows for flexibility when adding or removing detected processes.

Description

一种多进程监测方法、装置及服务系统Multi-process monitoring method, device and service system 技术领域Technical field
本申请涉及多进程监测技术领域,例如涉及一种多进程监测方法、装置及服务系统。The present application relates to the field of multi-process monitoring technologies, for example, to a multi-process monitoring method, apparatus, and service system.
背景技术Background technique
一般系统的服务进程存在多个,常见的监控手段一般有常驻进程和使用某些脚本语言作为检测手段,每个进程独立检测,但是存在以下不足:There are multiple service processes in the general system. Common monitoring methods generally have resident processes and use some scripting languages as detection means. Each process is independently detected, but the following shortcomings exist:
1、检测项目比较单一;1. The test items are relatively simple;
2、扩展(增加删除)检测项比较难(需要检测系统停止,做二次开发,加入代码);2. It is more difficult to expand (add delete) detection items (need to stop the detection system, do secondary development, add code);
3、检测项是采用串行检测的方式,可能会中途阻塞,一次检测时间过长。3. The detection item is serial detection, which may block in the middle, and the detection time is too long.
发明内容Summary of the invention
本公开提供一种多进程监测方法、装置及服务系统,解决相关技术中多进程监测方案的项目单一、扩展难以及易中途阻塞的问题。The present disclosure provides a multi-process monitoring method, apparatus and service system, which solves the problems of single project, difficult expansion and easy middle blocking in the multi-process monitoring scheme in the related art.
为了解决上述技术问题,本公开实施例提供一种多进程监测方法,包括:In order to solve the above technical problem, an embodiment of the present disclosure provides a multi-process monitoring method, including:
对系统中运行的多个待检测进程进行并行检测;Perform parallel detection on multiple processes to be detected running in the system;
若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。If there is an unfinished detection operation within the preset duration, the detection operation is stopped, and the detection result of the process to be detected corresponding to the detection operation is obtained.
可选的,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:Optionally, the step of performing parallel detection on multiple processes to be detected running in the system includes:
对系统进行启动和初始化;Start and initialize the system;
系统初始化后,对系统中运行的多个待检测进程进行并行检测。After the system is initialized, parallel detection is performed on multiple processes to be detected running in the system.
可选的,所述得到与该检测操作对应的待检测进程的检测结果的步骤包括:Optionally, the step of obtaining the detection result of the process to be detected corresponding to the detecting operation includes:
得到与该检测操作对应的待检测进程检测异常或检测失败的检测结果。A detection result of the detection process abnormality or the detection failure corresponding to the detection operation corresponding to the detection operation is obtained.
可选的,还包括:Optionally, it also includes:
获取执行检测操作的待检测进程的检测结果和配置信息; Obtaining a detection result and configuration information of a process to be detected that performs a detection operation;
根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作。Performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information.
可选的,所述根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作的步骤包括:Optionally, the step of performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information includes:
在所述检测结果为检测异常或检测失败时,根据所述配置信息重启与该检测结果对应的待检测进程,并在重启完毕后再对该待检测进程进行检测;或者,When the detection result is the detection abnormality or the detection failure, the process to be detected corresponding to the detection result is restarted according to the configuration information, and the process to be detected is detected after the restart is completed; or
在所述检测结果为检测异常或检测失败时,根据所述配置信息重启整个系统。When the detection result is a detection abnormality or a detection failure, the entire system is restarted according to the configuration information.
可选的,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:Optionally, the step of performing parallel detection on multiple processes to be detected running in the system includes:
在第一监控周期内对系统中运行的多个待检测进程进行并行检测;Performing parallel detection on multiple processes to be detected running in the system during the first monitoring period;
所述获取执行检测操作的待检测进程的检测结果和配置信息的步骤包括:The step of acquiring the detection result and the configuration information of the process to be detected that performs the detecting operation includes:
在第二监控周期内获取执行检测操作的待检测进程的检测结果和配置信息;Obtaining a detection result and configuration information of a process to be detected that performs a detection operation in a second monitoring period;
所述第二监控周期是所述第一监控周期的下一个周期。The second monitoring period is the next period of the first monitoring period.
可选的,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:Optionally, the step of performing parallel detection on multiple processes to be detected running in the system includes:
根据监控周期和各个待检测进程的检测周期对各个待检测进程分别进行检测;Detecting each process to be detected separately according to the monitoring period and the detection period of each process to be detected;
所述检测周期是所述监控周期的整数倍。The detection period is an integer multiple of the monitoring period.
可选的,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:Optionally, the step of performing parallel detection on multiple processes to be detected running in the system includes:
获取待检测进程的检测信息;Obtaining detection information of the process to be detected;
根据所述检测信息对系统中运行的多个待检测进程进行并行检测。Performing parallel detection on a plurality of processes to be detected running in the system according to the detection information.
本公开还提供了一种多进程监测装置,包括:The present disclosure also provides a multi-process monitoring device, including:
检测模块,被配置为对系统中运行的多个待检测进程进行并行检测;a detection module configured to perform parallel detection on a plurality of processes to be detected running in the system;
处理模块,被配置为若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。The processing module is configured to stop the detecting operation if the detecting operation is not completed within the preset duration, and obtain a detection result of the to-be-detected process corresponding to the detecting operation.
本公开还提供了一种多进程服务系统,包括:上述的多进程监测装置。The present disclosure also provides a multi-process service system, including: the multi-process monitoring device described above.
本公开实施例还提供了一种非暂态计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述方法。Embodiments of the present disclosure also provide a non-transitory computer readable storage medium storing computer executable instructions arranged to perform the above method.
本公开实施例还提供了一种电子设备,包括:An embodiment of the present disclosure further provides an electronic device, including:
至少一个处理器;以及 At least one processor;
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the method described above.
本公开的上述技术方案的有益效果如下:The beneficial effects of the above technical solutions of the present disclosure are as follows:
上述方案中,所述多进程监测方法通过对系统中运行的多个待检测进程进行并行检测,并在存在预设时长内未完成的检测操作时,停止该检测操作,继续执行后续操作,使得检测项目多样化、整个系统的检测过程和结果收集不会被阻塞、便于扩展(可以灵活的添加或删除被检测进程)。In the above solution, the multi-process monitoring method performs parallel detection on multiple processes to be detected running in the system, and stops the detection operation when there is an unfinished detection operation within a preset duration, and continues to perform subsequent operations, so that The detection project is diversified, the detection process and result collection of the entire system are not blocked, and it is easy to expand (the process can be added or deleted flexibly).
附图概述BRIEF abstract
图1为本公开实施例一的多进程监测方法流程示意图;1 is a schematic flow chart of a multi-process monitoring method according to Embodiment 1 of the present disclosure;
图2为本公开实施例一的系统启动示意图;2 is a schematic diagram of system startup according to Embodiment 1 of the present disclosure;
图3为本公开实施例一的检测架构示意图;3 is a schematic diagram of a detection architecture according to Embodiment 1 of the present disclosure;
图4为本公开实施例二的多进程监测装置结构示意图;以及4 is a schematic structural diagram of a multi-process monitoring apparatus according to Embodiment 2 of the present disclosure;
图5为本公开实施例的电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
具体实施方式detailed description
为使本公开要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及实施例进行详细描述。In order to make the technical problems, technical solutions and advantages of the present disclosure more clear, the following detailed description will be made with reference to the accompanying drawings and embodiments.
本公开针对相关技术中多进程监测方案的项目单一、扩展难以及易中途阻塞的问题,提供了多种解决方案,如下:The present disclosure provides various solutions for the single project, difficult expansion, and easy to block in the multi-process monitoring scheme in the related art, as follows:
实施例一 Embodiment 1
如图1所示,本公开实施例一提供的多进程监测方法,包括:As shown in FIG. 1 , a multi-process monitoring method provided in Embodiment 1 of the present disclosure includes:
步骤11:对系统中运行的多个待检测进程进行并行检测;Step 11: Perform parallel detection on multiple processes to be detected running in the system;
步骤12:若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。Step 12: If there is an unfinished detection operation within the preset duration, the detection operation is stopped, and the detection result of the process to be detected corresponding to the detection operation is obtained.
其中,预设时长是指检测超时阈值。The preset duration refers to the detection timeout threshold.
本公开实施例一提供的所述多进程监测方法通过对系统中运行的多个待检 测进程进行并行检测,并在存在预设时长内未完成的检测操作时,停止该检测操作,继续执行后续操作,使得检测项目多样化、整个系统的检测过程和结果收集不会被阻塞、便于扩展(可以灵活的添加或删除被检测进程)。The multi-process monitoring method provided by Embodiment 1 of the present disclosure passes through multiple pending tests in the system. The measurement process performs parallel detection, and when there is an unfinished detection operation within a preset duration, the detection operation is stopped, and subsequent operations are continued, so that the detection items are diversified, the detection process of the entire system, and the result collection are not blocked, and is convenient. Extensions (the ability to add or remove detected processes flexibly).
所述对系统中运行的多个待检测进程进行并行检测的步骤可以包括:对系统进行启动和初始化;系统初始化后,对系统中运行的多个待检测进程进行并行检测。The step of performing parallel detection on multiple processes to be detected running in the system may include: starting and initializing the system; and after the system is initialized, performing parallel detection on multiple processes to be detected running in the system.
为了保证系统不会出现异常环境,可采用的措施是:在系统上电时,首先调用具有停止功能的模块,然后调用具有启动和初始化功能的模块来启动和初始化系统。In order to ensure that the system does not have an abnormal environment, the measures that can be taken are: when the system is powered on, first call the module with the stop function, then call the module with the startup and initialization functions to start and initialize the system.
其中,所述得到与该检测操作对应的待检测进程的检测结果的步骤包括:得到与该检测操作对应的待检测进程检测异常或检测失败的检测结果。The step of obtaining the detection result of the process to be detected corresponding to the detection operation includes: obtaining a detection result of the process detection abnormality or the detection failure corresponding to the detection operation.
所述多进程监测方法还可以包括:获取执行检测操作的待检测进程的检测结果和配置信息;根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作。The multi-process monitoring method may further include: acquiring a detection result and configuration information of a process to be detected that performs a detection operation; and performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information.
针对检测结果的操作可有多种,本公开提供两种实例:所述根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作的步骤包括:在所述检测结果为检测异常或检测失败时,根据所述配置信息重启与该检测结果对应的待检测进程,并在重启完毕后再对该待检测进程进行检测;或者,在所述检测结果为检测异常或检测失败时,根据所述配置信息重启整个系统。There are a plurality of operations for the detection result. The present disclosure provides two examples: the step of performing a corresponding processing operation on the process to be detected according to the detection result and the configuration information includes: the detection result is When the detection is abnormal or the detection fails, the process to be detected corresponding to the detection result is restarted according to the configuration information, and the process to be detected is detected after the restart is completed; or the detection result is a detection abnormality or a detection failure. When the entire system is restarted according to the configuration information.
配置信息包括进程检测失败或异常时,所执行的处理操作。失败与异常的处理操作可以一致也可以不一致,在此不作限定。The configuration information includes the processing operations performed when the process detects a failure or an exception. The failure and the abnormal processing operations may be identical or inconsistent, and are not limited herein.
为了防止阻塞,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:在第一监控周期内对系统中运行的多个待检测进程进行并行检测;所述获取执行检测操作的待检测进程的检测结果和配置信息的步骤包括:在第二监控周期内获取执行检测操作的待检测进程的检测结果和配置信息;所述第二监控周期是所述第一监控周期的下一个周期。In order to prevent blocking, the step of performing parallel detection on a plurality of processes to be detected running in the system includes: performing parallel detection on a plurality of processes to be detected running in the system in a first monitoring period; The step of detecting the detection result and the configuration information of the process to be detected includes: acquiring, in the second monitoring period, a detection result and configuration information of the process to be detected that performs the detecting operation; the second monitoring period is the next one of the first monitoring period cycle.
为了保证系统和监测的正常运行,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:根据监控周期和各个待检测进程的检测周期对各个待检测进程分别进行检测;所述检测周期是所述监控周期的整数倍。 In order to ensure the normal operation of the system and the monitoring, the step of performing parallel detection on the plurality of processes to be detected in the system includes: separately detecting each process to be detected according to the monitoring period and the detection period of each process to be detected; The detection period is an integer multiple of the monitoring period.
对应于灵活的添加或删除被检测进程,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:获取待检测进程的检测信息;根据所述检测信息对系统中运行的多个待检测进程进行并行检测。The step of performing parallel detection on the plurality of processes to be detected running in the system includes: acquiring detection information of the process to be detected; and performing multiple operations on the system according to the detection information, corresponding to the flexible process of adding or deleting the detected process. The process to be tested performs parallel detection.
检测信息包括待检测进程的身份标识。The detection information includes the identity of the process to be detected.
下面对本公开实施例一提供的多进程监测方法进行说明。The multi-process monitoring method provided in the first embodiment of the present disclosure will be described below.
本公开实施例提供的方案可使用以下功能模块实现:The solution provided by the embodiment of the present disclosure can be implemented by using the following functional modules:
系统启动单元,负责多进程系统环境的初始化以及所有进程的启动。被检测进程在某些异常情况下,需要启动时,通过系统启动单元对注册的被检测进程执行启动操作。The system startup unit is responsible for the initialization of the multi-process system environment and the startup of all processes. The detected process performs a startup operation on the registered detected process through the system startup unit in some abnormal situations when it needs to be started.
系统监控单元,负责定时执行(触发设定的功能单元)以及查看各个被检测进程的检测结果,并采取对应策略(调用对应功能单元)。可通过在系统中建定时任务来实现。The system monitoring unit is responsible for timing execution (triggering the set function unit) and checking the detection results of each detected process, and adopting a corresponding strategy (calling the corresponding functional unit). This can be done by creating a timed task in the system.
系统停止单元,负责多进程系统所有进程的停止。被检测进程在某些异常情况下,需要停止时,通过系统停止单元对注册的被检测进程执行停止操作。The system stops the unit and is responsible for stopping all processes in the multi-process system. When the detected process needs to stop under certain abnormal conditions, the system stops the unit to perform a stop operation on the registered detected process.
系统控制单元,负责对被检测进程的超时控制,以及各个进程的检测。系统控制单元周期性(受系统监控单元的触发)的对注册的被检测进程执行状态检测操作。The system control unit is responsible for the timeout control of the detected process and the detection of each process. The system control unit periodically performs a state detection operation on the registered detected process (triggered by the system monitoring unit).
环境配置单元,负责配置环境变量(进程参数,包括检测失败后需执行的操作),检测的超时时间,监控的时间间隔,和需要监控的子模块(进程)。The environment configuration unit is responsible for configuring environment variables (process parameters, including operations to be performed after failure detection), timeout period of detection, time interval for monitoring, and submodules (processes) to be monitored.
其中的单元为各个子功能的功能函数,包括启动,停止,检测。进程分为两种类型:核心进程(检测判定失败,系统重启)和一般进程(检测判定失败,进程重启)。The units are the function functions of each sub-function, including start, stop, and detection. There are two types of processes: the core process (detection failure, system restart) and the general process (detection failure, process restart).
本公开实施例一提供的多进程监测方法可概括为两个流程:The multi-process monitoring method provided in Embodiment 1 of the present disclosure can be summarized into two processes:
系统上电启动的流程System power-on startup process
如图2所示,当服务器上电时,系统监控单元首先调用系统停止单元来确保环境干净,再调用系统启动单元。As shown in Figure 2, when the server is powered on, the system monitoring unit first calls the system stop unit to ensure that the environment is clean and then calls the system boot unit.
系统检测流程System inspection process
如图3所示,系统监控单元根据环境配置单元的配置,发送命令给系统控制单元,然后监控单元结束返回。由系统控制单元去并发调用各个进程的检测 接口函数,并分别控制各个进程的检测超时时间。在规定时间内结束所有检测,有阻塞(检测异常)的将其杀掉(停止对应的检测操作,这是可以认定为检测失败)。在下一个周期到来的时候,系统监控单元根据上个周期检测的结果,执行对应的处理操作,如果有错误,则调用进程的停止启动接口函数(系统停止单元)。As shown in FIG. 3, the system monitoring unit sends a command to the system control unit according to the configuration of the environment configuration unit, and then the monitoring unit ends the return. The system control unit concurrently calls the detection of each process Interface function, and control the detection timeout time of each process. All tests are completed within the specified time, and there is a blockage (detection abnormality) to kill it (stop the corresponding detection operation, which can be regarded as the detection failure). When the next cycle arrives, the system monitoring unit executes the corresponding processing operation according to the result of the last cycle detection, and if there is an error, the process stop function interface (system stop unit) is called.
由上可知,本公开实施例一提供的方案可以将系统中的多个被检测进程进行统一监控,并采用并行化检测方法,使整个系统的检测过程和结果收集不会被阻塞,并且可以灵活的添加或删除被检测进程。It can be seen that the solution provided by the first embodiment of the present disclosure can uniformly monitor multiple detected processes in the system, and adopt a parallel detection method, so that the detection process and result collection of the entire system are not blocked, and can be flexible. Add or remove detected processes.
应用举例如下:Application examples are as follows:
假设一个linux多进程服务系统有4个进程,分别为A进程,B进程,C进程,D进程,需要对这四个进程进行监控检测(监测)。Assume that a Linux multi-process service system has four processes, namely A process, B process, C process, and D process, which need to be monitored and monitored (monitored).
首先,将这四个进程的检测项添加进环境配置单元中去(把检测项做成功能文件,直接以模块化的形式加进环境配置单元中,不再需要停止检测和修改相关系统代码)。First, add the detection items of these four processes to the environment configuration unit (the detection items are made into function files, and are directly added to the environment configuration unit in a modular form, and it is no longer necessary to stop detecting and modifying the relevant system code) .
然后,在系统上电时,系统监控单元会首先调用系统停止单元,然后再调用系统启动单元,启动并初始化系统,来确保系统不会出现异常环境。Then, when the system is powered on, the system monitoring unit will first call the system stop unit, then call the system boot unit, start and initialize the system to ensure that the system does not have an abnormal environment.
在系统正常运行时,系统监控单元在第一个监控周期内分别为这四个被检测的进程创建异步的检测任务(以脚本建新任务,检测操作由系统控制单元执行),然后等到下一个监控周期到来时收集上一个监控周期检测的结果,根据检测的结果来做相应的处理(调用相应的功能单元)。其中,进程的检测周期是监控周期的整数倍,比如监控周期为3s,则检测周期为3s、6s或9s等。During normal operation of the system, the system monitoring unit creates asynchronous detection tasks for the four detected processes in the first monitoring period (create new tasks with scripts, and the detection operations are performed by the system control unit), and then wait until the next monitoring. When the cycle arrives, the result of the last monitoring cycle detection is collected, and corresponding processing is performed according to the detected result (calling the corresponding functional unit). The detection period of the process is an integer multiple of the monitoring period. For example, if the monitoring period is 3s, the detection period is 3s, 6s, or 9s.
比如,如果检测发现被检测进程A出现问题,则根据配置来做相应的处理:只重启进程A,或者重启整个系统等。For example, if the detection finds that there is a problem with the detected process A, it will perform corresponding processing according to the configuration: only restart process A, or restart the entire system.
如果只重启进程A,则在检测周期内不会再次对进程A进行检测,直到进程A重启完成后,才会继续检测。If only process A is restarted, process A will not be detected again during the detection period until the process A restarts.
如果是需要重启整个系统,则先通过系统停止单元停止整个系统,然后通过系统启动单元启动系统(需要在硬件、网卡等硬件环境满足启动条件时再启动系统)。If it is necessary to restart the entire system, first stop the entire system through the system stop unit, and then start the system through the system boot unit (requires the system to start when the hardware environment such as hardware and network card meets the boot conditions).
实施例二 Embodiment 2
如图4所示,本公开实施例二提供的多进程监测装置,包括:As shown in FIG. 4, the multi-process monitoring apparatus provided in Embodiment 2 of the present disclosure includes:
检测模块41,被配置为对系统中运行的多个待检测进程进行并行检测;The detecting module 41 is configured to perform parallel detection on multiple processes to be detected running in the system;
第一处理模块42,被配置为若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。The first processing module 42 is configured to stop the detecting operation if the detecting operation is not completed within the preset time period, and obtain the detection result of the to-be-detected process corresponding to the detecting operation.
其中,预设时长是指检测超时阈值。The preset duration refers to the detection timeout threshold.
本公开实施例二提供的所述多进程监测装置通过对系统中运行的多个待检测进程进行并行检测,并在存在预设时长内未完成的检测操作时,停止该检测操作,继续执行后续操作,使得检测项目多样化、整个系统的检测过程和结果收集不会被阻塞、便于扩展(可以灵活的添加或删除被检测进程)。The multi-process monitoring apparatus provided in the second embodiment of the present disclosure stops the detection operation by performing parallel detection on a plurality of processes to be detected running in the system, and stops performing the detection operation when there is an unfinished detection operation within a preset duration. The operation makes the detection items diversified, the detection process and result collection of the entire system are not blocked, and it is easy to expand (the detection process can be added or deleted flexibly).
所述检测模块可以包括:第一处理子模块,被配置为对系统进行启动和初始化;第一检测子模块,被配置为系统初始化后,对系统中运行的多个待检测进程进行并行检测。The detecting module may include: a first processing sub-module configured to start and initialize the system; and the first detecting sub-module configured to perform parallel detection on the plurality of processes to be detected running in the system after the system is initialized.
其中,所述第一处理模块包括:第二处理子模块,被配置为得到与该检测操作对应的待检测进程检测异常或检测失败的检测结果。The first processing module includes: a second processing sub-module configured to obtain a detection result of a detection abnormality or a detection failure of the to-be-detected process corresponding to the detecting operation.
所述多进程监测装置,还可以包括:获取模块,被配置为获取执行检测操作的待检测进程的检测结果和配置信息;第二处理模块,被配置为根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作。The multi-process monitoring device may further include: an obtaining module configured to acquire a detection result and configuration information of a process to be detected that performs a detecting operation; and a second processing module configured to be configured according to the detection result and the configuration information Performing a corresponding processing operation on the process to be detected.
针对检测结果的操作可有多种,本公开提供两种实例:所述第二处理模块包括:第三处理子模块,被配置为在所述检测结果为检测异常或检测失败时,根据所述配置信息重启与该检测结果对应的待检测进程,并在重启完毕后再对该待检测进程进行检测;或者,重启子模块,被配置为在所述检测结果为检测异常或检测失败时,根据所述配置信息重启整个系统。There are a plurality of operations for the detection result, and the present disclosure provides two examples: the second processing module includes: a third processing submodule configured to, when the detection result is a detection abnormality or a detection failure, according to the The configuration information is restarted, and the process to be detected corresponding to the detection result is restarted, and the process to be detected is detected after the restart is completed. Alternatively, the restarting submodule is configured to be configured according to the detection result being abnormal or detecting failure. The configuration information restarts the entire system.
配置信息包括进程检测失败或异常时,所执行的处理操作。失败与异常的处理操作可以一致也可以不一致,在此不作限定。The configuration information includes the processing operations performed when the process detects a failure or an exception. The failure and the abnormal processing operations may be identical or inconsistent, and are not limited herein.
为了防止阻塞,所述检测模块包括:第二检测子模块,被配置为在第一监控周期内对系统中运行的多个待检测进程进行并行检测;所述获取模块包括:第一获取子模块,被配置为在第二监控周期内获取执行检测操作的待检测进程的检测结果和配置信息;所述第二监控周期是所述第一监控周期的下一个周期。The detection module includes: a second detection submodule configured to perform parallel detection on a plurality of processes to be detected running in the system in a first monitoring period; the obtaining module includes: a first acquiring submodule And configured to acquire a detection result and configuration information of a process to be detected that performs a detection operation in a second monitoring period; the second monitoring period is a next period of the first monitoring period.
为了保证系统和监测的正常运行,所述检测模块包括:第三检测子模块, 被配置为根据监控周期和各个待检测进程的检测周期对各个待检测进程分别进行检测;所述检测周期是所述监控周期的整数倍。In order to ensure the normal operation of the system and the monitoring, the detecting module comprises: a third detecting submodule, It is configured to separately detect each to-be-detected process according to a monitoring period and a detection period of each to-be-detected process; the detection period is an integer multiple of the monitoring period.
对应于灵活的添加或删除被检测进程,所述检测模块包括:第二获取子模块,被配置为获取待检测进程的检测信息;第四检测子模块,被配置为根据所述检测信息对系统中运行的多个待检测进程进行并行检测。Corresponding to the flexible adding or deleting the detected process, the detecting module includes: a second acquiring submodule configured to acquire detection information of the process to be detected; and a fourth detecting submodule configured to perform the system according to the detecting information Parallel detection is performed by multiple processes to be detected running in the process.
检测信息包括待检测进程的身份标识。The detection information includes the identity of the process to be detected.
其中,上述多进程监测方法的所述实现实施例均适用于该多进程监测装置的实施例中,也能达到相同的技术效果。The implementation examples of the multi-process monitoring method described above are applicable to the embodiment of the multi-process monitoring device, and can achieve the same technical effect.
为了解决上述技术问题,本公开实施例还提供了一种多进程服务系统,包括:上述的多进程监测装置。In order to solve the above technical problem, the embodiment of the present disclosure further provides a multi-process service system, including: the multi-process monitoring device described above.
其中,上述多进程监测装置的所述实现实施例均适用于该多进程服务系统的实施例中,也能达到相同的技术效果。The implementation examples of the multi-process monitoring device are applicable to the embodiment of the multi-process service system, and the same technical effects can be achieved.
本公开实施例还提供了一种非暂态计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述任一实施例中的方法。Embodiments of the present disclosure also provide a non-transitory computer readable storage medium storing computer executable instructions arranged to perform the method of any of the above embodiments.
本公开实施例还提供了一种电子设备的结构示意图。参见图5,该电子设备包括:The embodiment of the present disclosure further provides a schematic structural diagram of an electronic device. Referring to FIG. 5, the electronic device includes:
至少一个处理器(processor)50,图5中以一个处理器50为例;和存储器(memory)51,还可以包括通信接口(Communications Interface)52和总线53。其中,处理器50、通信接口52、存储器51可以通过总线53完成相互间的通信。通信接口52可以用于信息传输。处理器50可以调用存储器51中的逻辑指令,以执行上述实施例的方法。At least one processor 50, which is exemplified by a processor 50 in FIG. 5; and a memory 51, may further include a communication interface 52 and a bus 53. The processor 50, the communication interface 52, and the memory 51 can complete communication with each other through the bus 53. Communication interface 52 can be used for information transmission. Processor 50 can invoke logic instructions in memory 51 to perform the methods of the above-described embodiments.
此外,上述的存储器51中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。Furthermore, the logic instructions in the memory 51 described above may be implemented in the form of software functional units and sold or used as separate products, and may be stored in a computer readable storage medium.
存储器51作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序,如本公开实施例中的方法对应的程序指令/模块。处理器50通过运行 存储在存储器51中的软件程序、指令以及模块,从而执行功能应用以及数据处理,即实现上述方法实施例中的多进程监测方法。The memory 51 is used as a computer readable storage medium for storing software programs, computer executable programs, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. Processor 50 runs through The software program, the instructions, and the modules stored in the memory 51 perform function application and data processing, that is, implement the multi-process monitoring method in the above method embodiments.
存储器51可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器51可以包括高速随机存取存储器,还可以包括非易失性存储器。The memory 51 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to use of the terminal device, and the like. Further, the memory 51 may include a high speed random access memory, and may also include a nonvolatile memory.
本公开实施例的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括一个或多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例所述方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质,包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product stored in a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, or a network) The device or the like) performs all or part of the steps of the method described in the embodiments of the present disclosure. The foregoing storage medium may be a non-transitory storage medium, including: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like. A medium that can store program code, or a transitory storage medium.
需要说明的是,此说明书中所描述的许多功能部件都被称为模块/子模块,以便更加特别地强调其实现方式的独立性。It should be noted that many of the functional components described in this specification are referred to as modules/sub-modules to more particularly emphasize the independence of their implementation.
本公开实施例中,模块/子模块可以用软件实现,以便由各种类型的处理器执行。举例来说,一个标识的可执行代码模块可以包括计算机指令的一个或多个物理或者逻辑块,举例来说,其可以被构建为对象、过程或函数。尽管如此,所标识模块的可执行代码无需物理地位于一起,而是可以包括存储在不同位里上的不同的指令,当这些指令逻辑上结合在一起时,其构成模块并且实现该模块的功能。In embodiments of the present disclosure, modules/sub-modules may be implemented in software for execution by various types of processors. For example, an identified executable code module can comprise one or more physical or logical blocks of computer instructions, which can be constructed, for example, as an object, procedure, or function. Nonetheless, the executable code of the identified modules need not be physically located together, but may include different instructions stored in different bits that, when logically combined, constitute a module and implement the functionality of the module. .
实际上,可执行代码模块可以是单条指令或者是许多条指令,并且甚至可以分布在多个不同的代码段上,分布在不同程序当中,以及跨越多个存储器设备分布。同样地,操作数据可以在模块内被识别,并且可以依照任何适当的形式实现并且被组织在任何适当类型的数据结构内。所述操作数据可以作为单个数据集被收集,或者可以分布在不同位置上(包括在不同存储设备上),并且至少部分地可以仅作为电子信号存在于系统或网络上。 In practice, the executable code module can be a single instruction or a plurality of instructions, and can even be distributed across multiple different code segments, distributed among different programs, and distributed across multiple memory devices. As such, operational data may be identified within the modules and may be implemented in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed at different locations (including on different storage devices), and may at least partially exist as an electronic signal on a system or network.
在模块可以利用软件实现时,考虑到相关硬件工艺的水平,所以可以以软件实现的模块,在不考虑成本的情况下,本领域技术人员都可以搭建对应的硬件电路来实现对应的功能,所述硬件电路包括常规的超大规模集成(VLSI)电路或者门阵列以及诸如逻辑芯片、晶体管之类的相关半导体或者是其它分立的元件。模块还可以用可编程硬件设备,诸如现场可编程门阵列、可编程阵列逻辑、可编程逻辑设备等实现。When the module can be implemented by software, considering the level of the related hardware process, the module can be implemented in software, and the technician can construct the corresponding hardware circuit to realize the corresponding function without considering the cost. The hardware circuits include conventional Very Large Scale Integration (VLSI) circuits or gate arrays and related semiconductors such as logic chips, transistors, or other discrete components. The modules can also be implemented with programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices, and the like.
以上所述的是本公开的实施方式,应当指出对于本技术领域的普通人员来说,在不脱离本公开实施例的范围的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。The above is an embodiment of the present disclosure, and it should be noted that those skilled in the art can also make several improvements and refinements without departing from the scope of the embodiments of the present disclosure. It is considered to be the scope of protection of the present disclosure.
工业实用性Industrial applicability
本申请提供的多进程监测方法、装置及服务系统,使得检测项目多样化、整个系统的检测过程和结果收集不会被阻塞、便于扩展(可以灵活的添加或删除被检测进程)。 The multi-process monitoring method, device and service system provided by the present application make the detection items diversified, the detection process and result collection of the entire system are not blocked, and are easy to expand (the detection process can be flexibly added or deleted).

Claims (11)

  1. 一种多进程监测方法,包括:A multi-process monitoring method comprising:
    对系统中运行的多个待检测进程进行并行检测;Perform parallel detection on multiple processes to be detected running in the system;
    若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。If there is an unfinished detection operation within the preset duration, the detection operation is stopped, and the detection result of the process to be detected corresponding to the detection operation is obtained.
  2. 如权利要求1所述的方法,其中,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:The method of claim 1 wherein said step of performing parallel detection of a plurality of processes to be detected running in the system comprises:
    对系统进行启动和初始化;Start and initialize the system;
    系统初始化后,对系统中运行的多个待检测进程进行并行检测。After the system is initialized, parallel detection is performed on multiple processes to be detected running in the system.
  3. 如权利要求1所述的方法,其中,所述得到与该检测操作对应的待检测进程的检测结果的步骤包括:The method of claim 1, wherein the step of obtaining a detection result of the process to be detected corresponding to the detecting operation comprises:
    得到与该检测操作对应的待检测进程检测异常或检测失败的检测结果。A detection result of the detection process abnormality or the detection failure corresponding to the detection operation corresponding to the detection operation is obtained.
  4. 如权利要求1所述的方法,还包括:The method of claim 1 further comprising:
    获取执行检测操作的待检测进程的检测结果和配置信息;Obtaining a detection result and configuration information of a process to be detected that performs a detection operation;
    根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作。Performing a corresponding processing operation on the to-be-detected process according to the detection result and the configuration information.
  5. 如权利要求4所述的方法,其中,所述根据所述检测结果和所述配置信息对所述待检测进程执行对应的处理操作的步骤包括:The method of claim 4, wherein the step of performing a corresponding processing operation on the process to be detected according to the detection result and the configuration information comprises:
    在所述检测结果为检测异常或检测失败时,根据所述配置信息重启与该检测结果对应的待检测进程,并在重启完毕后再对该待检测进程进行检测;或者,When the detection result is the detection abnormality or the detection failure, the process to be detected corresponding to the detection result is restarted according to the configuration information, and the process to be detected is detected after the restart is completed; or
    在所述检测结果为检测异常或检测失败时,根据所述配置信息重启整个系统。When the detection result is a detection abnormality or a detection failure, the entire system is restarted according to the configuration information.
  6. 如权利要求4所述的方法,其中,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:The method of claim 4 wherein said step of performing parallel detection of a plurality of processes to be detected running in the system comprises:
    在第一监控周期内对系统中运行的多个待检测进程进行并行检测;Performing parallel detection on multiple processes to be detected running in the system during the first monitoring period;
    所述获取执行检测操作的待检测进程的检测结果和配置信息的步骤包括:The step of acquiring the detection result and the configuration information of the process to be detected that performs the detecting operation includes:
    在第二监控周期内获取执行检测操作的待检测进程的检测结果和配置信息;Obtaining a detection result and configuration information of a process to be detected that performs a detection operation in a second monitoring period;
    所述第二监控周期是所述第一监控周期的下一个周期。The second monitoring period is the next period of the first monitoring period.
  7. 如权利要求1所述的方法,其中,所述对系统中运行的多个待检测进程进行并行检测的步骤包括: The method of claim 1 wherein said step of performing parallel detection of a plurality of processes to be detected running in the system comprises:
    根据监控周期和各个待检测进程的检测周期对各个待检测进程分别进行检测;Detecting each process to be detected separately according to the monitoring period and the detection period of each process to be detected;
    所述检测周期是所述监控周期的整数倍。The detection period is an integer multiple of the monitoring period.
  8. 如权利要求1所述的方法,其中,所述对系统中运行的多个待检测进程进行并行检测的步骤包括:The method of claim 1 wherein said step of performing parallel detection of a plurality of processes to be detected running in the system comprises:
    获取待检测进程的检测信息;Obtaining detection information of the process to be detected;
    根据所述检测信息对系统中运行的多个待检测进程进行并行检测。Performing parallel detection on a plurality of processes to be detected running in the system according to the detection information.
  9. 一种多进程监测装置,包括:A multi-process monitoring device comprising:
    检测模块,被配置为对系统中运行的多个待检测进程进行并行检测;a detection module configured to perform parallel detection on a plurality of processes to be detected running in the system;
    处理模块,被配置为若存在预设时长内未完成的检测操作,则停止该检测操作,并得到与该检测操作对应的待检测进程的检测结果。The processing module is configured to stop the detecting operation if the detecting operation is not completed within the preset duration, and obtain a detection result of the to-be-detected process corresponding to the detecting operation.
  10. 一种多进程服务系统,包括:如权利要求9所述的多进程监测装置。A multi-process service system comprising: the multi-process monitoring device of claim 9.
  11. 一种非暂态计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行权利要求1-8中任一项的方法。 A non-transitory computer readable storage medium storing computer executable instructions arranged to perform the method of any of claims 1-8.
PCT/CN2017/087185 2016-06-30 2017-06-05 Multi-process monitoring method, apparatus and service system WO2018001048A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610513899.6A CN107562597A (en) 2016-06-30 2016-06-30 A kind of multi-process monitoring method, device and service system
CN201610513899.6 2016-06-30

Publications (1)

Publication Number Publication Date
WO2018001048A1 true WO2018001048A1 (en) 2018-01-04

Family

ID=60785755

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087185 WO2018001048A1 (en) 2016-06-30 2017-06-05 Multi-process monitoring method, apparatus and service system

Country Status (2)

Country Link
CN (1) CN107562597A (en)
WO (1) WO2018001048A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112631868A (en) * 2020-12-28 2021-04-09 浙江中控技术股份有限公司 Performance monitoring method and device of CentOS system
CN113447059A (en) * 2021-06-03 2021-09-28 北京百度网讯科技有限公司 Detection method and device for sensor of automatic driving automobile and electronic equipment
CN113779570A (en) * 2021-09-18 2021-12-10 深信服科技股份有限公司 Process monitoring method and device, electronic equipment and readable storage medium
CN115292140A (en) * 2022-09-01 2022-11-04 摩尔线程智能科技(北京)有限责任公司 Method, apparatus and computer readable medium for monitoring system startup
CN116185772A (en) * 2023-02-10 2023-05-30 安芯网盾(北京)科技有限公司 File batch detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020184295A1 (en) * 2001-05-24 2002-12-05 Ibm Corporation Method for mutual computer process monitoring and restart
CN101727390A (en) * 2009-12-28 2010-06-09 金蝶软件(中国)有限公司 Method and device for debugging performance test scripts
CN102591765A (en) * 2011-12-31 2012-07-18 珠海市君天电子科技有限公司 Progress automatic management system
CN104156299A (en) * 2014-08-21 2014-11-19 江苏惠居乐信息科技有限公司 Monitoring method for parallel systems
CN104899006A (en) * 2015-05-25 2015-09-09 山东中孚信息产业股份有限公司 Multiprocess parallel processing method for multisystem platform

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793288B (en) * 2014-02-14 2017-07-18 北京邮电大学 A kind of software watchdog system and method
CN104199772A (en) * 2014-09-02 2014-12-10 浪潮(北京)电子信息产业有限公司 Progress supervising method and device
CN105302637B (en) * 2015-10-13 2019-04-23 Oppo广东移动通信有限公司 System process is operating abnormally restoration methods, device and the mobile terminal for causing Caton

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020184295A1 (en) * 2001-05-24 2002-12-05 Ibm Corporation Method for mutual computer process monitoring and restart
CN101727390A (en) * 2009-12-28 2010-06-09 金蝶软件(中国)有限公司 Method and device for debugging performance test scripts
CN102591765A (en) * 2011-12-31 2012-07-18 珠海市君天电子科技有限公司 Progress automatic management system
CN104156299A (en) * 2014-08-21 2014-11-19 江苏惠居乐信息科技有限公司 Monitoring method for parallel systems
CN104899006A (en) * 2015-05-25 2015-09-09 山东中孚信息产业股份有限公司 Multiprocess parallel processing method for multisystem platform

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112631868A (en) * 2020-12-28 2021-04-09 浙江中控技术股份有限公司 Performance monitoring method and device of CentOS system
CN112631868B (en) * 2020-12-28 2023-06-16 浙江中控技术股份有限公司 Performance monitoring method and device of CentOS system
CN113447059A (en) * 2021-06-03 2021-09-28 北京百度网讯科技有限公司 Detection method and device for sensor of automatic driving automobile and electronic equipment
CN113447059B (en) * 2021-06-03 2022-11-29 北京百度网讯科技有限公司 Detection method and device for sensor of automatic driving automobile and electronic equipment
CN113779570A (en) * 2021-09-18 2021-12-10 深信服科技股份有限公司 Process monitoring method and device, electronic equipment and readable storage medium
CN113779570B (en) * 2021-09-18 2024-02-23 深信服科技股份有限公司 Security protection method and device based on core process and electronic equipment
CN115292140A (en) * 2022-09-01 2022-11-04 摩尔线程智能科技(北京)有限责任公司 Method, apparatus and computer readable medium for monitoring system startup
CN115292140B (en) * 2022-09-01 2023-08-15 摩尔线程智能科技(北京)有限责任公司 Method, apparatus and computer readable medium for monitoring system start-up
CN116185772A (en) * 2023-02-10 2023-05-30 安芯网盾(北京)科技有限公司 File batch detection method and device
CN116185772B (en) * 2023-02-10 2023-09-19 安芯网盾(北京)科技有限公司 File batch detection method and device

Also Published As

Publication number Publication date
CN107562597A (en) 2018-01-09

Similar Documents

Publication Publication Date Title
WO2018001048A1 (en) Multi-process monitoring method, apparatus and service system
CN110928743B (en) Computing system, automatic diagnosis method and medium storing instructions thereof
CN104320308B (en) A kind of method and device of server exception detection
WO2018095414A1 (en) Method and apparatus for detecting and recovering fault of virtual machine
US10146626B2 (en) Detecting and handling an expansion card fault during system initialization
CN107111595B (en) Method, device and system for detecting early boot errors
CN102761439A (en) Device and method for detecting and recording abnormity on basis of watchdog in PON (Passive Optical Network) access system
TWI668567B (en) Server and method for restoring a baseboard management controller automatically
US11055416B2 (en) Detecting vulnerabilities in applications during execution
US10169137B2 (en) Dynamically detecting and interrupting excessive execution time
CN110618853B (en) Detection method, device and equipment for zombie container
US10838815B2 (en) Fault tolerant and diagnostic boot
CN101989936B (en) Test method and system of single plate fault
JP2016181055A (en) Information processing apparatus
US20180336086A1 (en) System state information monitoring
CN107179911B (en) Method and equipment for restarting management engine
CN107133130B (en) Computer operation monitoring method and device
CN115904793A (en) Memory unloading method, system and chip based on multi-core heterogeneous system
CN104572332B (en) The method and apparatus of processing system collapse
CN110399258B (en) Stability testing method, system and device for server system
CN108255667B (en) Service monitoring method and device and electronic equipment
CN111522677A (en) Method and device for recording start information based on embedded system
KR102695389B1 (en) Systems, methods, and apparatus for crash recovery in storage devices
US10108499B2 (en) Information processing device with watchdog timer
US20240241779A1 (en) Signaling host kernel crashes to dpu

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17819045

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17819045

Country of ref document: EP

Kind code of ref document: A1