CN116048847A - Health early warning method and system for real-time container - Google Patents

Health early warning method and system for real-time container Download PDF

Info

Publication number
CN116048847A
CN116048847A CN202211613911.2A CN202211613911A CN116048847A CN 116048847 A CN116048847 A CN 116048847A CN 202211613911 A CN202211613911 A CN 202211613911A CN 116048847 A CN116048847 A CN 116048847A
Authority
CN
China
Prior art keywords
container
health
real
warning system
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211613911.2A
Other languages
Chinese (zh)
Inventor
朱晓宁
周霆
任晓瑞
郝继锋
尹超
黄凡帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN202211613911.2A priority Critical patent/CN116048847A/en
Publication of CN116048847A publication Critical patent/CN116048847A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/006Identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/865Monitoring of software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention relates to the technical field of computer software, and provides a health early warning method and system for a real-time container, wherein the health early warning method and system comprise a historical monitoring information acquisition system, a real-time container monitoring system and a container health early warning system.

Description

Health early warning method and system for real-time container
Technical Field
The invention relates to the technical field of computer software, in particular to a health early warning method and system for a real-time container.
Background
In recent years, cloud computing systems in the general IT field are continuously evolved by cloud native technology typified by container technology. Container technology brings significant acceleration benefits and high availability benefits to the development, deployment and operation of modern software. The container is an operating system level, achieving view isolation and unified lightweight virtualization technology. For critical task fields with strong real-time and high determination requirements, such as aviation, aerospace, industrial control, medical equipment and the like, a real-time operating system is often needed, and in order to expand the profit of the cloud primary technology on a real-time basis, a real-time container constructed based on the real-time operating system is generated.
For the mission-critical domain described above, a core complaint is to ensure high availability of the system. Since all tasks are run in the container after the real-time container technology is adopted, ensuring the trouble-free operation of the container is a core problem. In the currently proposed container products, technologies such as probes are generally used for monitoring the operation state of the container, but the probe technology can only detect whether the container has a fault or not, and can not predict the possible fault of the container in advance.
Disclosure of Invention
In view of this, an embodiment of the present invention provides a health warning system for a real-time container, so as to solve the technical problem of operational failure of the container in the prior art, where the system includes:
the container history monitoring information acquisition system comprises a first collector, a first memory and a first manager, and is connected with the container health early warning system;
the real-time container monitoring system comprises a second collector, a second memory and a second manager;
the optimized container health early warning system comprises a task-level health early warning system, a container-level health early warning system and a module-level health early warning system, and the optimized container health early warning system is connected with the real-time container monitoring system.
Further, the real-time container monitoring system further comprises an interface layer, and the interface layer provides an access interface for the optimized container health early warning system.
Further, the optimized container health early-warning system is obtained by the container health early-warning system through a model quantification method.
Furthermore, the task-level health early-warning system, the container-level health early-warning system and the module-level health early-warning system are respectively realized through the management of the health monitoring table.
Furthermore, the container history monitoring information acquisition system, the real-time container monitoring system and the optimized container health early warning system are all operated and controlled through a core operating system, the core operating system calls event injection service according to container operation errors and submits events to the health monitoring table, the core operating system searches an event dispatching level from the health monitoring table according to the state of the events, dispatches the events to a task-level health early warning system, a container-level health early warning system or a module-level health early warning system according to the time dispatching level, and the core operating system has real-time performance and certainty.
Further, the task level health pre-warning system processes the event including cold and hot starting the container.
Further, the container-level health early warning system obtains dispatch information of the event by searching the health monitoring table, wherein the dispatch information comprises a system state and an error processing program.
Further, the processing operation of the event by the container-level health-warning system includes stopping and/or restarting the container.
In addition, the invention also provides a health early-warning method for the real-time container, which is applied to a health early-warning system for the real-time container, and comprises the following steps:
collecting application data information running in a real-time container, and storing the application data information;
according to an artificial intelligence algorithm, the application data information is used as artificial intelligence training data, an artificial intelligence model for predicting the failure time of the container is obtained through training, and an initial container health early warning system is built by using the artificial intelligence model;
optimizing and cutting the artificial intelligent model to obtain an optimized artificial intelligent model for predicting the failure time of the container, and constructing an optimized container health early warning system by using the model;
collecting second application data information running in a real-time container, and providing the second application data information to the optimized container health early warning system in real time;
and the container health early warning system analyzes the failure probability of the container according to the second application data information, obtains early warning information and provides the early warning information for the management system.
Compared with the prior art, the beneficial effects that above-mentioned at least one technical scheme that this description embodiment adopted can reach include at least: the invention provides a health early warning system for a real-time container, which comprises a historical monitoring information acquisition system, a real-time container monitoring system and a container health early warning system, wherein a container fault early warning model is designed by utilizing historical data of system operation through an artificial intelligence method, and the container to be subjected to fault can be operated so that a management system can ensure the reliable operation of the whole system by adopting a proper strategy, thereby achieving the aim of improving the safety of the system.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system for collecting historical monitoring information of a container according to an embodiment of the present invention;
FIG. 2 is a diagram of a real-time container monitoring system architecture provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a container health warning system according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of a real-time container health warning method according to an embodiment of the present invention.
Detailed Description
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Other advantages and effects of the present application will become apparent to those skilled in the art from the present disclosure, when the following description of the embodiments is taken in conjunction with the accompanying drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the present invention, there is provided a health warning system for a real-time container, comprising: the container history monitoring information acquisition system comprises a first collector, a first memory and a first manager, and is connected with the container health early warning system; the real-time container monitoring system comprises a second collector, a second memory and a second manager; the optimized container health early warning system comprises a task-level health early warning system, a container-level health early warning system and a module-level health early warning system, and the optimized container health early warning system is connected with the real-time container monitoring system.
Further, the real-time container monitoring system further comprises an interface layer, and the interface layer provides an access interface for the optimized container health early warning system.
Further, the optimized container health early-warning system is obtained by the container health early-warning system through a model quantification method.
Specifically, the invention provides a container health monitoring and early warning system applied to a real-time container, which comprises a container history monitoring information acquisition system, a real-time container monitoring system and a container health early warning system. The history monitoring information acquisition system is used for acquiring data generated in the history operation process of the system and comprises an acquisition device, a memory and a manager, wherein the architecture of the history monitoring information acquisition system is shown in figure 1, and the manager is the master control of the whole monitoring system and can control the starting period task, acquire a container list, automatically find a container and the like; the main function of the collector is to collect container indicators, which are typically provided by a real-time operating system through an interface; the memory is responsible for storing historical data of the monitoring system. The real-time container monitoring system is used for monitoring the running state of the container system in real time and comprises a collector, a memory and a manager. The architecture of the real-time container monitoring system is shown in fig. 2, and a manager is the master control of the whole monitoring system and can control the starting period task, acquire a container list, automatically discover containers and the like; the main function of the collector is to collect container indicators, which are typically provided by a real-time operating system through an interface; the memory is responsible for storing data of the monitoring system; the API layer (interface layer) is responsible for providing an access interface to the outside, and may provide node information, container operation information, event information, and the like.
Further, the container health early warning system comprises a three-level health monitoring system, namely a task level, a container level and a module level. The overall architecture of the container health early warning system is shown in fig. 3, wherein the health management application monitors and processes faults of the system according to a three-level health monitoring system; the health warning application receives the container performance monitoring data and accordingly makes container fault warnings. The health management application and the health early warning application are in unified docking with the management system, and the management system is used for arranging and scheduling the containers.
Furthermore, the task-level health early warning system, the container-level health early warning system and the module-level health early warning system are respectively realized through the management of the health monitoring table, and the task-level health early warning system, the container-level health early warning system and the module-level health early warning system comprise a health monitoring system table, a module table and a container table. Defining an event requires defining an error code and an error level at the same time, and running errors in an application program submit the event to a health monitoring table by calling an event injection service, and a core operating system searches an event dispatch level from the health monitoring table according to the event state, so that the event is dispatched to different container health early warning systems for processing. Task level health monitoring operates as a high priority task within a container to which health events dispatched to the task level will be sent for processing. The task level health monitoring finds a processing program corresponding to the health event code according to the health event code, and the processing comprises the operations of closing/suspending the task, releasing the semaphore, performing cold and hot start, closing and the like on the container.
Further, container level health monitoring processes operations including system status, error handling procedures (error recovery actions) by looking up container health monitoring tables, dispatching to container level health warning systems. The error handling at the container level includes stopping, restarting the container, etc. Module level health monitoring interfaces with the processing policy for health events dispatched to the module level by looking up the module health monitoring table, and the level error handler entries are all running in the core operating system, with redevelopment of the handler being limited by the operating system. The error handling at the module level includes soft reset, shut down, etc. of the CPU.
Based on the same inventive concept, the embodiment of the invention also provides a health pre-warning method for the real-time container, as described in the following embodiment. Because of a health warning system for real-time containers
The principle of the problem is similar to that of a health warning method for a real-time container, so that the implementation of a 5 health warning method for a real-time container can be referred to the implementation of a health warning system for a real-time container, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the system described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 4 is a schematic diagram 0 of a health warning method for a real-time container according to an embodiment of the present invention, and the specific steps include:
1) The method comprises the steps of continuously running an application running in a real-time container for a long time, collecting information such as system state, system time and the like in the running process, collecting the state and time of faults in the running process, and storing all collected data in a nonvolatile memory;
2) Selecting a proper artificial intelligent algorithm according to the performance and characteristics of the system, taking the data acquired in the step 1) and 5 as artificial intelligent training data, training to obtain an artificial intelligent model for predicting the failure time of the container, and constructing an initial container health early warning system by using the model;
3) Optimizing and cutting the artificial intelligent model generated in the step 2) according to the performance and characteristics of the system to form an optimized artificial intelligent model for predicting the failure time of the container, and constructing an optimized container health early warning system by using the model;
0 4) Information such as system state, system time and the like of an application running in a real-time container is collected,
providing all collected data to the container health early warning system constructed in the step 3) in real time;
5) And 4) the container health early warning system analyzes the probability of the container failing in a future period by operating the data generated in the step 4), and provides early warning information for the management system according to the characteristics of the failure time, the importance of the application in the container and the like.
The embodiment of the invention realizes the following technical effects:
the invention designs a health early warning method and a system for a real-time container, which utilize historical data of system operation to design a container fault early warning model by an artificial intelligence method, can monitor the operation state of the real-time container and early warn the container which is likely to be faulty so that a management system can take processing measures in advance, such as starting a standby container, thereby ensuring safe operation of application and greatly improving the reliability and safety of the system.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A health warning system for a real-time container, comprising:
the container history monitoring information acquisition system comprises a first collector, a first memory and a first manager, and is connected with the container health early warning system;
the real-time container monitoring system comprises a second collector, a second memory and a second manager;
the optimized container health early warning system comprises a task-level health early warning system, a container-level health early warning system and a module-level health early warning system, and the optimized container health early warning system is connected with the real-time container system.
2. The health warning system for a real-time container of claim 1, further comprising an interface layer that provides an access interface to the optimized container health warning system.
3. The system of claim 1, wherein the system obtains the optimized system by model quantification.
4. The health warning system for a real-time container according to claim 1, wherein the task level health warning system, the container level health warning system and the module level health warning system are implemented by management of health monitoring tables, respectively.
5. The health warning system for a real-time container according to claim 4, wherein the container history monitoring information acquisition system, the real-time container monitoring system and the optimized container health warning system are all controlled by a core operating system, the core operating system calls an event injection service according to a container operation error and submits an event to the health monitoring table, the core operating system searches an event dispatch level from the health monitoring table according to a state of the event, and dispatches the event to a task-level health warning system, a container-level health warning system or a module-level health warning system according to the time dispatch level, and the core operating system has real-time performance and certainty.
6. The health care system for a real-time container as in claim 5, wherein said task level health care system processing said event comprises cold and hot starting the container.
7. The health warning system for a real-time container of claim 5, wherein said container level health warning system obtains dispatch information for said event by looking up said health monitoring table, said dispatch information including system status, error handling procedures.
8. A health warning system for a real time container as in claim 7, wherein the handling of the event by the container level health warning system comprises stopping and/or restarting the container.
9. A health warning method for a real-time container, the health warning method for a real-time container being applied to a health warning system for a real-time container according to any one of claims 1 to 8, the method comprising:
collecting application data information running in a real-time container, and storing the application data information;
according to an artificial intelligence algorithm, the application data information is used as artificial intelligence training data, an artificial intelligence model for predicting the failure time of the container is obtained through training, and an initial container health early warning system is built by using the artificial intelligence model;
optimizing and cutting the artificial intelligent model to obtain an optimized artificial intelligent model for predicting the failure time of the container, and constructing an optimized container health early warning system by using the model;
collecting second application data information running in a real-time container, and providing the second application data information to the optimized container health early warning system in real time;
and the container health early warning system analyzes the failure probability of the container according to the second application data information, obtains early warning information and provides the early warning information for the management system.
CN202211613911.2A 2022-12-15 2022-12-15 Health early warning method and system for real-time container Pending CN116048847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211613911.2A CN116048847A (en) 2022-12-15 2022-12-15 Health early warning method and system for real-time container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211613911.2A CN116048847A (en) 2022-12-15 2022-12-15 Health early warning method and system for real-time container

Publications (1)

Publication Number Publication Date
CN116048847A true CN116048847A (en) 2023-05-02

Family

ID=86120834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211613911.2A Pending CN116048847A (en) 2022-12-15 2022-12-15 Health early warning method and system for real-time container

Country Status (1)

Country Link
CN (1) CN116048847A (en)

Similar Documents

Publication Publication Date Title
US6629266B1 (en) Method and system for transparent symptom-based selective software rejuvenation
CN105357038B (en) Monitor the method and system of cluster virtual machine
EP2659371B1 (en) Predicting, diagnosing, and recovering from application failures based on resource access patterns
US8314694B2 (en) System and method for suppressing redundant alarms
US9275172B2 (en) Systems and methods for analyzing performance of virtual environments
US8549536B2 (en) Performing a workflow having a set of dependancy-related predefined activities on a plurality of task servers
CN103092746B (en) The localization method of thread exception and system
US9399526B2 (en) Method, devices and program for computer-aided preventive diagnostics of an aircraft system, using critical event charts
TWI731146B (en) Aircraft malfunction handling system, method for handling aircraft malfunction, and computer equipment using thereof
US20130274991A1 (en) Method, devices and program for computer-aided analysis of the failure tolerance of an aircraft system, using critical event charts
CN111897671A (en) Failure recovery method, computer device, and storage medium
Pankratova Creation of Physical Models for Cyber-Physical Systems
CN115115030A (en) System monitoring method and device, electronic equipment and storage medium
CN112579267A (en) Decentralized big data job flow scheduling method and device
CN112084004A (en) Container detection and maintenance method and system for container application
CN111930561B (en) Streaming task automatic monitoring alarm restarting system and method
CN116048847A (en) Health early warning method and system for real-time container
CN112286762A (en) System information analysis method and device based on cloud environment, electronic equipment and medium
CN108021463B (en) GPU fault management method based on finite-state machine
Li et al. Redundant and fault-tolerant algorithms for real-time measurement and control systems for weapon equipment
Yang et al. Software rejuvenation in cluster computing systems with dependency between nodes
Pattanaik et al. Recovery and reliability prediction in fault tolerant automotive embedded system
CN114239538A (en) Assertion processing method and device, computer equipment and storage medium
CN111159237A (en) System data distribution method and device, storage medium and electronic equipment
Jia et al. Application and design of PHM in aircraft’s integrated modular mission system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination