CN113377626B - Visual unified alarm method, device, equipment and medium based on service tree - Google Patents

Visual unified alarm method, device, equipment and medium based on service tree Download PDF

Info

Publication number
CN113377626B
CN113377626B CN202110916619.7A CN202110916619A CN113377626B CN 113377626 B CN113377626 B CN 113377626B CN 202110916619 A CN202110916619 A CN 202110916619A CN 113377626 B CN113377626 B CN 113377626B
Authority
CN
China
Prior art keywords
alarm
message
service
alarm notification
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110916619.7A
Other languages
Chinese (zh)
Other versions
CN113377626A (en
Inventor
何育伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Linkedcare Information Technology Co ltd
Original Assignee
Shanghai Linkedcare Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Linkedcare Information Technology Co ltd filed Critical Shanghai Linkedcare Information Technology Co ltd
Priority to CN202110916619.7A priority Critical patent/CN113377626B/en
Publication of CN113377626A publication Critical patent/CN113377626A/en
Application granted granted Critical
Publication of CN113377626B publication Critical patent/CN113377626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Abstract

The application provides a visual unified alarm method, device, equipment and medium based on a service tree, which receive one or more alarm notification messages of an alarm manager; standardizing the alarm notification message to obtain an effective message, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after standardized processing; and pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system so as to be consumed by different message consumers. The method and the device provide a friendly front-end configuration page, reduce the learning cost of a user for informing a management module by using an AlertManager alarm of Prometheus, and avoid manual configuration errors; the alarm upgrading and the change of the notifier are provided, which is beneficial to monitoring the popularization of the alarm in enterprises; the unified processing of the alarm topics, the alarm contents, the alarm channels and the alarm subscription is realized.

Description

Visual unified alarm method, device, equipment and medium based on service tree
Technical Field
The invention relates to the technical field of micro-services, in particular to a visual unified alarm method, a visual unified alarm device, visual unified alarm equipment and a visual unified alarm medium based on a service tree.
Background
The microservice architecture is a software architecture mode commonly adopted by all large internet companies at present. In the micro-service architecture, a system is split into a plurality of small and mutually independent services, and the services run in own processes and can be independently developed and deployed. When the service changes rapidly, the micro-service has the characteristics of single responsibility and autonomy, so that the boundary of the system is clearer, and the maintainability of the system is improved; meanwhile, the complexity of system deployment is simplified, and the micro-service can be upgraded and released independently; when the service is increased, independent expansion can be conveniently carried out. The microservice architecture, while providing many benefits, also introduces new problems.
In the previous single application, the problem of troubleshooting is usually to locate error information and abnormal stacks by checking logs; however, in the micro-service architecture, the services are numerous, and problem location becomes very difficult when a problem occurs. In addition, micro-services often create new services by combining existing services, and a failure of one service is likely to produce an avalanche effect, resulting in unavailability of the entire system.
Therefore, how to monitor the operating condition of the microservice is extremely important, and when an abnormality occurs, the microservice can quickly and accurately give a corresponding alarm to inform corresponding service development and operation and maintenance personnel.
Currently, the underlying infrastructure supporting micro services is cloud native applications such as kubernets (a production-level container arrangement system) container arrangement and Service Mesh (a Service grid, an infrastructure technical architecture for inter-Service communication) Service administration, and a micro Service monitoring alarm system built based on Prometheus (promemeus) is commonly used in the industry to realize comprehensive monitoring and unified alarm of micro services.
While the technical solution of using Prometheus to match with Grafana (an open-source metric monitoring and visualization tool) for monitoring microservices has been relatively mature and sophisticated, the alarm of monitoring data relies on the Prometheus's alarm notification system alert manager (Prometheus's alarm notification management module), where the following problems exist.
1) The method needs to modify Prometheus configuration files, associates AlertManager with Prometheus, cannot realize alarm upgrading, does not support dynamic index value acquisition, cannot dynamically acquire index labels, and summarizes all the problems, which indicates that the alarm configuration of Prometheus and AlertManager is not convenient and flexible enough, the configuration files need to be changed frequently, and the learning cost is high, so that the method is not beneficial to popularization of monitoring alarms in enterprises.
2) In addition, in the face of a huge number of micro services, the prior art adopts a uniform alarm strategy based on AlertManager for all Prometheus systems in an enterprise, and cannot adapt to complex and variable business alarm requirements of the enterprise, for example, monitoring alarm rules required by development groups in the enterprise are configured differently, monitored applications are different, alarm notification levels and personnel are also different, while the prior art only provides alarm configuration capability of AlertManager, does not provide an interface for expanding an alarm system of the enterprise, and has poor expandability.
3) In addition, for hundreds of micro services, how to reasonably aggregate, display and automatically process alarm messages according to calling relations and time sequence relations after receiving the alarm messages is a pain point and a difficulty point in actual services.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present application to provide a visual unified alarm method, apparatus, device and medium based on service tree to solve the problems in the prior art.
To achieve the above and other related objects, the present application provides a visual unified alarm method based on a service tree, including: receiving one or more alarm notification messages of an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form; standardizing the alarm notification message to obtain an effective message, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after standardized processing; pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers; starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
In an embodiment of the present application, the entities of the alarm notification message include: any one or more of a source, environment, type, node, and label.
In an embodiment of the present application, the method further includes: according to dynamic automatic generation of each hierarchy of the business service tree, an alarm state page corresponding to each hierarchy of the business service tree is generated and displayed visually; and providing alarm levels, alarm channel configuration and updating functions on an alarm state page of the visual business service tree so that each development, operation and maintenance personnel of the micro-service can configure and subscribe the required alarm message by themselves.
In an embodiment of the present application, the method further includes: when the state of the received alarm notification message is triggered, the state of a node panel corresponding to the service tree on the alarm state page turns red, and the numerical value is increased along with the increase of the number of different received alarm notification messages; when the state of receiving the alarm notification message is that the processing is finished, the numerical value is reduced along with the increase of the number of different alarm notification messages received by a user; and when the numerical value is zero, the state of the node panel corresponding to the service tree on the alarm state page turns green.
In an embodiment of the present application, the method further includes: and calling a corresponding fault processing service interface according to the type of each standardized alarm notification message so as to provide an alarm fault self-healing processing function.
In an embodiment of the present application, the method further includes: pushing the alarm notification message to a registered and configured alarm channel according to a certain format style; the alarm channel includes: any one of enterprise WeChat, nail, mail and SMS; the alarm channels of the same type are distinguished according to different groups or groups created by the service tree; wherein the cluster may correspond to any node on the service tree to provide notification control of alarm notification messages of different granularities.
In an embodiment of the present application, the method further includes: acquiring the calling relation of each micro service in real time based on a cloud native service registration center and a service topology tool; and receiving alarm adjustment information of each micro-service according to a time sequence, and combining the topological graph to quickly locate an accident source and influence the service.
In an embodiment of the present application, the method further includes: according to the alarm severity level and range, adopting a corresponding alarm processing strategy to achieve rapid recovery of the micro-service cluster fault; wherein the alarm handling policy comprises: any one or more of current limiting, fusing, degrading, popping and restarting.
To achieve the above and other related objects, the present application provides a visual unified alarm system based on a service tree, the system comprising: a receiving module for receiving one or more alarm notification messages of an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form; the processing module is used for carrying out standardization processing on the alarm notification message to obtain an effective message, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after the standardization processing; pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers; the consumer module is used for starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to the alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
To achieve the above and other related objects, the present application provides a computer apparatus, comprising: a memory, and a processor; the memory is to store computer instructions; the processor executes computer instructions to implement the method as described above.
To achieve the above and other related objects, the present application provides a computer readable storage medium storing computer instructions which, when executed, perform the method as described above.
In summary, the method, apparatus, device and medium for visual unified alarm based on service tree according to the present application receives one or more alarm notification messages from an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form; standardizing the alarm notification message to obtain an effective message, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after standardized processing; pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers; starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
Has the following beneficial effects:
the method and the device provide a friendly front-end configuration page, reduce the learning cost of a user for informing a management module by using an AlertManager alarm of Prometheus, and avoid manual configuration errors; the alarm upgrading and the change of the notifier are provided, which is beneficial to monitoring the popularization of the alarm in enterprises; the unified processing of the alarm topic, the alarm content, the alarm channel and the alarm subscription is realized, and the visual unified alarm system based on the business service tree can be fully utilized. Based on a cloud native service registration center, micro-service alarm dependence management and visualization are realized, and a decision basis is provided for automatic service management; the AlertManager sends the alarm message to a visual unified alarm system based on the business service tree in a webhook mode, so that Topic can be set independently for different services and different alarm levels, and more accurate notification reach and focusing can be realized.
Drawings
Fig. 1 is a flowchart illustrating a visualization unified alarm method based on a service tree according to an embodiment of the present application.
Fig. 2 is a schematic traffic flow diagram illustrating a visualization unified alarm method based on a service tree according to an embodiment of the present application.
Fig. 3 is a block diagram of a visual unified alarm system based on a service tree according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application is provided by way of specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure herein. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only schematic and illustrate the basic idea of the present application, and although the drawings only show the components related to the present application and are not drawn according to the number, shape and size of the components in actual implementation, the type, quantity and proportion of the components in actual implementation may be changed at will, and the layout of the components may be more complex.
Throughout the specification, when a part is referred to as being "connected" to another part, this includes not only a case of being "directly connected" but also a case of being "indirectly connected" with another element interposed therebetween. In addition, when a certain part is referred to as "including" a certain component, unless otherwise stated, other components are not excluded, but it means that other components may be included.
The terms first, second, third, etc. are used herein to describe various elements, components, regions, layers and/or sections, but are not limited thereto. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section without departing from the scope of the present application.
Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," and/or "comprising," when used in this specification, specify the presence of stated features, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, operations, elements, components, items, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions or operations are inherently mutually exclusive in some way.
The aforementioned prometheus is an open source of monitoring software developed by CNCF. The process of prometheus mainly comprises the following steps: and pulling the data of each node at regular time according to the configuration, wherein the default pulling mode is pull, and the data of each monitoring node can be acquired by using a push mode provided by push gateway. The acquired data is stored in a TSDB (a time series database). At this time, prometheus has already acquired monitoring data, and a built-in PromQL can be used for querying. Its alarm function is provided using an alarm manager (Alertmanager), which is a component of promemeus' alarm management and sending alarms. The native icon function of the prometheus is too simple, so that the prometheus data can be accessed to the grafana and uniformly managed by the grafana.
The alarm management method aims at the problem that an alarm manager of a Prometheus server is insufficient in alarm configuration. The method, the system and the storage medium mainly use a Prometous server as a framework, and provide a visual unified alarm method, a system and a storage medium based on a service tree so as to provide alarm state pages of different alarm notification levels, diversified alarm channel configurations, friendly visual service tree/service dependence and other levels, an automatic alarm processing function and a historical alarm notification storage medium.
Fig. 1 is a schematic flow chart of a visualization unified alarm method based on a service tree in an embodiment of the present application. As shown, the method comprises:
step S101: receiving one or more alarm notification messages of an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form.
Webhooks is an api concept, is one of the usage paradigms of microservice api, and is also called reverse api, namely: the front end does not actively send the request and is completely pushed by the back end. Taking a common example, for example, your friend sends a friend circle, and the backend pushes the message to the clients of all other friends, which is a typical scenario of Webhooks. Briefly, Webhook is a URL that receives HTTP POST (or GET, PUT, DELETE). An API provider that implements WebHook sends a message to the configured URL when an event occurs. Using WebHooks, you can accept changes in real time, unlike request-response. This is again an inverse of the client-server model, where in the traditional approach, the client requests data from the server, and then the server provides the client with the data (the client is pulling the data). In the Webhook paradigm, a server updates a resource that needs to be provisioned and then automatically sends it as an update to a client (server is pushing data), which is not a requestor but a passive recipient. This reversal of the control relationship may be used to facilitate many communication requests that would otherwise require more complex requests and constant polling on the remote server. By simply receiving the resource rather than sending the request directly, the remote code library can be updated, easily allocating the resource, and even integrating it into an existing system to update the endpoint and related data as needed by the API. And for scenes such as authentication, login and the like of a third-party platform without a front-end interface for transfer, or payment scenes with strong safety requirements and the like, the method is suitable for actively pushing data by using Webhooks.
The alarm manager (alert manager) is an independent alarm module, receives alarms sent by clients such as promemeus (open source monitoring software), and then processes the alarms through grouping, deleting repetition and the like, and sends the alarms to a correct receiver through a route; the alarm mode can be sent to different module responsible persons according to different rules.
The alarm channel includes: enterprise WeChat, nailing, mailing, and SMS. For example, Alertmanager supports alarm modes such as Email and Slack, and can access domestic IM tools such as nails through webhook.
Briefly, one or more alarm notification messages from an alarm manager (alert manager) may be received via a variety of alarm channels (e.g., enterprise WeChat, nail, mail, and SMS, etc.) by pre-registering a variety of alarm channels, each in a Webhook format, to the alarm manager.
Preferably, the entities of the alarm notification message include: any one or more of a source, environment, type, node, and label.
Therefore, the alert manager sends the alarm message to the visual unified alarm system based on the business service tree in a webhook mode, so that Topic can be set independently for different services and different alarm levels, and more accurate notification reach and focusing can be realized.
Step S102: and standardizing the alarm notification message, analyzing the content of the alarm notification message to obtain a service tree in third-party service, and adding the service tree to the alarm notification message after standardized processing.
Preferably, the alarm notification message is standardized, that is, valid information is extracted, and irrelevant data is discarded. Because, typically, alarm messages are more hierarchical and redundant, if all of the alarm notification messages are received, not only is the processing speed of the message consumer reduced, but network resources are also consumed.
Specifically, the content of the alarm notification message is analyzed to obtain a service tree in a third-party service, and the specific operations are as follows: and according to the environment, the type and the node information in the entity of the alarm notification message, the third-party service acquires the data of the service tree corresponding to the alarm and adds the data into the alarm message after standardized processing.
Preferably, the service tree is a tree-shaped structure abstraction for information of each service Department, each service Project, service Cluster and service Role, and the abstraction is helpful for the alarm message to be quickly and accurately positioned to a development responsible person of the micro-service.
Step S103: and pushing the alarm notification message after standardized processing to one or more topics of a distributed publish-subscribe message system so as to be consumed by different message consumers.
For example, the standardized alert notification messages are pushed to one or more topics (Topic) of a Kafka (a high-throughput distributed publish-subscribe messaging system) for subsequent consumption processing by different message consumers (consumers).
So far, compared with the prior art, the steps S101 to S103 of the application are unified processing and issuing by means of the data of the service tree on the basis of receiving the alarm notification message from the alarm management. And then, flexible alarm configuration is realized through each hierarchy or each node of the business service tree, and based on the flexible alarm configuration, the dynamic index value acquisition and the dynamic index label acquisition are supported. Reference may be made to the flow diagram shown in fig. 2.
Step S104: starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
For example, the Consumer that simultaneously starts a plurality of different functional modules consumes the alarm notification message in the Topic of Kafka, and performs customized processing or storage according to the requirements of the alarm system of the present application. And/or, initiating an additional Consumer subscription to the alert message in Topic of Kafka, receiving and storing a copy of all historical alert messages in an ElasticSearch (a distributed multi-user-capable full text search engine) to provide intelligent analysis capability for alert notification messages.
In an embodiment of the present application, the method further includes:
A. and according to the dynamic automatic generation of each hierarchy of the business service tree, generating an alarm state page corresponding to each hierarchy of the business service tree, and displaying the alarm state page visually.
For example, the method of the present step may correspond to a state management module of the present system, and is used to friendly and visually display the alarm state pages of each level of the service tree, and the alarm state pages are dynamically and automatically generated according to the level of the service tree. Preferably, when the state of receiving the alarm notification message is triggering (ringing), the state on the alarm state page turns red, and the value increases as the number of the received alarm notification messages increases; when the status of receiving the alarm notification message is completed or recovered (resolved), the value decreases as the number of received alarm notification messages increases; when the value is zero, the state on the alarm state page turns green.
B. And providing alarm levels, alarm channel configuration and updating functions on an alarm state page of the visual business service tree so that each development, operation and maintenance personnel of the micro-service can configure and subscribe the required alarm message by themselves.
For example, the method of the present step may correspond to a configuration update module of the present system, which is configured to provide an alarm level and an alarm channel configuration and update function on a visual interface, and a development operation and maintenance staff of the micro service may configure and subscribe the alarm notification message that the development operation and maintenance staff want to receive by himself.
In an embodiment of the present application, the method further includes: and calling a corresponding fault processing service interface according to the type of each standardized alarm notification message so as to provide an alarm fault self-healing processing function.
For example, the method of this embodiment may correspond to an alarm processing module of the system, and is configured to call a corresponding fault handling service interface according to a standardized alarm notification message type to provide an alarm fault self-healing processing function.
In an embodiment of the present application, the method further includes:
A. pushing the alarm notification message to a registered and configured alarm channel according to a certain format style; the alarm channel includes: any one of enterprise WeChat, nail, mail and SMS;
B. the alarm channels of the same type are distinguished according to different groups or groups created by the service tree; wherein the cluster may correspond to any node on the service tree to provide notification control of alarm notification messages of different granularities.
For example, the method of this embodiment may correspond to a channel pushing module of the present system, which is configured to push the alarm notification message to various alarm channels of the registration configuration according to a certain format style, where the types of the alarm channels include, but are not limited to: enterprise wechat, stapling, email, SMS, etc., the same type of alarm channel also distinguishes different groups or groups created according to a particular business service tree, which may correspond to any node on the particular service tree to provide alarm message notification control of different granularity.
Briefly, the method and the system provide alarm upgrading and change of a notifier, and are beneficial to monitoring popularization of alarms in enterprises; the unified processing of the alarm topic, the alarm content, the alarm channel and the alarm subscription is realized, and the visual unified alarm system based on the business service tree can be fully utilized.
It should be noted that, as the number of services increases, the dependency relationship between the micro services becomes more and more complex, and the alarm of one micro service cluster may be initiated by a single micro service alarm and affect the upstream and downstream micro services, so the service dependency relationship, the alarm timing sequence, and the like have great significance for effectively managing the micro services.
To this end, the present application also proposes: based on a cloud native K8s service registration center (a public container K8s cluster) and a service topology tool Kiali, the method obtains the calling relation of each micro service in real time in an alarm processing module, receives the alarm adjusting message of each micro service according to the time sequence, and quickly positions an accident source and influences the service by combining a topological graph.
For example, based on a cloud native K8s service registry and a service topology tool Kiali, the calling relationship of each micro service is acquired in real time, the alarm message of each micro service is received according to the time sequence, and the accident source and the service influence are quickly positioned by combining a topology map.
For example, in different K8s clusters, a distributed scheduling platform XXLJOB pushes data to pushgateway, and then the pushgateway is provided to a proxy server, and then the proxy server is provided to an alarm manager alert manager and an open-source index amount monitoring and visualization tool Grafana, and finally summarized into the system, so as to obtain the invocation relationship of each micro-service in real time, receive the alarm adjustment message of each micro-service according to the time sequence, and quickly locate the accident source and influence the service by combining with a topological graph.
Wherein, the Kiali is a system with a front end and a back end separated, and when the mirror image is constructed, the front end and the back end are placed in the same mirror image. Kiali relies on two external services, Prometheus, which is a monitoring and alarm system. Kiali will look up data from Prometheus, produce a topological map or some other statistical map. The other service is a Cluster API, and Kiali acquires data such as service, deployment and the like, and also acquires yaml configuration of virtual service and destinationrule for configuration detection. Also, Kiali can configure two alternative services, Jaeger and Grafana. Jaeger is a distributed tracking system developed by Uber and Grafana is a data visualization system. These functions of Kiali are all Istio based.
The micro-service alarm dependence management method and the micro-service alarm dependence management system are based on the cloud native service registration center, micro-service alarm dependence management and visualization are achieved, and decision basis is provided for automatic service management.
Further, the method further comprises: according to the alarm severity level and range, adopting a corresponding alarm processing strategy to achieve rapid recovery of the micro-service cluster fault; the alarm processing strategy comprises the following steps: any one or more of current limiting, fusing, degrading, popping and restarting.
For example, the alarm processing module corresponding to the system can be internally provided with service management means such as current limiting/fusing/degrading/elastic expansion/restarting and the like, and an appropriate alarm processing strategy is adopted according to the alarm severity level and range so as to achieve rapid recovery of micro-service cluster faults.
In summary, the present application provides a service tree-based visual unified alarm method, device, equipment and medium, and aims to implement a service tree-based visual unified alarm scheme to provide different alarm notification levels, diversified alarm channel configurations, friendly alarm state pages at each level of the visual service tree, an automated alarm processing function and a historical alarm notification storage medium.
Fig. 3 is a schematic block diagram of a visualization unified alarm system based on a service tree according to an embodiment of the present application. As shown, the system 200 includes.
A receiving module 301, configured to receive one or more alarm notification messages of an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form.
A processing module 302, configured to perform standardization processing on the alarm notification message to obtain an effective message, analyze the content of the alarm notification message to obtain a service tree in the associated third-party service, and add the service tree to the alarm notification message after the standardization processing; and pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system so as to be consumed by different message consumers.
A consumer module 303, configured to start message consumers of multiple different function modules to consume the alarm notification messages in each topic of the distributed publish-subscribe message system, so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
It should be noted that, for the information interaction, execution process, and other contents between the modules/units of the system, since the same concept is based on the embodiment of the method described in this application, the technical effect brought by the embodiment of the method is the same as that of the embodiment of the method in this application, and specific contents may refer to the description in the foregoing embodiment of the method in this application, and are not described herein again.
It should be further noted that the division of the modules of the above system is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these units may all be implemented in the form of software calls by processing elements.
Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown, the computer device 400 includes: a memory 401, and a processor 402; the memory 401 is used for storing computer instructions; the processor 402 executes computer instructions to implement the method described in fig. 1.
In some embodiments, the number of the memories 401 in the computer device 400 may be one or more, the number of the processors 402 may be one or more, and fig. 4 is taken as an example.
In an embodiment of the present application, the processor 402 in the computer device 400 loads one or more instructions corresponding to processes of an application program into the memory 401 according to the steps described in fig. 1, and the processor 402 executes the application program stored in the memory 401, thereby implementing the method described in fig. 1.
The memory 401 may include a Random Access Memory (RAM), or may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 401 stores an operating system and operating instructions, executable modules or data structures, or a subset thereof, or an expanded set thereof, wherein the operating instructions may include various operating instructions for implementing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.
The Processor 402 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In some specific applications, the various components of the computer device 400 are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. But for clarity of explanation the various busses are shown in fig. 4 as a bus system.
In an embodiment of the present application, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the method described in fig. 1.
The present application may be embodied as systems, methods, and/or computer program products, in any combination of technical details. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present application.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable programs described herein may be downloaded from a computer-readable storage medium to a variety of computing/processing devices, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present application may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
In summary, the present application provides a visual unified alarm method, apparatus, device and medium based on a service tree, by receiving one or more alarm notification messages from an alarm manager; wherein, various alarm channels are respectively and uniformly pre-registered to the alarm manager in a Webhook form; standardizing the alarm notification message to obtain an effective message, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after standardized processing; pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers; starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
The application effectively overcomes various defects in the prior art and has high industrial utilization value.
The above embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the invention. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present application.

Claims (11)

1. A visualization unified alarm method based on a service tree is characterized by comprising the following steps:
receiving one or more alarm notification messages of an alarm manager; the method comprises the following steps that a plurality of alarm channels corresponding to alarm notification messages are respectively and uniformly registered in advance to an alarm manager in a Webhook mode;
standardizing the alarm notification message to obtain an effective message and abandon irrelevant data, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after standardized processing;
pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers;
starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
2. The method of claim 1, wherein the entity of the alert notification message comprises: any one or more of a source, environment, type, node, and label.
3. The method of claim 1, further comprising:
according to dynamic automatic generation of each hierarchy of the business service tree, an alarm state page corresponding to each hierarchy of the business service tree is generated and displayed visually;
and providing alarm levels, alarm channel configuration and updating functions on an alarm state page of the visual business service tree so that each development, operation and maintenance personnel of the micro-service can configure and subscribe the required alarm message by themselves.
4. The method of claim 3, further comprising:
when the state of the received alarm notification message is triggered, the state of a node panel corresponding to the service tree on the alarm state page turns red, and the numerical value is increased along with the increase of the number of different received alarm notification messages;
when the state of receiving the alarm notification message is that the processing is finished, the numerical value is reduced along with the increase of the number of different alarm notification messages received by users; and when the numerical value is zero, the state of the node panel corresponding to the service tree on the alarm state page turns green.
5. The method of claim 1, further comprising:
and calling a corresponding fault processing service interface according to the type of each standardized alarm notification message so as to provide an alarm fault self-healing processing function.
6. The method of claim 1, further comprising:
pushing the alarm notification message to a registered and configured alarm channel according to a certain format style; the alarm channel includes: any one of enterprise WeChat, nail, mail and SMS;
the alarm channels of the same type are distinguished according to different groups or groups created by the service tree; wherein the cluster may correspond to any node on the service tree to provide notification control of alarm notification messages of different granularities.
7. The method of claim 1, further comprising:
acquiring the calling relation of each micro service in real time based on a cloud native service registration center and a service topology tool;
and receiving alarm adjustment information of each micro-service according to a time sequence, and combining the topological graph to quickly locate an accident source and influence the service.
8. The method of claim 1, further comprising:
according to the alarm severity level and range, adopting a corresponding alarm processing strategy to achieve rapid recovery of the micro-service cluster fault; wherein the alarm handling policy comprises: any one or more of current limiting, fusing, degrading, popping and restarting.
9. A visual unified alarm system based on a service tree, characterized in that the system comprises:
a receiving module for receiving one or more alarm notification messages of an alarm manager; the method comprises the following steps that a plurality of alarm channels corresponding to alarm notification messages are respectively and uniformly registered in advance to an alarm manager in a Webhook mode;
the processing module is used for carrying out standardization processing on the alarm notification message to obtain an effective message and abandon irrelevant data, analyzing the content of the alarm notification message to obtain a business service tree in the associated third-party service, and adding the business service tree to the alarm notification message after the standardization processing; pushing the alarm notification message after standardized processing to one or more topics of a publish-subscribe message system for consumption by different message consumers;
the consumer module is used for starting message consumers of a plurality of different functional modules to consume the alarm notification messages in the topics of the distributed publish-subscribe message system so as to perform customized processing or storage according to the alarm requirements; and/or initiating a message consumer to subscribe to the alarm notification messages in the topics of the distributed publish-subscribe message system to receive and store the alarm notification messages of all the histories to a search engine for analysis of the alarm notification messages.
10. A computer device, the device comprising: a memory, and a processor; the memory is to store computer instructions; the processor executes computer instructions to implement the method of any one of claims 1 to 8.
11. A computer-readable storage medium having stored thereon computer instructions which, when executed, perform the method of any one of claims 1 to 8.
CN202110916619.7A 2021-08-11 2021-08-11 Visual unified alarm method, device, equipment and medium based on service tree Active CN113377626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110916619.7A CN113377626B (en) 2021-08-11 2021-08-11 Visual unified alarm method, device, equipment and medium based on service tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110916619.7A CN113377626B (en) 2021-08-11 2021-08-11 Visual unified alarm method, device, equipment and medium based on service tree

Publications (2)

Publication Number Publication Date
CN113377626A CN113377626A (en) 2021-09-10
CN113377626B true CN113377626B (en) 2021-11-23

Family

ID=77576694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110916619.7A Active CN113377626B (en) 2021-08-11 2021-08-11 Visual unified alarm method, device, equipment and medium based on service tree

Country Status (1)

Country Link
CN (1) CN113377626B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114253807B (en) * 2021-12-20 2023-04-07 深圳前海微众银行股份有限公司 Alarm information notification method and device
CN114500248B (en) * 2022-04-01 2022-08-05 北京锐融天下科技股份有限公司 Monitoring and alarming method and system for service in Internet software system
CN117149897B (en) * 2023-10-31 2024-01-26 成都交大光芒科技股份有限公司 Big data alarm information hierarchical display system and method based on double-buffer technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611137A (en) * 2020-06-30 2020-09-01 平安银行股份有限公司 Alarm monitoring method and device, computer equipment and storage medium
CN113141485A (en) * 2020-10-22 2021-07-20 西安天和防务技术股份有限公司 Alarm system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8111814B2 (en) * 2006-03-20 2012-02-07 Microsoft Corporation Extensible alert types
US10853160B2 (en) * 2018-05-04 2020-12-01 Vmware, Inc. Methods and systems to manage alerts in a distributed computing system
CN109710487A (en) * 2018-11-29 2019-05-03 同盾控股有限公司 A kind of monitoring method and device
CN112511339B (en) * 2020-11-09 2023-04-07 宝付网络科技(上海)有限公司 Container monitoring alarm method, system, equipment and storage medium based on multiple clusters
CN112559281A (en) * 2020-12-07 2021-03-26 恩亿科(北京)数据科技有限公司 Alarm routing system and method based on configuration

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611137A (en) * 2020-06-30 2020-09-01 平安银行股份有限公司 Alarm monitoring method and device, computer equipment and storage medium
CN113141485A (en) * 2020-10-22 2021-07-20 西安天和防务技术股份有限公司 Alarm system

Also Published As

Publication number Publication date
CN113377626A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
US11934417B2 (en) Dynamically monitoring an information technology networked entity
CN113377626B (en) Visual unified alarm method, device, equipment and medium based on service tree
CN110310034B (en) Service arrangement and business flow processing method and device applied to SaaS
US10057152B2 (en) Providing an unseen message count across devices
CN107491488B (en) Page data acquisition method and device
CN111190888A (en) Method and device for managing graph database cluster
US10521263B2 (en) Generic communication architecture for cloud microservice infrastructure
US11818152B2 (en) Modeling topic-based message-oriented middleware within a security system
CN112235130A (en) Method and device for realizing operation and maintenance automation based on SDN network
CN113704065A (en) Monitoring method, device, equipment and computer storage medium
CN111782672B (en) Multi-field data management method and related device
CN114185734A (en) Cluster monitoring method and device and electronic equipment
CN113220342A (en) Centralized configuration method and device, electronic equipment and storage medium
CN113486095A (en) Civil aviation air traffic control cross-network safety data exchange management platform
CN114756301B (en) Log processing method, device and system
CN111274032A (en) Task processing system and method, and storage medium
CN116112342A (en) Alarm information processing method, device, electronic equipment and storage medium
CN113242148B (en) Method, device, medium and electronic equipment for generating monitoring alarm related information
CN113762910A (en) Document monitoring method and device
CN112488462A (en) Unified pushing method, device and medium for workflow data
CN111858260A (en) Information display method, device, equipment and medium
CN112749204A (en) Method and device for reading data
CN114844957B (en) Link message conversion method, device, equipment, storage medium and program product
CN111143408B (en) Event processing method and device based on business rule
CN115208764A (en) Resource pool-based request response method, device and medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Visual unified alarm method, device, equipment, and medium based on service tree

Effective date of registration: 20231127

Granted publication date: 20211123

Pledgee: China Minsheng Banking Corp Shanghai branch

Pledgor: SHANGHAI LINKEDCARE INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2023310000785

PE01 Entry into force of the registration of the contract for pledge of patent right