CN113657715A - Root cause positioning method and system based on kernel density estimation calling chain - Google Patents

Root cause positioning method and system based on kernel density estimation calling chain Download PDF

Info

Publication number
CN113657715A
CN113657715A CN202110799721.3A CN202110799721A CN113657715A CN 113657715 A CN113657715 A CN 113657715A CN 202110799721 A CN202110799721 A CN 202110799721A CN 113657715 A CN113657715 A CN 113657715A
Authority
CN
China
Prior art keywords
node
root cause
kpi
kernel density
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110799721.3A
Other languages
Chinese (zh)
Inventor
李立泓
闫二乐
郑康秋
林诚汉
陈立峰
林俊德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujia Newland Software Engineering Co ltd
Original Assignee
Fujia Newland Software Engineering Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujia Newland Software Engineering Co ltd filed Critical Fujia Newland Software Engineering Co ltd
Priority to CN202110799721.3A priority Critical patent/CN113657715A/en
Publication of CN113657715A publication Critical patent/CN113657715A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Operations Research (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Pure & Applied Mathematics (AREA)
  • Strategic Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a root cause positioning method and a root cause positioning system based on a kernel density estimation calling chain in the technical field of computers, wherein the method comprises the following steps: step S10, collecting the availability index and KPI index of each business service, and the response time and success rate index between each business service on each calling chain; step S20, monitoring the availability index based on the set threshold value; step S30, converting KPI indexes of the call chain into node abnormal scores based on kernel density estimation, and converting response time and success rate indexes of the call chain into node edge abnormal scores; s40, loading the node abnormal scores and the node edge abnormal scores to a static topological graph of the I T system to obtain a fault propagation graph; and step S50, carrying out random walk on the fault propagation diagram by using a random walk algorithm, and positioning nodes generating faults and KPI indexes. The invention has the advantages that: the root cause positioning efficiency is greatly improved, and the operation and maintenance cost of the I T system is greatly reduced.

Description

Root cause positioning method and system based on kernel density estimation calling chain
Technical Field
The invention relates to the technical field of computers, in particular to a root cause positioning method and system based on a kernel density estimation calling chain.
Background
With the development of information technology and the cloud of numerous systems, IT architecture has been separated from front and back ends, and becomes complex architectures such as distributed, micro-service and DDD. Today, large-scale IT systems often contain thousands of applications, which are highly dynamic and complex, and a business service in IT systems contains several to thousands of instances, each running on a different container or a different server, and the availability of these instances becomes a key challenge to be faced by large-scale IT systems.
Under the architectures of distributed, micro-service and DDD, a complete service request (service) includes a plurality of service units, and each service system and service unit are called each other to form a call chain, and any exception on the call chain may propagate along the call chain, which finally results in that the service request cannot be executed, which is also a problem commonly encountered by large-scale IT systems. Since the service request cannot be executed to the benefit that will directly affect the user experience and the enterprise, the operation and maintenance engineer needs to monitor the service level KPI (e.g., response time) and the host level KPI (e.g., CPU usage) on each host where the service request is located. When a service request fails, the operation and maintenance engineer must locate the failing machine (root cause/root cause) as soon as possible and resolve the failure quickly.
Aiming at the positioning of root causes, the mode that an operation and maintenance engineer manually checks faults is adopted in the prior art, but because an IT system has a calling relation with complex service and a plurality of indexes, the operation and maintenance engineer is difficult to quickly position the problems in the plurality of services and indexes, and the efficiency of root cause positioning is low.
Therefore, how to provide a root cause positioning method and system based on a kernel density estimation call chain to achieve the purposes of improving the root cause positioning efficiency and reducing the operation and maintenance cost of an IT system becomes a problem to be solved urgently.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a root cause positioning method and system based on a kernel density estimation call chain, so as to improve the root cause positioning efficiency and reduce the operation and maintenance cost of an IT system.
In a first aspect, the present invention provides a method for estimating root cause location of a call chain based on kernel density, comprising the following steps:
step S10, acquiring the availability index and KPI index of each service in the IT system, and the response time and success rate index between each service on each calling chain;
step S20, setting a threshold, monitoring the availability index based on the threshold, and judging whether the IT system has a fault;
step S30, based on kernel density estimation, converting the KPI of the service associated call chain into node abnormal score, and converting the response time and success rate index of the service associated call chain into node edge abnormal score;
step S40, loading the node abnormal score and the node edge abnormal score to a static topological graph of the IT system to obtain a fault propagation graph;
and step S50, carrying out random walk on the fault propagation diagram by using a random walk algorithm, positioning nodes generating faults and KPI (Key performance indicator), and finishing root cause positioning.
Further, in step S10, the KPI indicator at least includes a CPU utilization and a memory utilization.
Further, the step S20 is specifically:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering the step S30; if not, indicating that no fault exists, and continuing monitoring.
Further, the step S30 is specifically:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
Further, the step S50 is specifically:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning.
In a second aspect, the present invention provides a system for root cause location based on kernel density estimation call chains, comprising the following modules:
the data acquisition module is used for acquiring the availability index and KPI index of each service in the IT system, and the response time and success rate index between each service on each calling chain;
the availability index monitoring module is used for setting a threshold value, monitoring the availability index based on the threshold value and judging whether the IT system has faults or not;
the kernel density estimation module is used for converting the KPI (Key performance indicator) of the service-related calling chain into a node abnormal score and converting the response time and success rate indicator of the service-related calling chain into a node edge abnormal score based on kernel density estimation;
the fault propagation graph building module is used for loading the node abnormal scores and the node edge abnormal scores to a static topological graph of the IT system to obtain a fault propagation graph;
and the root cause positioning module is used for carrying out random walk on the fault propagation diagram by utilizing a random walk algorithm, positioning nodes generating faults and KPI indexes, and finishing root cause positioning.
Further, in the data acquisition module, the KPI indicator at least includes a CPU utilization rate and a memory utilization rate.
Further, the availability index monitoring module specifically includes:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering a kernel density estimation module; if not, indicating that no fault exists, and continuing monitoring.
Further, the kernel density estimation module specifically includes:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
Further, the root cause positioning module specifically comprises:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning.
The invention has the advantages that:
by collecting availability indexes, KPI indexes, response time and success rate indexes, when the availability indexes are higher than a set threshold value, a fault exists, the KPI indexes of a service associated call chain are converted into node abnormal scores based on kernel density estimation, the response time and success rate indexes are converted into node side abnormal scores, then the node abnormal scores and the node side abnormal scores are loaded into a static topological graph of an IT system to obtain a fault propagation graph, and finally a random walk algorithm is used for randomly walking the fault propagation graph, so that the nodes and the KPI indexes which generate the fault can be automatically positioned.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for root cause location based on kernel density estimation call chains according to the present invention.
FIG. 2 is a schematic structural diagram of a root cause location system based on kernel density estimation call chain according to the present invention.
Detailed Description
The technical scheme in the embodiment of the application has the following general idea: acquiring availability indexes of IT system business services, response time and success rate indexes among business services on a call chain, and KPI indexes (CPU utilization rate and memory utilization rate) corresponding to the business services; triggering root cause positioning when the availability index of the business service exceeds a set threshold value; converting KPI indexes corresponding to the business services on all the call chains corresponding to the business services with the problems into node abnormal scores through kernel density estimation, and converting response time and success rate indexes between the business services on all the call chains corresponding to the business services with the problems into node edge abnormal scores; combining the static topological graph, and corresponding the node abnormal score and the node edge abnormal score to the static topological graph to form a fault propagation graph; and through a random walk algorithm, random walk is carried out on the fault propagation graph, and the fault node and the failed KPI are automatically positioned so as to improve the root cause positioning efficiency and reduce the operation and maintenance cost of the IT system.
Referring to fig. 1 to 2, a preferred embodiment of a root cause locating method based on kernel density estimation call chain of the present invention includes the following steps:
step S10, continuously collecting the availability index and KPI index of each service, the response time and success rate index between each service on each calling chain in the IT system at intervals of 1 minute, and storing the collected data in an elastic search;
step S20, setting a threshold, monitoring the availability index based on the threshold, and judging whether the IT system has a fault; triggering root cause location when the availability indicator exceeds a threshold;
step S30, converting the KPI of the service-related calling chain into a node abnormal score based on kernel density estimation (Kernel density estimation), and converting the response time and success rate index of the service-related calling chain into a node side abnormal score;
the kernel density estimation is used for estimating an unknown density function in probability theory and belongs to one of nonparametric inspection methods;
step S40, loading the node abnormal score and the node edge abnormal score to a static topological graph of the IT system to obtain a fault propagation graph; the static topological graph is updated regularly through an acquisition program;
and step S50, carrying out random walk on the fault propagation diagram by using a random walk algorithm, positioning nodes generating faults and KPI (Key performance indicator), completing automatic root cause positioning, storing and displaying root cause positioning results, and automatically executing corresponding fault repair operation.
In step S10, the KPI indicators at least include CPU utilization and memory utilization.
The step S20 specifically includes:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering the step S30; if not, indicating that no fault exists, and continuing monitoring.
The step S30 specifically includes:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
The step S50 specifically includes:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning. The random walk algorithm is used for calculating the probability of forward, backward and self-direction transition of each node.
The invention relates to a preferred embodiment of a root cause positioning system based on a kernel density estimation calling chain, which comprises the following modules:
the data acquisition module is used for continuously acquiring the availability index and the KPI index of each service, the response time and the success rate index between each service on each calling chain in the IT system at intervals of 1 minute and storing the acquired data to an elastic search;
the availability index monitoring module is used for setting a threshold value, monitoring the availability index based on the threshold value and judging whether the IT system has faults or not; triggering root cause location when the availability indicator exceeds a threshold;
the kernel density estimation module is used for converting the KPI (kernel density estimation) of the service-related calling chain into a node abnormal score and converting the response time and the success rate index of the service-related calling chain into a node side abnormal score based on kernel density estimation;
the kernel density estimation is used for estimating an unknown density function in probability theory and belongs to one of nonparametric inspection methods;
the fault propagation graph building module is used for loading the node abnormal scores and the node edge abnormal scores to a static topological graph of the IT system to obtain a fault propagation graph; the static topological graph is updated regularly through an acquisition program;
and the root cause positioning module is used for carrying out random walk on the fault propagation diagram by utilizing a random walk algorithm, positioning the nodes generating the faults and the KPI (Key performance indicator), completing automatic root cause positioning, storing and displaying the root cause positioning result and automatically executing corresponding fault repairing operation.
In the data acquisition module, the KPI at least comprises CPU utilization rate and memory utilization rate.
The availability index monitoring module specifically comprises:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering a kernel density estimation module; if not, indicating that no fault exists, and continuing monitoring.
The nuclear density estimation module specifically comprises:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
The root cause positioning module specifically comprises:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning. The random walk algorithm is used for calculating the probability of forward, backward and self-direction transition of each node.
In summary, the invention has the advantages that:
by collecting availability indexes, KPI indexes, response time and success rate indexes, when the availability indexes are higher than a set threshold value, a fault exists, the KPI indexes of a service associated call chain are converted into node abnormal scores based on kernel density estimation, the response time and success rate indexes are converted into node side abnormal scores, then the node abnormal scores and the node side abnormal scores are loaded into a static topological graph of an IT system to obtain a fault propagation graph, and finally a random walk algorithm is used for randomly walking the fault propagation graph, so that the nodes and the KPI indexes which generate the fault can be automatically positioned.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (10)

1. A root cause positioning method based on kernel density estimation calling chain is characterized in that: the method comprises the following steps:
step S10, acquiring the availability index and KPI index of each service in the IT system, and the response time and success rate index between each service on each calling chain;
step S20, setting a threshold, monitoring the availability index based on the threshold, and judging whether the IT system has a fault;
step S30, based on kernel density estimation, converting the KPI of the service associated call chain into node abnormal score, and converting the response time and success rate index of the service associated call chain into node edge abnormal score;
step S40, loading the node abnormal score and the node edge abnormal score to a static topological graph of the IT system to obtain a fault propagation graph;
and step S50, carrying out random walk on the fault propagation diagram by using a random walk algorithm, positioning nodes generating faults and KPI (Key performance indicator), and finishing root cause positioning.
2. The method of claim 1, wherein estimating a root cause location of a call chain based on kernel density comprises: in step S10, the KPI indicators at least include CPU utilization and memory utilization.
3. The method of claim 1, wherein estimating a root cause location of a call chain based on kernel density comprises: the step S20 specifically includes:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering the step S30; if not, indicating that no fault exists, and continuing monitoring.
4. The method of claim 1, wherein estimating a root cause location of a call chain based on kernel density comprises: the step S30 specifically includes:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
5. The method of claim 1, wherein estimating a root cause location of a call chain based on kernel density comprises: the step S50 specifically includes:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning.
6. A cause localization system for estimating a call chain based on kernel density, comprising: the system comprises the following modules:
the data acquisition module is used for acquiring the availability index and KPI index of each service in the IT system, and the response time and success rate index between each service on each calling chain;
the availability index monitoring module is used for setting a threshold value, monitoring the availability index based on the threshold value and judging whether the IT system has faults or not;
the kernel density estimation module is used for converting the KPI (Key performance indicator) of the service-related calling chain into a node abnormal score and converting the response time and success rate indicator of the service-related calling chain into a node edge abnormal score based on kernel density estimation;
the fault propagation graph building module is used for loading the node abnormal scores and the node edge abnormal scores to a static topological graph of the IT system to obtain a fault propagation graph;
and the root cause positioning module is used for carrying out random walk on the fault propagation diagram by utilizing a random walk algorithm, positioning nodes generating faults and KPI indexes, and finishing root cause positioning.
7. The system of claim 6, wherein the root cause location system is further configured to estimate a call chain based on kernel density: in the data acquisition module, the KPI at least comprises CPU utilization rate and memory utilization rate.
8. The system of claim 6, wherein the root cause location system is further configured to estimate a call chain based on kernel density: the availability index monitoring module specifically comprises:
setting a threshold, sequentially judging whether the availability index of each business service is greater than the threshold, if so, indicating that a fault exists, and entering a kernel density estimation module; if not, indicating that no fault exists, and continuing monitoring.
9. The system of claim 6, wherein the root cause location system is further configured to estimate a call chain based on kernel density: the nuclear density estimation module specifically comprises:
based on kernel density estimation, a KDE model is respectively constructed for the KPI, the response time and the historical data of the success rate index, so as to obtain a probability density function, the KPI, the response time and the success rate index of a service associated call chain in a fault time window are input into the corresponding probability density function to obtain a probability density, and then the probability density is converted into a node abnormity score and a node edge abnormity score through a logarithm function.
10. The system of claim 6, wherein the root cause location system is further configured to estimate a call chain based on kernel density: the root cause positioning module specifically comprises:
and carrying out random walk on the fault propagation diagram by using a random walk algorithm, sequentially and randomly accessing the next adjacent node, recording the access times of each node, carrying out descending order arrangement, further positioning the node generating the fault and the corresponding KPI (Key performance indicator), and finishing root cause positioning.
CN202110799721.3A 2021-07-15 2021-07-15 Root cause positioning method and system based on kernel density estimation calling chain Pending CN113657715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110799721.3A CN113657715A (en) 2021-07-15 2021-07-15 Root cause positioning method and system based on kernel density estimation calling chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110799721.3A CN113657715A (en) 2021-07-15 2021-07-15 Root cause positioning method and system based on kernel density estimation calling chain

Publications (1)

Publication Number Publication Date
CN113657715A true CN113657715A (en) 2021-11-16

Family

ID=78489382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110799721.3A Pending CN113657715A (en) 2021-07-15 2021-07-15 Root cause positioning method and system based on kernel density estimation calling chain

Country Status (1)

Country Link
CN (1) CN113657715A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114024837A (en) * 2022-01-06 2022-02-08 杭州大乘智能科技有限公司 Fault root cause positioning method of micro-service system
CN114598539A (en) * 2022-03-16 2022-06-07 京东科技信息技术有限公司 Root cause positioning method and device, storage medium and electronic equipment
CN114710397A (en) * 2022-04-24 2022-07-05 中国工商银行股份有限公司 Method, device, electronic equipment and medium for positioning fault root cause of service link
CN115296978A (en) * 2022-07-06 2022-11-04 北京三快在线科技有限公司 Root cause positioning method, device and equipment
CN115333921A (en) * 2022-08-20 2022-11-11 海南大学 Micro-service abnormal root cause positioning method and device
CN115941545A (en) * 2022-10-14 2023-04-07 华能信息技术有限公司 Log management method and platform based on micro-service
CN117370064A (en) * 2023-10-31 2024-01-09 河北东软软件有限公司 Micro-service system based on container technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101741626B1 (en) * 2016-09-08 2017-05-31 주식회사 모비즈 Service server, apparatus, and its method for changing structure of group using contributiveness
CN111597070A (en) * 2020-07-27 2020-08-28 北京必示科技有限公司 Fault positioning method and device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101741626B1 (en) * 2016-09-08 2017-05-31 주식회사 모비즈 Service server, apparatus, and its method for changing structure of group using contributiveness
CN111597070A (en) * 2020-07-27 2020-08-28 北京必示科技有限公司 Fault positioning method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周春蕾 等 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114024837B (en) * 2022-01-06 2022-04-05 杭州乘云数字技术有限公司 Fault root cause positioning method of micro-service system
CN114024837A (en) * 2022-01-06 2022-02-08 杭州大乘智能科技有限公司 Fault root cause positioning method of micro-service system
CN114598539A (en) * 2022-03-16 2022-06-07 京东科技信息技术有限公司 Root cause positioning method and device, storage medium and electronic equipment
CN114598539B (en) * 2022-03-16 2024-03-01 京东科技信息技术有限公司 Root cause positioning method and device, storage medium and electronic equipment
CN114710397B (en) * 2022-04-24 2024-02-06 中国工商银行股份有限公司 Service link fault root cause positioning method and device, electronic equipment and medium
CN114710397A (en) * 2022-04-24 2022-07-05 中国工商银行股份有限公司 Method, device, electronic equipment and medium for positioning fault root cause of service link
CN115296978A (en) * 2022-07-06 2022-11-04 北京三快在线科技有限公司 Root cause positioning method, device and equipment
CN115296978B (en) * 2022-07-06 2023-09-12 北京三快在线科技有限公司 Root cause positioning method, root cause positioning device and root cause positioning equipment
CN115333921A (en) * 2022-08-20 2022-11-11 海南大学 Micro-service abnormal root cause positioning method and device
CN115333921B (en) * 2022-08-20 2024-03-29 海南大学 Micro-service abnormal root cause positioning method and device
CN115941545A (en) * 2022-10-14 2023-04-07 华能信息技术有限公司 Log management method and platform based on micro-service
CN115941545B (en) * 2022-10-14 2023-06-23 华能信息技术有限公司 Log management method and platform based on micro-service
CN117370064A (en) * 2023-10-31 2024-01-09 河北东软软件有限公司 Micro-service system based on container technology
CN117370064B (en) * 2023-10-31 2024-05-28 河北东软软件有限公司 Micro-service system based on container technology

Similar Documents

Publication Publication Date Title
CN113657715A (en) Root cause positioning method and system based on kernel density estimation calling chain
US11269718B1 (en) Root cause detection and corrective action diagnosis system
CN109933452B (en) Micro-service intelligent monitoring method facing abnormal propagation
US9424157B2 (en) Early detection of failing computers
US9672085B2 (en) Adaptive fault diagnosis
CN110516971B (en) Anomaly detection method, device, medium and computing equipment
Gainaru et al. Fault prediction under the microscope: A closer look into HPC systems
US9612892B2 (en) Creating a correlation rule defining a relationship between event types
US9632861B1 (en) Computer-implemented method, system, and storage medium
US10489232B1 (en) Data center diagnostic information
US20160055044A1 (en) Fault analysis method, fault analysis system, and storage medium
CN110096437A (en) The test method and Related product of micro services framework
WO2021236278A1 (en) Automatic tuning of incident noise
CN115373888A (en) Fault positioning method and device, electronic equipment and storage medium
CN116049146A (en) Database fault processing method, device, equipment and storage medium
CN110659147B (en) Self-repairing method and system based on module self-checking behavior
CN112506802B (en) Test data management method and system
Naksinehaboon et al. Benefits of software rejuvenation on HPC systems
Meng et al. Driftinsight: detecting anomalous behaviors in large-scale cloud platform
CN114760190A (en) Service-oriented converged network performance anomaly detection method
US20190324832A1 (en) Metric for the assessment of distributed high-availability architectures using survivability modeling
CN112162528A (en) Fault diagnosis method, device, equipment and storage medium of numerical control machine tool
US11929867B1 (en) Degradation engine execution triggering alerts for outages
US20240179044A1 (en) Monitoring service health statuses to raise alerts
CN113535528B (en) Log management system, method and medium for distributed graph iterative computation job

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211116

RJ01 Rejection of invention patent application after publication