CN109800052A - Abnormality detection and localization method and device applied to distributed container cloud platform - Google Patents
Abnormality detection and localization method and device applied to distributed container cloud platform Download PDFInfo
- Publication number
- CN109800052A CN109800052A CN201811537333.2A CN201811537333A CN109800052A CN 109800052 A CN109800052 A CN 109800052A CN 201811537333 A CN201811537333 A CN 201811537333A CN 109800052 A CN109800052 A CN 109800052A
- Authority
- CN
- China
- Prior art keywords
- component
- tcp
- information
- subgraph
- exception
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention relates to container cloud platform fields, and in particular to a kind of abnormality detection applied to distributed container cloud platform and localization method and device, this method and device first obtain the TCP delay information of each container assemblies;Postpone information to the TCP of each container assemblies by sliding window accumulation and Outlier Detection Algorithm to analyze, obtains the status information and formation component status information key-value pair of each component;Component exception subgraph is constructed by component status information key-value pair;The container assemblies node occurred extremely is oriented according to component exception subgraph.This method and device are reduced the expense of data acquisition, are improved the accuracy and real-time of abnormality judgement using TCP delay information progress abnormality judgement.Simultaneously in view of between each component, the interference between physical machine and component proposes propagation of the component exception subgraph to indicate abnormality, improves the accuracy positioned extremely.
Description
Technical field
The present invention relates to container cloud platform fields, in particular to a kind of applied to the different of distributed container cloud platform
Often detection and localization method and device.
Background technique
Cloud computing obtains the favor of industrial circle and academia as a kind of new services presentation mode.The pass of cloud computing
Key technology is exactly virtualization technology, and by virtualizing all kinds of resources, cloud computing service provider can easily will very much
All kinds of resources, which are customized, consigns to user's use, and numerous applications also gradually start to move in cloud computing cluster.Traditional void
Quasi-ization technology includes KVM, Xen etc..But traditional virtualization technology is due to excessively heavy, for some component in application cluster
It is created, modification and migration operation are all very complicated, therefore cloud computing service provider needs the void of more lightweight
Quasi-ization technology.Container technique is a kind of virtualization technology of the operating system grade of lightweight.Compared to traditional virtualization technology
The virtualization of virtualization for hardware layer, container rests on operating system layer, creates it either, modifies or migrate all
It is very convenient.Container technique is cracking to be used by all kinds of cloud computing service providers.Due to these features of container, Yong Hu
Often by each assembly operating in independent container when disposing its application, conveniently to be tieed up to application
Shield, which results in the internal structures of container cloud complexity.The characteristics of less isolated property of container, also results between container mutually simultaneously
It interferes more serious.Once exception occurs in some container, will propagate rapidly extremely.And then influence different application groups
Part.Cloud service provider needs a kind of side that can be positioned extremely to the complicated application cluster established by container
Method.
Typically, an application being deployed on container cloud is often made of hundreds of component, and component and group
It interdepends between part, constitutes the complicated figure by component as node.It can be from this using the relevant knowledge of graph theory
The root occurred extremely is navigated in the figure of a complexity.I.e. the cloud computing platform based on container technique is usually by thousands of physical machines
It forms, usually runs dozens of container in every physical machine, thus based on the cloud computing platform of container technique compared to traditional
Cloud computing platform is more complicated.Compared to traditional virtual machine, vessel isolation is worse, interferes between container and container more tight
Weight.Thus compared to conventional virtual machine, container is also easier to influence each other.Simultaneously because the behaviour of container deployment under operation
Make in system, thus the exception of physical machine can also cause the container disposed on it to be abnormal.Existing abnormality detection positioning
Scheme lacks the analysis of relevance between component and physical machine between component, while existing abnormality detection locating scheme
Utility achievement data is carried out abnormality detection and is positioned, and is brought and is greatly stored and transmitted expense, thus cannot be fitted well
Answer the distributed container cloud platform environment of serious interference.
Nguyen et al. is in " Insight:in-situ online service failure path inference in
Production computing infrastructures " chapter 3 propose that the positioning of online black box exception positioning system is abnormal
Component.The system utilizes the normal fluctuation model of virtual machine performance index structural behavior index, judges the data point of anomalous variation,
Abnormal component is positioned in combination with the dependence between the temporal information and component of changed data point.Although the system
It can be detected and be positioned to abnormal, but since it uses performance indicator to carry out abnormality detection and judge, for complexity
Distributed container cloud platform, monitoring performance index bring expense will be very huge.
Summary of the invention
The embodiment of the invention provides a kind of abnormality detections applied to distributed container cloud platform and localization method and dress
It sets, at least to solve the technical issues of traditional method for detecting abnormality based on unimodule can not be suitable for distributed container cloud.
An embodiment according to the present invention provides a kind of abnormality detection and positioning applied to distributed container cloud platform
Method, comprising the following steps:
Obtain the TCP delay information of each container assemblies;
Postpone information to the TCP of each container assemblies by sliding window accumulation and Outlier Detection Algorithm to analyze, obtain
The status information and formation component status information key-value pair of each component;
Component exception subgraph is constructed by component status information key-value pair;
The container assemblies node occurred extremely is oriented according to component exception subgraph.
Further, postpone information to the TCP of each container assemblies by sliding window accumulation and Outlier Detection Algorithm to carry out
Analysis, obtains the status information of each component and formation component status information key-value pair includes:
Sliding window [the L of initialization component0, Lk], input TCP delay information is until the data that TCP postpones in sliding window
Number reaches k, initializes average valueAccumulation and Sk=0;Wherein [L0, Lk] it is that storage TCP postpones team of the information from 0 to k
Column, k are the integer of 0 < k < 60;
Input TCP postpones information L againt, TCP is postponed into information LtIt is inserted into sliding window, and is deleted in sliding window earliest
TCP postpone information Lt-k, average value in calculation windowAnd calculate accumulation andWherein LtFor
The TCP of t moment postpones information, the integer of t t > k;
Calculate early warning value Sdiff=Smax-Smin, wherein Smax、Smin∈[St-k, St], St-kWhen postponing information for earliest TCP
Accumulation and;
Judge SdiffWhether between normality threshold [- h, h], if it is, judging that the state Status of the component is
Normally, otherwise judge the state Status of the component for exception;
According to the status information formation component status information key-value pair<CID:MID:Status>of each component, wherein CID table
Show the number of component, MID indicates the number of physical machine locating for component, and Status indicates the state of component, when component states are
Status value is 1 when abnormal, is normally then 0.
Further, constructing component exception subgraph by component status information key-value pair includes:
Input module dependence graph G, the matrix of component dependencies figure are expressed as G=(Eij), wherein i, and j expression is answered
With the component in cluster, Eij indicates the dependence between i component and j component, the Eij value if component i is dependent on component j
It is 1, otherwise Eij value is 0;
Traverse component status information key-value pair deletes i=CID from component dependencies figure G when Status value is 0
Or the row and column of j=CID, traversal finish to obtain component dependencies subgraph G1;
It whether there is stand-alone assembly node in determination component dependence subgraph G1, stand-alone assembly node is independent of it
His component nodes and the component nodes not relied on for any other component nodes, construction component is different after this kind of component nodes are deleted
Chang Zitu G '.
Further, orienting the container assemblies node occurred extremely according to component exception subgraph includes:
Traverse component exception subgraph G ' calculates δi=∑j∈G’EijIf δi=0, then it represents that component nodes i is abnormal
Root node.
Further, method is also wrapped after orienting the container assemblies node occurred extremely according to component exception subgraph
It includes:
Judge whether the MID of each abnormal root node is identical, if identical, judges that the physical machine generation that number is MID is different
Often.
Further, the TCP for obtaining each container assemblies postpones information and includes:
Postpone information using the TCP that software tcprstat collects each component.
According to another embodiment of the present invention, a kind of abnormality detection applied to distributed container cloud platform and fixed is provided
Position device, comprising:
Postpone information acquisition unit, the TCP for obtaining each container assemblies postpones information;
State information acquisition unit, for accumulating the TCP with Outlier Detection Algorithm to each container assemblies by sliding window
Delay information is analyzed, and the status information and formation component status information key-value pair of each component are obtained;
Component exception subgraph construction unit, for constructing component exception subgraph by component status information key-value pair;
Abnormal positioning unit, for orienting the container assemblies node occurred extremely according to component exception subgraph.
Further, device further include:
Abnormal deciding means, if identical, judges that number is for judging whether the MID of each abnormal root node is identical
The physical machine of MID is abnormal.
A kind of storage medium, storage medium, which is stored with, can be realized above-mentioned any one applied to distributed container cloud platform
Abnormality detection and localization method program file.
A kind of processor, processor is for running program, wherein program executes being applied to for above-mentioned any one when running
The abnormality detection and localization method of distributed container cloud platform.
The abnormality detection and localization method and device for being applied to distributed container cloud platform in the embodiment of the present invention, uses
TCP postpone information carry out abnormality judgement, reduce data acquisition expense, improve abnormality judgement accuracy with
Real-time.Simultaneously in view of between each component, the interference between physical machine and component proposes component exception subgraph to indicate
The propagation of abnormality improves the accuracy positioned extremely.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair
Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of abnormality detection and localization method that the present invention is applied to distributed container cloud platform;
Fig. 2 is that the present invention is applied to the abnormality detection of distributed container cloud platform and the preferred flow charts of localization method;
Fig. 3 is that the present invention is applied to the abnormality detection of distributed container cloud platform and the module map of positioning device;
Fig. 4 is that the present invention is applied to the abnormality detection of distributed container cloud platform and the preferred module figure of localization method.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention
Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people
The model that the present invention protects all should belong in member's every other embodiment obtained without making creative work
It encloses.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein can in addition to illustrating herein or
Sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, for example, the process, method, system, product or equipment for containing a series of steps or units are not necessarily limited to
Step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, product
Or other step or units that equipment is intrinsic.
With the mature of container technique, the cloud computing system based on container technique, that is, container cloud is had begun gradually
Replace traditional cloud computing system based on virtual machine.Since container has the characteristics that light-weighted, the deployment of container is more convenient.
Thus composition is more complicated compared to traditional cloud computing platform inside container cloud.Secondly isolation phase of the container to system items resource
It is not strong compared with for virtual machine, and multiple containers are run on same physical host, the interference between container is comparatively strong, because
Once some container is abnormal inside this container cloud, will propagate rapidly extremely, and then influence entire cluster.And due to container
The internal environment of cloud complexity has not been suitable for distributed container cloud ring based on the method for detecting abnormality of unimodule for tradition
Border.The prior art is analyzed using performance indicator abnormal, and the expense of data acquisition is increased, while needing to construct normal
Volatility model, it is lower and lack real-time for fluctuating accuracy rate for frequent and complicated container cloud platform.
The present invention provides a kind of abnormality detection and positioning applied to distributed container cloud platform for container cloud platform
Method and device.Abnormal positioning and inspection can be carried out to more complicated distributed container cloud platform by this method and device
It surveys, while the accuracy rate positioned extremely is improved by its component exception subgraph.
Embodiment 1
An embodiment according to the present invention provides a kind of abnormality detection applied to distributed container cloud platform and positioning side
Method, referring to Fig. 1, comprising the following steps:
S101: the TCP delay information of each container assemblies is obtained;
S102: postponing information to the TCP of each container assemblies by sliding window accumulation and Outlier Detection Algorithm and analyze,
Obtain the status information and formation component status information key-value pair of each component;
S103: component exception subgraph is constructed by component status information key-value pair;
S104: the container assemblies node occurred extremely is oriented according to component exception subgraph.
This method carries out abnormality judgement using TCP delay information, reduces the expense of data acquisition, improves exception
The accuracy and real-time of state judgement.Simultaneously in view of between each component, the interference between physical machine and component proposes group
Propagation of the part exception subgraph to indicate abnormality, improves the accuracy positioned extremely.
In as a preferred technical scheme, by sliding window accumulation with Outlier Detection Algorithm to the TCP of each container assemblies
Delay information is analyzed, and obtains the status information of each component and formation component status information key-value pair includes:
Sliding window [the L of initialization component0, Lk], input TCP (Transmission Control Protocol transmission
Control protocol) postpone information until the data amount check that TCP postpones in sliding window reaches k, initialization average valueAccumulation
And Sk=0, which is initialization value, Sk=Sk-1=... S0=0, Lk=Lk-1...=L0=0;Wherein [L0, Lk] it is that storage TCP prolongs
Slow queue of the information from 0 to k, the size of queue are k, and k value is the integer of 0 < k < 60 as input, k, and usual k takes 10;
Input TCP postpones information L againt, TCP is postponed into information LtIt is inserted into sliding window, and is deleted in sliding window earliest
TCP postpone information Lt-k, average value in calculation windowAnd calculate accumulation andIt is herein
Iterative calculation, when t is k+1, St-1=Sk=0;Wherein LtPostpone information, the integer of t t > k for the TCP of t moment;
Calculate early warning value Sdiff=Smax-Smin, wherein Smax、Smin∈[St-k, St], St-kWhen postponing information for earliest TCP
Accumulation and;
Judge SdiffWhether between normality threshold [- h, h], if it is, judging that the state Status of the component is
Normally, otherwise judge the state Status of the component for exception;H indicates acceptable SdiffRange, for input one of parameter.
According to the status information formation component status information key-value pair<CID:MID:Status>of each component, wherein CID table
Show the number of component, MID indicates the number of physical machine locating for component, and Status indicates the state of component, when component states are
Status value is 1 when abnormal, is normally then 0.
In as a preferred technical scheme, constructing component exception subgraph by component status information key-value pair includes:
Input module dependence graph G, the matrix of component dependencies figure are expressed as G=(Eij), wherein i, and j expression is answered
With the component in cluster, Eij indicates the dependence between i component and j component, the Eij value if component i is dependent on component j
It is 1, otherwise Eij value is 0;
Traverse component status information key-value pair deletes i=CID from component dependencies figure G when Status value is 0
Or the row and column of j=CID, traversal finish to obtain component dependencies subgraph G1;
It whether there is stand-alone assembly node in determination component dependence subgraph G1, i.e. the stand-alone assembly node is not depend on
The component nodes not relied in other assemblies node and for any other component nodes, construction group after this kind of component nodes are deleted
Part exception subgraph G '.
In as a preferred technical scheme, the container assemblies node packet occurred extremely is oriented according to component exception subgraph
It includes:
Traverse component exception subgraph G ' calculates δi=∑j∈G’EijIf δi=0, then it represents that component nodes i is abnormal
Root node.
In as a preferred technical scheme, referring to fig. 2, method is orienting the appearance occurred extremely according to component exception subgraph
After device assembly node further include:
S105: judging whether the MID of each abnormal root node is identical, if identical, judges the physical machine hair that number is MID
It is raw abnormal.
In as a preferred technical scheme, the TCP delay information for obtaining each container assemblies includes:
Postpone information using the TCP that software tcprstat collects each component.
Below with specific embodiment, this method is described in detail, a kind of distribution container cloud that is applied to of the present invention is put down
The abnormality detection of platform and localization method the following steps are included:
Service manager submits abnormal Location Request to service broker;
After service broker receives abnormal Location Request, postpone letter using the TCP of software tcprstat collection assembly
Breath.Software tcprstat is the tcp layer analysis tool freely increased income, and statisticallys analyze the response time of request, be can be used for interim
Analysis, can also timed task do information collection;
Information is postponed to the TCP of component collected by service broker by sliding window accumulation and Outlier Detection Algorithm
It is analyzed, the status information Status and formation component status information key-value pair<CID:MID:Status>of securing component;
Component status information key-value pair<CID:MID:Status>is submitted to service manager by service broker;
Service manager constructs component exception subgraph G ' after being collected into all component status information key assignments;
Service manager traverse component exception subgraph G ' calculates δi=∑j∈G’EijIf δi=0, then it represents that component section
Point i is abnormal root node;
Judge whether the MID of each abnormal root node is identical, if identical, the physical machine for indicating that number is MID is abnormal.
Embodiment 2
Another embodiment according to the present invention provides a kind of abnormality detection and positioning applied to distributed container cloud platform
Device, referring to Fig. 3, comprising:
Postpone information acquisition unit 201, the TCP for obtaining each container assemblies postpones information;
State information acquisition unit 202, for being accumulated with Outlier Detection Algorithm by sliding window to each container assemblies
TCP delay information is analyzed, and the status information and formation component status information key-value pair of each component are obtained;
Component exception subgraph construction unit 203, for constructing component exception subgraph by component status information key-value pair;
Abnormal positioning unit 204, for orienting the container assemblies node occurred extremely according to component exception subgraph.
The abnormality detection and positioning device of Based on Distributed container cloud platform of the present invention are carried out abnormal using TCP delay information
State judgement, reduces the expense of data acquisition, improves the accuracy and real-time of abnormality judgement.Simultaneously in view of each
Between component, interference between physical machine and component proposes propagation of the component exception subgraph to indicate abnormality, improves
The accuracy of abnormal positioning.
In as a preferred technical scheme, referring to fig. 4, device further include:
Abnormal deciding means 205, for judging whether the MID of each abnormal root node is identical, if identical, judges to number
It is abnormal for the physical machine of MID.
Embodiment 3
A kind of storage medium, storage medium, which is stored with, can be realized above-mentioned any one applied to distributed container cloud platform
Abnormality detection and localization method program file.
Embodiment 4
A kind of processor, processor is for running program, wherein program executes being applied to for above-mentioned any one when running
The abnormality detection and localization method of distributed container cloud platform.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
In the above embodiment of the invention, it all emphasizes particularly on different fields to the description of each embodiment, does not have in some embodiment
The part of detailed description, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others
Mode is realized.Wherein, system embodiment described above is only schematical, such as the division of unit, can be one kind
Logical function partition, there may be another division manner in actual implementation, such as multiple units or components can combine or can
To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of unit or module,
It can be electrical or other forms.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit
Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple units
On.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
It, can if integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product
To be stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention substantially or
Say that all or part of the part that contributes to existing technology or the technical solution can embody in the form of software products
Out, which is stored in a storage medium, including some instructions are used so that a computer equipment
(can be personal computer, server or network equipment etc.) executes all or part of step of each embodiment method of the present invention
Suddenly.And storage medium above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory
The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (10)
1. a kind of abnormality detection and localization method applied to distributed container cloud platform, which comprises the following steps:
Obtain the TCP delay information of each container assemblies;
Postpone information to the TCP of each container assemblies by sliding window accumulation and Outlier Detection Algorithm to analyze, obtains each group
The status information and formation component status information key-value pair of part;
Component exception subgraph is constructed by component status information key-value pair;
The container assemblies node occurred extremely is oriented according to component exception subgraph.
2. the method according to claim 1, wherein described pass through sliding window accumulation and Outlier Detection Algorithm pair
The TCP delay information of each container assemblies is analyzed, and the status information and formation component status information key-value pair of each component are obtained
Include:
Sliding window [the L of initialization component0, Lk], input TCP delay information is until the data amount check that TCP postpones in sliding window
Reach k, initializes average valueAccumulation and Sk=0;Wherein [L0, Lk] it is that storage TCP postpones queue of the information from 0 to k, k
For the integer of 0 < k < 60;
Input TCP postpones information L againt, TCP is postponed into information LtIt is inserted into sliding window, and is deleted in sliding window earliest
TCP postpones information Lt-k, average value in calculation windowAnd calculate accumulation andWherein LtFor t
The TCP at moment postpones information, the integer of t t > k;
Calculate early warning value Sdiff=Smax-Smin, wherein Smax、Smin∈[St-k, St], St-kTiring out when postponing information for earliest TCP
Product and;
Judge SdiffWhether between normality threshold [- h, h], if it is, judge the state Status of the component be it is normal,
Otherwise judge the state Status of the component for exception;
According to the status information formation component status information key-value pair<CID:MID:Status>of each component, wherein CID expression group
The number of part, MID indicate the number of physical machine locating for component, and Status indicates the state of component, when component states are abnormal
When Status value be 1, normally then be 0.
3. according to the method described in claim 2, it is characterized in that, described different by component status information key-value pair construction component
Chang Zitu includes:
Input module dependence graph G, the matrix of component dependencies figure are expressed as G=(Eij), wherein i, and j indicates application collection
Component in group, Eij indicate the dependence between i component and j component, and Eij value is 1 if component i is dependent on component j,
Otherwise Eij value is 0;
Traverse component status information key-value pair deletes i=CID or j when Status value is 0 from component dependencies figure G
The row and column of=CID, traversal finish to obtain component dependencies subgraph G1;
It whether there is stand-alone assembly node in determination component dependence subgraph G1, stand-alone assembly node is independent of other groups
Part node and the component nodes not relied on for any other component nodes, construction component is extremely sub after this kind of component nodes are deleted
Scheme G '.
4. according to the method described in claim 3, it is characterized in that, described oriented according to component exception subgraph occurs extremely
Container assemblies node includes:
Traverse component exception subgraph G ' calculates δi=∑j∈G’EijIf δi=0, then it represents that component nodes i is abnormal root section
Point.
5. according to the method described in claim 4, it is characterized in that, the method is oriented described according to component exception subgraph
Extremely after the container assemblies node occurred further include:
Judge whether the MID of each abnormal root node is identical, if identical, the physical machine for judging that number is MID is abnormal.
6. the method according to claim 1, wherein the TCP delay information for obtaining each container assemblies includes:
Postpone information using the TCP that software tcprstat collects each component.
7. a kind of abnormality detection and positioning device applied to distributed container cloud platform characterized by comprising
Postpone information acquisition unit, the TCP for obtaining each container assemblies postpones information;
State information acquisition unit, for being postponed by sliding window accumulation and Outlier Detection Algorithm to the TCP of each container assemblies
Information is analyzed, and the status information and formation component status information key-value pair of each component are obtained;
Component exception subgraph construction unit, for constructing component exception subgraph by component status information key-value pair;
Abnormal positioning unit, for orienting the container assemblies node occurred extremely according to component exception subgraph.
8. device according to claim 7, which is characterized in that described device further include:
Abnormal deciding means, if identical, judges that number is MID's for judging whether the MID of each abnormal root node is identical
Physical machine is abnormal.
9. a kind of storage medium, which is characterized in that the storage medium be stored with can be realized it is any one in claim 1 to 6
It is applied to the abnormality detection of distributed container cloud platform and the program file of localization method described in.
10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run
Benefit is applied to the abnormality detection and localization method of distributed container cloud platform described in requiring any one of 1 to 6.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811537333.2A CN109800052B (en) | 2018-12-15 | 2018-12-15 | Anomaly detection and positioning method and device applied to distributed container cloud platform |
PCT/CN2019/123989 WO2020119627A1 (en) | 2018-12-15 | 2019-12-09 | Abnormality detection and positioning method and apparatus applied to distributed container cloud platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811537333.2A CN109800052B (en) | 2018-12-15 | 2018-12-15 | Anomaly detection and positioning method and device applied to distributed container cloud platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109800052A true CN109800052A (en) | 2019-05-24 |
CN109800052B CN109800052B (en) | 2020-11-24 |
Family
ID=66556890
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811537333.2A Active CN109800052B (en) | 2018-12-15 | 2018-12-15 | Anomaly detection and positioning method and device applied to distributed container cloud platform |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109800052B (en) |
WO (1) | WO2020119627A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111061586A (en) * | 2019-12-05 | 2020-04-24 | 深圳先进技术研究院 | Container cloud platform anomaly detection method and system and electronic equipment |
WO2020119627A1 (en) * | 2018-12-15 | 2020-06-18 | 深圳先进技术研究院 | Abnormality detection and positioning method and apparatus applied to distributed container cloud platform |
WO2021109048A1 (en) * | 2019-12-05 | 2021-06-10 | 深圳先进技术研究院 | Container cloud platform abnormality detection method and system, and electronic device |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106487633A (en) * | 2016-10-11 | 2017-03-08 | 中国银联股份有限公司 | A kind of abnormal monitoring method of virtual machine and device |
CN106776005A (en) * | 2016-11-23 | 2017-05-31 | 华中科技大学 | A kind of resource management system and method towards containerization application |
CN107612787A (en) * | 2017-11-06 | 2018-01-19 | 南京易捷思达软件科技有限公司 | A kind of cloud hostdown detection method for cloud platform of being increased income based on Openstack |
US20180032903A1 (en) * | 2016-07-28 | 2018-02-01 | International Business Machines Corporation | Optimized re-training for analytic models |
WO2018084912A1 (en) * | 2016-11-02 | 2018-05-11 | Qualcomm Incorporated | Methods and systems for anomaly detection using function specifications derived from server input/output (i/o) behavior |
CN108259241A (en) * | 2018-01-11 | 2018-07-06 | 上海有云信息技术有限公司 | A kind of abnormal localization method and device of cloud platform monitoring system |
CN108306747A (en) * | 2017-01-11 | 2018-07-20 | 阿里巴巴集团控股有限公司 | A kind of cloud security detection method, device and electronic equipment |
CN108337108A (en) * | 2017-12-28 | 2018-07-27 | 天津麒麟信息技术有限公司 | A kind of cloud platform failure automation localization method based on association analysis |
CN108491306A (en) * | 2018-03-19 | 2018-09-04 | 广东电网有限责任公司珠海供电局 | One kind being based on enterprise's private clound credibility monitoring method and system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3345626B2 (en) * | 1994-09-29 | 2002-11-18 | 富士通株式会社 | Processor error countermeasure device in multiprocessor system and processor error countermeasure method in multiprocessor system |
CN101505243B (en) * | 2009-03-10 | 2011-01-05 | 中国科学院软件研究所 | Performance exception detecting method for Web application |
CN105242971B (en) * | 2015-10-20 | 2019-02-22 | 北京航空航天大学 | Memory object management method and system towards Stream Processing system |
CN108306879B (en) * | 2018-01-30 | 2020-11-06 | 福建师范大学 | Distributed real-time anomaly positioning method based on Web session flow |
CN109800052B (en) * | 2018-12-15 | 2020-11-24 | 深圳先进技术研究院 | Anomaly detection and positioning method and device applied to distributed container cloud platform |
-
2018
- 2018-12-15 CN CN201811537333.2A patent/CN109800052B/en active Active
-
2019
- 2019-12-09 WO PCT/CN2019/123989 patent/WO2020119627A1/en active Application Filing
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180032903A1 (en) * | 2016-07-28 | 2018-02-01 | International Business Machines Corporation | Optimized re-training for analytic models |
CN106487633A (en) * | 2016-10-11 | 2017-03-08 | 中国银联股份有限公司 | A kind of abnormal monitoring method of virtual machine and device |
WO2018084912A1 (en) * | 2016-11-02 | 2018-05-11 | Qualcomm Incorporated | Methods and systems for anomaly detection using function specifications derived from server input/output (i/o) behavior |
CN106776005A (en) * | 2016-11-23 | 2017-05-31 | 华中科技大学 | A kind of resource management system and method towards containerization application |
CN108306747A (en) * | 2017-01-11 | 2018-07-20 | 阿里巴巴集团控股有限公司 | A kind of cloud security detection method, device and electronic equipment |
CN107612787A (en) * | 2017-11-06 | 2018-01-19 | 南京易捷思达软件科技有限公司 | A kind of cloud hostdown detection method for cloud platform of being increased income based on Openstack |
CN108337108A (en) * | 2017-12-28 | 2018-07-27 | 天津麒麟信息技术有限公司 | A kind of cloud platform failure automation localization method based on association analysis |
CN108259241A (en) * | 2018-01-11 | 2018-07-06 | 上海有云信息技术有限公司 | A kind of abnormal localization method and device of cloud platform monitoring system |
CN108491306A (en) * | 2018-03-19 | 2018-09-04 | 广东电网有限责任公司珠海供电局 | One kind being based on enterprise's private clound credibility monitoring method and system |
Non-Patent Citations (3)
Title |
---|
JORDAN HOCHENBAUM: "Automatic Anomaly Detection in the Cloud", 《ARXIV》 * |
TAO WANG: "Self-adaptive cloud monitoring with online anomaly detection", 《FUTURE GENERATION COMPUTER SYSTEMS》 * |
王桂平: "云环境下面向可信的虚拟机异常检测关键技术研究", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020119627A1 (en) * | 2018-12-15 | 2020-06-18 | 深圳先进技术研究院 | Abnormality detection and positioning method and apparatus applied to distributed container cloud platform |
CN111061586A (en) * | 2019-12-05 | 2020-04-24 | 深圳先进技术研究院 | Container cloud platform anomaly detection method and system and electronic equipment |
WO2021109048A1 (en) * | 2019-12-05 | 2021-06-10 | 深圳先进技术研究院 | Container cloud platform abnormality detection method and system, and electronic device |
CN111061586B (en) * | 2019-12-05 | 2023-09-19 | 深圳先进技术研究院 | Container cloud platform anomaly detection method and system and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2020119627A1 (en) | 2020-06-18 |
CN109800052B (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10860576B2 (en) | Splitting a query into native query operations and post-processing operations | |
AU2017281638B2 (en) | Application migration system | |
US10177998B2 (en) | Augmenting flow data for improved network monitoring and management | |
EP2011015B1 (en) | Method and system for determining compatibility of computer systems | |
AU2020200578A1 (en) | Intelligent configuration discovery techniques | |
EP2425349B1 (en) | Application efficiency engine | |
EP2515233A1 (en) | Detecting and diagnosing misbehaving applications in virtualized computing systems | |
EP2524322B1 (en) | A virtualization and consolidation analysis engine for enterprise data centers | |
US7882216B2 (en) | Process and methodology for generic analysis of metrics related to resource utilization and performance | |
CN109800052A (en) | Abnormality detection and localization method and device applied to distributed container cloud platform | |
US20200026566A1 (en) | Workload identification and display of workload-specific metrics | |
US20070250615A1 (en) | Method and System For Determining Compatibility of Computer Systems | |
CN106170947A (en) | A kind of alarm information processing method, relevant device and system | |
WO2009026703A1 (en) | Method and system for evaluating virtualized environments | |
WO2012117318A1 (en) | Generating a semantic graph relating information assets | |
US10754866B2 (en) | Management device and management method | |
US9270539B2 (en) | Predicting resource provisioning times in a computing environment | |
US20130254524A1 (en) | Automated configuration change authorization | |
CN108989430A (en) | Load-balancing method, device and storage medium | |
US20190215246A1 (en) | Predictive analysis in a software defined network | |
CN106471473A (en) | Mechanism for the too high distribution of server in the minds of in control data | |
CN115514657A (en) | Network modeling method, network problem analysis method and related equipment | |
Jehangiri et al. | Diagnosing cloud performance anomalies using large time series dataset analysis | |
CN108009004A (en) | The implementation method of service application availability measurement monitoring based on Docker | |
CN110196751A (en) | The partition method and device of mutual interference service, electronic equipment, storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |