CN114185756A - Distributed system state analysis method, device and computer readable storage medium - Google Patents

Distributed system state analysis method, device and computer readable storage medium Download PDF

Info

Publication number
CN114185756A
CN114185756A CN202111514847.8A CN202111514847A CN114185756A CN 114185756 A CN114185756 A CN 114185756A CN 202111514847 A CN202111514847 A CN 202111514847A CN 114185756 A CN114185756 A CN 114185756A
Authority
CN
China
Prior art keywords
state
distributed system
decision tree
tree model
analyzing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111514847.8A
Other languages
Chinese (zh)
Inventor
李传文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Yuncongxihe Artificial Intelligence Co ltd
Original Assignee
Jiangsu Yuncongxihe Artificial Intelligence Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Yuncongxihe Artificial Intelligence Co ltd filed Critical Jiangsu Yuncongxihe Artificial Intelligence Co ltd
Priority to CN202111514847.8A priority Critical patent/CN114185756A/en
Publication of CN114185756A publication Critical patent/CN114185756A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available

Abstract

The invention relates to the technical field of computer processing, in particular to a distributed system state analysis method, a distributed system state analysis device and a computer readable storage medium, and aims to solve the technical problem of accurately analyzing whether a distributed system is in a fault or other states. To this end, the method of the invention comprises: acquiring state characteristics of a distributed system in a target scene; inputting the state characteristics of the distributed system into a preset decision tree model, and training the decision tree model by using a sample state characteristic set of the distributed system to analyze whether the distributed system is in one of multiple states; and outputting the state of the distributed system by the decision tree model. The decision tree model is used in the invention to preferentially judge whether the distributed system has a fault which is easier to identify, if the system has the fault, the fault can be quickly and accurately identified, thereby avoiding analyzing other faults when the system has the fault, and improving the accuracy of fault judgment.

Description

Distributed system state analysis method, device and computer readable storage medium
Technical Field
The invention relates to the technical field of computers, and particularly provides a distributed system state analysis method and device and a computer readable storage medium.
Background
A distributed system is a software system built on top of a network. Due to the nature of software, distributed systems are highly cohesive and transparent. In a distributed system, a set of independent computers appear to the user as a unified whole, as if it were a system. The system has various general physical and logical resources, can dynamically allocate tasks, and realizes information exchange by the dispersed physical and logical resources through a computer network.
The distributed system is interconnected with each other through a communication network by a plurality of different nodes, and a user on one node can use resources on other nodes to realize resource sharing. A task can be divided into a plurality of subtasks which run in parallel, and the subtasks can be dispersed to different nodes to enable the subtasks to run on the nodes simultaneously, so that the computing speed is increased. In addition, the distributed system has high reliability. If one of the nodes fails, the remaining nodes can continue to operate, and the entire system does not crash altogether due to the failure of one or a few nodes. Therefore, the distributed system has good fault tolerance.
However, in order to ensure high reliability of the distributed system, the system must be able to detect the failure of a node and take appropriate measures to recover from the failure. For distributed systems, it is desirable to quickly determine the node at which the failure is located and not reuse it to provide service until it returns to normal operation. However, the current solution depends heavily on professionals such as operation and maintenance, and the length of the investigation time is determined by the experience abundance of related personnel.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks, the present invention is proposed to provide a distributed system state analysis method, apparatus, and computer-readable storage medium that solve or at least partially quickly and accurately analyze whether a distributed system is in a fault or other state.
In a first aspect, the present invention provides a method for analyzing a distributed system state, the method comprising:
acquiring state characteristics of a distributed system in a target scene;
inputting the state characteristics of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using a sample state characteristic set of the distributed system, for analyzing whether the distributed system is in one of a plurality of states, and, in training the decision tree model, for a first state and a second state of the plurality of states, calculating a first gap between the first state-related state feature and the first state-unrelated state feature in the sample state feature set, and a second gap between the second state-related state feature and the second state-unrelated state feature in the sample state feature set, if the first gap is larger than the second gap, controlling the decision tree model to firstly analyze whether the distributed system is in the first state or not, and then analyzing whether the distributed system is in the second state or not when the distributed system is not in the first state;
and outputting the state of the distributed system by the decision tree model.
In an embodiment of the method for analyzing the state of the distributed system, the step of "calculating a first difference between the state feature related to the first state and the state feature unrelated to the first state in the sample state feature set, and a second difference between the state feature related to the second state and the state feature unrelated to the second state in the sample state feature set" includes:
calculating the first gap according to a difference between the average value of each state feature related to the first state and the average value of each state feature unrelated to the first state, and taking the difference between the average value of each state feature related to the second state and the average value of each state feature unrelated to the second state as the second gap;
and/or the presence of a gas in the gas,
before the step of inputting the state characteristics of the distributed system into a preset decision tree model, the method further comprises the following steps:
identifying states that the distributed system would not appear in the target scene
The step of inputting the state characteristics of the distributed system into a preset decision tree model further comprises the following steps:
inputting information of the non-existent state of the distributed system into the decision tree model, wherein the non-existent state is not analyzed in the process of analyzing the state by the decision tree model;
and/or the presence of a gas in the gas,
the state characteristics of the distributed system are various, and before the step of inputting the state characteristics of the distributed system into a preset decision tree model, the method further comprises the following steps:
analyzing the service influence degree of the various state characteristics of the distributed system in the target scene, and zooming the various state characteristics of the distributed system according to the service influence degree;
and/or the presence of a gas in the gas,
the state characteristics of the distributed system comprise network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system.
In a second aspect, there is provided a distributed system state analysis apparatus, the apparatus comprising:
the state characteristic acquisition module is used for acquiring the state characteristics of the distributed system in a target scene;
a model input module, which inputs the state characteristics of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using a sample state characteristic set of the distributed system and is used for analyzing whether the distributed system is in one of a plurality of states, when the decision tree model is trained, for a first state and a second state in the plurality of states, a first gap between the state characteristics related to the first state and the state characteristics unrelated to the first state in the sample state characteristic set and a second gap between the state characteristics related to the second state and the state characteristics unrelated to the second state in the sample state characteristic set are calculated, and if the first gap is larger than the second gap, the decision tree model is controlled to firstly analyze whether the distributed system is in the first state, when not in the first state, analyzing whether the second state is in the second state;
and the model output module outputs the state of the distributed system by the decision tree model.
In an embodiment of the apparatus for analyzing a state of a distributed system, there are a plurality of state features of the distributed system, and when the decision tree model is trained, the first gap is calculated according to a difference between an average value of each state feature related to the first state and an average value of each state feature unrelated to the first state, and the second gap is calculated according to a difference between an average value of each state feature related to the second state and an average value of each state feature unrelated to the second state;
and/or, further comprising:
the state recognition module is used for inputting the state characteristics of the distributed system into a preset decision tree model and recognizing the state of the distributed system which cannot appear in the target scene;
the model input module also inputs information of states which cannot occur in the distributed system into the decision tree model, and the decision tree model does not analyze the occurring states in the process of analyzing the states;
and/or the presence of a gas in the gas,
the distributed system has various status characteristics, and further comprises:
and the state characteristic scaling module is used for inputting the state characteristics of the distributed system into a preset decision tree model, analyzing the service influence degree of the various state characteristics of the distributed system in the target scene, and scaling the various state characteristics of the distributed system according to the service influence degree.
And/or the presence of a gas in the gas,
the state characteristics of the distributed system comprise network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system.
In a third aspect, a control device is provided, which includes a processor and a storage device, wherein the storage device is adapted to store a plurality of program codes, and the program codes are adapted to be loaded and run by the processor to execute any one of the above-mentioned technical solutions of the distributed system state analysis method.
In a fourth aspect, a computer-readable storage medium is provided, in which a plurality of program codes are stored, the program codes being adapted to be loaded and executed by a processor to perform the distributed system state analysis method according to any one of the above-mentioned aspects of the distributed system state analysis method.
One or more technical schemes of the invention at least have one or more of the following beneficial effects:
in one embodiment of the present invention, the method for analyzing the state of the distributed system may include the following steps: acquiring state characteristics of a distributed system in a target scene; inputting the state characteristics of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using a sample state characteristic set of the distributed system and is used for analyzing whether the distributed system is in one of a plurality of states, when the decision tree model is trained, for a first state and a second state in the plurality of states, calculating a first difference between state characteristics relevant to the first state and state characteristics irrelevant to the first state in the sample state characteristic set, and a second difference between state characteristics relevant to the second state and state characteristics irrelevant to the second state in the sample state characteristic set, if the first difference is larger than the second difference, controlling the decision tree model to firstly analyze whether the distributed system is in the first state, and secondly analyzing whether the distributed system is in the second state when the distributed system is not in the first state, because the difference between the state characteristics in the first state and the state when the distributed system is not in the first state is larger, the system states are obviously different when the distributed system is in the first state and not in the first state, namely whether the distributed system is in the first state can be accurately judged according to the state characteristics of the distributed system, for the second state, the difference between the state characteristics when the distributed system is in the second state and not in the second state is smaller, and the system states are not obviously different when the distributed system is in the second state and not in the second state, so that whether the distributed system is in the second state is difficult to accurately judge according to the state characteristics of the distributed system, and therefore the decision tree model used by the method preferentially judges whether the distributed system is in the first state which is easier to identify; and outputting the state of the distributed system by the decision tree model. The decision tree model is used in the invention to preferentially judge whether the distributed system has a fault which is easier to identify, if the system has the fault, the fault can be quickly and accurately identified, thereby avoiding analyzing other faults when the system has the fault, and improving the accuracy of fault judgment.
Drawings
The disclosure of the present invention will become more readily understood with reference to the accompanying drawings. As is readily understood by those skilled in the art: these drawings are for illustrative purposes only and are not intended to constitute a limitation on the scope of the present invention. Wherein:
FIG. 1 is a schematic flow chart of the main steps of a distributed system state analysis method according to one embodiment of the present invention;
FIG. 2 is a schematic flow chart of the main steps of a distributed system state analysis method according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of a decision tree model used by a distributed system state analysis method according to one embodiment of the invention;
fig. 4 is a schematic diagram of a main configuration of a distributed system state analysis apparatus according to another embodiment of the present invention;
fig. 5 is a schematic diagram of a main structure of a distributed system state analysis apparatus according to another embodiment of the present invention.
Detailed Description
Some embodiments of the invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
In the description of the present invention, a "module" or "processor" may include hardware, software, or a combination of both. A module may comprise hardware circuitry, various suitable sensors, communication ports, memory, may comprise software components such as program code, or may be a combination of software and hardware. The processor may be a central processing unit, microprocessor, image processor, digital signal processor, or any other suitable processor. The processor has data and/or signal processing functionality. The processor may be implemented in software, hardware, or a combination thereof. Non-transitory computer readable storage media include any suitable medium that can store program code, such as magnetic disks, hard disks, optical disks, flash memory, read-only memory, random-access memory, and the like. The term "a and/or B" denotes all possible combinations of a and B, such as a alone, B alone or a and B. The term "at least one A or B" or "at least one of A and B" means similar to "A and/or B" and may include only A, only B, or both A and B. The singular forms "a", "an" and "the" may include the plural forms as well.
Referring to fig. 1, fig. 1 is a flow chart illustrating main steps of a distributed system state analysis method according to an embodiment of the present invention. As shown in fig. 1, the distributed system state analysis method in the embodiment of the present invention mainly includes the following steps:
step S110, acquiring the state characteristics of the distributed system in the target scene.
The target scene is not limited herein, and may be any time, occasion, and scene in a business environment. The status characteristics of the distributed system are various, including but not limited to network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system. For example, the status characteristics of the distributed system may be 50, wherein the 1 st to 9 th are basic characteristics of network connection of the distributed system, the 10 th to 23 th are characteristics of network content and traffic of the connection of the distributed system, the 24 th to 30 th are statistical characteristics of traffic load balancing based on a time grid for the distributed system, the 31 st to 41 th are characteristics of consistency of data based on network partition for the distributed system, and the 42 th to 50 th are characteristics of high availability of the distributed system.
Step S120, inputting the state characteristics of the distributed system into a preset decision tree model, where the decision tree model is trained by using the sample state characteristic set of the distributed system, and is used to analyze whether the distributed system is in one of multiple states.
The various states of the distributed system herein include not only various types of faults but also other states such as the degree of system load. The following description will be given taking an example of analyzing a fault of a distributed system, and it is not intended to indicate that the technical solution of the present invention can only analyze a fault of a system.
When the decision tree model is trained, for a first state and a second state in a plurality of states, calculating a first difference between state features related to the first state and state features unrelated to the first state in the sample state feature set and a second difference between state features related to the second state and state features unrelated to the second state in the sample state feature set, and if the first difference is larger than the second difference, controlling the decision tree model to firstly analyze whether the distributed system is in the first state and secondly analyze whether the distributed system is in the second state when the distributed system is not in the first state.
Classifying the sample data set according to whether the distributed system is in the first state, and if the difference between the two types of samples is large, indicating that the difference between the state characteristics of the distributed system in the first state and the state characteristics of the distributed system not in the first state is obvious; the method comprises the steps of classifying sample data sets according to whether a distributed system is in a second state or not, and if the difference between the two types of samples is small, the difference between state characteristics of the distributed system in the second state and the distributed system not in the second state is not obvious, so that whether the distributed system is in the first state or not can be quickly and accurately identified through training a decision tree model by the sample data set, the accuracy rate of identifying whether the distributed system is in the second state is reduced, the decision tree model is controlled to preferentially analyze whether the distributed system is in the first state or not, whether the distributed system is in the second state or not is analyzed when the distributed system is not in the first state, accurate analysis can be conducted if the distributed system is in the first state, the situation that whether the distributed system is in the first state and the decision tree model is still analyzing whether the distributed system is in the second state or not is avoided, and the accuracy rate of analyzing the state of the distributed system is improved.
Step S130, the decision tree model outputs what state the distributed system is in.
The decision tree model is used in the embodiment, whether the distributed system has a fault which is easier to identify can be preferentially judged, if the system has the fault, the fault can be quickly and accurately identified, analysis on other faults when the system has the fault is avoided, and the accuracy of fault judgment is improved.
Referring to fig. 2, fig. 2 is a flow chart illustrating the main steps of a distributed system status analysis method according to an embodiment of the present invention. As shown in fig. 2, the distributed system state analysis method in the embodiment of the present invention mainly includes the following steps:
step S210, acquiring the state characteristics of the distributed system in the target scene.
And step S220, analyzing the service influence degree of the various state characteristics of the distributed system in the target scene, and scaling the various state characteristics of the distributed system according to the service influence degree.
Under different scenes, the correlation degree of different state characteristics of the distributed system and the business development under the current scene is different, so that the influence degree of the different state characteristics on the states of the systems is also different, and the state characteristics of the systems are properly scaled so as to reflect the influence of the different state characteristics on the states of the systems.
Step S230, identifying a state that the distributed system does not appear in the target scene.
Under a specific scene, some faults or other states are impossible to occur, and then the decision tree system does not need to consume resources for analysis, which is beneficial to improving the efficiency of state analysis of the distributed system.
Step S240, inputting the state characteristics of the distributed system and the information of the non-existent state of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using the sample state characteristic set of the distributed system and is used for analyzing whether the distributed system is in one of a plurality of states, and the non-existent state is not analyzed in the process of analyzing the state by the decision tree model.
The distributed system training data set adopts data numeralization and normalization to carry out data preprocessing, and PAC (programmable automation controller) dimension reduction operation is carried out on the data to be used for training a decision tree model. In the embodiment, the CART algorithm is adopted as the algorithm for constructing the multi-classification distributed system fault decision tree model.
When the decision tree model is trained, for a first state and a second state in a plurality of states, a first difference is calculated according to the difference between the average value of each state feature related to the first state and the average value of each state feature unrelated to the first state, and a second difference is calculated according to the difference between the average value of each state feature related to the second state and the average value of each state feature unrelated to the second state, if the first difference is larger than the second difference, the decision tree model is controlled to firstly analyze whether the distributed system is in the first state, and then analyze whether the distributed system is in the second state when the distributed system is not in the first state.
Here, the specific way to calculate the gap is as follows:
Figure BDA0003405273000000091
Figure BDA0003405273000000092
dividing the sample data set into a type A relevant to the state and a type B irrelevant to the state according to whether the sample data set is in a certain state, wherein M represents the average value of the ith characteristic in all samples of the type A, and M' represents the average value of the ith characteristic in all samples of the type B; n, N' represents the total number of samples in class A and class B, respectively; n represents all the feature numbers in the sample; i. i' respectively represents the serial numbers of the A-type samples and the B-type samples; j denotes the number of features in the sample, XijRepresenting the jth feature in the ith sample in class A; x'i,jRepresents the jth feature in the ith' sample in class B; and finally, solving the distance between the A class and the B class as follows:
Figure BDA0003405273000000093
and step S250, outputting the state of the distributed system by the decision tree model.
A specific implementation manner of this embodiment is used for analyzing a distributed system fault, where the distributed system status types are classified into 5 types, a type represents normal, a type B represents missing fault, a type C represents random fault (byzantine fault), a type D represents time sequence fault, and a type E represents reliability fault of one-to-one communication. Training through a sample state characteristic set, wherein the sample state characteristic set is classified according to whether A-class conditions exist or not, and class centers (namely average values) of two classes of characteristics and the distance between the two class centers are respectively calculated; in the same way, the corresponding distance of the B-E classes is calculated, the analysis sequence of the decision tree model is determined according to the size of the distance, and if the corresponding distance of the A-E classes is gradually reduced, a schematic diagram of the decision tree model is constructed as shown in FIG. 3. By using the decision tree model, the accuracy of fault judgment is effectively improved.
Referring to fig. 4, fig. 4 is a schematic block diagram of a main structure of a distributed system state analysis apparatus according to an embodiment of the present invention. As shown in fig. 4, the distributed system state analysis apparatus in the embodiment of the present invention mainly includes the following modules:
and the state feature acquiring module 410 acquires the state features of the distributed system in the target scene.
The target scene is not limited herein, and may be any time, occasion, and scene in a business environment. The status characteristics of the distributed system are various, including but not limited to network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system. For example, the status characteristics of the distributed system may be 50, wherein the 1 st to 9 th are basic characteristics of network connection of the distributed system, the 10 th to 23 th are characteristics of network content and traffic of the connection of the distributed system, the 24 th to 30 th are statistical characteristics of traffic load balancing based on a time grid for the distributed system, the 31 st to 41 th are characteristics of consistency of data based on network partition for the distributed system, and the 42 th to 50 th are characteristics of high availability of the distributed system.
The model input module 420 inputs the state characteristics of the distributed system into a preset decision tree model, and the decision tree model is trained by using the sample state characteristic set of the distributed system and is used for analyzing whether the distributed system is in one of multiple states.
The various states of the distributed system herein include not only various types of faults but also other states such as the degree of system load. The following description will be given taking an example of analyzing a fault of a distributed system, and it is not intended to indicate that the technical solution of the present invention can only analyze a fault of a system.
When the decision tree model is trained, for a first state and a second state in a plurality of states, calculating a first difference between state features related to the first state and state features unrelated to the first state in the sample state feature set and a second difference between state features related to the second state and state features unrelated to the second state in the sample state feature set, and if the first difference is larger than the second difference, controlling the decision tree model to firstly analyze whether the distributed system is in the first state and secondly analyze whether the distributed system is in the second state when the distributed system is not in the first state.
Classifying the sample data set according to whether the distributed system is in the first state, and if the difference between the two types of samples is large, indicating that the difference between the state characteristics of the distributed system in the first state and the state characteristics of the distributed system not in the first state is obvious; the method comprises the steps of classifying sample data sets according to whether a distributed system is in a second state or not, and if the difference between the two types of samples is small, the difference between state characteristics of the distributed system in the second state and the distributed system not in the second state is not obvious, so that whether the distributed system is in the first state or not can be quickly and accurately identified through training a decision tree model by the sample data set, the accuracy rate of identifying whether the distributed system is in the second state is reduced, the decision tree model is controlled to preferentially analyze whether the distributed system is in the first state or not, whether the distributed system is in the second state or not is analyzed when the distributed system is not in the first state, accurate analysis can be conducted if the distributed system is in the first state, the situation that whether the distributed system is in the first state and the decision tree model is still analyzing whether the distributed system is in the second state or not is avoided, and the accuracy rate of analyzing the state of the distributed system is improved.
The model output module 430 outputs the state of the distributed system from the decision tree model.
The decision tree model is used in the embodiment, whether the distributed system has a fault which is easier to identify can be preferentially judged, if the system has the fault, the fault can be quickly and accurately identified, analysis on other faults when the system has the fault is avoided, and the accuracy of fault judgment is improved.
Referring to fig. 5, fig. 5 is a schematic block diagram illustrating a main structure of a distributed system state analysis apparatus according to an embodiment of the present invention. As shown in fig. 5, the distributed system state analysis apparatus in the embodiment of the present invention mainly includes the following modules:
the status feature obtaining module 510 obtains status features of the distributed system in the target scene.
The status feature scaling module 520 analyzes the service influence degree of the various status features of the distributed system in the target scene, and scales the various status features of the distributed system according to the service influence degree.
Under different scenes, the correlation degree of different state characteristics of the distributed system and the business development under the current scene is different, so that the influence degree of the different state characteristics on the states of the systems is also different, and the state characteristics of the systems are properly scaled so as to reflect the influence of the different state characteristics on the states of the systems.
And the state identification module 530 identifies states that the distributed system does not appear in the target scene.
Under a specific scene, some faults or other states are impossible to occur, and then the decision tree system does not need to consume resources for analysis, which is beneficial to improving the efficiency of state analysis of the distributed system.
The model input module 540 is configured to input the state characteristics of the distributed system and information about states that the distributed system does not appear into a preset decision tree model, where the decision tree model is trained by using a sample state feature set of the distributed system and is used to analyze whether the distributed system is in one of multiple states, and the decision tree model does not analyze states that do not appear during state analysis.
The distributed system training data set adopts data numeralization and normalization to carry out data preprocessing, and PAC (programmable automation controller) dimension reduction operation is carried out on the data to be used for training a decision tree model. In the embodiment, the CART algorithm is adopted as the algorithm for constructing the multi-classification distributed system fault decision tree model.
When the decision tree model is trained, for a first state and a second state in a plurality of states, a first difference is calculated according to the difference between the average value of each state feature related to the first state and the average value of each state feature unrelated to the first state, and a second difference is calculated according to the difference between the average value of each state feature related to the second state and the average value of each state feature unrelated to the second state, if the first difference is larger than the second difference, the decision tree model is controlled to firstly analyze whether the distributed system is in the first state, and then analyze whether the distributed system is in the second state when the distributed system is not in the first state.
Here, the specific way to calculate the gap is as follows:
Figure BDA0003405273000000121
Figure BDA0003405273000000122
dividing the sample data set into A type relevant to the state and B type irrelevant to the state according to whether the sample data set is in a certain state, wherein M represents the average of ith characteristics in all samples of the A typeMean, M 'represents the mean of the i' th features in all samples of class B; n, N' represents the total number of samples in class A and class B, respectively; n represents all the feature numbers in the sample; i. i' respectively represents the serial numbers of the A-type samples and the B-type samples; j denotes the number of features in the sample, XijRepresenting the jth feature in the ith sample in class A; x'i,iRepresents the jth feature in the ith' sample in class B; and finally, solving the distance between the A class and the B class as follows:
Figure BDA0003405273000000123
the model output module 550 outputs the state of the distributed system according to the decision tree model.
A specific implementation manner of this embodiment is used for analyzing a distributed system fault, where the distributed system status types are classified into 5 types, a type represents normal, a type B represents missing fault, a type C represents random fault (byzantine fault), a type D represents time sequence fault, and a type E represents reliability fault of one-to-one communication. Training through a sample state characteristic set, wherein the sample state characteristic set is classified according to whether A-class conditions exist or not, and class centers (namely average values) of two classes of characteristics and the distance between the two class centers are respectively calculated; in the same way, the corresponding distance of the B-E classes is calculated, the analysis sequence of the decision tree model is determined according to the size of the distance, and if the corresponding distance of the A-E classes is gradually reduced, a schematic diagram of the decision tree model is constructed as shown in FIG. 3. By using the decision tree model, the accuracy of fault judgment is effectively improved.
The above-mentioned distributed system state analyzing apparatus shown in fig. 4 to 5 is used for executing the embodiment of the distributed system state analyzing method shown in fig. 1 to 2, and the technical principles, the solved technical problems and the generated technical effects of the two are similar, and it can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process and related description of the distributed system state analyzing apparatus may refer to the content described in the embodiment of the distributed system state analyzing method, and are not repeated here.
It will be understood by those skilled in the art that all or part of the flow of the method according to the above-described embodiment may be implemented by a computer program, which may be stored in a computer-readable storage medium and used to implement the steps of the above-described embodiments of the method when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include: any entity or device capable of carrying said computer program code, media, usb disk, removable hard disk, magnetic diskette, optical disk, computer memory, read-only memory, random access memory, electrical carrier wave signals, telecommunication signals, software distribution media, etc. It should be noted that the computer readable storage medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable storage media that does not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
Furthermore, the invention also provides a control device. In an embodiment of the control device according to the present invention, the control device comprises a processor and a storage device, the storage device may be configured to store a program for executing the distributed system state analysis method of the above-mentioned method embodiment, and the processor may be configured to execute the program in the storage device, the program including but not limited to the program for executing the distributed system state analysis method of the above-mentioned method embodiment. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The control device may be a control device apparatus formed including various electronic apparatuses.
Further, the invention also provides a computer readable storage medium. In one computer-readable storage medium embodiment according to the present invention, a computer-readable storage medium may be configured to store a program that executes the distributed system state analysis method of the above-described method embodiment, and the program may be loaded and executed by a processor to implement the above-described distributed system state analysis method. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The computer readable storage medium may be a storage device formed by including various electronic devices, and optionally, the computer readable storage medium is a non-transitory computer readable storage medium in the embodiment of the present invention.
Further, it should be understood that, since the configuration of each module is only for explaining the functional units of the apparatus of the present invention, the corresponding physical devices of the modules may be the processor itself, or a part of software, a part of hardware, or a part of a combination of software and hardware in the processor. Thus, the number of individual modules in the figures is merely illustrative.
Those skilled in the art will appreciate that the various modules in the apparatus may be adaptively split or combined. Such splitting or combining of specific modules does not cause the technical solutions to deviate from the principle of the present invention, and therefore, the technical solutions after splitting or combining will fall within the protection scope of the present invention.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (12)

1. A method for analyzing a state of a distributed system, the method comprising:
acquiring state characteristics of a distributed system in a target scene;
inputting the state features of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using a sample state feature set of the distributed system and is used for analyzing whether the distributed system is in one of a plurality of states, and when the decision tree model is trained, for a first state and a second state in the plurality of states, calculating a first difference between the state features related to the first state and the state features unrelated to the first state in the sample state feature set and a second difference between the state features related to the second state and the state features unrelated to the second state in the sample state feature set;
if the first gap is larger than the second gap, controlling the decision tree model to firstly analyze whether the distributed system is in the first state or not, and then analyzing whether the distributed system is in the second state or not when the distributed system is not in the first state;
and outputting the state of the distributed system by the decision tree model.
2. The method according to claim 1, wherein the steps of "calculating a first difference between the state features related to the first state and the state features unrelated to the first state in the sample state feature set, and calculating a second difference between the state features related to the second state and the state features unrelated to the second state in the sample state feature set" include:
the first gap is calculated from a difference between the average value of each state feature related to the first state and the average value of each state feature unrelated to the first state, and the second gap is calculated from a difference between the average value of each state feature related to the second state and the average value of each state feature unrelated to the second state.
3. The method for analyzing the state of the distributed system according to claim 1, further comprising, before the step of inputting the state feature of the distributed system into a preset decision tree model:
identifying states of the distributed system which do not appear in the target scene;
the step of inputting the state characteristics of the distributed system into a preset decision tree model further comprises the following steps:
and inputting the information of the non-appeared state of the distributed system into the decision tree model, wherein the non-appeared state is not analyzed in the process of analyzing the state by the decision tree model.
4. The method for analyzing the state of the distributed system according to claim 1, wherein the state characteristics of the distributed system are various, and before the step of inputting the state characteristics of the distributed system into a preset decision tree model, the method further comprises:
analyzing the service influence degree of the various state characteristics of the distributed system in the target scene, and scaling the various state characteristics of the distributed system according to the service influence degree.
5. The distributed system state analysis method according to claim 1,
the state characteristics of the distributed system comprise network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system.
6. A distributed system state analysis apparatus, the apparatus comprising:
the state characteristic acquisition module is used for acquiring the state characteristics of the distributed system in a target scene;
a model input module, which inputs the state characteristics of the distributed system into a preset decision tree model, wherein the decision tree model is trained by using a sample state characteristic set of the distributed system and is used for analyzing whether the distributed system is in one of a plurality of states, when the decision tree model is trained, for a first state and a second state in the plurality of states, a first gap between the state characteristics related to the first state and the state characteristics unrelated to the first state in the sample state characteristic set and a second gap between the state characteristics related to the second state and the state characteristics unrelated to the second state in the sample state characteristic set are calculated, and if the first gap is larger than the second gap, the decision tree model is controlled to firstly analyze whether the distributed system is in the first state, when not in the first state, analyzing whether the second state is in the second state;
and the model output module outputs the state of the distributed system by the decision tree model.
7. The distributed system state analysis apparatus according to claim 6, wherein there are a plurality of state features of the distributed system, and when the decision tree model is trained, the first gap is calculated according to a difference between the average value of each state feature related to the first state and the average value of each state feature unrelated to the first state, and the second gap is calculated according to a difference between the average value of each state feature related to the second state and the average value of each state feature unrelated to the second state.
8. The distributed system state analysis device according to claim 6, further comprising:
the state recognition module is used for inputting the state characteristics of the distributed system into a preset decision tree model and recognizing the state of the distributed system which cannot appear in the target scene;
the model input module also inputs information of states which cannot occur in the distributed system into the decision tree model, and the decision tree model does not analyze the occurring states in the process of analyzing the states.
9. The distributed system state analysis apparatus according to claim 6, wherein the distributed system has a plurality of state characteristics, and further comprising:
and the state characteristic scaling module is used for inputting the state characteristics of the distributed system into a preset decision tree model, analyzing the service influence degree of the various state characteristics of the distributed system in the target scene, and scaling the various state characteristics of the distributed system according to the service influence degree.
10. The distributed system state analysis apparatus according to claim 6,
the state characteristics of the distributed system comprise network connection basic characteristics, network content and traffic characteristics, traffic load balancing statistical characteristics, data consistency characteristics and high availability characteristics of the distributed system.
11. A control apparatus comprising a processor and a storage device, the storage device being adapted to store a plurality of program codes, wherein the program codes are adapted to be loaded and run by the processor to perform the distributed system state analysis method of any of claims 1 to 5.
12. A computer-readable storage medium having stored therein a plurality of program codes, characterized in that the program codes are adapted to be loaded and run by a processor to perform the distributed system state analysis method of any one of claims 1 to 5.
CN202111514847.8A 2021-12-10 2021-12-10 Distributed system state analysis method, device and computer readable storage medium Pending CN114185756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111514847.8A CN114185756A (en) 2021-12-10 2021-12-10 Distributed system state analysis method, device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111514847.8A CN114185756A (en) 2021-12-10 2021-12-10 Distributed system state analysis method, device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114185756A true CN114185756A (en) 2022-03-15

Family

ID=80604601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111514847.8A Pending CN114185756A (en) 2021-12-10 2021-12-10 Distributed system state analysis method, device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114185756A (en)

Similar Documents

Publication Publication Date Title
WO2018005210A1 (en) Predictive anomaly detection in communication systems
CN113688042A (en) Method and device for determining test scene, electronic equipment and readable storage medium
CN111178380A (en) Data classification method and device and electronic equipment
CN110471945B (en) Active data processing method, system, computer equipment and storage medium
CN116109121B (en) User demand mining method and system based on big data analysis
CN112114986A (en) Data anomaly identification method and device, server and storage medium
CN111931809A (en) Data processing method and device, storage medium and electronic equipment
CN107203464B (en) Method and device for positioning service problem
CN111274084A (en) Fault diagnosis method, device, equipment and computer readable storage medium
CN114911615B (en) Intelligent prediction scheduling method and application during micro-service running
CN112685207A (en) Method, apparatus and computer program product for error assessment
CN115730947A (en) Bank customer loss prediction method and device
CN110300008A (en) A kind of method and device of the state of the determining network equipment
CN114584377A (en) Flow anomaly detection method, model training method, device, equipment and medium
CN112995337B (en) High-performance processor chip-based high-performance Internet of things hardware platform and method
CN112070180B (en) Power grid equipment state judging method and device based on information physical bilateral data
CN110543462A (en) Microservice reliability prediction method, prediction device, electronic device, and storage medium
CN114185756A (en) Distributed system state analysis method, device and computer readable storage medium
CN116707859A (en) Feature rule extraction method and device, and network intrusion detection method and device
CN112749003A (en) Method, apparatus and computer-readable storage medium for system optimization
CN113259878B (en) Call bill settlement method, system, electronic device and computer readable storage medium
CN111737371B (en) Data flow detection classification method and device capable of dynamically predicting
CN113918345A (en) Capacity calculation method and device for configuration hardware, computer equipment and medium
CN114356512A (en) Data processing method, data processing equipment and computer readable storage medium
CN110633742A (en) Method for acquiring characteristic information and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination