CN115878415A - Cluster server intelligent fault prediction method, system, terminal and storage medium - Google Patents

Cluster server intelligent fault prediction method, system, terminal and storage medium Download PDF

Info

Publication number
CN115878415A
CN115878415A CN202211433292.9A CN202211433292A CN115878415A CN 115878415 A CN115878415 A CN 115878415A CN 202211433292 A CN202211433292 A CN 202211433292A CN 115878415 A CN115878415 A CN 115878415A
Authority
CN
China
Prior art keywords
state
data
equipment
cluster server
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211433292.9A
Other languages
Chinese (zh)
Inventor
张嘉谣
牛玉峰
陈亮甫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Chaoyue Shentai Information Technology Co Ltd
Original Assignee
Xian Chaoyue Shentai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Chaoyue Shentai Information Technology Co Ltd filed Critical Xian Chaoyue Shentai Information Technology Co Ltd
Priority to CN202211433292.9A priority Critical patent/CN115878415A/en
Publication of CN115878415A publication Critical patent/CN115878415A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to the technical field of servers, in particular to an intelligent fault prediction method, an intelligent fault prediction system, a terminal and a storage medium for a cluster server. The method comprises the following steps: acquiring real-time information of equipment; analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data; predicting equipment faults according to the equipment state data; the invention realizes the detection and evaluation of the condition of the component in the server and can find the fault in time.

Description

Cluster server intelligent fault prediction method, system, terminal and storage medium
Technical Field
The invention relates to the technical field of servers, in particular to an intelligent fault prediction method, an intelligent fault prediction system, a terminal and a storage medium for a cluster server.
Background
A server is one of computers that runs faster, is more heavily loaded, and is more expensive than a regular computer. The server provides calculation or application services for other clients (such as terminals like PC, smart phone, ATM and the like and even large equipment like train systems and the like) in the network. The server has high-speed CPU computing capability, long-time reliable operation, strong I/O external data throughput capability and better expansibility.
In order to solve the technical problem, an intelligent fault prediction method, a system, a terminal and a storage medium for a cluster server are provided.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an intelligent failure prediction method, an intelligent failure prediction system, a terminal and a storage medium for a cluster server.
In order to achieve the above purpose, the embodiment of the present invention provides the following technical solutions:
in a first aspect, in an embodiment provided by the present invention, a method for cluster server intelligent failure prediction is provided, the method including the following steps:
acquiring real-time information of equipment;
analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data;
and predicting the equipment fault according to the equipment state data.
As a further scheme of the invention, the real-time information of the equipment comprises data such as vibration conditions, temperature changes, current stability and the like.
As a further scheme of the present invention, the loss state data of the device is analyzed according to the real-time information of the device, and the device state data is obtained according to the loss state data of the device; the method comprises the following steps: and analyzing the loss state data of the equipment through a BP neural network prediction model to obtain the equipment state data.
As a further scheme of the invention, the construction step of the BP neural network prediction model comprises the following steps:
s201, constructing a BP neural network prediction model according to vibration conditions, temperature changes and current stability elements;
s202, establishing a sample data set according to { (vibration condition, temperature change, current stability) and fault type };
s203, normalizing the sample data set value by using a normalization formula to enable the sample data set value to be in a range from 0 to 1;
s204, inputting the sample data set into the constructed BP neural network prediction model, and outputting fault type data;
s205, judging a calculation error according to the fault type data, and adjusting the weight from a hidden layer to an output layer and the weight from an input layer to the hidden layer of the BP neural network prediction model;
and S206, repeating the steps S204-S205 until the error meets the set value.
As a further scheme of the present invention, the determining a calculation error according to the fault type data and adjusting weights from a hidden layer to an output layer and from an input layer to the hidden layer of the BP neural network prediction model includes: and calculating the error of the model by a least square method, and sequentially updating the weight from back to front by a gradient descent method.
As a further aspect of the present invention, the S30 predicting the device failure according to the device status data includes the following steps:
s301, generating a state transition probability distribution matrix of each associated component;
and S302, predicting the state of the associated component according to the Markov chain.
As a further aspect of the present invention, the generating S301 a state transition probability distribution matrix of each related component includes:
s3011, selecting a history period width T, and acquiring the state of a main component and the state of a related component in each unit time period;
s3012, setting all the prediction states of the principal component and all the states of the associated components, dividing the prediction states and all the states of the associated components according to the states of the principal component, respectively calculating the frequency of data state transition of the principal component in each state in adjacent time periods of the associated components, and obtaining the prediction state probability distribution of the principal component and the state transition probability distribution of the associated components.
In a second aspect, in another embodiment provided by the present invention, a cluster server intelligent failure prediction system is provided, which includes: the device monitoring system comprises a device monitoring terminal 100, a device state analysis module 200 and a device state prediction module 300;
the device monitoring terminal 100 is configured to acquire real-time information of a device, where the real-time information of the device includes vibration conditions, temperature changes, and current stability data;
the device state analysis module 200 is configured to analyze the loss state data of the device according to the real-time information of the device, and analyze the loss state data of the device to obtain device state data;
the device status prediction module 300 is configured to predict a device fault according to the device status data.
In a third aspect, in a further embodiment provided by the present invention, a terminal is provided, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the cluster server intelligent failure prediction method when loading and executing the computer program.
In a fourth aspect, in a further embodiment provided by the present invention, a storage medium is provided, which stores a computer program that is loaded by a processor and executed to implement the steps of the cluster server intelligent failure prediction method.
The technical scheme provided by the invention has the following beneficial effects:
the invention provides a cluster server intelligent fault prediction method, a system, a terminal and a storage medium, wherein the method acquires real-time information of equipment; analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data; predicting equipment faults according to the equipment state data; the invention realizes the detection and evaluation of the condition of the component in the server and can find the fault in time.
These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a cluster server intelligent failure prediction method according to an embodiment of the present invention;
fig. 2 is a flowchart of S30 in the cluster server intelligent failure prediction method according to an embodiment of the present invention;
fig. 3 is a flowchart of S302 in the method for cluster server intelligent failure prediction according to an embodiment of the present invention;
FIG. 4 is a diagram of a neural network model;
FIG. 5 is a block diagram of an embodiment of an intelligent cluster server failure prediction system;
fig. 6 is a block diagram of a terminal according to an embodiment of the present invention.
In the figure: the device monitoring system comprises a device monitoring terminal-100, a device state analysis module-200, a device state prediction module-300, a processor-401, a communication interface-402, a memory-403 and a communication bus-404.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Specifically, the embodiments of the present invention are further explained below with reference to the drawings.
Referring to fig. 1, fig. 1 is a flowchart of an intelligent failure prediction method for a cluster server according to an embodiment of the present invention, and as shown in fig. 1, the intelligent failure prediction method for a cluster server includes steps S10 to S30.
S10, acquiring real-time information of equipment;
in an embodiment of the present invention, the real-time information of the device includes data of vibration condition, temperature variation, current stability, and the like.
In an embodiment of the present invention, the acquiring the real-time information of the device includes performing noise reduction processing on the acquired real-time information of the device, so as to remove noise data.
S20, analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data;
analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data; the method comprises the following steps: and analyzing the loss state data of the equipment through a BP neural network prediction model to obtain the equipment state data.
In the embodiment of the present invention, the BP neural network prediction model constructing step includes the steps of:
s201, constructing a BP neural network prediction model according to vibration conditions, temperature changes and current stability elements;
s202, establishing a sample data set according to { (vibration condition, temperature change, current stability) and fault type };
s203, normalizing the numerical value by using a normalization formula to enable the range of the numerical value to be between 0 and 1;
s204, inputting the sample data set into the constructed BP neural network prediction model, and outputting fault type data;
s205, judging a calculation error according to the fault type data, and adjusting the weight from a hidden layer to an output layer and the weight from an input layer to the hidden layer of the BP neural network prediction model;
and S206, repeating the steps S204-S205 until the error meets the set value.
In the embodiment of the present invention, in step S201, a BP neural network prediction model is constructed according to the vibration condition, the temperature change, and the current stability element; the method comprises the following steps:
the three elements of the vibration condition, the temperature change and the current stability are respectively segmented into a variation range, and the variation range space is formed by each segment and respectively comprises the following elements: v = { V 1 ,V 2 ,...,V v },T={T 1 ,T 2 ,...,T t },E={E 1 ,E 2 ,...,E e };
Constructing early failure symptom feature space A = { A = { (A) 1 ,A 2 ,...,A m The component fault characteristic space is B = { B = 1 ,B 2 ,...,B n };
Each element Ai of the feature space of the fault symptom is a triplet A i ={V j ,T k ,E l J ∈ {1, 2.., v }, k ∈ {1, 2.., t }, l ∈ {1, 2.., e };
collection { A i →B j I belongs to a mapping relation data set of {1,2,. M }, j belongs to a mapping relation data set of {1,2,. N } }, and the values in the data set are normalized: the values are normalized using a normalization formula to range from 0 to 1.
In an embodiment of the present invention, the determining a calculation error according to the fault type data, and adjusting weights from a hidden layer to an output layer and from an input layer to the hidden layer of the BP neural network prediction model includes: and calculating the error of the model by a least square method, and updating the weight sequentially from back to front by a gradient descent method.
And S30, predicting the equipment fault according to the equipment state data.
In the embodiment of the present invention, the S30 predicting the device failure according to the device status data includes the following steps:
s301, generating a state transition probability distribution matrix of each associated component;
and S302, predicting the state of the associated component according to the Markov chain.
S301, the generation of the state transition probability distribution matrix of each associated component comprises the following steps:
s3011, selecting a history period width T, and acquiring the state of a main component and the state of a related component in each unit time period;
s3012, setting all the prediction states of the principal component and all the states of the associated components, dividing the prediction states and all the states of the associated components according to the states of the principal component, respectively calculating the frequency number of data state transition of the adjacent time periods of the associated components in each state of the principal component, and obtaining the prediction state probability distribution of the principal component and the state transition probability distribution of the associated components.
Illustratively, assume that all predicted states of the master component a are E = { E = { (E) 1 ,E 2 ,E 3 ,E 4 In which E 1 Representing a state of no fault level, E 2 Representing a mild degree of failure state, E 3 Representing a state of moderate degree of failure, E 4 Representing a severe fault condition; associated with component b all states are Q = { Q = { [ Q ] 1 ,Q 2 ,...,Q 4 In which Q 1 Representing a state of no fault level, Q 2 Representing a mild fault condition, Q 3 Representing a medium fault condition, Q 4 Representing a severe fault condition. Dividing the states of the principal component devices, and respectively calculating the frequency of data state transition of the adjacent time periods of the associated component b of the principal component devices in each state. At E i In this state, the transfer distribution of the associated component b is:
Figure BDA0003945864240000071
then the state transition probability distribution of the associated components is: />
Figure BDA0003945864240000081
In the embodiment of the present invention, S302, performing state prediction on the associated component according to the markov chain, includes the following steps:
and S3021, acquiring the predicted state probability distribution of the main component a and the current state of the associated component b.
And S3022, calculating the probability distribution of the next state of the associated element b in each state of the principal element device a according to the Markov law. Assuming principal component a all prediction states are E = { E = { (E) 1 ,E 2 ,E 3 ,E 4 }. Associated with component b all states are Q = { Q = { [ Q ] 1 ,Q 2 ,...,Q 4 }。
For the initial predicted state of the principal component a, set to W 0 =[w 01 ,w 02 ,...,w 0i ,...,w 0m ]Wherein w is 0i Indicating that the predicted state of the pivot element is E at time t =0 i The probability of (c). Wherein
Figure BDA0003945864240000082
With predicted state E of principal component a i I ∈ {1, 2.. Said, m } is taken as an example, an arbitrary time point is selected as a start, and the state at that time point is taken as an initial state, and U is set 0 ={0,...,1,...,0},U 0 Representing a unit row vector of 1x n, if the p component is 1 and the other components are 0, representing that the system initial state is in the p state, calculating the state probability U of the next moment i1 Comprises the following steps:
U i1 =U 0 *P i =[p i (1),p i (2),...,p i (k),...,p i (n)]
s3023, calculating the probability distribution of the next state of the associated component b according to the predicted probability distribution of the main component a:
Figure BDA0003945864240000083
and S3024, summarizing to obtain the loss condition prediction of each component such as the principal component, the related component and the like.
The invention combines the neural network algorithm and the Markov chain to study and judge the state of the equipment components, and the system can detect and evaluate the conditions of the components no matter whether the component loss changes the parameters of the equipment or not.
It should be understood that although the steps are described above in a certain order, the steps are not necessarily performed in the order described. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, some steps of the present embodiment may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or in turns with other steps or at least a part of the steps or stages in other steps.
In one embodiment, referring to fig. 3, in an embodiment of the present invention, an intelligent cluster server failure prediction system is further provided, where the system includes a device monitoring terminal 100, a device state analysis module 200, and a device state prediction module 300.
The device monitoring terminal 100 is configured to obtain real-time information of a device, where the real-time information of the device includes data such as a vibration condition, a temperature change, and a current stability.
In an embodiment of the present invention, the acquiring the real-time information of the device includes performing noise reduction processing on the acquired real-time information of the device, so as to remove noise data.
The device status analysis module 200 is configured to analyze the loss status data of the device according to the real-time information of the device, and analyze the loss status data of the device to obtain the device status data.
The loss state data of the device state analysis module 200 is analyzed by a BP neural network prediction model to obtain device state data.
As shown in fig. 4, the BP neural network prediction model constructing step includes the following steps:
s201, constructing a BP neural network prediction model according to vibration conditions, temperature changes and current stability elements;
s202, establishing a sample data set according to { (vibration condition, temperature change, current stability) and fault type };
s203, normalizing the numerical value of the sample data set by using a normalization formula to enable the range of the numerical value to be 0-1;
s204, inputting the sample data set into the constructed BP neural network prediction model, and outputting fault type data;
s205, judging a calculation error according to the fault type data, and adjusting the weight from a hidden layer to an output layer and the weight from an input layer to the hidden layer of the BP neural network prediction model;
and S206, repeating the steps S204-S205 until the error meets the set value.
S201, constructing a BP neural network prediction model by using elements of vibration condition, temperature change and current stability; the method comprises the following steps:
the three elements of the vibration condition, the temperature change and the current stability are respectively segmented into a variation range, and the variation range space is formed by each segment and respectively comprises the following elements: v = { V 1 ,V 2 ,...,V v },T={T 1 ,T 2 ,...,T t },E={E 1 ,E 2 ,...,E e };
Constructing early failure symptom feature space A = { A = { (A) 1 ,A 2 ,...,A m The component fault characteristic space is B = { B = 1 ,B 2 ,...,B n };
Each element a of the failure symptom feature space i Into a three-original group A i ={V j ,T k ,E l J ∈ {1, 2.., v }, k ∈ {1, 2.., t }, l ∈ {1, 2.., e };
collection { A i →B j I belongs to a mapping relation data set of {1,2,. M }, j belongs to a mapping relation data set of {1,2,. N } }, and the values in the data set are normalized: the values are normalized using a normalization formula to range from 0 to 1.
The method for judging the calculation error according to the fault type data and adjusting the weight from the hidden layer to the output layer and the weight from the input layer to the hidden layer of the BP neural network prediction model comprises the following steps: and calculating the error of the model by a least square method, and sequentially updating the weight from back to front by a gradient descent method.
The device status prediction module 300 is configured to predict a device fault according to the device status data.
In the embodiment of the present invention, the S30 predicting the device failure according to the device status data includes the following steps:
s301, generating a state transition probability distribution matrix of each associated component;
and S302, predicting the state of the associated component according to the Markov chain.
S301, the generation of the state transition probability distribution matrix of each associated component comprises the following steps:
s3011, selecting a history period width T, and acquiring the state of a main component and the state of a related component in each unit time period;
s3012, setting all the prediction states of the principal component and all the states of the associated components, dividing the prediction states and all the states of the associated components according to the states of the principal component, respectively calculating the frequency of data state transition of the principal component in each state in adjacent time periods of the associated components, and obtaining the prediction state probability distribution of the principal component and the state transition probability distribution of the associated components.
Illustratively, assume that all predicted states of the master component a are E = { E = { (E) 1 ,E 2 ,E 3 ,E 4 In which E 1 Representing a state of no fault level, E 2 Representing a mild degree of failure state, E 3 Representing a state of moderate degree of failure, E 4 Representing a severe fault condition; relating to component b all states are Q = { Q = 1 ,Q 2 ,...,Q 4 In which Q 1 Representing a state of no fault, Q 2 Representing a mild fault condition, Q 3 Representing a medium fault condition, Q 4 Representing a severe fault condition. Dividing the states of the principal component devices, and respectively calculating the frequency of data state transition of the adjacent time periods of the associated component b of the principal component devices in each state. At E i In this state, the transfer distribution of the associated component b is:
Figure BDA0003945864240000121
the state transition probability distribution of the associated components is: />
Figure BDA0003945864240000122
In the embodiment of the present invention, the S302, performing state prediction on the associated component according to the markov chain, includes the following steps:
and S3021, acquiring the predicted state probability distribution of the main component a and the current state of the associated component b.
And S3022, calculating the probability distribution of the next state of the associated component in each state of the principal component according to the Markov law. Assume that all predicted states of the master a are E = { E = } 1 ,E 2 ,E 3 ,E 4 }. Relating to component b all states are Q = { Q = 1 ,Q 2 ,...,Q 4 }。
For principal component deviceInitial predicted state of element a, set to W 0 =[w 01 ,w 02 ,...,w 0i ,...,w 0m ]Wherein w is 0i Indicating that the predicted state of the principal component is E at time t =0 i The probability of (c). Wherein
Figure BDA0003945864240000123
Predicted state E with principal component a i Taking i ∈ {1, 2., m } as an example, an arbitrary time point is selected as a start, and the state at that time point is set as an initial state, and U is set 0 ={0,...,1,...,0},U 0 Representing a unit row vector of 1x n, if the p component is 1 and the other components are 0, representing that the system initial state is in the p state, calculating the state probability U of the next moment i1 Comprises the following steps:
U i1 =U 0 *P i =[p i (1),p i (2),...,p i (k),...,p i (n)]
and S3023, calculating the probability distribution of the next state of the associated component according to the predicted probability distribution of the principal component.
Figure BDA0003945864240000131
And S3024, summarizing and obtaining the loss condition prediction of each component such as the main component a and the related component b.
In one embodiment, referring to fig. 5, in an embodiment of the present invention, a terminal is further provided, which includes a processor 401, a communication interface 402, a memory 403, and a communication bus 404, where the processor 401, the communication interface 402, and the memory 403 complete communication with each other through the communication bus 404.
A memory 403 for storing a computer program;
the processor 401 is configured to execute the cluster server intelligent failure prediction method when executing the computer program stored in the memory 403, and the processor executes the instructions to implement the steps in the foregoing method embodiments.
The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the terminal and other equipment.
The Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The terminal comprises user equipment and network equipment. Wherein the user equipment includes but is not limited to computers, smart phones, PDAs, etc.; the network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of computers or network servers, wherein Cloud Computing is one of distributed Computing, a super virtual computer consisting of a collection of loosely coupled computers. Wherein, the terminal can be operated alone to realize the invention, and can also be accessed to the network and realize the invention through the interactive operation with other terminals in the network. The network where the terminal is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
In an embodiment of the invention, a storage medium is also provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. An intelligent failure prediction method for a cluster server is characterized by comprising the following steps: acquiring real-time information of equipment;
analyzing the loss state data of the equipment according to the real-time information of the equipment, and analyzing the loss state data of the equipment to obtain the equipment state data;
and predicting the equipment fault according to the equipment state data.
2. The intelligent cluster server failure prediction method of claim 1, wherein the real-time information of the device includes vibration conditions, temperature changes, current stability, and other data.
3. The intelligent cluster server fault prediction method of claim 2, wherein the loss state data of the device is analyzed according to real-time information of the device, and the device state data is obtained according to the loss state data of the device; the method comprises the following steps: and analyzing the loss state data of the equipment through a BP neural network prediction model to obtain the equipment state data.
4. The intelligent cluster server failure prediction method of claim 3 wherein the BP neural network prediction model construction step comprises the steps of:
s201, constructing a BP neural network prediction model according to vibration conditions, temperature changes and current stability elements;
s202, establishing a sample data set according to vibration conditions, temperature changes, current stability and fault types;
s203, normalizing the sample data set value by using a normalization formula to enable the sample data set value to be in a range from 0 to 1;
s204, inputting the sample data set into the constructed BP neural network prediction model, and outputting fault type data;
s205, judging a calculation error according to fault type data, and adjusting weights from a hidden layer to an output layer and from an input layer to the hidden layer of the BP neural network prediction model;
and S206, repeating the steps S204-S205 until the error meets the set value.
5. The intelligent failure prediction method of cluster server according to claim 4, wherein the determining the calculation error according to the failure type data and adjusting the weights from the hidden layer to the output layer and from the input layer to the hidden layer of the BP neural network prediction model comprises: and calculating the error of the model by a least square method, and sequentially updating the weight from back to front by a gradient descent method.
6. The intelligent cluster server failure prediction method of claim 1, wherein predicting device failure based on device status data comprises the steps of:
s301, generating a state transition probability distribution matrix of each associated component;
and S302, predicting the state of the associated component according to the Markov chain.
7. The intelligent cluster server fault prediction method of claim 6, wherein the step S301 of generating the state transition probability distribution matrix of each associated component comprises the following steps:
s3011, selecting a history period width T, and acquiring the state of a main component and the state of a related component in each unit time period;
s3012, setting all the prediction states of the principal component and all the states of the associated components, dividing the prediction states and all the states of the associated components according to the states of the principal component, respectively calculating the frequency of data state transition of the principal component in each state in adjacent time periods of the associated components, and obtaining the prediction state probability distribution of the principal component and the state transition probability distribution of the associated components.
8. An intelligent cluster server failure prediction system, comprising: the device monitoring system comprises a device monitoring terminal 100, a device state analysis module 200 and a device state prediction module 300;
the device monitoring terminal 100 is configured to acquire real-time information of a device, where the real-time information of the device includes vibration conditions, temperature changes, and current stability data;
the device state analysis module 200 is configured to analyze the loss state data of the device according to the real-time information of the device, and analyze the loss state data of the device to obtain device state data;
the device status prediction module 300 is configured to predict a device fault according to the device status data.
9. A terminal comprising a memory storing a computer program and a processor implementing the steps of the cluster server intelligent failure prediction method according to any one of claims 1 to 7 when the computer program is loaded and executed.
10. A storage medium storing a computer program which, when loaded and executed by a processor, carries out the steps of the cluster server intelligent failure prediction method according to any one of claims 1-7.
CN202211433292.9A 2022-11-16 2022-11-16 Cluster server intelligent fault prediction method, system, terminal and storage medium Pending CN115878415A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211433292.9A CN115878415A (en) 2022-11-16 2022-11-16 Cluster server intelligent fault prediction method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211433292.9A CN115878415A (en) 2022-11-16 2022-11-16 Cluster server intelligent fault prediction method, system, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN115878415A true CN115878415A (en) 2023-03-31

Family

ID=85759995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211433292.9A Pending CN115878415A (en) 2022-11-16 2022-11-16 Cluster server intelligent fault prediction method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN115878415A (en)

Similar Documents

Publication Publication Date Title
JP7103274B2 (en) Detection device and detection program
CN109981333B (en) Operation and maintenance method and operation and maintenance equipment applied to data center
CN111625516A (en) Method and device for detecting data state, computer equipment and storage medium
CN110351299B (en) Network connection detection method and device
CN110888911A (en) Sample data processing method and device, computer equipment and storage medium
CN114285728A (en) Prediction model training method, flow prediction method, device and storage medium
CN115801463B (en) Industrial Internet platform intrusion detection method and device and electronic equipment
CN111198799A (en) Machine room power consumption early warning method, system, terminal and storage medium based on LSTM
CN115841046B (en) Accelerated degradation test data processing method and device based on wiener process
CN113379301A (en) Method, device and equipment for classifying users through decision tree model
CN111159481B (en) Edge prediction method and device for graph data and terminal equipment
CN113282920B (en) Log abnormality detection method, device, computer equipment and storage medium
CN113518367A (en) Fault diagnosis method and system based on service characteristics under 5G network slice
CN116737373A (en) Load balancing method, device, computer equipment and storage medium
CN112131274A (en) Method, device and equipment for detecting time series abnormal points and readable storage medium
CN111783883A (en) Abnormal data detection method and device
CN113825165A (en) 5G slice network congestion early warning method and device based on time chart network
CN115878415A (en) Cluster server intelligent fault prediction method, system, terminal and storage medium
CN114157486B (en) Communication flow data abnormity detection method and device, electronic equipment and storage medium
CN113837481B (en) Financial big data management system based on block chain
CN114385398A (en) Request response state determination method, device, equipment and storage medium
CN115296876A (en) Network security early warning system of self-adaptation mimicry technique
CN115408182A (en) Service system fault positioning method and device
CN113572639A (en) Method, system, equipment and medium for diagnosing carrier network fault
CN116938769B (en) Flow anomaly detection method, electronic device, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination