CN113835962A - Server fault detection method and device, computer equipment and storage medium - Google Patents

Server fault detection method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113835962A
CN113835962A CN202111121516.8A CN202111121516A CN113835962A CN 113835962 A CN113835962 A CN 113835962A CN 202111121516 A CN202111121516 A CN 202111121516A CN 113835962 A CN113835962 A CN 113835962A
Authority
CN
China
Prior art keywords
server
current
fault
operation data
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111121516.8A
Other languages
Chinese (zh)
Inventor
杨柳
赖一鹏
刘毅枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chaoyue Technology Co Ltd
Original Assignee
Chaoyue Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chaoyue Technology Co Ltd filed Critical Chaoyue Technology Co Ltd
Priority to CN202111121516.8A priority Critical patent/CN113835962A/en
Publication of CN113835962A publication Critical patent/CN113835962A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/24Resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a server fault detection method, a device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring current operation data of a server; preprocessing the current operation data to obtain preprocessed operation data; inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type; determining whether the current server has faults or not and the fault type of the faults according to the output of the pre-trained prediction model; the scheme of the invention not only realizes the fault determination of the server, but also diagnoses the fault type aiming at the condition of the fault, has better accuracy and fault tolerance, provides great convenience for the operation and maintenance of the server, and improves the stability of the operation of the server.

Description

Server fault detection method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of server technologies, and in particular, to a server fault detection method and apparatus, a computer device, and a storage medium.
Background
The server is a computer with fast operation, high load and strong performance, long-time operation is an important performance index of the server, monitoring the operation state of the server is an important method for ensuring long-term reliable operation of the server, and once the server fails to operate normally, the server needs to be reset and the like by means of a server remote Controller (such as a Baseboard Management Controller, BMC for short).
At present, the failure monitoring of the traditional server is performed in a common manner by setting an early warning value for a monitoring parameter, and if it is monitored that a certain operation parameter of the server exceeds the set early warning value, it indicates that the server fails, or statistical analysis is performed on operation data of the server to obtain an operation state evaluation result; however, the existing server monitoring method has the following disadvantages: firstly, the type of the fault cannot be accurately determined, and only whether the fault occurs or not, the type of the fault cannot be positioned or the device and the reason of the fault cannot be positioned can be judged, so that great inconvenience is left for fault repair and maintenance of the server; secondly, the fault tolerance rate is low, and for some situations of running data missing or short-time abnormity, the mode has misjudgment or cannot determine whether the fault exists. Thirdly, the data processing capacity is large, the time consumption is long, and the cost of fault monitoring is increased. Therefore, the conventional server failure monitoring method needs to be improved.
Disclosure of Invention
In view of the above, it is desirable to provide a server failure detection method, device, computer device and storage medium.
According to a first aspect of the present invention, there is provided a server failure detection method, the method comprising:
acquiring current operation data of a server;
preprocessing the current operation data to obtain preprocessed operation data;
inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type;
and determining whether the current server fails or not and the fault type of the current server according to the output of the pre-trained prediction model.
In some embodiments, the method further comprises:
responding to the current server failure, and generating an alarm record based on the failure type of the failure;
and displaying the alarm record through a Web page.
In some embodiments, the method further comprises:
responding to the current server failure, monitoring the current failed server to determine whether the server is down;
and responding to the downtime of the server with the current fault, and resetting the server with the current fault.
In some embodiments, the current operating data includes at least one of: the voltage of at least one component on the mainboard, the current of at least one component on the mainboard, and the memory utilization rate of the central processing unit.
In some embodiments, the step of preprocessing the current operation data to obtain preprocessed operation data includes:
and carrying out normalization processing on the current operation data, and taking the data after the normalization processing as the operation data after the preprocessing.
In some embodiments, the method further comprises:
constructing a sample set by using historical operating data of a server which is marked with a fault type label in advance;
the number of decision trees, the depth of each decision tree, the number of features used by each node, iteration termination conditions, the minimum number of samples on each node, and the minimum information gain on each node are configured.
In some embodiments, the pre-trained predictive model is obtained by a random forest training process and a random forest testing process;
the training process of the random forest comprises the following steps: and (3) extracting training samples from the constructed sample set with the feedback, randomly selecting a root node, and training by using the training sample set from the root node until all the nodes are trained, so as to obtain a prediction model with required parameters:
the testing process of the random forest comprises the following steps: inputting the test sample into a prediction model with required parameters, evaluating the output result of the model by adopting Gini parameters, and adjusting the parameters of the prediction model based on the Gini value to obtain the pre-trained prediction model.
According to a second aspect of the present invention, there is provided a server failure detection apparatus, the apparatus comprising:
the data acquisition module is configured to acquire current operating data of the server;
the preprocessing module is configured to preprocess the current operating data to obtain preprocessed operating data;
the prediction module is configured to input the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type;
and the fault determining module is configured to determine whether the current server has a fault and the fault type of the fault according to the output of the pre-trained prediction model.
According to a third aspect of the present invention, there is also provided a computer apparatus comprising:
at least one processor; and
the storage stores a computer program capable of running on the processor, and the processor executes the server fault detection method when executing the program.
According to a fourth aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, performs the aforementioned server failure detection method.
According to the server fault detection method, the operation data of the server are acquired on line in the service operation process, the operation data are preprocessed and then input into the pre-trained prediction model obtained through training based on the random forest algorithm, whether the current server fails or not is determined according to the output of the pre-trained prediction model, the server is determined according to the faults, the fault type can be diagnosed according to the situation with the faults, the accuracy and the fault tolerance are good, great convenience is provided for the operation and maintenance of the server, and the operation stability of the server is improved.
In addition, the invention also provides a server fault detection device, a computer device and a computer readable storage medium, which can also achieve the technical effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a flowchart illustrating a server failure detection method 100 according to an embodiment of the present invention;
FIG. 2 is a flow diagram of another server failure monitoring method 200 according to an embodiment of the invention;
fig. 3 is a schematic structural diagram of a server failure detection apparatus 300 according to another embodiment of the present invention;
fig. 4 is an internal structural view of a computer device according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In one embodiment, referring to fig. 1, the present invention provides a server failure detection method 100, where the method 100 includes the following steps:
s101, obtaining current operation data of the server.
In the specific implementation process, the operation modes of the acquisition server include, but are not limited to, the following cases: acquired through server management software, sensors, or baseboard management controllers. The operation data may be the memory occupancy rate of the CPU, or the operation data may be the voltage, current, and the like of a certain chip or functional module on the motherboard. Preferably, the current operating data includes at least one of: the voltage of at least one component on the mainboard, the current of at least one component on the mainboard, and the memory utilization rate of the central processing unit.
And S102, preprocessing the current operation data to obtain preprocessed operation data.
S103, inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type;
and S104, determining whether the current server fails or not and the fault type of the current server according to the output of the pre-trained prediction model.
According to the server fault detection method, the operation data of the server are acquired on line in the service operation process, the operation data are preprocessed and then input into the pre-trained prediction model obtained through training based on the random forest algorithm, whether the current server fails or not is determined according to the output of the pre-trained prediction model, the server is determined according to the faults, the fault type can be diagnosed according to the situation with the faults, the accuracy and the fault tolerance are good, great convenience is provided for the operation and maintenance of the server, and the operation stability of the server is improved.
In some embodiments, to facilitate management of the server, a user or an operation and maintenance person can timely find a fault, the method further includes:
responding to the current server failure, and generating an alarm record based on the failure type of the failure; and displaying the alarm record through a Web page.
In some embodiments, for some failure problems that cause the server to fail to automatically recover, the server may be recovered to normal by restarting the server, and the method further includes:
responding to the current server failure, monitoring the current failed server to determine whether the server is down;
and responding to the downtime of the server with the current fault, and resetting the server with the current fault.
In some embodiments, the step of preprocessing the current operation data to obtain preprocessed operation data includes:
and carrying out normalization processing on the current operation data, and taking the data after the normalization processing as the operation data after the preprocessing.
In some embodiments, the method further comprises:
constructing a sample set by using historical operating data of a server which is marked with a fault type label in advance; the number of decision trees, the depth of each decision tree, the number of features used by each node, iteration termination conditions, the minimum number of samples on each node, and the minimum information gain on each node are configured.
In some embodiments, the pre-trained predictive model is obtained by a random forest training process and a random forest testing process;
the training process of the random forest comprises the following steps: and (3) extracting training samples from the constructed sample set with the feedback, randomly selecting a root node, and training by using the training sample set from the root node until all the nodes are trained, so as to obtain a prediction model with required parameters:
the testing process of the random forest comprises the following steps: inputting the test sample into a prediction model with required parameters, evaluating the output result of the model by adopting Gini parameters, and adjusting the parameters of the prediction model based on the Gini value to obtain the pre-trained prediction model.
In another embodiment, please refer to fig. 2, which shows a flowchart of another server failure monitoring method 200 according to the present invention, which specifically includes the following steps:
s201, the BMC obtains the information of the key components of the server through I2C. The critical component may be memory, central processing unit, etc.
S202, health information data are processed, the numerical type is normalized, partial data need to be marked with the current server state, and the state is divided into numerical type data corresponding to faults and various fault types. For example, the server may be represented by a value "0" when there is no failure, may be represented by a value "1" when there is a failure in the central processing unit, and may be represented by a value "2" when there is a failure in the memory, and the correspondence between the values and the types of failures may be freely set in the specific implementation process.
S203, predicting whether a fault exists in the normalized data through a random forest algorithm, wherein a fault detection part based on the random forest algorithm is divided into a training process and a prediction process of the random forest, and the training process of the random forest is as follows:
selecting part of labeled data as a sample set S, wherein the dimensionality, namely the characteristic dimensionality, of each data of a training set is F, the parameters to be determined include the number t of decision trees, the depth d of each tree, the characteristic quantity F used by each node, and termination conditions: the minimum number of samples s on the node and the minimum information gain m on the node;
the training process of the random forest A is as follows:
a1, extracting a training set S (i) with the same size as the sample set S from the sample set S, randomly selecting a sample as a root node, and starting training from the root node;
a2, if the current node reaches the termination condition, setting the current node as a leaf node, wherein the predicted output of the leaf node is the most abundant class c (j) in the current set sample, the probability p is the proportion of c (j) in the current sample set, continuing to train other nodes, and if the current node does not reach the termination condition, randomly selecting F-dimensional features from the F-dimensional features without being put back; and searching the one-dimensional feature k with the best classification effect and the threshold th thereof by using the f-dimensional feature, wherein the samples with the k-th dimension feature smaller than th on the current node are divided into left nodes, and the rest are divided into right nodes. And continuing to train other nodes.
A3, repeat A2 until all nodes in a decision tree have been trained or marked as leaf nodes.
A4, repeat A1-A3 until all random numbers in the random forest have been trained.
B. The prediction process of the random forest is as follows:
b1, starting from the root node of the current decision tree, judging whether to enter a left node (the threshold of the current node is smaller than the threshold th) or to count a right node (the threshold of the current node is larger than or equal to the threshold th) according to the threshold th of the current node until reaching a certain leaf node, and outputting a prediction result;
b2, repeating B1 till all t decision trees output predicted values, wherein the predicted values are the type with the largest sum of predicted probabilities of all tree species, namely the accumulation of p for each c (j).
B3, calculating the Gini value as a judgment standard by using the following formula;
Gini=1-∑(p(i)*p(i))
the predicted output of the leaf node is the type with the largest number in the samples of the current set, and the probability p is the proportion of c (j) in the current sample set.
And training in the above mode to obtain a pre-trained prediction model based on the random forest algorithm, inputting the preprocessed data serving as a sample of the random forest algorithm, and outputting the predicted fault type.
And S204, if the fault forms an alarm record, feeding back the fault type to a web page for display.
S205, monitoring the state of the server to judge whether the server is down, and if the server is down, entering the step S206.
And S206, when the state of the server is down, resetting the down server.
According to the server fault detection method, server information is obtained through the BMC, whether the server fails or not is analyzed and predicted through the random forest algorithm, the fault is fed back to the webpage to be displayed, the state of the server is monitored, the stability of the server is improved, variable deletion is not needed when the random forest can process large input variables, the most important variable for classification can be evaluated, and the accuracy can be still kept when most data are lost through an effective method for estimating missing data.
In some embodiments, please refer to fig. 3, the present invention provides a server failure detection apparatus 300, which includes:
a data acquisition module 301 configured to acquire current operating data of the server;
a preprocessing module 302 configured to preprocess the current operating data to obtain preprocessed operating data;
the prediction module 303 is configured to input the preprocessed operating data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained by training based on a random forest algorithm and is used for representing a corresponding relationship between the operating data and a fault type;
and the fault determining module 304 is configured to determine whether the current server fails according to the output of the pre-trained prediction model and a fault type to which the fault belongs.
It should be noted that, for specific limitations of the server failure detection apparatus, reference may be made to the above limitations of the server failure detection method, and details are not described herein again. The modules in the server failure detection device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
According to another aspect of the present invention, a computer device is provided, and the computer device may be a server, and its internal structure is shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements the server failure detection method described above, in particular the method comprising the steps of:
acquiring current operation data of a server; preprocessing the current operation data to obtain preprocessed operation data; inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type; and determining whether the current server fails or not and the fault type of the current server according to the output of the pre-trained prediction model.
According to yet another aspect of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the server failure detection method described above, in particular comprising performing the steps of:
acquiring current operation data of a server; preprocessing the current operation data to obtain preprocessed operation data; inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type; and determining whether the current server fails or not and the fault type of the current server according to the output of the pre-trained prediction model.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for server failure detection, the method comprising:
acquiring current operation data of a server;
preprocessing the current operation data to obtain preprocessed operation data;
inputting the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type;
and determining whether the current server fails or not and the fault type of the current server according to the output of the pre-trained prediction model.
2. The server failure detection method according to claim 1, further comprising:
responding to the current server failure, and generating an alarm record based on the failure type of the failure;
and displaying the alarm record through a Web page.
3. The server failure detection method according to claim 1, further comprising:
responding to the current server failure, monitoring the current failed server to determine whether the server is down;
and responding to the downtime of the server with the current fault, and resetting the server with the current fault.
4. The server failure detection method of claim 1, wherein the current operational data comprises at least one of: the voltage of at least one component on the mainboard, the current of at least one component on the mainboard, and the memory utilization rate of the central processing unit.
5. The method according to claim 1, wherein the step of preprocessing the current operation data to obtain preprocessed operation data comprises:
and carrying out normalization processing on the current operation data, and taking the data after the normalization processing as the operation data after the preprocessing.
6. The server failure detection method according to any one of claims 1 to 5, wherein the method further comprises:
constructing a sample set by using historical operating data of a server which is marked with a fault type label in advance;
the number of decision trees, the depth of each decision tree, the number of features used by each node, iteration termination conditions, the minimum number of samples on each node, and the minimum information gain on each node are configured.
7. The server fault detection method according to claim 6, wherein the pre-trained predictive model is obtained through a random forest training process and a random forest testing process;
the training process of the random forest comprises the following steps: and (3) extracting training samples from the constructed sample set with the feedback, randomly selecting a root node, and training by using the training sample set from the root node until all the nodes are trained, so as to obtain a prediction model with required parameters:
the testing process of the random forest comprises the following steps: inputting the test sample into a prediction model with required parameters, evaluating the output result of the model by adopting Gini parameters, and adjusting the parameters of the prediction model based on the Gini value to obtain the pre-trained prediction model.
8. An apparatus for server failure detection, the apparatus comprising:
the data acquisition module is configured to acquire current operating data of the server;
the preprocessing module is configured to preprocess the current operating data to obtain preprocessed operating data;
the prediction module is configured to input the preprocessed operation data into a pre-trained prediction model, wherein the pre-trained prediction model is obtained based on random forest algorithm training and is used for representing the corresponding relation between the operation data and the fault type;
and the fault determining module is configured to determine whether the current server has a fault and the fault type of the fault according to the output of the pre-trained prediction model.
9. A computer device, comprising:
at least one processor;
and a memory storing a computer program operable in the processor, the processor executing the program to perform the server failure detection method of any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor to perform the server failure detection method according to any one of claims 1 to 7.
CN202111121516.8A 2021-09-24 2021-09-24 Server fault detection method and device, computer equipment and storage medium Pending CN113835962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111121516.8A CN113835962A (en) 2021-09-24 2021-09-24 Server fault detection method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111121516.8A CN113835962A (en) 2021-09-24 2021-09-24 Server fault detection method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113835962A true CN113835962A (en) 2021-12-24

Family

ID=78969794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111121516.8A Pending CN113835962A (en) 2021-09-24 2021-09-24 Server fault detection method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113835962A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114548326A (en) * 2022-04-27 2022-05-27 深圳丰尚智慧农牧科技有限公司 Fault processing method and device for feed production equipment and computer equipment
CN114598588A (en) * 2022-03-14 2022-06-07 阿里巴巴(中国)有限公司 Server fault determination method and device and terminal equipment
CN115131171A (en) * 2022-01-18 2022-09-30 杭州安脉盛智能技术有限公司 Training method of energy storage power station operation monitoring model and monitoring system of energy storage power station
CN115545085A (en) * 2022-11-04 2022-12-30 南方电网数字电网研究院有限公司 Weak fault current fault type identification method, device, equipment and medium
CN117390520A (en) * 2023-12-08 2024-01-12 惠州市宝惠电子科技有限公司 Transformer state monitoring method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714187A (en) * 2018-08-17 2019-05-03 平安普惠企业管理有限公司 Log analysis method, device, equipment and storage medium based on machine learning
WO2019169743A1 (en) * 2018-03-09 2019-09-12 网宿科技股份有限公司 Server failure detection method and system
CN111105063A (en) * 2018-10-26 2020-05-05 北京国双科技有限公司 Fault prediction method, fault prediction device, model construction method, fault prediction device, processor and readable storage medium
CN111143173A (en) * 2020-01-02 2020-05-12 山东超越数控电子股份有限公司 Server fault monitoring method and system based on neural network
CN111949429A (en) * 2020-08-17 2020-11-17 山东超越数控电子股份有限公司 Server fault monitoring method and system based on density clustering algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019169743A1 (en) * 2018-03-09 2019-09-12 网宿科技股份有限公司 Server failure detection method and system
CN109714187A (en) * 2018-08-17 2019-05-03 平安普惠企业管理有限公司 Log analysis method, device, equipment and storage medium based on machine learning
CN111105063A (en) * 2018-10-26 2020-05-05 北京国双科技有限公司 Fault prediction method, fault prediction device, model construction method, fault prediction device, processor and readable storage medium
CN111143173A (en) * 2020-01-02 2020-05-12 山东超越数控电子股份有限公司 Server fault monitoring method and system based on neural network
CN111949429A (en) * 2020-08-17 2020-11-17 山东超越数控电子股份有限公司 Server fault monitoring method and system based on density clustering algorithm

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131171A (en) * 2022-01-18 2022-09-30 杭州安脉盛智能技术有限公司 Training method of energy storage power station operation monitoring model and monitoring system of energy storage power station
CN114598588A (en) * 2022-03-14 2022-06-07 阿里巴巴(中国)有限公司 Server fault determination method and device and terminal equipment
CN114598588B (en) * 2022-03-14 2023-07-25 阿里巴巴(中国)有限公司 Server fault determination method and device and terminal equipment
CN114548326A (en) * 2022-04-27 2022-05-27 深圳丰尚智慧农牧科技有限公司 Fault processing method and device for feed production equipment and computer equipment
CN114548326B (en) * 2022-04-27 2022-09-09 深圳丰尚智慧农牧科技有限公司 Fault processing method and device for feed production equipment and computer equipment
CN115545085A (en) * 2022-11-04 2022-12-30 南方电网数字电网研究院有限公司 Weak fault current fault type identification method, device, equipment and medium
CN117390520A (en) * 2023-12-08 2024-01-12 惠州市宝惠电子科技有限公司 Transformer state monitoring method and system
CN117390520B (en) * 2023-12-08 2024-04-16 惠州市宝惠电子科技有限公司 Transformer state monitoring method and system

Similar Documents

Publication Publication Date Title
CN113835962A (en) Server fault detection method and device, computer equipment and storage medium
CN114065613B (en) Multi-working-condition process industrial fault detection and diagnosis method based on deep migration learning
CN111177714B (en) Abnormal behavior detection method and device, computer equipment and storage medium
AU2018201487B2 (en) Method and system for health monitoring and fault signature identification
US7890813B2 (en) Method and apparatus for identifying a failure mechanism for a component in a computer system
CN113282461B (en) Alarm identification method and device for transmission network
CN111143173A (en) Server fault monitoring method and system based on neural network
CN110825068A (en) Industrial control system anomaly detection method based on PCA-CNN
CN112257755B (en) Method and device for analyzing running state of spacecraft
CN108460397B (en) Method and device for analyzing equipment fault type, storage medium and electronic equipment
CN112308126A (en) Fault recognition model training method, fault recognition device and electronic equipment
CN111325159B (en) Fault diagnosis method, device, computer equipment and storage medium
CN115656673A (en) Transformer data processing device and equipment storage medium
WO2022001125A1 (en) Method, system and device for predicting storage failure in storage system
CN113760670A (en) Cable joint abnormity early warning method and device, electronic equipment and storage medium
CN111881980A (en) Vehicle fault detection method and device, computer equipment and storage medium
CN110570544A (en) method, device, equipment and storage medium for identifying faults of aircraft fuel system
CN111624986A (en) Case base-based fault diagnosis method and system
CN113284002A (en) Power consumption data anomaly detection method and device, computer equipment and storage medium
CN116345690B (en) Power monitoring false alarm identification method and system based on power supply system equipment list
CN113419950A (en) Method and device for generating UI automation script, computer equipment and storage medium
CN113723861A (en) Abnormal electricity consumption behavior detection method and device, computer equipment and storage medium
CN110866682B (en) Underground cable early warning method and device based on historical data
CN115952081A (en) Software testing method, device, storage medium and equipment
CN111143191A (en) Website testing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination