CN112131033A - Server fault repairing method, device, equipment and storage medium - Google Patents

Server fault repairing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112131033A
CN112131033A CN202010988093.9A CN202010988093A CN112131033A CN 112131033 A CN112131033 A CN 112131033A CN 202010988093 A CN202010988093 A CN 202010988093A CN 112131033 A CN112131033 A CN 112131033A
Authority
CN
China
Prior art keywords
fault
target
repairing
server
repair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010988093.9A
Other languages
Chinese (zh)
Inventor
韩颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010988093.9A priority Critical patent/CN112131033A/en
Publication of CN112131033A publication Critical patent/CN112131033A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a server fault repairing method, which comprises the following steps: acquiring current fault information of a target server; inputting current fault information into a fault restoration recommendation method model obtained by utilizing a fault restoration method library and deep neural network training in advance, and outputting a corresponding target restoration method by utilizing the fault restoration recommendation method model; and performing fault repair on the target server by using a target repair method. The method determines a target repair method corresponding to the current fault information by using a fault repair recommendation method model, repairs the fault according to the target repair method, replaces operation and maintenance personnel to manually perform troubleshooting and fault repair operations, can reduce human resources consumed during fault repair of the server, reduces technical level requirements on the operation and maintenance personnel, and improves timeliness and efficiency of fault repair; the application also discloses a server fault repairing device, equipment and a computer readable storage medium, which have the beneficial effects.

Description

Server fault repairing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of servers, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for repairing a server failure.
Background
With the rapid development of server technology, the schemes for detecting and diagnosing server faults are more mature, and the detection scenes are more extensive.
In the prior art, generally, a log of a server is analyzed to diagnose fault information of the server, and when a fault of the server needs to be repaired, an on-site operation and maintenance worker manually troubleshoots the fault according to the diagnosed fault information and performs fault repair according to own operation and maintenance experience. That is to say, the method in the prior art can determine the corresponding fault information according to the log of the server, but needs the operation and maintenance personnel to manually perform troubleshooting and fault repairing operations according to the diagnosed fault information, so as to restore the server to normal. However, this method not only needs to consume a large amount of human resources, but also requires operation and maintenance personnel to have a certain technical level, and in addition, the operation of the operation and maintenance personnel arriving at the site and troubleshooting the fault situation and the like needs to consume a certain time, which greatly reduces the timeliness and efficiency of server fault repair.
Therefore, how to reduce the human resources consumed in the fault repairing of the server, reduce the technical level requirements on the operation and maintenance personnel, and improve the timeliness and efficiency of the fault repairing is a technical problem to be solved by the technical personnel in the field.
Disclosure of Invention
In view of this, the present invention aims to provide a server fault repairing method, which can reduce human resources consumed during server fault repairing, reduce technical level requirements on operation and maintenance personnel, and improve timeliness and efficiency of fault repairing; another object of the present invention is to provide a server fault repairing apparatus, device and computer readable storage medium, all having the above beneficial effects.
In order to solve the above technical problem, the present invention provides a server fault repairing method, including:
acquiring current fault information of a target server;
inputting the current fault information into a fault repairing recommendation method model obtained by utilizing a fault repairing method library and deep neural network training in advance, and outputting a corresponding target repairing method by utilizing the fault repairing recommendation method model;
and performing fault repair on the target server by using the target repair method.
Preferably, the process of training the fault repairing recommendation method model specifically includes:
setting sample fault information and a sample repairing method corresponding to the sample fault information in the fault repairing method library as input and output of the deep neural network respectively;
and determining the weight vector and the offset of the deep neural network by using a gradient descent algorithm to obtain the fault repairing recommendation method model.
Preferably, when the target repairing method corresponding to the current fault information cannot be determined by using the fault repairing recommendation method model, the method further includes:
and receiving a target repairing method corresponding to the current fault information and input by operation and maintenance personnel.
Preferably, after the receiving the target repair method corresponding to the current fault information input by the operation and maintenance personnel, the method further includes:
and respectively setting the current fault information and the corresponding target repairing method as the input and the output of the deep neural network for model training, and updating the fault repairing recommendation method model.
Preferably, after the inputting the current fault information into a fault repair recommendation method model trained by using a fault repair method library and a deep neural network in advance and outputting a corresponding target repair method by using the fault repair recommendation method model, the method further includes:
judging whether the target repairing method can be automatically executed or not;
if so, performing fault repairing on the target server by using the target repairing method;
if not, sending out corresponding prompt information.
Preferably, the process of obtaining the current fault information of the target server specifically includes:
and inputting the log of the target server into a fault diagnosis system to obtain the current fault information of the target server.
Preferably, the process of inputting the log of the target server into a fault diagnosis system to obtain the current fault information of the target server specifically includes:
and inputting the log of the target server into the fault diagnosis system according to a preset time period to obtain the current fault information of the target server.
In order to solve the above technical problem, the present invention further provides a server fault repairing apparatus, including:
the acquisition module is used for acquiring the current fault information of the target server;
the determining module is used for inputting the current fault information into a fault restoration recommendation method model obtained by utilizing a fault restoration method library and deep neural network training in advance, and outputting a corresponding target restoration method by utilizing the fault restoration recommendation method model;
and the repair module is used for repairing the fault of the target server by using the target repair method.
In order to solve the above technical problem, the present invention further provides a server fault repairing apparatus, including:
a memory for storing a computer program;
and the processor is used for realizing the steps of any one of the server fault repairing methods when the computer program is executed.
In order to solve the above technical problem, the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps of any one of the above server fault repairing methods.
According to the server fault repairing method provided by the invention, the fault repairing recommendation method model obtained by utilizing the fault repairing method library and the deep neural network training in advance is used, when the current fault information is obtained, the corresponding target repairing method can be determined only by inputting the current fault information into the fault repairing recommendation method model, and the fault repairing is carried out on the target server according to the target repairing method. Therefore, according to the method, the target repair method corresponding to the current fault information is determined by using the fault repair recommendation method model, fault repair is carried out according to the target repair method, operation and maintenance personnel are replaced to carry out fault troubleshooting and fault repair operations manually, human resources consumed can be greatly reduced, the technical level requirements on the operation and maintenance personnel are reduced, after the current fault information is determined, the corresponding target repair method can be determined in time by using the fault repair recommendation method model, fault repair is carried out, and timeliness and efficiency of server fault repair can be improved.
In order to solve the technical problem, the invention also provides a server fault repairing device, equipment and a computer readable storage medium, which have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a server fault repairing method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a deep neural network according to an embodiment of the present invention;
fig. 3 is a structural diagram of a server fault repairing apparatus according to an embodiment of the present invention;
fig. 4 is a structural diagram of a server fault repairing apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The core of the embodiment of the invention is to provide a server fault repairing method, which can reduce the human resources required to be consumed when repairing the server fault, reduce the technical level requirements on operation and maintenance personnel, and improve the timeliness and efficiency of fault repairing; another core of the present invention is to provide a server fault repairing apparatus, a device and a computer readable storage medium, all having the above beneficial effects.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flowchart of a server fault repairing method according to an embodiment of the present invention. As shown in fig. 1, a server failure recovery method includes:
s10: acquiring current fault information of a target server;
s20: inputting current fault information into a fault restoration recommendation method model obtained by utilizing a fault restoration method library and deep neural network training in advance, and outputting a corresponding target restoration method by utilizing the fault restoration recommendation method model;
s30: and performing fault repair on the target server by using a target repair method.
Specifically, in this embodiment, first, current failure information of the target server needs to be acquired; the current fault information refers to information for determining a current fault condition of the target server according to current operation information or a log of the target server, and it should be noted that the current fault information may include information such as a fault name, a fault type, a fault level, a fault description, a fault reason, and the like.
Specifically, a fault restoration recommendation method model trained by a fault restoration method library and a deep neural network is used in advance, after current fault information of a target server is obtained, the current fault information is input into the fault restoration recommendation method model, and the fault restoration recommendation method model is used for determining corresponding output according to the input current fault information, so that a corresponding target restoration method is obtained. It should be noted that, in this embodiment, a specific manner of training the fault repairing recommendation method model is not limited, the deep neural network may be specifically a BP (back propagation) neural network, and this embodiment is also not limited thereto.
Specifically, after the target repairing method corresponding to the current fault information is determined by using the fault repairing recommendation method model, corresponding operations are further executed according to the target repairing method, so that fault repairing is performed on the target server.
According to the server fault repairing method provided by the embodiment of the invention, the fault repairing recommendation method model obtained by utilizing the fault repairing method library and the deep neural network training in advance is used, when the current fault information is obtained, the corresponding target repairing method can be determined only by inputting the current fault information into the fault repairing recommendation method model, and the fault repairing is carried out on the target server according to the target repairing method. Therefore, according to the method, the target repair method corresponding to the current fault information is determined by using the fault repair recommendation method model, fault repair is carried out according to the target repair method, operation and maintenance personnel are replaced to carry out fault troubleshooting and fault repair operations manually, human resources consumed can be greatly reduced, the technical level requirements on the operation and maintenance personnel are reduced, after the current fault information is determined, the corresponding target repair method can be determined in time by using the fault repair recommendation method model, fault repair is carried out, and timeliness and efficiency of server fault repair can be improved.
On the basis of the foregoing embodiment, the present embodiment further describes and optimizes the technical solution, and specifically, in the present embodiment, the process of training the fault repairing recommendation method model specifically includes:
respectively setting the sample fault information and the sample repairing method corresponding to the sample fault information in the fault repairing method library as the input and the output of the deep neural network;
and determining a weight vector and a bias of the deep neural network by using a gradient descent algorithm to obtain a fault repairing recommendation method model.
Specifically, firstly, a sample fault information and fault repairing method library are set, wherein the sample fault information refers to the determined fault information, and the sample fault information can also include information such as a fault name, a fault type, a fault level, a fault description, a fault reason and the like; assume that the sample failure information includes a failure type x1Description of failure x2And cause of failure x3And the total number of sample fault information is m, the sample fault information used for model training can be represented as a 3 × m vector. Specifically, a fault repairing method library is established according to the operation and maintenance experience of the server, wherein repairing operations are taken as a unit in the fault repairing method library, namely the fault repairing method library comprises a plurality of different repairing operations, and each repairing operation comprises information such as an operation type, an operation object, operation contents, an operation level and the like; assuming that the total number of operands in the fail-over method library is n, the fail-over method library can be represented as a 1 × n vector.
Specifically, a vector corresponding to the sample fault information and a vector corresponding to the repairing operation in the fault repairing method library are respectively used as input and output of the deep neural network model, and the optimal relation between the sample fault information and the repairing operation is trained through the model, so that the fault repairing recommendation method model is obtained.
Specifically, as shown in fig. 2, it is a schematic structural diagram of a deep neural network provided in an embodiment of the present invention; let input layer X be (X)1,x2,x3) If the hidden layer is H, the expression of the hidden layer obtained from the input layer is H ═ W1X+B1
Assuming the hidden layer is 50D, W1Is a 50 x 3 matrix, and
Figure BDA0002689922800000061
B1is the first bias term.
Specifically, the activation function may use a ReLU function (Rectified Linear Unit), and the expression of the hidden layer on which the activation function is superimposed is as follows:
Figure BDA0002689922800000062
where g denotes an activation function.
The expression from the hidden layer to the output layer is Y ═ W2H+B2
Wherein,
Figure BDA0002689922800000063
B2is a second bias term;
similarly, the expression of the output layer is superimposed on the activation function, and the expression of Y is obtained as:
Figure BDA0002689922800000064
Figure BDA0002689922800000071
for each output y found abovei,yi∈(y1,yn) Normalizing the outputs using the following formula to obtain each output yiPercent occupied, and all yiThe sum of the percentages of (a) and (b) is 1:
Figure BDA0002689922800000072
wherein, yi∈[1,yn],yj∈[y1,yn];
Wherein, y is the largestiIs the repair operation that corresponds closest to the input sample failure information.
Specifically, for each set of input sample fault information, the corresponding output of the deep neural network is yiCorresponding real result (expected result) is fiBy usingThe loss function C represents the error of the desired value from the actual value:
Figure BDA0002689922800000073
according to W1、W2And the comprehensive weight vector W determined by the activation function3According to B1、B2And the activation function determines a comprehensive bias term B3And the comprehensive weight vector W is combined3And the integrated bias term B3Merging into a matrix, and recording as W; then, the partial derivative of the loss function C to W is calculated, and the value of W is randomly selected as W0(ii) a And the loss and gradient vectors at that time are found. The step size α is set randomly, e.g. α is 0.5, according to the gradient vector formula
Figure BDA0002689922800000074
To W0Gradient descent is carried out to obtain W1I.e. by
Figure BDA0002689922800000075
Wherein,
Figure BDA0002689922800000076
the partial derivatives found for the previous step; then use W1Updating the deep neural network and updating the corresponding output y according to the updated deep neural networkiUpdating the loss by using a loss function C, and repeating the steps by analogy to calculate W by iteration step by stepnLoss and gradient vector. And, in the iterative process, if WnIs greater than Wn-1The loss of (a) indicates that the value of the randomly set step size α is too large, and may cross the lowest point, and at this time, a smaller step size α should be set again, for example, α is 0.05, and the above steps should be repeated. As the gradient is decreasing, each value (e.g., partial derivative) in the gradient vector gets closer to 0; stopping a distance value by presetting a gradient descent algorithm, and stopping iteration when gradient descent distances are all smaller than a preset value; determining W as an optimal parameter; and then updating the deep neural network by using W to obtain a fault repairing recommendation method model. Fault remediationThe recommendation method model can determine the probability percentage of each repair operation according to the input current fault information, and sets the repair operation with the maximum percentage as a target repair method and outputs the repair operation.
Therefore, the fault restoration recommendation method model is obtained by training according to the method of the embodiment, the training process is simple and intuitive, and the obtained fault restoration recommendation method model can accurately output the target restoration method corresponding to the current fault information.
On the basis of the foregoing embodiment, this embodiment further describes and optimizes the technical solution, and specifically, in this embodiment, when the target repair method corresponding to the current fault information cannot be determined by using the fault repair recommendation method model, the method further includes:
and receiving a target repairing method corresponding to the current fault information input by the operation and maintenance personnel.
Specifically, in this embodiment, it is further considered that sample failure information is limited, and the trained failure recovery recommendation method model cannot cover all failure situations, and a situation that a target recovery method corresponding to the current failure information cannot be determined by using the failure recovery recommendation method model may occur; at this time, the target server further utilizes the target repair method to perform fault repair operation by further receiving the target repair method corresponding to the current fault information input by the operation and maintenance personnel according to the operation and maintenance experience of the server, so that the situation that the target server cannot perform fault repair operation due to the fact that the target repair method cannot be determined by utilizing the fault repair recommendation method model is avoided, and the reliability of fault repair of the target server is further guaranteed.
As a preferred embodiment, after receiving the target repairing method corresponding to the current fault information input by the operation and maintenance personnel, the method further includes:
and respectively setting the current fault information and the corresponding target repairing method as the input and the output of the deep neural network for model training, and updating the fault repairing recommendation method model.
Specifically, in this embodiment, after receiving a target repair method corresponding to current fault information input by an operation and maintenance worker, that is, after obtaining the target repair method corresponding to the current fault information, model training is further performed by using the group of data as new sample data, the current fault information and the corresponding target repair method are respectively set as input and output of a deep neural network for model training, and a fault repair recommendation method model is updated.
Therefore, the embodiment of the invention can further improve the reliability of the target repair method output by the fault repair recommendation method model by continuously using the current fault information and the target repair method input by the operation and maintenance personnel as sample data to update the fault repair recommendation method model.
On the basis of the foregoing embodiment, the present embodiment further describes and optimizes the technical solution, and specifically, after inputting the current fault information into a fault repair recommendation method model trained in advance by using a fault repair method library and a deep neural network, and outputting a corresponding target repair method by using the fault repair recommendation method model, the present embodiment further includes:
judging whether the target repairing method can be automatically executed or not;
if so, performing fault repairing on the target server by using a target repairing method;
if not, sending out corresponding prompt information.
Specifically, in this embodiment, after the corresponding target repairing method is output by using the fault repairing recommendation method model, it is further determined whether the target repairing method can be automatically executed. Specifically, whether the automatic execution is possible or not may be determined according to the operation level or the operation authority of the target repair method, for example, the operation level includes three types, i.e., common, warning, and critical, where the operation method at the level of common may be automatically executed, and the operations at the levels of warning and critical need to send out prompt information and be executed by an operation and maintenance person.
It should be noted that, in the actual operation, whether manual review is needed or not can be judged according to the operation authority; the target server automatically executes fault repairing operation and checks whether fault repairing is successful or not without a target repairing method of manual examination; the operation needing manual examination is that operation and maintenance personnel manually execute fault repairing operation; and if the target server failure repair operation fails, corresponding prompt information can be further sent to prompt operation and maintenance personnel to perform manual review, and the failure repair operation is manually executed to achieve failure repair.
It should be noted that, in this embodiment, the prompt device such as a buzzer and/or an indicator light may be triggered to operate to send the prompt information, and the specific manner of sending the prompt information is not limited in this embodiment.
Therefore, according to the embodiment, whether the target repairing method can be automatically executed is further judged, and the corresponding prompt information is sent out when the target repairing method cannot be automatically executed, so that the server fault can be further ensured to be repaired in time, and the efficiency of repairing the server fault is further improved.
On the basis of the foregoing embodiment, this embodiment further describes and optimizes the technical solution, and specifically, in this embodiment, the process of obtaining the current failure information of the target server specifically includes:
and inputting the log of the target server into a fault diagnosis system to obtain the current fault information of the target server.
Specifically, in this embodiment, a log of a target server is obtained, and then the log is input into a fault diagnosis system, and the fault diagnosis system is used to directly diagnose current fault information of the target server according to log analysis, where the current fault information includes information such as a fault name, a fault type, a fault level, a fault description, and a fault reason. It should be noted that the fault diagnosis System is used for log analysis when a Server fails, and can implement accurate and fast positioning of a Server fault, and the fault diagnosis System may specifically be an ISCDS (impulse Server Cloud Diagnostic System, wave Cloud fault diagnosis System), and the like, and the specific type of the fault diagnosis System is not limited in this embodiment.
It should be noted that, the log of the target server is input into the fault diagnosis system to obtain the corresponding current fault information, the operation process is convenient, and the current fault information can be determined quickly and accurately.
As a preferred embodiment, the process of inputting the log of the target server into the fault diagnosis system to obtain the current fault information of the target server specifically includes:
and inputting the log of the target server into a fault diagnosis system according to a preset time period to obtain the current fault information of the target server.
Specifically, in this embodiment, a preset time period is preset, then the log of the target server is obtained according to the preset time period and is input into the fault diagnosis system to obtain the corresponding current fault information, or the obtained log is input into the fault diagnosis system according to the preset time period to obtain the corresponding current fault information, so that the current fault information of the target server can be obtained more comprehensively and timely.
The above detailed description is given for the embodiment of the server fault repairing method provided by the present invention, and the present invention also provides a server fault repairing apparatus, a device and a computer-readable storage medium corresponding to the method.
Fig. 3 is a structural diagram of a server failure recovery apparatus according to an embodiment of the present invention, and as shown in fig. 3, a server failure recovery apparatus includes:
an obtaining module 31, configured to obtain current fault information of a target server;
the determining module 32 is configured to input current fault information into a fault repair recommendation method model obtained by utilizing a fault repair method library and deep neural network training in advance, and output a corresponding target repair method by utilizing the fault repair recommendation method model;
and a repair module 33, configured to perform fault repair on the target server by using the target repair method.
The server fault repairing device provided by the embodiment of the invention has the beneficial effects of the server fault repairing method.
As a preferred embodiment, a server failure repair apparatus further includes:
and the receiving module is used for receiving the target repairing method corresponding to the current fault information input by the operation and maintenance personnel when the target repairing method corresponding to the current fault information cannot be determined by using the fault repairing recommendation method model.
As a preferred embodiment, a server failure repair apparatus further includes:
and the updating module is used for respectively setting the current fault information and the corresponding target repairing method as the input and the output of the deep neural network for model training and updating the fault repairing recommendation method model after receiving the target repairing method which is input by the operation and maintenance personnel and corresponds to the current fault information.
As a preferred embodiment, a server failure repair apparatus further includes:
the judging module is used for judging whether the target repairing method can be automatically executed or not; if yes, calling a repair module; if not, calling a prompt module;
and the prompt module is used for sending out corresponding prompt information.
Fig. 4 is a structural diagram of a server fault repairing apparatus according to an embodiment of the present invention, and as shown in fig. 4, a server fault repairing apparatus includes:
a memory 41 for storing a computer program;
a processor 42 for implementing the steps of the server failover method as described above when executing the computer program.
The server fault repairing device provided by the embodiment of the invention has the beneficial effects of the server fault repairing method.
In order to solve the above technical problem, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the server fault repairing method.
The computer-readable storage medium provided by the embodiment of the invention has the beneficial effects of the server fault repairing method.
The server fault repairing method, device, equipment and computer readable storage medium provided by the invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are set forth only to help understand the method and its core ideas of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Claims (10)

1. A method for repairing a server failure, comprising:
acquiring current fault information of a target server;
inputting the current fault information into a fault repairing recommendation method model obtained by utilizing a fault repairing method library and deep neural network training in advance, and outputting a corresponding target repairing method by utilizing the fault repairing recommendation method model;
and performing fault repair on the target server by using the target repair method.
2. The method according to claim 1, wherein training the fault remediation recommendation method model specifically comprises:
setting sample fault information and a sample repairing method corresponding to the sample fault information in the fault repairing method library as input and output of the deep neural network respectively;
and determining the weight vector and the offset of the deep neural network by using a gradient descent algorithm to obtain the fault repairing recommendation method model.
3. The method according to claim 2, wherein when the target repairing method corresponding to the current fault information cannot be determined by using the fault repairing recommendation method model, the method further comprises:
and receiving a target repairing method corresponding to the current fault information and input by operation and maintenance personnel.
4. The method of claim 3, wherein after receiving the target repair method corresponding to the current fault information input by the operation and maintenance personnel, the method further comprises:
and respectively setting the current fault information and the corresponding target repairing method as the input and the output of the deep neural network for model training, and updating the fault repairing recommendation method model.
5. The method according to claim 1, wherein after the inputting the current fault information into a fault repair recommendation method model trained by a fault repair method library and a deep neural network in advance and outputting a corresponding target repair method by using the fault repair recommendation method model, the method further comprises:
judging whether the target repairing method can be automatically executed or not;
if so, performing fault repairing on the target server by using the target repairing method;
if not, sending out corresponding prompt information.
6. The method according to any one of claims 1 to 5, wherein the process of obtaining the current failure information of the target server specifically includes:
and inputting the log of the target server into a fault diagnosis system to obtain the current fault information of the target server.
7. The method according to claim 6, wherein the step of inputting the log of the target server into a fault diagnosis system to obtain the current fault information of the target server specifically includes:
and inputting the log of the target server into the fault diagnosis system according to a preset time period to obtain the current fault information of the target server.
8. A server failure recovery apparatus, comprising:
the acquisition module is used for acquiring the current fault information of the target server;
the determining module is used for inputting the current fault information into a fault restoration recommendation method model obtained by utilizing a fault restoration method library and deep neural network training in advance, and outputting a corresponding target restoration method by utilizing the fault restoration recommendation method model;
and the repair module is used for repairing the fault of the target server by using the target repair method.
9. A server failure repair device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the server failover method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the server failover method according to one of claims 1 to 7.
CN202010988093.9A 2020-09-18 2020-09-18 Server fault repairing method, device, equipment and storage medium Withdrawn CN112131033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010988093.9A CN112131033A (en) 2020-09-18 2020-09-18 Server fault repairing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010988093.9A CN112131033A (en) 2020-09-18 2020-09-18 Server fault repairing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112131033A true CN112131033A (en) 2020-12-25

Family

ID=73841520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010988093.9A Withdrawn CN112131033A (en) 2020-09-18 2020-09-18 Server fault repairing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112131033A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113886130A (en) * 2021-10-21 2022-01-04 深信服科技股份有限公司 Method, device and medium for processing database fault
CN113986618A (en) * 2021-11-08 2022-01-28 苏州浪潮智能科技有限公司 Cluster brain split automatic repairing method, system, device and storage medium
CN114647531A (en) * 2022-05-19 2022-06-21 武汉四通信息服务有限公司 Failure solving method, failure solving system, electronic device, and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113886130A (en) * 2021-10-21 2022-01-04 深信服科技股份有限公司 Method, device and medium for processing database fault
CN113986618A (en) * 2021-11-08 2022-01-28 苏州浪潮智能科技有限公司 Cluster brain split automatic repairing method, system, device and storage medium
CN113986618B (en) * 2021-11-08 2023-11-10 苏州浪潮智能科技有限公司 Cluster brain fracture automatic repair method, system, device and storage medium
CN114647531A (en) * 2022-05-19 2022-06-21 武汉四通信息服务有限公司 Failure solving method, failure solving system, electronic device, and storage medium
CN114647531B (en) * 2022-05-19 2022-07-29 武汉四通信息服务有限公司 Failure solving method, failure solving system, electronic device, and storage medium

Similar Documents

Publication Publication Date Title
CN111209131B (en) Method and system for determining faults of heterogeneous system based on machine learning
CN112131033A (en) Server fault repairing method, device, equipment and storage medium
EP2225636B1 (en) Assisting failure mode and effects analysis of a system comprising a plurality of components
CN110645153B (en) Wind generating set fault diagnosis method and device and electronic equipment
US20100121520A1 (en) System and method for determining electronic logbook observed defect fix effectiveness
CN109885478A (en) A kind of localization method and system of error code
CN112508249A (en) Method and device for constructing emergency deduction graph structure and method and device for deducting emergency
WO2015017260A1 (en) Method and system for risk assessment analysis
CN114647525A (en) Diagnostic method, diagnostic device, terminal and storage medium
CN117041029A (en) Network equipment fault processing method and device, electronic equipment and storage medium
CN112598223A (en) Nuclear power state oriented law accident rule completeness inspection method and system, electronic equipment and storage medium
CN111782532A (en) Software fault positioning method and system based on network abnormal node analysis
CN117649105B (en) Substation work ticket intelligent ticket filling method based on RPA flow automation
CN113657648B (en) Multi-dimensional data fusion equipment health assessment method, device and operation and maintenance system
CN112183555B (en) Method and system for detecting welding quality, electronic device and storage medium
CN114023477A (en) Computerized regulation system and control system for nuclear power plant
CN112819262A (en) Memory, process pipeline inspection and maintenance decision method, device and equipment
Tang et al. Automatic generation of availability models in rascad
Avritzer et al. Automated generation of test cases using a performability model
CN114444933A (en) Danger source analysis method, equipment and medium based on constructional engineering
Medema et al. Extracting human reliability findings from human factors studies in the Human Systems Simulation Laboratory
CN114282359A (en) Parameter evaluation method for representing satellite reliability maintainability guarantee comprehensive capacity
Bhatti et al. Reliability Analysis of Industrial Model Using Redundancy Technique and Geometric Distribution
CN118132451B (en) Automatic test and error diagnosis system and method for computer operating system
Hao et al. Review on Verification and Validation technology in integrated health management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201225

WW01 Invention patent application withdrawn after publication