CN115643158A - Equipment cluster repairing method, device, equipment and storage medium - Google Patents

Equipment cluster repairing method, device, equipment and storage medium Download PDF

Info

Publication number
CN115643158A
CN115643158A CN202211314277.2A CN202211314277A CN115643158A CN 115643158 A CN115643158 A CN 115643158A CN 202211314277 A CN202211314277 A CN 202211314277A CN 115643158 A CN115643158 A CN 115643158A
Authority
CN
China
Prior art keywords
equipment
cluster
fault
repairing
repair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211314277.2A
Other languages
Chinese (zh)
Inventor
张春和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202211314277.2A priority Critical patent/CN115643158A/en
Publication of CN115643158A publication Critical patent/CN115643158A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method, a device, equipment and a storage medium for repairing an equipment cluster, wherein the method comprises the following steps: responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information; sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster; performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster; and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result. The detection and repair efficiency of the fault equipment in the equipment cluster is improved.

Description

Equipment cluster repairing method, device, equipment and storage medium
Technical Field
The application relates to the technical field of internet of things, in particular to a method, a device, equipment and a storage medium for repairing an equipment cluster.
Background
Currently, with the development of digital technology, many enterprises need to implement various services by deploying various service systems. Various subsystems exist in an existing financial system, different subsystems need a plurality of entity servers or virtual servers to form a server cluster to operate together to execute various functions of the subsystems, however, under the scenes of large-scale edition sending, application online and offline, machine room shutdown maintenance and the like, operation and maintenance personnel are difficult to quickly identify and repair faulty equipment in an equipment cluster, so that potential faulty equipment in the equipment cluster can possibly cause program operation error reporting and influence the service efficiency of a service system.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for repairing a device cluster, and aims to solve the technical problem that in the prior art, fault equipment in the device cluster is difficult to be accurately positioned and repaired.
In one aspect, an embodiment of the present application provides an apparatus cluster repair method, where the apparatus cluster repair method includes the following steps:
responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information;
sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster;
performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
In a possible implementation manner of the present application, the receiving device feedback information of each cluster device, encoding the device feedback information, and generating the device feedback feature of the target device cluster includes:
receiving equipment feedback information of each cluster equipment, and generating a cluster state image of the target equipment cluster according to the equipment feedback information and the equipment type of each cluster equipment;
and inputting the cluster state image into a preset cluster detection model for coding to obtain the equipment feedback characteristics of the cluster state image.
In a possible implementation manner of the present application, the performing fault analysis on the device feedback feature to obtain a faulty device of the target device cluster includes:
inputting the equipment feedback characteristics into a full connection layer in the cluster detection model for classification processing to obtain equipment state characteristics in the equipment feedback characteristics;
matching the equipment state characteristic with a preset fault characteristic to obtain fault similarity between the equipment state characteristic and the preset fault characteristic;
and marking cluster equipment corresponding to the equipment state characteristics with the fault similarity larger than the preset similarity threshold as fault equipment.
In a possible implementation manner of the present application, the repairing the faulty device according to the device type of the faulty device and the device feedback information of the faulty device to obtain a fault repairing result includes:
accessing a preset equipment database to obtain the equipment type of the fault equipment;
reading the equipment dimension parameter in the equipment feedback information, and determining the equipment fault level of the fault equipment according to the equipment dimension parameter and a preset fault level threshold;
and repairing the fault equipment according to the fault repairing strategy corresponding to the equipment type and the equipment fault level to obtain a fault repairing result.
In a possible implementation manner of the present application, the repairing the faulty device according to the device type of the faulty device and the device feedback information of the faulty device to obtain a fault repair result includes:
acquiring a repair check code and a feedback time threshold of the fault repair strategy;
receiving a repair check code transmitted back by the fault equipment and repair feedback time related to the repair check code;
and if the repair check code is matched with the repair check code and the repair feedback time is less than the feedback time threshold, generating a fault repair result with repaired faults.
In a possible implementation manner of the present application, the obtaining device information of a target device cluster corresponding to the cluster inquiry instruction and generating a device detection instruction corresponding to the device information includes:
inquiring an equipment database, and acquiring a target equipment cluster corresponding to the cluster inquiry instruction in the equipment database;
acquiring equipment information of each cluster equipment in the target equipment cluster and historical detection data of the cluster equipment;
and reading the dimension detection parameters corresponding to the target fields in the historical detection data, and generating a device detection instruction of the cluster device according to the device information and the dimension detection parameters.
In a possible implementation manner of the present application, the querying the device database to obtain the target device cluster corresponding to the cluster querying instruction in the device database includes:
reading equipment dimension parameters in the cluster query instruction, wherein the equipment dimension parameters comprise at least one of an equipment identifier, an application identifier, a machine room dimension identifier and an equipment sub-environment identifier;
inputting the equipment dimension parameters into a preset equipment database for query to obtain target cluster equipment corresponding to the equipment dimension parameters;
and summarizing the target cluster equipment, and generating a target equipment cluster corresponding to the cluster inquiry instruction.
In another aspect, the present application provides an apparatus for repairing an equipment cluster, where the apparatus for repairing an equipment cluster includes:
the detection query module is configured to respond to a cluster query instruction, query an equipment database, acquire equipment information of a target equipment cluster corresponding to the cluster query instruction, and generate an equipment detection instruction corresponding to the equipment information;
a feature encoding module, configured to send the device detection instruction to each cluster device in the target device cluster, receive device feedback information of each cluster device, encode the device feedback information, and generate a device feedback feature of the target device cluster;
the fault analysis module is configured to perform fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and the fault repairing module is configured to repair the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
In another aspect, the present application also provides an apparatus, comprising:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the device cluster repair method.
In another aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is loaded by a processor to execute the steps in the device cluster repairing method.
In the method, a device database is queried by responding to a cluster query instruction, device information of a target device cluster corresponding to the cluster query instruction is obtained, and a device detection instruction corresponding to the device information is generated; sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster; performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster; and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result. The method and the device realize accurate monitoring of each equipment cluster, can quickly and efficiently position the fault equipment in the equipment cluster and repair the fault equipment in multiple dimensions, and improve the detection and repair efficiency of the fault equipment.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of a scene of an apparatus cluster repair method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of an embodiment of a device cluster repairing method in the embodiment of the present application;
fig. 3 is a schematic flowchart of an embodiment of performing fault analysis on device feedback characteristics in the device cluster repair method according to the embodiment of the present application;
fig. 4 is a schematic flowchart of an embodiment of repairing a failed device to obtain a failure repair result in the device cluster repair method provided in the embodiment of the present application;
fig. 5 is a schematic structural diagram of an embodiment of an apparatus cluster repair device provided in the embodiment of the present application;
fig. 6 is a schematic structural diagram of an embodiment of a device cluster repair device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In this application, the word "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the invention. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and processes are not shown in detail to avoid obscuring the description of the invention with unnecessary detail. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Currently, with the development of digital technology, many enterprises need to implement various services by deploying various service systems. Various subsystems exist in an existing financial system, different subsystems need a plurality of entity servers or virtual servers to form a server cluster to operate together to execute various functions of the subsystems, however, under the scenes of large-scale edition release, application online and offline, machine room shutdown maintenance and the like, operation and maintenance personnel are difficult to quickly identify and repair faulty equipment in the equipment cluster, so that potential faulty equipment in the equipment cluster may cause program operation error reporting and influence the service efficiency of a service system.
Based on this, the application provides a method, an apparatus, a device and a computer readable storage medium for repairing a device cluster, so as to solve the technical problem that it is difficult to accurately position and repair a faulty device in the device cluster in the prior art.
The device cluster repairing method in the embodiment of the invention is applied to a device cluster repairing apparatus, the device cluster repairing apparatus is arranged in a device cluster repairing device, the device cluster repairing device is provided with one or more processors, a memory and one or more application programs, wherein the one or more application programs are stored in the memory and are configured to be executed by the processor to implement the device cluster repairing method; the device cluster repair device can be an intelligent terminal, such as a mobile phone, a tablet computer, an intelligent television, a network device, an intelligent computer and the like; optionally, the device cluster repair device may also be a device or a service cluster formed by multiple devices.
As shown in fig. 1, fig. 1 is a scene schematic diagram of an apparatus cluster repair method in an embodiment of the present application, where an apparatus cluster repair scene in the embodiment of the present application includes an apparatus cluster repair apparatus 100 (an apparatus cluster repair device is integrated in the apparatus cluster repair apparatus 100) and a target apparatus cluster 200, and a computer-readable storage medium corresponding to the apparatus cluster repair method is run in the apparatus cluster repair apparatus 100, so as to execute steps of the apparatus cluster repair method. The target device cluster 200 may be a server cluster or an intelligent terminal cluster composed of a plurality of servers or devices.
It should be understood that the device cluster repair apparatus in the scenario of the device cluster repair method shown in fig. 1, or the apparatuses included in the device cluster repair apparatus, do not constitute a limitation to the embodiment of the present invention, that is, the number of devices and the types of devices of the device cluster repair apparatus included in the scenario of the device cluster repair method, or the number of apparatuses and the types of apparatuses included in each device do not affect the overall implementation of the technical solution in the embodiment of the present invention, and may be calculated as an equivalent replacement or a derivative of the technical solution claimed in the embodiment of the present invention.
The device cluster repairing apparatus 100 in the embodiment of the present invention is mainly configured to: responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information; sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster; performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster; and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
The device cluster repairing apparatus 100 in the embodiment of the present invention may be an independent device cluster repairing apparatus, for example, an intelligent terminal such as a mobile phone, a tablet computer, an intelligent television, a network device, an apparatus, and an intelligent computer, or may be a device cluster repairing network or a device cluster repairing cluster composed of multiple device cluster repairing apparatuses.
The embodiments of the present application provide a method, an apparatus, a device and a computer-readable storage medium for repairing a device cluster, which are described in detail below.
Those skilled in the art can understand that the application environment shown in fig. 1 is only one of application scenarios related to the present application scheme, and does not constitute a limitation on the application scenario of the present application scheme, and that other application environments may further include more or less device cluster repairing devices than those shown in fig. 1, or a device cluster repairing network connection relationship, for example, only one device cluster repairing device is shown in fig. 1, and it is understood that the scenario of the device cluster repairing method may further include one or more device cluster repairing devices, and is not limited herein; the device cluster repair device 100 may further comprise a memory for storing cluster device information and other data.
It should be noted that the scene schematic diagram of the device cluster repairing method shown in fig. 1 is only an example, and the scene of the device cluster repairing method described in the embodiment of the present invention is for more clearly illustrating the technical solution of the embodiment of the present invention, and does not limit the technical solution provided by the embodiment of the present invention.
Based on the above scenario of the device cluster repairing method, various embodiments of the device cluster repairing method disclosed by the present invention are proposed.
As shown in fig. 2, fig. 2 is a schematic flowchart of an embodiment of a device cluster repairing method in the embodiment of the present application, where the server cluster repairing method includes the following steps 201 to 204:
201. responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information;
the device cluster repairing method in this embodiment is applied to the device cluster repairing apparatus, and the type and the number of the device cluster repairing apparatus are not specifically limited, that is, the device cluster repairing apparatus may be one or more intelligent terminals or devices, and in a specific embodiment, the device cluster repairing apparatus is an intelligent computer.
Specifically, the device cluster repairing device is configured to respond to a cluster query instruction, query a preset device database, obtain a target device cluster corresponding to the cluster query instruction in the device database and the target device cluster, locate a faulty device in the target device cluster, and repair the faulty device to obtain a fault repair result. The target device cluster is a virtual server cluster or an entity server cluster which is arranged in each service system and used for supporting the operation of the service system.
Specifically, in the operation process of the device cluster repairing device, the device cluster repairing device receives a cluster query instruction and obtains a target device cluster corresponding to the cluster query instruction. The triggering manner of the cluster query instruction is not specifically limited herein, that is, the cluster query instruction may be actively triggered by the user, for example, in an embodiment, the user is an operation and maintenance person of the financial service system, and the cluster query instruction is actively triggered by inputting device query information of a specific dimension to the device cluster repair device. Optionally, the cluster query instruction may also be automatically triggered by the device cluster repairing device, for example, the device cluster repairing device sets a timing repairing process in advance, and automatically generates the cluster query instruction according to a preset query policy within a preset time period.
Specifically, after acquiring the cluster query instruction, the device cluster repair device reads each device dimension parameter in the cluster query instruction, where the device dimension parameter is a query parameter for querying a device with a specific dimension, and the device dimension parameter includes at least one of a device identifier, an application identifier, a machine room dimension identifier, and a device sub-environment identifier.
After the device cluster repairing device obtains the device dimension parameters in the cluster query instruction, the device dimension parameters are input into a preset device database for querying, and target cluster devices corresponding to the device dimension parameters in the device database are obtained. And the equipment cluster repairing equipment collects all the target cluster equipment and generates a target equipment cluster corresponding to the cluster inquiry instruction.
After the device cluster repairing device obtains the target device cluster, the device cluster repairing device also obtains device information of each cluster device in the target device cluster and historical detection data of the cluster device. After the historical detection data of the cluster equipment is obtained, a preset target field in the historical detection data is identified, and a dimension detection parameter corresponding to the target field is read. The dimension detection parameter is a multi-dimension detection parameter for health detection of the cluster equipment. And the equipment cluster repairing equipment generates an equipment detection instruction of the cluster equipment according to the equipment information of the cluster equipment and the dimension detection parameter.
202. Sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster;
specifically, after generating a device detection instruction of each cluster device in a target cluster, the device cluster repair device further obtains a protocol address of each cluster device, sends the device detection instruction to each cluster device in the target cluster through the protocol address, and performs health detection on the cluster device through the device detection instruction.
Specifically, after sending the device detection instruction, the device cluster repair device also detects the data transmission interface in real time, receives the device feedback information of each cluster device through the data transmission interface, and encodes the device feedback information, thereby generating the device feedback characteristic of the target device cluster.
Specifically, after the device cluster repairing device acquires the device feedback information, a cluster state image of the target device cluster is generated according to the device feedback information and the device type of the cluster device, and the cluster state image is input into a preset cluster detection model to be encoded, so that the device feedback characteristics of the target device cluster are obtained.
Specifically, the device cluster repair device analyzes the device feedback information, acquires the device type of each cluster device, generates a device node corresponding to the device type, and associates and inputs the device feedback information and the device node into a preset cluster image template, thereby generating a cluster state image. The cluster state image is a node connection image representing the operation connection relation of each cluster device in the device cluster. The connection relation is characterized in that interactive communication operation can be carried out between each cluster device and each cluster device, and between each cluster device and each device cluster repairing device. Optionally, the cluster state image can also be displayed in a contiguous matrix form.
After the device cluster repairing device generates the cluster state image, inputting the cluster state image into a preset cluster detection model for feature extraction, so as to obtain the cluster state features corresponding to the cluster state image. The device cluster repairing device inputs the cluster state image into the cluster detection model, and performs feature extraction on the adjacent matrix in the cluster state image through the cluster detection model, so as to obtain the device feedback feature of the cluster state image.
203. Performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
after the device cluster repairing device obtains the device feedback feature corresponding to the cluster state image, the device feedback feature is input into a full connection layer of the cluster detection model, fault analysis is carried out on the device feedback feature through the full connection layer, and therefore the fault state feature in the device feedback feature is determined, and fault devices are located through the fault state feature.
And after the equipment cluster repairing equipment obtains the fault equipment in the template equipment cluster by carrying out fault analysis on the equipment feedback characteristics, repairing the fault equipment so as to ensure the normal operation of the equipment cluster.
204. And repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
In this embodiment, after acquiring the faulty device in the device cluster, the device cluster repair device further generates a fault repair policy for the faulty device according to the device type of the faulty device and the device feedback information of the faulty device, and repairs the faulty device through the fault repair policy, thereby obtaining the fault repair result.
Specifically, the device cluster repair device accesses a preset device database to obtain the device type of the failed device. The equipment cluster repair equipment also reads equipment feedback information corresponding to the fault equipment, acquires equipment dimension parameters in the equipment feedback information, and compares the equipment dimension parameters with a preset fault level threshold value, so as to determine the fault equipment level corresponding to the fault equipment. The device dimension parameter is a plurality of dimension parameters characterizing the operation state of the device, such as parameters of throughput and the like.
The device cluster repair device also generates a plurality of fault repair strategies in advance, and associates the fault repair strategies with the device types and the device fault levels of the cluster devices. After the device cluster repair device obtains the device type and the device fault level of the fault device, a fault repair strategy corresponding to the device type and the device fault level is obtained, and the fault device is repaired through a repair rule in the fault repair strategy to obtain a fault repair result.
Optionally, in other embodiments, after determining the faulty device, the device cluster repair device further generates a fault repair prompt on a preset display interface, so as to prompt a user to troubleshoot the faulty device.
In this embodiment, the device cluster repair device queries the device database by responding to a cluster query instruction, acquires device information of a target device cluster corresponding to the cluster query instruction, and generates a device detection instruction corresponding to the device information; sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster; performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster; and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result. The method and the device realize accurate monitoring of each equipment cluster, can quickly and efficiently position the fault equipment in the equipment cluster and repair the fault equipment in multiple dimensions, and improve the detection and repair efficiency of the fault equipment.
As shown in fig. 3, fig. 3 is a schematic flowchart of an embodiment of performing fault analysis on device feedback characteristics in a device cluster repair method provided in the embodiment of the present application, and specifically includes steps 301 to 303:
301. inputting the equipment feedback characteristics into a full connection layer in the cluster detection model for classification processing to obtain equipment state characteristics in the equipment feedback characteristics;
302. matching the equipment state characteristics with preset fault characteristics to obtain fault similarity between the equipment state characteristics and the preset fault characteristics;
303. and marking cluster equipment corresponding to the equipment state characteristics with the fault similarity larger than the preset similarity threshold as fault equipment.
Based on the foregoing embodiment, in this embodiment, after acquiring the device feedback feature corresponding to the cluster state image, the device cluster repair device inputs the device feedback feature into the full connection layer of the cluster detection model, performs fault analysis on the device feedback feature through the full connection layer, thereby determining a fault state feature in the device feedback feature, and locates a faulty device through the fault state feature.
Specifically, after the device cluster repair device obtains the device state feature, the device state feature is matched with a preset fault feature, and the fault similarity between the device state feature and the preset fault feature is calculated.
Optionally, if the fault similarity is smaller than the preset similarity threshold, it is determined that the device status feature corresponding to the fault similarity is a normal status feature, and the device cluster repairing device determines that the cluster device corresponding to the normal status feature is a normal cluster device.
Optionally, if the fault similarity is greater than the preset similarity threshold, it is determined that the device status feature corresponding to the fault similarity is a fault status feature, and the device cluster repairing device determines that the cluster device corresponding to the fault status feature is a fault device.
After acquiring the faulty equipment in the equipment cluster, the equipment cluster repair equipment also generates a fault repair strategy of the faulty equipment according to the equipment type of the faulty equipment and the equipment feedback information of the faulty equipment, and repairs the faulty equipment through the fault repair strategy, so as to obtain a fault repair result.
In this embodiment, the device cluster repair device obtains the device state characteristics in the device feedback characteristics by inputting the device feedback characteristics into the full connection layer in the cluster detection model for classification processing; matching the equipment state characteristic with a preset fault characteristic to obtain fault similarity between the equipment state characteristic and the preset fault characteristic; and marking cluster equipment corresponding to the equipment state characteristics with the fault similarity larger than the preset similarity threshold as fault equipment. And the accurate positioning of the fault equipment in the target equipment cluster is realized.
As shown in fig. 4, fig. 4 is a schematic flow chart of an embodiment of repairing a failed device in the device cluster repairing method provided in the embodiment of the present application to obtain a failure repairing result, and specifically includes steps 401 to 403:
401. acquiring a repair check code and a feedback time threshold of the fault repair strategy;
402. receiving a repair check code returned by the fault equipment and repair feedback time corresponding to the fault equipment;
403. and if the repair check code is matched with the repair check code and the repair feedback time is less than the feedback time threshold, generating a fault repair result with repaired fault.
Based on the foregoing embodiment, in this embodiment, after the device cluster repair device repairs the failed device according to the failure repair policy, the device cluster repair device also obtains the verification performed on the failed device, so as to determine the failure repair result of the failed device.
Specifically, the device cluster repair device obtains a fault repair policy corresponding to the fault device, analyzes the fault repair policy, obtains a repair check code and a feedback time threshold in the fault repair policy, and determines a fault repair result of the fault device according to the repair check code and the feedback time threshold.
Specifically, after the device cluster repair device passes through the fault repair strategy and the fault device is repaired by the fault repair strategy, the device cluster repair device receives the repair check code sent back by the fault device and the repair feedback time corresponding to the fault device.
Specifically, after acquiring the repair check code and the repair check code, the device cluster repair device matches the repair check code with the repair check code to obtain a matching result.
Optionally, if the matching result is a matching failure, that is, the repair check code and the repair check code are not matched, it is determined that the fault repair result of the faulty device is that the fault is not repaired.
Optionally, if the matching result is that the matching is successful, that is, the repair check code is matched with the repair check code, the device cluster repair device further obtains repair feedback time corresponding to the faulty device, and if the repair feedback time does not exceed a preset feedback time threshold, it is determined that the fault repair result of the faulty device is that the fault is repaired.
In this embodiment, the device cluster repair device obtains the repair check code and the feedback time threshold of the fault repair policy; receiving a repair check code returned by the fault equipment and repair feedback time corresponding to the fault equipment; and if the repair check code is matched with the repair check code and the repair feedback time is less than the feedback time threshold, generating a fault repair result with repaired fault. The fault repair accuracy of the fault equipment in the target equipment cluster is improved.
In order to better implement the method for repairing the device cluster in the embodiment of the present application, based on the method for repairing the device cluster, an apparatus for repairing the device cluster is further provided in the embodiment of the present application, as shown in fig. 5, fig. 5 is a schematic structural diagram of an embodiment of the apparatus for repairing the device cluster provided in the embodiment of the present application, where the apparatus for repairing the device cluster 500 includes:
a detection query module 501 configured to respond to a cluster query instruction, query an equipment database, obtain equipment information of a target equipment cluster corresponding to the cluster query instruction, and generate an equipment detection instruction corresponding to the equipment information;
a feature encoding module 502 configured to send the device detection instruction to each cluster device in the target device cluster, receive device feedback information of each cluster device, encode the device feedback information, and generate a device feedback feature of the target device cluster;
a fault analysis module 503, configured to perform fault analysis on the device feedback characteristics to obtain a faulty device of the target device cluster;
a fault repairing module 504, configured to repair the faulty device according to the device type of the faulty device and the device feedback information of the faulty device, so as to obtain a fault repairing result.
In some embodiments of the present application, an apparatus cluster repair device receives apparatus feedback information of each cluster apparatus, encodes the apparatus feedback information, and generates an apparatus feedback feature of the target apparatus cluster, where the apparatus feedback feature includes:
receiving equipment feedback information of each cluster equipment, and generating a cluster state image of the target equipment cluster according to the equipment feedback information and the equipment type of each cluster equipment;
and inputting the cluster state image into a preset cluster detection model for coding to obtain the equipment feedback characteristics of the cluster state image.
In some embodiments of the present application, the apparatus for repairing a device cluster performs fault analysis on the device feedback feature to obtain a faulty device of the target device cluster, including:
inputting the equipment feedback characteristics into a full connection layer in the cluster detection model for classification processing to obtain equipment state characteristics in the equipment feedback characteristics;
matching the equipment state characteristics with preset fault characteristics to obtain fault similarity between the equipment state characteristics and the preset fault characteristics;
and marking cluster equipment corresponding to the equipment state characteristics with the fault similarity larger than the preset similarity threshold as fault equipment.
In some embodiments of the present application, the repairing, by the device cluster repairing apparatus, the faulty device according to the device type of the faulty device and the device feedback information of the faulty device to obtain a fault repairing result, including:
accessing a preset equipment database to obtain the equipment type of the fault equipment;
reading the equipment dimension parameters in the equipment feedback information, and determining the equipment fault level of the fault equipment according to the equipment dimension parameters and a preset fault level threshold;
and repairing the fault equipment according to the fault repairing strategy corresponding to the equipment type and the equipment fault level to obtain a fault repairing result.
In some embodiments of the present application, the repairing, by the device cluster repairing apparatus, the faulty device according to the device type of the faulty device and the device feedback information of the faulty device to obtain a fault repairing result, including:
acquiring a repair check code and a feedback time threshold of the fault repair strategy;
receiving a repair check code returned by the fault equipment and repair feedback time associated with the repair check code;
and if the repair check code is matched with the repair check code and the repair feedback time is less than the feedback time threshold, generating a fault repair result with repaired fault.
In some embodiments of the present application, an apparatus cluster repairing apparatus obtains apparatus information of a target apparatus cluster corresponding to the cluster inquiry instruction, and generates an apparatus detection instruction corresponding to the apparatus information, where the apparatus detection instruction includes:
inquiring an equipment database, and acquiring a target equipment cluster corresponding to the cluster inquiry instruction in the equipment database;
acquiring equipment information of each cluster equipment in the target equipment cluster and historical detection data of the cluster equipment;
and reading the dimension detection parameters corresponding to the target fields in the historical detection data, and generating a device detection instruction of the cluster device according to the device information and the dimension detection parameters.
In some embodiments of the present application, querying, by an equipment cluster repair apparatus, an equipment database to obtain a target equipment cluster corresponding to the cluster query instruction in the equipment database includes:
reading equipment dimension parameters in the cluster query instruction, wherein the equipment dimension parameters comprise at least one of an equipment identifier, an application identifier, a machine room dimension identifier and an equipment sub-environment identifier;
inputting the equipment dimension parameters into a preset equipment database for query to obtain target cluster equipment corresponding to the equipment dimension parameters;
and summarizing the target cluster equipment, and generating a target equipment cluster corresponding to the cluster inquiry instruction.
In this embodiment, the device cluster repairing apparatus queries the device database by responding to a cluster query instruction, acquires device information of a target device cluster corresponding to the cluster query instruction, and generates a device detection instruction corresponding to the device information; sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster; performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster; and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result. The method and the device realize accurate monitoring of each equipment cluster, can quickly and efficiently position the fault equipment in the equipment cluster and repair the fault equipment in multiple dimensions, and improve the detection and repair efficiency of the fault equipment.
An embodiment of the present invention further provides a device cluster repair device, as shown in fig. 6, fig. 6 is a schematic structural diagram of an embodiment of the device cluster repair device provided in the embodiment of the present application.
The device cluster repair device integrates any one of the device cluster repair apparatuses provided by the embodiments of the present invention, and the device cluster repair device includes:
one or more processors;
a memory; and
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to perform the steps of the device cluster repair method in any of the device cluster repair method embodiments described above.
Specifically, the method comprises the following steps: the device cluster repair device may include components such as a processor 601 of one or more processing cores, memory 602 of one or more computer-readable storage media, a power supply 603, and an input unit 604. Those skilled in the art will appreciate that the device cluster repair apparatus configuration shown in fig. 6 does not constitute a limitation of the device cluster repair apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 601 is a control center of the device cluster repair device, connects various parts of the whole device cluster repair device by using various interfaces and lines, and executes various functions of the device cluster repair device and processes data by running or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby performing overall monitoring on the device cluster repair device. Alternatively, processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.
The memory 602 may be used to store software programs and modules, and the processor 601 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created from use of the device cluster repair device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 access to the memory 602.
The device cluster repair apparatus further includes a power supply 603 for supplying power to each component, and preferably, the power supply 603 may be logically connected to the processor 601 through a power management system, so that functions of managing charging, discharging, power consumption management, and the like are implemented through the power management system. The power supply 603 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The device cluster repairing device may further comprise an input unit 604, the input unit 604 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the device cluster repairing apparatus may further include a display unit and the like, which are not described herein again. Specifically, in this embodiment, the processor 601 in the device cluster repair apparatus loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 601 runs the application program stored in the memory 602, thereby implementing various functions as follows:
responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information;
sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster;
performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, an embodiment of the present invention provides a computer-readable storage medium, which may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like. The device cluster repairing method comprises a first step of storing a computer program, and a second step of storing the computer program, wherein the computer program is loaded by a processor to execute the steps of any one of the device cluster repairing methods provided by the embodiments of the invention. For example, the computer program may be loaded by a processor to perform the steps of:
responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information;
sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster;
performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed descriptions of other embodiments, and are not described herein again.
In specific implementation, each unit or structure may be implemented as an independent entity, or may be combined arbitrarily to be implemented as the same entity or several entities, and specific implementation of each unit or structure may refer to the foregoing method embodiment, which is not described herein again.
The above detailed description is provided for the device cluster repairing method provided in the embodiment of the present application, and the principle and the implementation of the present invention are explained in this document by applying specific embodiments, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. An apparatus cluster repairing method, characterized in that the apparatus cluster repairing method comprises:
responding to a cluster query instruction, querying an equipment database, acquiring equipment information of a target equipment cluster corresponding to the cluster query instruction, and generating an equipment detection instruction corresponding to the equipment information;
sending the device detection instruction to each cluster device in the target device cluster, receiving device feedback information of each cluster device, coding the device feedback information, and generating device feedback characteristics of the target device cluster;
performing fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and repairing the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
2. The device cluster repairing method according to claim 1, wherein the receiving device feedback information of each cluster device, encoding the device feedback information, and generating the device feedback characteristic of the target device cluster comprises:
receiving device feedback information of each cluster device, and generating a cluster state image of the target device cluster according to the device feedback information and the device type of each cluster device;
and inputting the cluster state image into a preset cluster detection model for coding to obtain the equipment feedback characteristics of the cluster state image.
3. The method for repairing a device cluster according to claim 1, wherein the performing fault analysis on the device feedback characteristics to obtain a faulty device of the target device cluster includes:
inputting the equipment feedback characteristics into a full connection layer in the cluster detection model for classification processing to obtain equipment state characteristics in the equipment feedback characteristics;
matching the equipment state characteristics with preset fault characteristics to obtain fault similarity between the equipment state characteristics and the preset fault characteristics;
and marking cluster equipment corresponding to the equipment state characteristics with the fault similarity larger than the preset similarity threshold as fault equipment.
4. The method for device cluster repairing according to claim 1, wherein the repairing the failed device according to the device type of the failed device and the device feedback information of the failed device to obtain a failure repair result comprises:
accessing a preset equipment database to obtain the equipment type of the fault equipment;
reading the equipment dimension parameter in the equipment feedback information, and determining the equipment fault level of the fault equipment according to the equipment dimension parameter and a preset fault level threshold;
and repairing the fault equipment according to the equipment type and a fault repairing strategy corresponding to the equipment fault grade to obtain a fault repairing result.
5. The apparatus cluster repairing method according to claim 4, wherein the repairing the faulty apparatus according to the apparatus type of the faulty apparatus and the apparatus feedback information of the faulty apparatus to obtain a fault repairing result includes:
acquiring a repair check code and a feedback time threshold of the fault repair strategy;
receiving a repair check code returned by the fault equipment and repair feedback time associated with the repair check code;
and if the repair check code is matched with the repair check code and the repair feedback time is less than the feedback time threshold, generating a fault repair result with repaired faults.
6. The device cluster repairing method according to any one of claims 1 to 5, wherein the obtaining device information of the target device cluster corresponding to the cluster querying instruction and generating the device detection instruction corresponding to the device information includes:
inquiring an equipment database, and acquiring a target equipment cluster corresponding to the cluster inquiry instruction in the equipment database;
acquiring equipment information of each cluster equipment in the target equipment cluster and historical detection data of the cluster equipment;
and reading the dimension detection parameters corresponding to the target fields in the historical detection data, and generating a device detection instruction of the cluster device according to the device information and the dimension detection parameters.
7. The device cluster repairing method according to claim 6, wherein the querying the device database to obtain the target device cluster corresponding to the cluster querying instruction in the device database includes:
reading equipment dimension parameters in the cluster query instruction, wherein the equipment dimension parameters comprise at least one of an equipment identifier, an application identifier, a machine room dimension identifier and an equipment sub-environment identifier;
inputting the equipment dimension parameters into a preset equipment database for query to obtain target cluster equipment corresponding to the equipment dimension parameters;
and summarizing the target cluster equipment, and generating a target equipment cluster corresponding to the cluster inquiry instruction.
8. An apparatus cluster repair device, comprising:
the detection query module is configured to respond to a cluster query instruction, query an equipment database, acquire equipment information of a target equipment cluster corresponding to the cluster query instruction, and generate an equipment detection instruction corresponding to the equipment information;
a feature encoding module configured to send the device detection instruction to each cluster device in the target device cluster, receive device feedback information of each cluster device, encode the device feedback information, and generate a device feedback feature of the target device cluster;
the fault analysis module is configured to perform fault analysis on the equipment feedback characteristics to obtain fault equipment of the target equipment cluster;
and the fault repairing module is configured to repair the fault equipment according to the equipment type of the fault equipment and the equipment feedback information of the fault equipment to obtain a fault repairing result.
9. A device cluster repairing apparatus, characterized in that the device cluster repairing apparatus comprises:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the steps of the device cluster repair method of any of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which is loadable by a processor to perform the steps of the device cluster repair method of any of claims 1 to 7.
CN202211314277.2A 2022-10-25 2022-10-25 Equipment cluster repairing method, device, equipment and storage medium Pending CN115643158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211314277.2A CN115643158A (en) 2022-10-25 2022-10-25 Equipment cluster repairing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211314277.2A CN115643158A (en) 2022-10-25 2022-10-25 Equipment cluster repairing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115643158A true CN115643158A (en) 2023-01-24

Family

ID=84946248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211314277.2A Pending CN115643158A (en) 2022-10-25 2022-10-25 Equipment cluster repairing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115643158A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349128A (en) * 2023-12-05 2024-01-05 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349128A (en) * 2023-12-05 2024-01-05 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium
CN117349128B (en) * 2023-12-05 2024-03-22 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium

Similar Documents

Publication Publication Date Title
CN110704231A (en) Fault processing method and device
CN107547595B (en) Cloud resource scheduling system, method and device
CN109254922B (en) Automatic testing method and device for BMC Redfish function of server
CN110178121A (en) A kind of detection method and its terminal of database
CN110088744A (en) A kind of database maintenance method and its system
CN115643158A (en) Equipment cluster repairing method, device, equipment and storage medium
CN115052041B (en) Channel identifier allocation method, device, equipment and storage medium
CN113220540A (en) Service management method, device, computer equipment and storage medium
CN110291505A (en) Reduce the recovery time of application
CN107679423A (en) Partition integrity inspection method and device
CN115190044B (en) Device connection state checking method, device and storage medium
WO2024008130A1 (en) Faulty hardware processing method, apparatus and system
CN115643163A (en) Fault equipment positioning method, device, equipment and storage medium
CN114243914B (en) Power monitoring system
CN115981713A (en) Business system management method, device, equipment and storage medium
CN115629919A (en) Method and device for fast switching fault system
CN114650211A (en) Fault repairing method, device, electronic equipment and computer readable storage medium
CN115733771B (en) Storage module detection method, device, equipment and storage medium
CN115914016A (en) Cluster fault diagnosis method, device, equipment and storage medium
CN114741324B (en) Block chain stability testing method and device, electronic equipment and storage medium
CN115242685B (en) Playback testing method, device, equipment and storage medium based on incidence matrix
CN113342795B (en) Data checking method and device in application program, electronic equipment and storage medium
WO2023125382A1 (en) Battery management method and apparatus, electronic device, and storage medium
CN118278876A (en) Service request checking method, device, equipment and storage medium
CN115658467A (en) Service data testing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination