CN116112344B - Method, equipment and medium for detecting machine room fault network equipment - Google Patents
Method, equipment and medium for detecting machine room fault network equipment Download PDFInfo
- Publication number
- CN116112344B CN116112344B CN202310375376.XA CN202310375376A CN116112344B CN 116112344 B CN116112344 B CN 116112344B CN 202310375376 A CN202310375376 A CN 202310375376A CN 116112344 B CN116112344 B CN 116112344B
- Authority
- CN
- China
- Prior art keywords
- fault
- communication
- link
- data
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000004891 communication Methods 0.000 claims abstract description 268
- 238000001514 detection method Methods 0.000 claims abstract description 77
- 238000012360 testing method Methods 0.000 claims abstract description 55
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 24
- 238000003062 neural network model Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims 2
- 238000004364 calculation method Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0631—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0631—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
- H04L41/065—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Small-Scale Networks (AREA)
Abstract
The embodiment of the application discloses a method, equipment and medium for detecting equipment of a machine room fault network. Belonging to the technical field of electric digital data processing. The method is used for solving the problem that after the network equipment in the machine room fails, the quantity of the received alarm information is huge, and the failed equipment is difficult to be quickly determined in a manual mode, and the method comprises the following steps: determining a plurality of links based on a computer room network map; determining current network data and a plurality of center points corresponding to each link in the current time period respectively so as to fill the current network data; based on the current network data corresponding to each filled link, determining a corresponding communication fault detection model to obtain a fault link; detecting communication connectivity of a fault link, and determining reference fault equipment; performing communication data test on the reference fault equipment based on a preset communication test task to obtain a reference communication data value; and comparing the reference communication data value with a preset communication data value to determine fault equipment information.
Description
Technical Field
The application relates to the technical field of electric digital data processing, in particular to a method, equipment and medium for detecting equipment of a machine room fault network.
Background
Along with the increasing degree of informatization automation of modern enterprises, the information communication is more greatly relied on, so that the detection work performed on the information communication network equipment room is an important work content of information communication network operation and maintenance personnel, and is an important guarantee for ensuring the stable operation of equipment.
During the operation of the network system, if one network device in the network system fails, other network devices in the network system may be affected. Therefore, after the network equipment in the machine room fails, a great amount of alarm information reported by the network equipment may be received on the network operation and maintenance platform. However, the number of the received alarm messages is huge, and the fault diagnosis is carried out on the alarm messages only by a manual mode, so that the fault equipment is difficult to quickly determine.
Disclosure of Invention
The embodiment of the application provides a method, equipment and medium for detecting equipment of a machine room fault network, which are used for solving the following technical problems: after network equipment in a machine room breaks down, the quantity of the received alarm information is huge, and the fault diagnosis is carried out on the alarm information only by a manual mode, so that the fault equipment is difficult to quickly determine.
The embodiment of the application adopts the following technical scheme:
The embodiment of the application provides a method for detecting equipment of a machine room fault network. Determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and connection relations among the nodes; acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points; determining communication fault detection models corresponding to the links respectively based on the historical network database and the current network data corresponding to the filled links respectively so as to obtain a fault link through the communication fault detection models; determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment; based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment; and comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on the comparison result.
According to the method and the device for data expansion of the current link, the plurality of links in the machine room are determined, the plurality of links are filled with data, the current link can be expanded based on the historical network database under the condition that the current link data are less, and therefore model training is conducted through the expanded current link data, and a communication fault detection model obtained through training is more accurate. Secondly, according to the embodiment of the application, the intermediate node of the fault link is determined, the fault link is grouped, corresponding reference fault equipment can be rapidly determined, and the fault equipment detection efficiency is improved. In addition, in order to ensure that errors exist in the fault equipment detection process, a preset communication test task is arranged, and each reference fault equipment can be further determined through the preset communication test task, so that the accuracy of the detected fault equipment is ensured.
In one implementation manner of the present application, current network data corresponding to each link in a current time period is obtained, and when the current network data does not meet a preset condition, a plurality of center points are determined in the current network data, so as to fill the current network data based on a historical network database and the plurality of center points, and the method specifically includes: acquiring first actual average network data corresponding to each link in a previous time period, acquiring second actual average network data corresponding to each link in a current time period, and determining a difference value between the first actual average network data and the second actual average network data; wherein, the first actual average network data is related to the network data corresponding to the time slices in the previous time period; the second actual average network data is related to the network data respectively corresponding to the time slices in the current time period; selecting a preset number of network data from the network data corresponding to the current time period to serve as a plurality of center points under the condition that the difference value is larger than a preset difference value threshold value; determining a plurality of sample elements with the distance from the current center point within a preset distance from the current center point in a historical network database as a sample data set; determining the distance between the current center point and each sample element in the sample data set, and taking a plurality of sample elements in a preset distance range as a filling data set corresponding to the current center point; and determining filling data sets corresponding to the plurality of center points respectively, and filling network data corresponding to the current time period based on the filling data sets corresponding to the plurality of center points respectively.
In one implementation manner of the present application, based on a historical network database and current network data corresponding to each filled link, a communication failure detection model corresponding to each of a plurality of links is determined, so as to obtain a failed link through the communication failure detection model, which specifically includes: determining the history network data corresponding to each filled link respectively; dividing historical network data into a plurality of equal time periods respectively, and determining average network data corresponding to the equal time periods respectively; n average network data corresponding to the current link are determined, the first N-1 average network data are used as input, the nth average network data are used as output, and a preset neural network model is trained to obtain a communication fault detection model corresponding to the current link; after obtaining the actual average network data of the current link corresponding to the current time period, adding the actual average network data into an output training set of a communication fault detection model corresponding to the current link, and deleting the average network data with earliest time in the input training set; based on the current network data corresponding to each link, respectively updating the training data set corresponding to each link, so as to determine the communication fault detection model corresponding to each link through the updated training data set, and detect the network data of a plurality of links through the fault detection models corresponding to each link, so as to determine the fault link.
In one implementation manner of the present application, determining an intermediate node corresponding to a failed link, grouping the failed link based on the intermediate node, and performing communication connectivity detection on the grouped failed link to determine a reference failed device, including: determining an intermediate node of a fault link, and dividing the fault link into two sub-links by taking the intermediate node as a center; detecting communication connectivity of the two sub-links respectively, and determining a first fault sub-link; determining an intermediate node of the first fault sub-link, and dividing the first fault sub-link into two sub-links by taking the intermediate node as a center; detecting communication connectivity of two sub-links corresponding to the first fault sub-link, and determining a second fault sub-link; and dividing the fault sub-link for a plurality of times until the reference fault equipment is determined.
In one implementation manner of the present application, detecting communication connectivity of two sub-links respectively, and determining a first failure sub-link specifically includes: transmitting a detection signal to a downstream node through a start node in a sub-link; under the condition that the error rate of the detection signal received by the downstream node is larger than the preset error rate; and/or under the condition that the energy intensity of the detection signal received by the downstream node is smaller than the preset energy intensity; and/or under the condition that the interval time length of the detection signal received by the downstream node is greater than a preset interval time length threshold value; and/or determining the sub-link corresponding to the starting node as a first fault sub-link at the starting node.
In one implementation manner of the present application, a communication data test is performed on a reference fault device to determine fault device information based on the communication data test, which specifically includes: sequentially testing communication data of a plurality of reference fault devices through a first test task in a preset communication test task list to obtain first reference communication data values corresponding to each reference fault device respectively; sequentially testing communication data of a plurality of reference fault devices through a second test task in a preset communication test task list to obtain second reference communication data values corresponding to each reference fault device respectively; repeating the first test task and the second test task, determining a first total reference communication data value corresponding to the multiple reference fault devices respectively based on the obtained multiple first reference communication data values and multiple second reference communication data values, and determining a second total reference communication data value corresponding to the multiple reference fault devices respectively.
In one implementation of the present application, determining the final fault device based on the first reference communication data value, the second reference communication data value, the first total reference communication data value, and the second total reference communication data value specifically includes: performing pairwise median processing on a plurality of first reference communication data values corresponding to each reference fault device respectively to obtain first communication values corresponding to each reference fault device respectively; performing two-by-two median processing on a plurality of second reference communication data values corresponding to each reference fault device respectively to obtain second communication values corresponding to each reference fault device respectively; and comparing the first communication value with the second communication value respectively with a preset first communication value, and comparing the first total reference communication data value with the second total reference communication data value respectively with a preset second communication value to determine fault equipment information through a comparison result.
In one implementation manner of the present application, the comparing the first communication value with the second communication value with the preset first communication value, and comparing the first total reference communication data value with the second total reference communication data value with the preset second communication value, so as to determine the fault device information through the comparison result specifically includes: determining that the reference fault device is a fault device of a serious grade under the condition that the first communication value and the second communication value are smaller than the preset first communication value and the first total reference communication data value and the second total reference communication data value are smaller than the preset second communication value; determining that the reference fault device is a normal device when the first communication value and the second communication value are not smaller than the preset first communication value and the first total reference communication data value and the second total reference communication data value are not smaller than the preset second communication value; otherwise, determining that the reference fault device is a fault device of a common grade.
The embodiment of the application provides equipment for detecting equipment of a machine room fault network, which comprises the following components: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to: determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and connection relations among the nodes; acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points; determining communication fault detection models corresponding to the links respectively based on the historical network database and the current network data corresponding to the filled links respectively so as to obtain a fault link through the communication fault detection models; determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment; based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment; and comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on the comparison result.
The embodiment of the application provides a nonvolatile computer storage medium, which stores computer executable instructions, wherein the computer executable instructions are configured to: determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and connection relations among the nodes; acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points; determining communication fault detection models corresponding to the links respectively based on the historical network database and the current network data corresponding to the filled links respectively so as to obtain a fault link through the communication fault detection models; determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment; based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment; and comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on the comparison result.
The above-mentioned at least one technical scheme that this application embodiment adopted can reach following beneficial effect: according to the method and the device for data expansion of the current link, the plurality of links in the machine room are determined, the plurality of links are filled with data, the current link can be expanded based on the historical network database under the condition that the current link data are less, and therefore model training is conducted through the expanded current link data, and a communication fault detection model obtained through training is more accurate. Secondly, according to the embodiment of the application, the intermediate node of the fault link is determined, the fault link is grouped, corresponding reference fault equipment can be rapidly determined, and the fault equipment detection efficiency is improved. In addition, in order to ensure that errors exist in the fault equipment detection process, a preset communication test task is arranged, and each reference fault equipment can be further determined through the preset communication test task, so that the accuracy of the detected fault equipment is ensured.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art. In the drawings:
Fig. 1 is a flowchart of a method for detecting a machine room fault network device according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a machine room fault network device detection device provided in an embodiment of the present application;
reference numerals:
200 machine room fault network equipment detection equipment, 201 processor, 202 memory.
Detailed Description
The embodiment of the application provides a method, equipment and medium for detecting equipment of a machine room fault network.
In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
The following describes in detail the technical solution proposed in the embodiments of the present application through the accompanying drawings.
Fig. 1 is a flowchart of a method for detecting a machine room fault network device according to an embodiment of the present application. As shown in fig. 1, the method for detecting the machine room fault network device includes the following steps:
Step 101, determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and connection relations among the nodes.
In one embodiment of the present application, a plurality of network devices of a machine room are determined, each network device acting as a node. According to the connection relation and the network data transmission relation among different network devices, a network map corresponding to the network devices in the current machine room is determined, and a plurality of links corresponding to the network devices in the current machine room are determined based on the network map. The network map comprises a plurality of network equipment nodes in a machine room and connection relations among the nodes.
Step 102, obtaining current network data corresponding to each link in the current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on the historical network database and the plurality of center points.
In one embodiment of the present application, first actual average network data corresponding to each link in a previous period of time is obtained, second actual average network data corresponding to each link in a current period of time is obtained, and a difference value between the first actual average network data and the second actual average network data is determined; wherein, the first actual average network data is related to the network data corresponding to the time slices in the previous time period; the second actual average network data is related to network data respectively corresponding to a plurality of time slices in the current time period. And selecting a preset number of network data from the network data corresponding to the current time period to serve as a plurality of center points under the condition that the difference value is larger than a preset difference value threshold value. In the historical network database, a plurality of sample elements within a predetermined distance from the current center point are determined as a sample data set. And determining the distance between the current center point and each sample element in the sample data set, and taking a plurality of sample elements in a preset distance range as a filling data set corresponding to the current center point. And determining filling data sets corresponding to the plurality of center points respectively, and filling network data corresponding to the current time period based on the filling data sets corresponding to the plurality of center points respectively.
Specifically, after determining a plurality of links in the current machine room, it is necessary to determine the link where the fault device is located. In order to ensure the accuracy of fault link detection, the embodiment of the application needs to fill the data of the link first so as to ensure enough link data to be detected and improve the accuracy of detection.
Specifically, the time length of receiving network data by each link is divided into a plurality of equal time periods, and each time period includes a plurality of small time slices, for example, the time length of receiving network data can be divided into a time period of every half hour, and each time period includes a plurality of small time slices of a division of every minute.
Further, when the fault detection is carried out on the current link, determining first actual average network data corresponding to a previous time period of the current link, and acquiring second actual average network data corresponding to the current link in the current time period. The first actual average network data is a ratio between total network data acquired by the current link in a previous time period and the number of small time slices in the previous time period of the current link. The second actual average network data is a ratio between total network data acquired by the current link in the current time period and the number of small time slices in the current time period of the current link.
Further, the difference value calculation is performed on the first actual average network data and the second actual average network data, if the obtained difference value is larger than a preset difference value threshold, it is indicated that the data volume of the current link in the current time period is smaller, and the data expansion needs to be performed on the current link so as to ensure the accuracy of training the communication fault detection model. Specifically, a plurality of network data are selected in the current time period, so that the selected network data are used as the center point of the current link in the current time period. And comparing the historical network database with a plurality of center points respectively, specifically comparing the historical network database with the currently selected center point, and determining a plurality of sample elements with the distance from the current center point within a preset distance to serve as a sample data set corresponding to the currently selected center point. And sorting the plurality of sample elements in the sample data set based on the distance from the current center point from the near to the far, deleting the sample elements with the distance larger than the preset distance after sorting, and taking the plurality of sample elements in the preset distance range as a filling data set corresponding to the current center point.
Further, sample data set determination is respectively carried out on each center point of the current link in the current time period, so that filling data respectively corresponding to each center point is determined based on the determined multiple sample data sets. And filling the network data of the current link in the current time period according to the filling data respectively corresponding to each center point.
And step 103, determining communication fault detection models corresponding to the links respectively based on the historical network database and the current network data corresponding to the filled links respectively, so as to obtain the fault link through the communication fault detection models.
In one embodiment of the present application, historical network data corresponding to each link after filling is determined. And dividing the historical network data into a plurality of equal time periods respectively, and determining average network data corresponding to the equal time periods respectively. N average network data corresponding to the current link are determined, the first N-1 average network data are used as input, the nth average network data are used as output, and the preset neural network model is trained to obtain a communication fault detection model corresponding to the current link. After obtaining the actual average network data of the current link corresponding to the current time period, adding the actual average network data into an output training set of a communication fault detection model corresponding to the current link, and deleting the average network data with earliest time in the input training set. Based on the current network data corresponding to each link, respectively updating the training data set corresponding to each link, so as to determine the communication fault detection model corresponding to each link through the updated training data set, and detect the network data of a plurality of links through the fault detection models corresponding to each link, so as to determine the fault link.
Specifically, after the link data is filled, the historical network data corresponding to each link is determined. The historical network data is also divided into a plurality of equal time periods, for example, each half hour as a time period, and each time period comprises a plurality of small time slices as a divided section every minute. And determining a plurality of equal time periods corresponding to each link respectively, and comparing the network data received in the equal time periods with the number of small fragments in the equal time periods to obtain average network data corresponding to each equal time period respectively.
Further, determining a plurality of equal time periods corresponding to the current link, and determining average network data corresponding to each equal time period of the current link, so as to obtain N average network data corresponding to the current link. According to the time sequence, the first N-1 average network data corresponding to the current link are used as input, the nth average network data corresponding to the current link is used as output, and the preset neural network model is trained to obtain the communication fault detection model corresponding to the current link. And determining whether the current link is a fault link or not based on the error data such as the error data, the missing data, the repeated data and the like in the N-th average network data output by the preset neural network model, namely, if the error data occupies a proportion greater than a preset proportion, indicating that the current link is the fault link.
Further, to ensure accuracy of training the communication failure detection model, real-time updating of the training data is required. Therefore, after receiving the latest average network data in the current time period, namely, the (n+1) th average network data, adding the latest (n+1) th average network data into the output training set, adding the (N) th average network data into the input training set, and deleting the average network data with the earliest time in the input training set. By the method, the latest average network data can be added into the training set in time, and the communication fault detection model is accurately trained, so that the detection result of the fault link is more accurate.
Further, the communication fault detection model corresponding to each link is determined in the mode, and the communication fault detection model corresponding to each link is trained based on the plurality of average network data corresponding to each link. And detecting the network data of each link respectively through the communication fault detection model corresponding to each link respectively so as to determine the fault link.
Step 104, determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine the reference fault equipment.
In one embodiment of the present application, an intermediate node of the failed link is determined, and the failed link is divided into two sub-links with the intermediate node as a center. And detecting communication connectivity of the two sub-links respectively, and determining a first fault sub-link. And determining an intermediate node of the first fault sub-link, and dividing the first fault sub-link into two sub-links by taking the intermediate node as a center. And detecting communication connectivity of the two sub-links corresponding to the first fault sub-link, and determining a second fault sub-link. And dividing the fault sub-link for a plurality of times until the reference fault equipment is determined.
In one embodiment of the present application, the detection signal is sent to the downstream node by the originating node in the sub-link. And under the condition that the error rate of the detection signal received by the downstream node is larger than a preset error rate, and/or under the condition that the energy intensity of the detection signal received by the downstream node is smaller than a preset energy intensity, and/or under the condition that the interval time length of the detection signal received by the downstream node is larger than a preset interval time length threshold, and/or under the condition that the initial node determines that the sub-link corresponding to the initial node is the first fault sub-link.
Specifically, after determining the failed links, in order to improve the efficiency of detecting the failed device, in this embodiment of the present application, an intermediate node of each failed link is determined, and each failed link is divided into two parts with the intermediate node as a center. And respectively detecting communication connectivity of the two parts of links corresponding to each link, namely, taking the first node of each part as a starting point and sending detection signals to the last node of each part. If the last network node cannot receive the detection signal or the detection signal received by the last node has errors, indicating that the link in the current part has faults, and taking the link in the current part with faults as a first fault sub-link, specifically, if the error rate of the detection signal received by the last node is larger than the preset error rate; and/or under the condition that the energy intensity of the detection signal received by the last node is smaller than the preset energy intensity; and/or under the condition that the interval time length of the detection signal received by the last node is greater than a preset interval time length threshold value; the current sub-link is determined to be the first failed sub-link.
Further, determining an intermediate node of each first failure sub-link, dividing each first failure sub-link into two parts by taking the intermediate node as a center, and likewise taking the first node of each part as a starting point, sending network data to the last node of each part, and if the last network node cannot receive the network data or the network data received by the last node has errors, indicating that the links of the current part have failures, and taking the links of the current part having failures as second failure sub-links.
Further, an intermediate node of each second faulty sub-link is determined, so as to perform fault detection on the second faulty sub-link, and the process is repeated until the reference faulty device is determined.
And 105, based on a preset communication test task, performing communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment.
In one embodiment of the present application, communication data testing is sequentially performed on a plurality of reference fault devices through a first test task in a preset communication test task list, so as to obtain a first reference communication data value corresponding to each reference fault device. And sequentially carrying out communication data test on the plurality of reference fault devices through a second test task in the preset communication test task list to obtain second reference communication data values corresponding to each reference fault device. Repeating the first test task and the second test task, determining a first total reference communication data value corresponding to the multiple reference fault devices respectively based on the obtained multiple first reference communication data values and multiple second reference communication data values, and determining a second total reference communication data value corresponding to the multiple reference fault devices respectively.
Specifically, in order to further determine the reference fault devices, the embodiment of the application further separately detects each reference fault device, and the embodiment of the application is provided with a preset communication test task list, and performs communication data detection on each reference fault device through a first test task in the preset communication test task list to obtain a first reference communication data value corresponding to each reference fault device. Repeating the first test task for a plurality of times to obtain a plurality of first reference communication data values corresponding to each reference fault device respectively, and adding the plurality of first reference communication data values to obtain a first total reference communication data value corresponding to each reference fault device respectively.
And similarly, detecting communication data of each reference fault device through a second test task in the preset communication test task list to obtain a second reference communication data value corresponding to each reference fault device. And repeating the second test task for a plurality of times to obtain a plurality of second reference communication data values corresponding to each reference fault device respectively, and adding the plurality of second reference communication data values to obtain a second total reference communication data value corresponding to each reference fault device respectively.
Further, the faulty device information may be determined based on the first reference communication data value, the second reference communication data value, the first total reference communication data value, and the second total reference communication data value.
And 106, comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on the comparison result.
In one embodiment of the present application, a plurality of first reference communication data values corresponding to each reference fault device are subjected to two-by-two median processing to obtain a first communication value corresponding to each reference fault device, and a plurality of second reference communication data values corresponding to each reference fault device are subjected to two-by-two median processing to obtain a second communication value corresponding to each reference fault device. And comparing the first communication value with the second communication value respectively with a preset first communication value, and comparing the first total reference communication data value with the second total reference communication data value respectively with a preset second communication value to determine fault equipment information through a comparison result.
In one embodiment of the present application, the reference fault device is determined to be a final fault device of a severity level when both the first communication value and the second communication value are less than a preset first communication value and both the first total reference communication data value and the second total reference communication data value are less than a preset second communication value. And under the condition that the first communication value and the second communication value are not smaller than the preset first communication value and the first total reference communication data value and the second total reference communication data value are not smaller than the preset second communication value, determining the reference fault equipment as normal equipment. Otherwise, determining the reference fault device as the final fault device of the common grade.
Specifically, after a plurality of first reference communication values corresponding to each reference fault device are obtained, median processing is carried out on two adjacent first reference communication values, and after multiple median calculation, the first communication values corresponding to each reference fault device are obtained. Similarly, after a plurality of second reference communication values corresponding to each reference fault device are obtained, median processing is carried out on two adjacent second reference communication values, and after multiple median calculation, the second communication values corresponding to each reference fault device are obtained.
And comparing the first communication value and the second communication value corresponding to each reference fault with a preset first communication value respectively, and comparing the first total reference communication data value and the second total reference communication data value with a preset second communication value respectively.
Further, if the first communication value and the second communication value are both smaller than the preset first communication value, the first total reference communication data value and the second total reference communication data value are both smaller than the preset second communication value, it is indicated that the current reference fault device is a fault device, and the fault device is a fault device with a serious grade, and emergency treatment is needed.
Further, if the first communication value and the second communication value are not smaller than the preset first communication value, the first total reference communication data value and the second total reference communication data value are not smaller than the preset second communication value, and the current reference fault device is a normal device.
Further, if any one data or any two data or any three data in the first communication value, the second communication value, the first total reference communication data value and the second total reference communication data value corresponding to the current reference fault device do not meet the requirement of the preset communication value, determining that the current reference fault device is the fault device, and the fault device is the fault device of the common level.
Fig. 2 is a schematic structural diagram of a machine room fault network device detection device provided in an embodiment of the present application. As shown in fig. 2, the machine room failure network device detection device 200 includes: at least one processor 201; and a memory 202 communicatively coupled to the at least one processor 201; wherein the memory 202 stores instructions executable by the at least one processor 201, the instructions being executable by the at least one processor 201 to enable the at least one processor 201 to: determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and a plurality of connection relations among the nodes; acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points; determining communication fault detection models respectively corresponding to the links based on the historical network database and the current network data respectively corresponding to the filled links so as to obtain a fault link through the communication fault detection models; determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment; based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment; and comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on a comparison result.
The embodiments also provide a non-volatile computer storage medium storing computer executable instructions configured to: determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and a plurality of connection relations among the nodes; acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points; determining communication fault detection models respectively corresponding to the links based on the historical network database and the current network data respectively corresponding to the filled links so as to obtain a fault link through the communication fault detection models; determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment; based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment; and comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on a comparison result.
All embodiments in the application are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred, so that each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and alterations to the embodiments herein will be apparent to those skilled in the art without departing from the spirit and scope of the various embodiments of the invention.
Claims (9)
1. The method for detecting the equipment of the machine room fault network is characterized by comprising the following steps:
determining a plurality of links based on a computer room network map; the computer room network map comprises a plurality of nodes corresponding to the computer room and a plurality of connection relations among the nodes;
acquiring current network data corresponding to each link in a current time period, and determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions so as to fill the current network data based on a historical network database and the plurality of center points;
Determining a communication fault detection model corresponding to each of a plurality of links based on the historical network database and the current network data corresponding to each of the filled links respectively, so as to obtain a fault link through the communication fault detection model;
determining an intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine reference fault equipment;
based on a preset communication test task, carrying out communication data test on the reference fault equipment to obtain a reference communication data value corresponding to the reference fault equipment;
comparing the reference communication data value with a preset communication data value to determine fault equipment information in the current machine room based on a comparison result;
the obtaining current network data corresponding to each link in the current time period, determining a plurality of center points in the current network data under the condition that the current network data does not meet preset conditions, so as to fill the current network data based on a historical network database and the plurality of center points, including:
acquiring first actual average network data corresponding to each link in a previous time period, acquiring second actual average network data corresponding to each link in a current time period, and determining a difference value between the first actual average network data and the second actual average network data; wherein, the first actual average network data is related to network data corresponding to a plurality of time slices in a previous time period respectively; the second actual average network data is related to network data corresponding to a plurality of time slices in the current time period respectively;
Selecting a preset number of network data from the network data corresponding to the current time period to serve as a plurality of center points under the condition that the difference value is larger than a preset difference value threshold value;
determining a plurality of sample elements with the distance from the current center point within a preset distance from the current center point in a historical network database as a sample data set;
determining the distance between the current center point and each sample element in the sample data set, and taking a plurality of sample elements in a preset distance range as a filling data set corresponding to the current center point;
and determining filling data sets corresponding to the plurality of center points respectively, and filling network data corresponding to the current time period based on the filling data sets corresponding to the plurality of center points respectively.
2. The method for detecting a machine room fault network device according to claim 1, wherein the determining a communication fault detection model corresponding to each of the plurality of links based on the historical network database and the current network data corresponding to each of the filled links, so as to obtain a fault link through the communication fault detection model specifically includes:
determining the history network data corresponding to each filled link respectively;
Dividing the historical network data into a plurality of equal time periods respectively, and determining average network data corresponding to the equal time periods respectively;
determining N average network data corresponding to a current link, taking the first N-1 average network data as input, taking the Nth average network data as output, and training a preset neural network model to obtain a communication fault detection model corresponding to the current link;
after obtaining the actual average network data of the current link corresponding to the current time period, adding the actual average network data into an output training set of a communication fault detection model corresponding to the current link, and deleting the average network data with earliest time in the input training set;
based on the current network data respectively corresponding to each link, the training data set respectively corresponding to each link is updated, so that the communication fault detection model respectively corresponding to each link is determined through the updated training data set, and the network data of the links are detected through the fault detection models respectively corresponding to the links, so that the fault link is determined.
3. The method for detecting a machine room fault network device according to claim 1, wherein the determining the intermediate node corresponding to the fault link, grouping the fault link based on the intermediate node, and detecting communication connectivity of the grouped fault link to determine a reference fault device, specifically includes:
Determining an intermediate node of the fault link, and dividing the fault link into two sub-links by taking the intermediate node as a center;
detecting communication connectivity of the two sub-links respectively, and determining a first fault sub-link;
determining an intermediate node of the first fault sub-link, and dividing the first fault sub-link into two sub-links by taking the intermediate node as a center;
detecting communication connectivity of two sub-links corresponding to the first fault sub-link, and determining a second fault sub-link;
and dividing the fault sub-link for a plurality of times until the reference fault equipment is determined.
4. The method for detecting a machine room fault network device according to claim 3, wherein the detecting communication connectivity of the two sub-links respectively, and determining the first fault sub-link specifically includes:
transmitting a detection signal to a downstream node through a start node in the sub-link;
under the condition that the error rate of the detection signal received by the downstream node is larger than a preset error rate; and/or in case the energy intensity of the detection signal received by the downstream node is smaller than a preset energy intensity; and/or when the interval time length of the detection signal received by the downstream node is greater than a preset interval time length threshold; and determining the sub-link corresponding to the starting node as a first fault sub-link.
5. The method for detecting a machine room fault network device according to claim 1, wherein the communication data testing is performed on the reference fault device based on a preset communication test task to obtain a reference communication data value corresponding to the reference fault device, specifically including:
sequentially performing communication data test on a plurality of reference fault devices through a first test task in a preset communication test task list to obtain first reference communication data values corresponding to each reference fault device respectively;
sequentially performing communication data test on a plurality of reference fault devices through a second test task in a preset communication test task list to obtain second reference communication data values corresponding to each reference fault device respectively;
repeating the first test task and the second test task, determining a first total reference communication data value corresponding to a plurality of reference fault devices respectively based on the obtained plurality of first reference communication data values and the obtained plurality of second reference communication data values, and determining a second total reference communication data value corresponding to the plurality of reference fault devices respectively.
6. The method for detecting a fault network device in a machine room according to claim 5, wherein the comparing the reference communication data value with a preset communication data value to determine the fault device information in the current machine room based on the comparison result specifically includes:
Performing pairwise median processing on a plurality of first reference communication data values corresponding to each reference fault device respectively to obtain first communication values corresponding to each reference fault device respectively; performing two-by-two median processing on a plurality of second reference communication data values corresponding to each reference fault device respectively to obtain second communication values corresponding to each reference fault device respectively;
and comparing the first communication value with the second communication value respectively with a preset first communication value, and comparing the first total reference communication data value with the second total reference communication data value respectively with a preset second communication value to determine the fault equipment information through a comparison result.
7. The method for detecting a machine room fault network device according to claim 6, wherein comparing the first communication value and the second communication value with preset first communication values respectively, and comparing the first total reference communication data value and the second total reference communication data value with preset second communication values respectively, so as to determine the fault device information through a comparison result, specifically includes:
Determining that the reference fault device is a fault device of a serious grade under the condition that the first communication value and the second communication value are smaller than the preset first communication value and the first total reference communication data value and the second total reference communication data value are smaller than the preset second communication value;
determining that the reference fault device is a normal device when the first communication value and the second communication value are not smaller than the preset first communication value and the first total reference communication data value and the second total reference communication data value are not smaller than the preset second communication value;
otherwise, determining that the reference fault equipment is normal-grade fault equipment.
8. A machine room failure network device detection apparatus, characterized in that the apparatus comprises a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a machine room failure network device detection method according to any of claims 1-7.
9. A non-transitory computer storage medium storing computer executable instructions, wherein the computer executable instructions are capable of performing a machine room failure network device detection method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310375376.XA CN116112344B (en) | 2023-04-11 | 2023-04-11 | Method, equipment and medium for detecting machine room fault network equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310375376.XA CN116112344B (en) | 2023-04-11 | 2023-04-11 | Method, equipment and medium for detecting machine room fault network equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116112344A CN116112344A (en) | 2023-05-12 |
CN116112344B true CN116112344B (en) | 2023-06-20 |
Family
ID=86267603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310375376.XA Active CN116112344B (en) | 2023-04-11 | 2023-04-11 | Method, equipment and medium for detecting machine room fault network equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116112344B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102761475A (en) * | 2012-03-27 | 2012-10-31 | 西安交通大学 | Internetwork-on-chip fault-tolerance routing method based on channel dependency graphs |
CN108491305A (en) * | 2018-03-09 | 2018-09-04 | 网宿科技股份有限公司 | A kind of detection method and system of server failure |
CN114089118A (en) * | 2021-11-24 | 2022-02-25 | 重庆大学 | Intelligent substation fault positioning method based on gated cyclic unit network |
CN114662249A (en) * | 2020-12-22 | 2022-06-24 | 中国石油化工股份有限公司 | Pipe network model establishing method and pipe network model establishing device |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105591775B (en) * | 2014-10-23 | 2019-10-25 | 华为技术有限公司 | A kind of operation management maintainance OAM methods, devices and systems of network |
CN111126603A (en) * | 2019-12-25 | 2020-05-08 | 江苏远望仪器集团有限公司 | Equipment fault prediction method, device and equipment based on neural network model |
AU2020102478A4 (en) * | 2020-09-29 | 2020-11-12 | Liu, Yilong Mr | A Method of A Power Utility Failure Detection Model Based on State Space Model |
CN112560349A (en) * | 2020-12-21 | 2021-03-26 | 上海云瀚科技股份有限公司 | Missing value filling method of water service partition metering fault flow instrument based on GCNs |
CN112911625B (en) * | 2021-02-04 | 2022-06-03 | 重庆邮电大学 | Fault diagnosis method for deterministic time slot communication sensing node |
CN114266294A (en) * | 2021-12-08 | 2022-04-01 | 中国联合网络通信集团有限公司 | Training method of classification model, and fault analysis method and device of target link |
CN114861880B (en) * | 2022-05-06 | 2024-04-12 | 清华大学 | Industrial equipment fault prediction method and device based on cavity convolutional neural network |
CN115037599A (en) * | 2022-06-13 | 2022-09-09 | 中国电信股份有限公司 | Communication network fault early warning method, device, equipment and medium |
-
2023
- 2023-04-11 CN CN202310375376.XA patent/CN116112344B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102761475A (en) * | 2012-03-27 | 2012-10-31 | 西安交通大学 | Internetwork-on-chip fault-tolerance routing method based on channel dependency graphs |
CN108491305A (en) * | 2018-03-09 | 2018-09-04 | 网宿科技股份有限公司 | A kind of detection method and system of server failure |
CN114662249A (en) * | 2020-12-22 | 2022-06-24 | 中国石油化工股份有限公司 | Pipe network model establishing method and pipe network model establishing device |
CN114089118A (en) * | 2021-11-24 | 2022-02-25 | 重庆大学 | Intelligent substation fault positioning method based on gated cyclic unit network |
Also Published As
Publication number | Publication date |
---|---|
CN116112344A (en) | 2023-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104462757A (en) | Sequential verification test method of Weibull distribution reliability based on monitoring data | |
CN104376033A (en) | Fault diagnosis method based on fault tree and database technology | |
RU2336566C2 (en) | Method of modeling of processes of provision of technical readiness of communication networks in technical operation and system for its implementation | |
CN105515893B (en) | Method for determining position of sampling point | |
CN108268023B (en) | Remote fault diagnosis method and system for rail transit platform door | |
CN114943321A (en) | Fault prediction method, device and equipment for hard disk | |
CN114495497B (en) | Method and system for judging and interpolating traffic abnormal data | |
CN106506226A (en) | A kind of startup method and device of fault detect | |
CN116593883A (en) | Breaker body fault diagnosis method, device and equipment of intelligent high-voltage switch and storage medium | |
CN114779747A (en) | Vehicle fault cause determination system and method | |
CN109815124B (en) | MBSE-based interlocking function defect analysis method and device and interlocking system | |
CN116112344B (en) | Method, equipment and medium for detecting machine room fault network equipment | |
CN110609761B (en) | Method and device for determining fault source, storage medium and electronic equipment | |
CN117075578A (en) | Vehicle fault analysis method and device, electronic equipment and storage medium | |
CN111209180B (en) | Regression testing method and device based on fuzzy matching | |
CN116541728A (en) | Fault diagnosis method and device based on density clustering | |
CN113507397B (en) | Method for collecting terminal equipment state automatic inspection based on cloud operation and maintenance | |
US20240289209A1 (en) | Method and apparatus for detecting and explaining anomalies | |
CN114021744A (en) | Method and device for determining residual service life of equipment and electronic equipment | |
CN112199247B (en) | Method and device for checking Docker container process activity in non-service state | |
CN112799911A (en) | Node health state detection method, device, equipment and storage medium | |
Vedeshenkov | On the route-oriented method of system diagnostics in digital systems structured as a symmetric bipartite graph | |
CN117793764B (en) | 5G private network soft probe dial testing data integrity checksum completion method and system | |
CN111176916B (en) | Data storage fault diagnosis method and system | |
Qiu et al. | Multi-sensor system simulation based on RESTART algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A method, equipment, and medium for detecting network equipment faults in computer rooms Granted publication date: 20230620 Pledgee: Ji'nan rural commercial bank Limited by Share Ltd. high tech branch Pledgor: Shandong Jinyu Information Technology Group Co.,Ltd. Registration number: Y2024980000280 |