WO2018001326A1

WO2018001326A1 - Method and device for acquiring fault information

Info

Publication number: WO2018001326A1
Application number: PCT/CN2017/090871
Authority: WO
Inventors: 刘庆明
Original assignee: 中兴通讯股份有限公司
Priority date: 2016-06-29
Filing date: 2017-06-29
Publication date: 2018-01-04
Also published as: CN107547127A

Abstract

A method and a device for acquiring fault information. The method is applicable to a server, and comprises: when monitoring that an optical transport network (OTN) service issues an alarm, acquiring a first alarm code of a first alarm and determining a first OTN single board corresponding to the first alarm; acquiring the monitored information and sample data ranges of a service node, the service node being the service node of a preset service stream model of the first OTN single board; the monitored information comprising the data and performance parameters of the service node when the first alarm is being issued, the sample data ranges being the range of the data and the range of the performance parameters of the service node, set in advance; and uploading the first alarm code, the monitored information and the sample data ranges to a cloud server.

Description

Fault information acquisition method and device

Technical field

The present disclosure relates to the field of communications technologies, and, for example, to a fault information acquisition method and apparatus.

Background technique

With the rapid development of the communication technology, the application network of the Optical Transport Network (OTN) service is gradually expanded, and the service rate carried by a single OTN service board is also increasing. When a service board fails, the service board has a wide range of impacts. Therefore, when a service board fails, you can quickly switch services to the protection network. In other ways, you can quickly recover other services in an unprotected environment. After recovery, fault location, output fault analysis, evasive methods and solutions are available. Currently, there are two main methods for locating engineering faults:

The first type, when the faulty environment can be retained, the remote access environment or maintenance personnel directly go to the engineering site to operate, and use the positioning interface provided by the service board device to perform loopback testing. Each loopback point is gradually advanced to analyze the fault point. Read whether the monitoring register reserved by the service board is abnormal with the normal service. Check whether the monitoring storage variables reserved by the board are consistent with the expected ones. Analyze the alarm status of the board alarm and its associated board. However, these operations are performed in the actual OTN network, and the relevant staff, such as maintenance personnel or R&D personnel, directly operate the existing network environment, which may affect the normal operation of the service, and the scattered monitoring registers and software variables require corresponding work. Personnel with proficiency in professional competence;

Second, when the fault environment cannot be retained, relevant environmental information can be collected, reproduced in the laboratory, simulated engineering operations, and automated repetitive operations attempting to reproduce the fault. According to the experience of positioning faults, accidental on-site faults may not be reproduced due to different environments and different operating methods. The remote location OTN device fault depends heavily on the actual fault environment of the live network. The analysis of the alarm data is cumbersome and the timeliness of the location fault is not strong.

Therefore, the related technology has certain defects for the fault location of the OTN device.

Summary of the invention

The present disclosure provides a method and a device for acquiring fault information, which can solve the defect that the fault location of the OTN device in the related art is easy to affect the running stability of the existing network service and the timeliness of the positioning is poor.

The embodiment provides a method for acquiring fault information, which is applied to the server, and may include:

Obtaining the first alarm code of the first alarm and determining the first OTN board corresponding to the first alarm when the OTN service of the optical transmission network is sent an alarm;

Obtaining the monitoring information of the service node and the sample data range, where the service node is a service node of the preset service flow model of the first OTN board; the monitoring information includes data of the service node at the moment when the first alarm occurs, and performance parameters, sample data. The range is the data range and performance parameter range of the preset service node data;

Upload the first alarm code, monitoring information, and sample data range to the cloud server.

Optionally, the first OTN board corresponding to the first alarm is obtained, and the first OTN board corresponding to the first alarm is obtained, and the following:

When the alarm monitoring point in the service topology of the OTN service of the optical transport network sends an alarm, the alarm code of the alarm is obtained, and the alarm code carries the time information, the location information, and the alarm name of the alarm;

Determining, according to the alarm code, the first alarm and the first alarm code whose alarm time occurs at the top; and

Determining the first OTN board according to the first alarm code.

Optionally, the service node includes: a service split node, a service encapsulation node, and a hardware node.

Optionally, the performance parameters include: clock frequency, peripheral chip state, optical power, optical module bias voltage, and bias current.

Optionally, the method further includes:

Obtaining a second OTN board corresponding to the error rate when the BER of the OTN service exceeds a preset value;

Obtaining data packet information of the service node of the preset service flow model of the second OTN board, where the data packet information includes: the number of received data packets and the number of sent data packets;

Upload the error rate and packet information to the cloud server.

Optionally, uploading the first alarm code, the monitoring information, and the sample data range to the cloud server, including:

The first alarm code, the monitoring information, and the sample data range are encrypted according to the first preset encryption algorithm, and then uploaded to the cloud server.

Optionally, uploading the error rate and the packet information to the cloud server, including:

The error rate and the packet information are encrypted according to the second preset encryption algorithm, and then uploaded to the cloud server.

The embodiment further provides a method for acquiring fault information, which is applied to the client, and may include:

When the server detects that the OTN service sends an alarm, the client obtains the first alarm code of the first alarm from the cloud server, and the monitoring information corresponding to the first alarm code and the sample data range; the monitoring information is the first The data of the service node of the preset service flow model of the first OTN board corresponding to the alarm and the performance parameter at the moment when the first alarm occurs, and the sample data range is the data range and the performance parameter range of the preset service node data;

The first alarm code, the monitoring information, and the sample data range and the preset business flow model are displayed through the visual view.

Optionally, displaying the first alarm code, the monitoring information, and the sample data range and the preset service flow model through the visual view, including:

The first alarm code, the monitoring information, and the sample data range are decrypted into the plaintext according to the first preset encryption algorithm, and the preset service flow model is displayed through the visual view.

Optionally, the method comprises:

When the server detects that the BER of the OTN service exceeds a preset value, the client obtains the second OTN board corresponding to the error rate from the cloud server, and the service of the preset service flow model of the second OTN board. Packet information of the node, the packet information includes: the number of received data packets and the number of transmitted data packets;

The bit error rate and packet information are displayed in a visual view.

Optionally, the bit error rate and the packet information are displayed through the visual view, including:

The error rate and the packet information are decrypted into plaintext according to the second preset encryption algorithm, and the packet information is displayed through the visual view.

The embodiment further provides a fault information obtaining device, which is applied to the server, and may include:

The alarm monitoring module is configured to: when the OTN service of the optical transport network is sent to send an alarm, obtain the first alarm code of the first alarm, and determine the first OTN board corresponding to the first alarm;

An information acquisition module, configured to acquire monitoring information of a service node and a sample data range, and a service The node is a service node of the preset service flow model of the first OTN board; the monitoring information includes data of the service node at the moment when the first alarm occurs and performance parameters, and the sample data range is a data range of the preset service node data and Range of performance parameters;

The data uploading module is configured to upload the first alarm code, the monitoring information, and the sample data range to the cloud server.

Optionally, the alarm monitoring module includes:

The alarm code acquisition sub-module is configured to acquire an alarm code of the alarm when the alarm monitoring point in the service topology of the OTN service of the optical transport network sends an alarm, and the alarm code carries the time information, the location information, and the alarm name of the alarm;

a first alarm determining submodule configured to determine, according to the alarm code, a first alarm that occurs at an alarm time and a first alarm code of the first alarm;

The board determining submodule is configured to determine, according to the first alarm code, the first OTN board corresponding to the first alarm.

Optionally, the device further includes:

The error monitoring module is configured to obtain a second OTN board corresponding to the error rate when the error rate of the OTN service exceeds a preset value;

a packet information obtaining module, configured to acquire data packet information of a service node of a preset service flow model of the second OTN board, where the data packet information includes: a quantity of the received data packet and a quantity of the sent data packet;

The error information uploading module is configured to upload the error rate and the data packet information to the cloud server.

Optionally, the data upload module is used to:

The embodiment further provides a fault information obtaining device, which is applied to the client, and may include:

The data acquisition module is configured to: when the server detects that the OTN service sends an alarm, the first alarm code of the first alarm is obtained from the cloud server, and the monitoring signal is corresponding to the first alarm code. And the sample data range; the monitoring information is the data of the service node of the preset service flow model of the first OTN board corresponding to the first alarm, and the performance parameter at the moment when the first alarm occurs, and the sample data range is a preset service. The data range and performance parameter range of the node data;

The view display module is configured to display the first alarm code, the monitoring information, and the sample data range and the preset service flow model through the visual view.

Optionally, the view display module is set to:

Optionally, the device further includes:

The erroneous display module is configured to obtain the second OTN board corresponding to the BER and the preset service flow of the second OTN board from the cloud server when the error rate of the OTN service exceeds a preset value. Packet information of the service node of the model, the packet information includes: the number of received data packets and the number of transmitted data packets;

The bit error rate and packet information are displayed in a visual view.

The embodiment further provides a computer readable storage medium storing computer executable instructions for performing any of the above methods.

The embodiment further provides a server device, the server device comprising one or more processors, a memory and one or more programs, the one or more programs being stored in the memory when being processed by one or more When the device is executed, the corresponding fault information acquisition method described above is executed.

The embodiment also provides a client device including one or more processors, a memory, and one or more programs, the one or more programs being stored in the memory when processed by one or more When the device is executed, the corresponding fault information acquisition method described above is executed.

The embodiment further provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer Having the computer perform any of the methods described above.

The method and device for acquiring the fault information provided by the present disclosure, when monitoring the OTN service to send an alarm, the monitoring information and the sample of each service node of the first OTN board are obtained according to the first OTN board where the alarm time is generated according to the alarm code. The data range is uploaded to the cloud server; the monitoring information of the first OTN board where the alarm time occurs at the top of the cloud server and the sample data range are obtained from the cloud service. The data acquired by the device matches the preset service flow model, and displays the service node data and the performance parameter in the monitoring information in a visualized model corresponding to the sample data range, so that the maintenance personnel can fault according to the measurement information and the sample data range. The data is analyzed; the disclosure directly obtains the fault information corresponding to the fault starting point through the server, avoids checking one by one according to a large amount of historical data, improves the speed of the positioning fault, and does not require the staff to go to the site environment operation, thereby reducing the fault location process. The risk of impact on the operational stability of the existing network services and the speed of fault location.

DRAWINGS

1 is a schematic flowchart of a method for acquiring fault information provided by a first embodiment;

2 is a schematic structural diagram of a fault information acquiring apparatus according to a second embodiment;

3 is a schematic flowchart of a method for acquiring fault information provided by a third embodiment;

4 is a schematic structural diagram of a fault information acquiring apparatus according to a fourth embodiment;

FIG. 5 is a schematic diagram of an application scenario provided by the fifth embodiment;

6 is a schematic diagram of a system level model provided by the fifth embodiment;

7 is a schematic diagram of a single board level model provided by the fifth embodiment;

8 is a schematic structural diagram of hardware of a server device according to Embodiment 5;

FIG. 9 is a schematic structural diagram of hardware of a client device according to Embodiment 5.

detailed description

First embodiment

Referring to FIG. 1 , this embodiment provides a method for acquiring fault information, which is applied to a server, and may include steps 110-130.

In step 110, when the OTN service of the optical transport network is sent to generate an alarm, the first alarm code of the first alarm is obtained, and the first OTN board corresponding to the first alarm is determined.

Generally, the OTN service has multiple alarm monitoring points in the service topology. The alarm monitoring point is usually set on the service node. The alarm monitoring point generates an alarm code after the alarm is generated. The corresponding OTN board can be determined according to the alarm code.

In step 120, the monitoring information of the service node and the sample data range are obtained, where the service node is a service node of a preset service flow model of the first OTN board; and the monitoring information includes the service node at the first The data of the moment when the alarm occurs and the performance parameter. The sample data range is the data range and performance parameter range of the preset service node data.

The monitoring information of the preset service node of the first OTN board is obtained, where the monitoring information includes data and performance parameters transmitted by the service node at the moment when the first alarm occurs.

Optionally, the performance parameters include: a clock frequency, a peripheral chip state, an optical power, an optical module bias voltage, and a bias current; and the sample data range is a data range corresponding to the data of the service node in the monitoring information, that is, a normal value range. And the range of performance parameters, which is the normal range of performance parameters.

Optionally, a monitoring point is set in the service node of the sub-service flow model to obtain monitoring information of the node.

In step 130, the first alarm code, the monitoring information, and the sample data range are uploaded to the cloud server.

In this embodiment, the first alarm code, the monitoring information, and the sample data range are uploaded to the cloud server, so that the fault information can be obtained manually by the maintenance personnel, and the maintenance personnel or the R&D personnel can directly operate in the live network environment to avoid affecting the operation of the normal service. The risk of the stability of the existing network service during the fault location process is reduced, and the cumbersome operation of the site is not required, and the speed of fault location is improved.

Optionally, the foregoing step 110 includes the following steps:

In the first step, when the alarm monitoring point in the service topology of the OTN service of the optical transport network sends an alarm, the alarm code of the alarm is obtained.

The alarm code carries the time information, the location information, and the alarm name of the alarm. The time information is used to determine the sequence of the alarms generated by each alarm monitoring point. The location information is used to determine the OTN board corresponding to the alarm. The maintenance personnel directed to check the cause of the alarm.

In the second step, according to the alarm code, the first alarm that occurs at the top of the alarm time and the first alarm code of the first alarm are determined.

The first OTN board corresponding to the first alarm code of the first alarm is the fault starting point. The fault origin can be avoided based on a large amount of historical data. The first alarm is generated, and the first alarm corresponding to the first alarm is obtained, according to the time information of each alarm code, and the first alarm code corresponding to the first alarm is obtained, so as to be based on the first alarm. Encoding obtains relevant information of the first alarm.

In the third step, the first OTN board corresponding to the first alarm is determined according to the first alarm code.

The first OTN board in which the alarm time occurs is determined according to the location information in the first alarm code.

In order to improve data security, the monitoring information and the sample data range need to be encrypted and uploaded to the cloud server.

Optionally, the method further includes:

The OTN board corresponding to the error rate can be determined when the BER of the OTN service exceeds the first preset value. The second OTN board.

Upload the error rate and packet information to the cloud server.

Normally, when the bit error rate exceeds a certain value (greater than the above preset value), the alarm will be generated. When the error rate is insufficient to cause the alarm, but the above preset value is exceeded, the second OTN list needs to be checked. Whether the packet loss occurs on the service node of the default service flow model of the board; if a packet loss occurs, check the corresponding service node to avoid system alarm caused by packet loss.

In order to improve data security, the second OTN board is encrypted by the second preset encryption algorithm to the corresponding error rate and the packet information, and then uploaded to the cloud server.

In the embodiment, when the alarm is generated by the OTN service, the first OTN board is generated according to the alarm code, and the monitoring information and the sample data range of each service node of the first OTN board are uploaded to the cloud server. The maintenance personnel analyzes the fault data according to the measurement information and the sample data range; directly obtains the fault information corresponding to the fault starting point through the server, and avoids according to a large number of Historical data is checked one by one, which improves the speed of locating faults and eliminates the need for staff to go to the on-site environment to reduce the risk of impact on the operational stability of the existing network during fault location.

Second embodiment

Referring to FIG. 2, the embodiment provides a fault information acquiring apparatus, which is applied to a server, and may include:

The alarm monitoring module 201 is configured to acquire the first alarm code of the first alarm and determine the first OTN board corresponding to the first alarm when the OTN service of the optical transmission network is detected.

Generally, the OTN service has multiple alarm monitoring points in the service topology, and the alarm monitoring point can be set on the service node. After the alarm monitoring point sends an alarm, an alarm code is generated, and the corresponding OTN board can be determined according to the alarm code. .

The information obtaining module 202 is configured to acquire the monitoring information of the service node and the sample data range, where the service node is a service node of the preset service flow model of the first OTN board; the monitoring information includes the time when the service node is at the time when the first alarm occurs. Data and performance parameters, the sample data range is the data range and performance parameter range of the preset service node data.

The monitoring information of the preset service node of the first OTN board is obtained. The monitoring information includes data transmitted by the service node at the moment when the first alarm occurs, and performance parameters. Optionally, the performance parameters include: clock frequency and periphery. The chip state, optical power, optical module bias voltage, and bias current performance parameters include temperature, incoming and outgoing optical power, and bias current; the sample data range is the data range corresponding to the service node data in the monitoring information, that is, the normal value range, And a range of performance parameters.

The data uploading module 203 is configured to upload the first alarm code, the monitoring information, and the sample data range to the cloud server.

The first alarm code, the monitoring information, and the sample data range are uploaded to the cloud server, so that the fault information can be obtained manually by the maintenance personnel, and the maintenance personnel or the R&D personnel can directly operate in the live network environment to avoid affecting the operation of the normal service and reduce the operation. The risk of affecting the operational stability of the existing network during the fault location process, and the speed of the fault location.

Optionally, the alarm monitoring module 201 includes:

The well code acquisition sub-module is set to be used in the service topology of the OTN service of the optical transport network. When the alarm monitoring point issues an alarm, the alarm code of the alarm is obtained, and the alarm code carries the time information, the location information, and the alarm name of the alarm occurrence;

Optionally, the device further includes:

Optionally, the data uploading module 203 is configured to:

Optionally, the error information uploading module is set to:

In this embodiment, when the alarm is generated by the OTN service, the first OTN board is generated according to the alarm code, and the monitoring information and the sample data range of each service node of the first OTN board are uploaded to the cloud. The server enables the maintenance personnel to remotely analyze the fault data according to the monitoring information and the sample data range. The server obtains the fault information corresponding to the fault starting point directly, avoids troubleshooting according to a large amount of historical data, and improves the speed of the positioning fault, and does not need to The staff goes to the on-site environment to reduce the risk of impact on the operational stability of the existing network during the fault location process.

Third embodiment

Referring to FIG. 3, the embodiment provides a method for acquiring fault information, which is applied to the client, and may include steps 310-320.

In step 310, when the server detects that the OTN service sends an alarm, the client obtains the first alarm code of the first alarm and the first alarm code from the cloud server that has obtained the first alarm code of the first alarm. The corresponding monitoring information and the sample data range; the monitoring information is the data of the service node of the preset service flow model of the first OTN board corresponding to the first alarm, and the performance parameter at the moment when the first alarm occurs, and the sample data range is The data range and performance parameter range of the preset service node data.

The first alarm is an alarm whose alarm time occurs first. When the server detects that the OTN service sends an alarm, the first alarm code and the monitoring information and the sample data range that occur in the first alarm are obtained from the cloud server.

For example, when the server detects that the OTN service sends an alarm, the server uploads the service link data when the fault occurs to the cloud server, and notifies the client that the client obtains the first alarm that occurs at the forefront from the cloud server. The first alarm code and the monitoring information and the sample data range are subjected to subsequent visual view display according to the data information acquired from the cloud server. Generally, the OTN service has multiple alarm monitoring points in the service topology. The alarm monitoring point can be set on the service node. The alarm monitoring point generates an alarm code after the alarm is generated. The alarm time is usually on the top OTN board. As the starting point of the fault, the first OTN board corresponding to the first alarm code of the first alarm is the fault starting point; directly obtaining the fault starting point can avoid checking one by one according to a large amount of historical data, and reducing the work of the relevant staff. Quantity, improve the speed of positioning faults.

In step 320, the first alarm code, the monitoring information, and the sample data range and the preset service flow model are displayed through the visual view.

And matching the preset service flow model according to the first alarm code, the monitoring information, and the sample data range obtained from the cloud server, and displaying the service node data and the performance parameter in the monitoring information corresponding to the sample data range. In the visual model, the maintenance personnel can compare the data according to the visualization model, and can quickly filter out the faulty service node, and realize the remote positioning of the OTN device fault, no longer rely on the actual scene of the current network fault, and avoid the cumbersome operation of the alarm data analysis. Improve the speed of positioning faults.

Optionally, step 310 may include:

The data obtained from the cloud server is the ciphertext encrypted according to the first preset encryption algorithm. Therefore, the ciphertext needs to be decrypted to obtain the plaintext, and displayed on the service flow model.

Optionally, the method further includes:

The bit error rate and packet information are displayed in a visual view.

When the second OTN board corresponding to the bit error rate and the packet information are obtained from the cloud server, the error rate and the packet information are visually displayed on the service flow model, so that the maintenance personnel can accurately locate the packet loss phenomenon. Business node.

The error rate and the packet information obtained from the cloud server are ciphertexts encrypted according to the second preset encryption algorithm. Therefore, the ciphertext needs to be decrypted to obtain plaintext, and displayed on the service flow model.

In this embodiment, when monitoring the OTN service to send an alarm, the monitoring information and the sample data range of the first OTN board whose alarm time occurs at the foremost time are obtained from the cloud server, and the preset data is matched according to the data acquired from the cloud server. The service flow model displays the service node data and performance parameters in the monitoring information in a visualized model corresponding to the sample data range, so that the maintenance personnel can compare the data according to the visualization model, and can quickly filter out the faulty service node and realize the remote operation. The OTN device is faulty and no longer depends on the actual scenario of the fault on the live network. The cumbersome operation of the alarm data analysis is avoided. The fault information corresponding to the fault origin is directly obtained from the cloud server through the client, avoiding one by one according to a large amount of historical data. Trouble-shooting improves the speed of locating faults and eliminates the need for staff to go to the on-site environment to reduce the risk of impact on the operational stability of the existing network during fault location.

Fourth embodiment

Referring to FIG. 4, the embodiment provides a fault information obtaining apparatus, which is applied to a client, and may include:

The data acquisition module 401 is configured to: when the server detects that the OTN service sends an alarm, the first alarm code of the first alarm is obtained from the cloud server that has obtained the first alarm code of the first alarm, and the first alarm is generated. The corresponding monitoring information and the sample data range are encoded; the monitoring information is the data of the service node of the preset service flow model of the first OTN board corresponding to the first alarm, and the performance parameter, the sample data range The data range and performance parameter range of the preset business node data.

The first alarm is an alarm whose alarm time occurs first. When the OTN service is alerted, the first alarm code, the monitoring information, and the sample data range of the first alarm generated by the cloud server are obtained from the cloud server.

Generally, the OTN service has multiple alarm monitoring points in the service topology. The alarm monitoring point can be set on the service node. The alarm monitoring point generates an alarm code after the alarm is generated. The alarm time is usually on the top OTN board. As a starting point of the fault, the first OTN board corresponding to the first alarm code of the first alarm is the fault starting point; directly obtaining the fault starting point avoids checking one by one according to a large amount of historical data, and reduces the workload of the relevant staff.

The view display module 402 is configured to display the first alarm code, the monitoring information, and the sample data range and the preset service flow model through the visual view.

According to the data obtained from the cloud server, the preset service flow model is matched, and the service node data and the performance parameter in the monitoring information are displayed in the visual model corresponding to the sample data range, so that the maintenance personnel perform data according to the visualization model. In contrast, the faulty service node can be quickly filtered out to achieve remote location OTN device failure, no longer relying on the actual scenario of the existing network fault, avoiding the cumbersome operation of alarm data analysis, and improving the speed of fault location.

Optionally, the view display module 402 is configured to:

Decrypting the first alarm code, the monitoring information, and the sample data range according to the first preset encryption algorithm Cheng Mingwen, with the preset business flow model displayed through the visual view.

Optionally, the device further includes:

The erroneous display module is configured to: when the server detects that the OTN service has a BER that exceeds a preset value, the second OTN board corresponding to the BER of the OTN board and the preset of the second OTN board are obtained from the cloud server. Packet information of the service node of the service flow model, the packet information includes: the number of received data packets and the number of transmitted data packets;

The bit error rate and packet information are displayed in a visual view.

Optionally, the error display module is set to:

In this embodiment, when monitoring the OTN service to send an alarm, the monitoring information and the sample data range of the first OTN board whose alarm time occurs at the foremost time are obtained from the cloud server, and the preset data is matched according to the data acquired from the cloud server. The service flow model displays the service node data and performance parameters in the monitoring information in a visualized model corresponding to the sample data range, so that the maintenance personnel can compare the data according to the visualization model, and can quickly filter out the faulty service node and realize the remote operation. The fault of locating the OTN device is no longer dependent on the actual scenario of the fault on the live network, and the cumbersome process of analyzing the alarm data is avoided. The fault information corresponding to the fault starting point is directly obtained from the cloud server through the client, avoiding troubleshooting according to a large amount of historical data, improving the speed of fault location, and eliminating the need for the staff to go to the site environment for operation, thereby reducing the presence of the fault during the fault location process. The risk of the stability of the operation of the network business.

Fifth embodiment

Referring to FIG. 5, this embodiment introduces a fault information acquisition method in an application scenario. The application scenarios shown in Figure 5 mainly include an OTN device, a server, a cloud server, and a client.

The visual data model of the OTN board can be established on the server side, and the visualization view is abstracted into a visual view based on the architecture of the OTN board hardware layout, and the monitoring point data on the service flow direction and the service flow is displayed in the visualization view. Referring to FIG. 6 and FIG. 7, the board-level hardware includes a client-side service board, a line-side service board, and a cross-board group. The board hardware includes an OTN service board hardware device, an optical module, and a field programmable gate array (Field Programmable Gate). Array, FPGA), framing chip and clock module.

Optionally, the system-level service processing model is abstracted. The model is based on each board and will be a basic board. The alarms of a time point are visualized, that is, the alarms affecting the service, such as the alarms of the optical input port, the optical output port, the dispatch receiving port, the dispatching and sending port, the internal receiving port, the internal sending port, and the backplane related port, and the bit error rate. Show it out.

According to the service topology combination of the project site and the above-mentioned service processing model, the relevant staff can quickly determine which node is the first out of the traffic on the service link, and what time period has an alarm, etc., without the need for the staff to follow the historical topology from the historical alarm. The screen is in the data.

Optionally, the service board is configured to process the service flow model, and the service flow can be divided into an A-direction service flow and a B-direction service flow, where the A-direction refers to the service from the optical port to the cross-matrix. The direction of the B-direction refers to the service from the cross-matrix to the optical interface. According to the actual service mapping and configuration, the service flow processed by the board is abstracted into different models. For example, the 10GE service is accessed on the 10GE service. Then, the service flows into the cross-system under the optical channel data unit (ODU2). The traffic processing model of the A-direction service flow in the single board is: 10 optical port (Ge-Lan)→GFP-F→ODU2→cross system, B direction The traffic flow model cross system is: ODU2→GFP-F→10Ge-LAN;

The 10Ge-Lan→GFP-F mapping process is implemented in a framer (framer); the GFP-F→ODU2 process is implemented in the FPGA, and the abstracted service flow model presents 10GE related alarms on the optical port.

Referring to FIG. 7, the framing chip presents an alarm of the Generic Framing Procedure (GFP) layer, and the FPGA presents an Optical Channel Data Unit (ODU) layer alarm. The optical module partially displays the incoming optical power and is offset. Performance such as current, the clock part shows the normal range of clock frequency measurement, the protection part shows all alarm states such as signal failure and signal degradation of the current trigger switching, and the Aspect-Oriented Software Development (AOSD) part shows the alarm state of the switching laser. According to different functions of the business board and different business, different abstract models are designed for visual display.

In this embodiment, the cloud data is used to upload the service data to the cloud server, so that the R&D personnel or maintenance personnel can obtain the corresponding service data through the client to perform service fault analysis. This link may include: data collection and data uploading. During the data collection process, you need to confirm the board information that the faulty link to be analyzed passes, and start the data collection device, and store and upload the service data according to the agreed format. For the security needs of business data, the business data can be encrypted and then uploaded to the cloud server. Data collection, data encryption and upload functions can be turned on at the same time.

After the R&D personnel get the business data, they can get the system-level OTN according to the topology of the service. The service flow model, the system-level service flow model may include the boards involved in the faulty service link, and the alarms of the ports on the board. By selecting the time period during which the fault occurs, you can view the alarms on the entire link. Pass the situation to determine the point at which the failure occurred. The board-level service flow model is obtained by using the data of the board corresponding to the fault occurrence point. The board-level service flow model can include information such as alarms, clock frequencies, and values of important monitoring node registers. Monitors whether the value of the node register is within the normal range.

As shown in FIG. 8 , a hardware structure of a server device is provided in this embodiment. As shown in FIG. 8 , the server device includes: a processor 510 and a memory 520. Communication Interface 530 and bus 540.

The processor 510, the memory 520, and the communication interface 530 can complete communication with each other through the bus 540. Communication interface 530 can be used for information transfer. The processor 510 can call the logic instructions in the memory 520 to perform the corresponding fault information acquisition method in the above embodiment.

The memory 520 may include a storage program area and a storage data area, and the storage program area may store an operating system and an application required for at least one function. The storage data area can store data and the like created according to the use of the server device. Further, the memory may include, for example, a volatile memory of a random access memory, and may also include a non-volatile memory. For example, at least one disk storage device, flash memory device, or other non-transitory solid state storage device.

Moreover, when the logic instructions in the memory 520 described above can be implemented in the form of software functional units and sold or used as separate products, the logic instructions can be stored in a computer readable storage medium. The technical solution of the present disclosure may be embodied in the form of a computer software product, which may be stored in a storage medium, and includes a plurality of instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) All or part of the steps of the method described in this embodiment are performed.

The storage medium may be a non-transitory storage medium or a transitory storage medium. The non-transitory storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. medium.

FIG. 9 is a schematic diagram of a hardware structure of a client device according to the embodiment. As shown in FIG. 9, the client device includes: one or more processors 610 and a memory 620. One processor 610 is taken as an example in FIG.

The client device may further include: an input device 630 and an output device 640.

The processor 610, the memory 620, the input device 630, and the output device 640 in the client device may be connected by a bus or other means, and the bus connection is taken as an example in FIG.

The input device 630 can receive input numeric or character information, and the output device 640 can include a display device such as a display screen.

The memory 620 is a computer readable storage medium that can be used to store software programs, computer executable programs, and modules. The processor 610 executes a plurality of functional applications and data processing by executing software programs, instructions, and modules stored in the memory 620 to implement corresponding fault information acquisition methods in the above embodiments.

The memory 620 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to usage of the client device, and the like. In addition, the memory may include volatile memory such as random access memory (RAM), and may also include non-volatile memory such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device.

Memory 620 can be a non-transitory computer storage medium or a transitory computer storage medium. The non-transitory computer storage medium, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 620 can optionally include memory remotely located relative to processor 610, which can be connected to the client device over a network. Examples of the above networks may include the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

Input device 630 can be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the client device. The output device 640 can include a display device such as a display screen.

The client device of this embodiment may also include a communication device 650 for transmission over a communication network and/or Receive information.

A person skilled in the art can understand that all or part of the process of implementing the above embodiment method can be completed by executing related hardware by a computer program, and the program can be stored in a non-transitory computer readable storage medium. The program, when executed, may include the flow of an embodiment of the method as described above, wherein the non-transitory computer readable storage medium may be a magnetic disk, an optical disk, a read only memory (ROM), or a random access memory (RAM). Wait.

Industrial applicability

The present disclosure provides a fault information acquisition method and device, which can directly obtain fault information corresponding to a fault starting point when a service fault occurs, avoiding troubleshooting according to a large amount of historical data, improving the speed of the positioning fault, and eliminating the need for the staff to go to the site environment. The risk of affecting the operational stability of the existing network during the fault location process is reduced.

Claims

A method for acquiring fault information is applied to a server, including:

Obtaining a first alarm code of the first alarm and determining a first OTN board corresponding to the first alarm when the OTN service of the optical transmission network is sent an alarm;

Acquiring the monitoring information of the service node and the sample data range, where the service node is a service node of the preset service flow model of the first OTN board; the monitoring information includes that the service node occurs in the first alarm Data of the moment and the performance parameter, the sample data range being a preset data range and a performance parameter range of the service node data;

Uploading the first alarm code, the monitoring information, and the sample data range to a cloud server.
The method of claim 1, wherein the first OTN board corresponding to the first alarm is obtained, and the first OTN board corresponding to the first alarm is obtained, :

Obtaining an alarm code of the alarm when the alarm monitoring point in the service topology of the OTN service of the optical transport network sends an alarm, where the alarm code carries time information, location information, and an alarm name of the alarm;

Determining, according to the alarm code, a first alarm that occurs at an alarm time and the first alarm code;

Determining the first OTN board according to the first alarm code.
The method of claim 1, wherein the service node comprises: a service split node, a service encapsulation node, and a hardware node.
The method of claim 1 wherein the performance parameters comprise: a clock frequency, a peripheral chip state, an optical power, an optical module bias voltage, and a bias current.
The method of claim 1 further comprising:

Obtaining a second OTN board corresponding to the error rate when the error rate of the OTN service exceeds a preset value;

Acquiring packet information of the service node of the preset service flow model of the second OTN board, where the data packet information includes: the number of received data packets and the number of sent data packets;

Transmitting the error rate and the data packet information to the cloud server.
The method of claim 1 wherein said encoding said first alert, said supervisor The measurement information and the sample data range are uploaded to the cloud server, including:

The first alarm code, the monitoring information, and the sample data range are encrypted according to a first preset encryption algorithm, and then uploaded to the cloud server.
The method of claim 5, wherein the uploading the error rate and the data packet information to the cloud server comprises:

And encrypting the error rate and the data packet information according to a second preset encryption algorithm, and uploading the data to the cloud server.
A fault information acquisition method is applied to a client, including:

When the server detects that the OTN service of the optical transport network sends an alarm, the client obtains the first alarm code of the first alarm and the first alarm code corresponding to the cloud server that has obtained the first alarm code of the first alarm. The monitoring information and the sample data range; the monitoring information is data and performance parameters of the service node of the preset service flow model of the first OTN board corresponding to the first alarm at the moment when the first alarm occurs. The sample data range is a preset data range and a performance parameter range of the service node data;

And displaying the first alarm code, the monitoring information, and the sample data range and the preset service flow model through a visual view.
The method of claim 8, wherein the displaying the first alarm code, the monitoring information, and the sample data range with the preset service flow model through a visual view comprises:

The first alarm code, the monitoring information, and the sample data range are decrypted into a plaintext according to a first preset encryption algorithm, and the preset service flow model is displayed through a visual view.
The method of claim 8 further comprising:

When the server detects that the error rate of the OTN service exceeds a preset value, the client obtains the second OTN board corresponding to the error rate and the preset of the second OTN board from the cloud server. Packet information of a service node of a service flow model, the packet information including: the number of received data packets and the number of transmitted data packets;

The error rate and the packet information are displayed through a visual view.
The method of claim 10, wherein the displaying the bit error rate and the data packet information through a visual view comprises:

Decrypting the error rate and the data packet information into a plaintext according to a second preset encryption algorithm, and The packet information is displayed in a visual view.
A fault information acquiring device is applied to a server, and includes:

The alarm monitoring module is configured to: when the OTN service of the optical transport network is sent to send an alarm, obtain the first alarm code of the first alarm, and determine the first OTN board corresponding to the first alarm;

An information obtaining module, configured to acquire monitoring information of a service node and a sample data range, where the service node is a service node of a preset service flow model of the first OTN board; and the monitoring information includes the service node The data of the moment when the first alarm occurs and the performance parameter, where the sample data range is a preset data range and a performance parameter range of the service node data;

The data uploading module is configured to upload the first alarm code, the monitoring information, and the sample data range to a cloud server.
The device of claim 12, wherein the alarm monitoring module comprises:

The alarm code acquisition sub-module is configured to acquire an alarm code of the alarm when the alarm monitoring point in the service topology of the OTN service of the optical transport network sends an alarm, where the alarm code carries the time information of the alarm occurrence, Location information and the name of the alarm;

a first alarm determining submodule, configured to determine, according to the alarm code, a first alarm that occurs at an alarm time and a first alarm code of the first alarm;

The board determining sub-module is configured to determine, according to the first alarm code, a first OTN board corresponding to the first alarm.
The apparatus of claim 12, wherein the service node comprises: a service split node, a service encapsulation node, and a hardware node.
The apparatus of claim 12, wherein the performance parameters comprise: a clock frequency, a peripheral chip state, an optical power, an optical module bias voltage, and a bias current.
The apparatus of claim 12, further comprising:

The error monitoring module is configured to acquire the second OTN board corresponding to the error rate when the error rate of the OTN service exceeds a preset value;

a packet information obtaining module, configured to acquire data packet information of a service node of a preset service flow model of the second OTN board, where the data packet information includes: a quantity of received data packets and a quantity of sent data packets; as well as

The error information uploading module is configured to upload the error rate and the data packet information to the cloud server.
The apparatus of claim 12, wherein the data uploading module is configured to:

The first alarm code, the monitoring information, and the sample data range are encrypted according to a first preset encryption algorithm, and then uploaded to the cloud server.
A fault information obtaining device is applied to a client, including:

a data acquisition module, configured to: when the server detects that the OTN service of the optical transport network sends an alarm, obtain the first alarm code of the first alarm, and the first alarm from the cloud server that has obtained the first alarm code of the first alarm Encoding corresponding monitoring information and a sample data range; the monitoring information is data of a service node of a preset service flow model of the first OTN board corresponding to the first alarm at a moment when the first alarm occurs a performance parameter, where the sample data range is a preset data range and a performance parameter range of the service node data;

The view display module is configured to display the first alarm code, the monitoring information, and the sample data range and the preset service flow model through a visual view.
The apparatus of claim 18, wherein the view display module is configured to:

The first alarm code, the monitoring information, and the sample data range are decrypted into a plaintext according to a first preset encryption algorithm, and the preset service flow model is displayed through a visual view.
The apparatus of claim 18, further comprising:

The error display module is configured to: when the server detects that the error rate of the OTN service exceeds a preset value, acquire, by the cloud server, the second OTN board corresponding to the error rate, and the second OTN The packet information of the service node of the preset service flow model of the board, where the data packet information includes: the number of received data packets and the number of sent data packets;

The error rate and the packet information are displayed through a visual view.
A computer readable storage medium storing computer executable instructions for performing the method of any of claims 1-7 and 8-11.