CN115314358A - Method and device for monitoring dummy network element fault of home wide network - Google Patents

Method and device for monitoring dummy network element fault of home wide network Download PDF

Info

Publication number
CN115314358A
CN115314358A CN202110499907.7A CN202110499907A CN115314358A CN 115314358 A CN115314358 A CN 115314358A CN 202110499907 A CN202110499907 A CN 202110499907A CN 115314358 A CN115314358 A CN 115314358A
Authority
CN
China
Prior art keywords
network
log
offline
online
line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110499907.7A
Other languages
Chinese (zh)
Other versions
CN115314358B (en
Inventor
杨慰民
陈志安
陈晞
罗卫鸿
郑银云
陈文�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Fujian Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Fujian Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Fujian Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110499907.7A priority Critical patent/CN115314358B/en
Publication of CN115314358A publication Critical patent/CN115314358A/en
Application granted granted Critical
Publication of CN115314358B publication Critical patent/CN115314358B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method and a device for monitoring the faults of a dummy network element of a home wide network, which are used for solving the problem of poor timeliness and accuracy in monitoring the faults of the dummy network element of the home wide network. This scheme includes: acquiring an online log of at least one network user in communication connection with a dummy network element of the home wide network according to the resource state tree; determining the online and offline state of at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user; determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state; and determining the dumb network element in the offline state exceeding the preset time length as a fault dumb network element. According to the scheme, the upper and lower line states of the dumb network element are determined according to the upper and lower line states of the network user in communication connection with the dumb network element, so that the fault of the dumb network element is monitored, and the instantaneity and the accuracy of fault monitoring of the dumb network element are improved.

Description

Method and device for monitoring dummy network element fault of home wide network
Technical Field
The invention relates to the field of communication, in particular to a method and a device for monitoring a dummy network element fault of a home wide network.
Background
In the field of communication technology, a home broadband network is also called a home broadband network. Network devices in the home wide network include optical fibers, optical connectors, optical splitters, optical cross-connect boxes, splice closures, etc., and some of the above listed network devices are passive devices, and these passive devices are also called dummy resources because they are difficult to report failure information in time. The dummy resources are an important component of the existing wired access Network of the telecom operator, bear various systems such as Passive Optical Network (PON), packet Transport Network (PTN), and determine the access capabilities of various services such as group customers, wireless Local Area Network (WLAN), and home customers.
However, the conventional network management system usually cannot monitor the operating states of the dumb network elements, and it is difficult for operation and maintenance personnel to find the faults occurring in the dumb network elements at the first time and to determine the user scale affected by the faults in time, so that it is difficult to process the faults of the dumb network elements in time, which may cause the expansion of the fault scale and the large-area network faults.
How to improve the timely accuracy of the monitoring of the dumb network element fault of the home wide network is the technical problem to be solved by the application.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for monitoring the faults of the dummy network elements of the home wide network, and is used for solving the problem that the monitoring of the faults of the dummy network elements of the home wide network is timely and poor in accuracy.
In a first aspect, a method for monitoring a dummy network element fault of a home-wide network is provided, including:
acquiring an online log and an offline log of at least one network user in communication connection with a dummy network element of a home wide network according to a resource state tree, wherein the resource state tree represents the communication connection relation between the dummy network element of the home network and the at least one network user, and the online log and the offline log comprise an online time point and an offline time point of the network user;
determining the online and offline state of the at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user;
determining the up-down line state of the dumb network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state;
and determining the dummy network element which is in the offline state and exceeds the preset time length as a fault dummy network element.
In a second aspect, a device for monitoring a failure of a dummy network element in a home-wide network is provided, including:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an online log and an offline log of at least one network user which are in communication connection with a dummy network element of a home wide network according to a resource state tree, the resource state tree is used for representing the communication connection relation between the dummy network element of the home network and the at least one network user, and the online log and the offline log comprise an online time point and an offline time point of the network user;
the first determining module is used for determining the online and offline states of the at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user;
the second determining module is used for determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state;
and a third determining module, configured to determine the dumb network element in the offline state exceeding a preset time length as a faulty dumb network element.
In a third aspect, an electronic device is provided, the electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the method according to the first aspect.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the method as in the first aspect.
In the embodiment of the application, the online and offline log of at least one network user in communication connection with the dummy network element of the home wide network is obtained according to a resource state tree, the resource state tree represents the communication connection relation between the dummy network element of the home network and the at least one network user, and the online and offline log comprises the online time point and the offline time point of the network user; determining the online and offline state of at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user; determining the up-down line state of the dumb network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state; and determining the dummy network element which is in the offline state and exceeds the preset time length as a fault dummy network element. According to the scheme, the upper and lower line states of the dumb network element are determined according to the upper and lower line states of the network user in communication connection with the dumb network element, so that the fault of the dumb network element is monitored, and the instantaneity and the accuracy of fault monitoring of the dumb network element are improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 2a is a second schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 2b is a third schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 3a is a fourth schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 3b is a fifth flowchart illustrating a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 4 is a sixth schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 5a is a seventh schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 5b is an eighth schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 6a is a ninth schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 6b is a tenth schematic flowchart of a method for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Fig. 7a is an eleventh schematic flowchart of a dummy network element fault monitoring method of the home-wide network according to an embodiment of the present invention.
Fig. 7b is a schematic repair timing diagram of a method for monitoring a dummy network element fault in a home-wide network according to an embodiment of the present invention.
Fig. 7c is a schematic structural diagram of a dummy network element fault monitoring system of the home wide network according to an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a device for monitoring a failure of a dummy network element in a home-wide network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The reference numbers in the present application are only used for distinguishing the steps in the scheme and are not used for limiting the execution sequence of the steps, and the specific execution sequence is described in the specification.
In the technical field of communication, as the dumb home wide access network element cannot actively report fault information, the faults of the primary optical splitter and the secondary optical splitter are difficult to identify in time. Furthermore, the fixed Broadband Network has a complicated architecture, and for example, an Optical Access Network may include a plurality of levels of devices such as a BRAS (Broadband Remote Access Server), a switch, an OLT (Optical Line Terminal), a PON, an OBD (Optical branch Device) 1, an OBD2, and an ONU (Optical Network Unit) in a metro Network. In the prior art, the operation state of a dumb network element is difficult to monitor accurately in real time, and only when a user actively initiates complaints or a PON port alarm is caused by a large-area fault, a network operation and maintenance worker can know the specific equipment with the fault through manual investigation. It is difficult to know the failure of the dumb network element in time and to accurately position the failed dumb network element in time, which is not beneficial to repairing the network failure.
In order to solve the problems existing in the prior art, an embodiment of the present application provides a method for monitoring a fault of a dumb network element of a home-wide network, as shown in fig. 1, including:
s11: the method comprises the steps of obtaining an online log and an offline log of at least one network user in communication connection with a dummy network element of the home wide network according to a resource state tree, wherein the resource state tree represents the communication connection relation between the dummy network element of the home network and the at least one network user, and the online log and the offline log comprise online time points and offline time points of the network user.
The resource state tree may be constructed in advance according to a communication connection relationship between the dummy network element and the network user, and the resource state tree may include nodes representing the network elements and the network user, and inter-node connection lines representing the communication connection relationship. The network elements characterized by the nodes in the resource state tree may include dummy network elements or non-dummy network elements. Specifically, a resource file containing the resources of the network wide users and an online file containing the information of the network wide online users in the latest preset time period may be obtained in advance. The resource file specifically includes network users and topology information of a PON network to which the network users belong. The resource file can be used for determining the communication connection relation between each network element and the network user, and the online file can be used for determining the initial online and offline states of each network user. And constructing a resource state tree according to the resource file and the online file.
In this step, at least one network user in communication connection with a dummy network element of the home wide network can be determined according to the resource state tree, and then an online log of the at least one network user is obtained. The log may specifically include a user authentication log received from a log authentication server. The log of the online and the offline includes the online time point and the offline time point of the network user.
S12: and determining the online and offline state of the at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user.
In this step, the up-down state of the network user is determined based on the up-down time point and the down-down time point in the up-down log. Specifically, the time point closest to the current time may be determined from the time points in the log of the top and bottom lines, and the top and bottom line state where the network user is located may be determined based on the log of the top and bottom lines corresponding to the time point closest to the current time. Or, the time points of the upper line and the lower line can be sorted according to the time sequence to determine the latest time point. And determining the online and offline states of the network users based on the online and offline logs corresponding to the latest time points.
S13: and determining the up-down line state of the dumb network element of the home wide network according to the number of the network users in the up-line state and/or the down-line state.
In this step, the up-down status of the dumb network element of the home bandwidth network may be determined only according to the number of network users in the up-down status. For example, when the number of network users in the online state is greater than 0, it is determined that a dumb network element of the home wide network is in the online state, otherwise, it is in the offline state. The up-down status of the dummy network element of the home wide network can also be determined only according to the number of the network users in the down-line status. For example, when the number of network users in the offline state is greater than the preset number, it is determined that a dumb network element of the home broadband network is in the offline state, otherwise, it is in the online state. The preset number may be specifically determined according to the number of network users in the resource state tree, which are in communication connection with the dummy network element. Or, the online and offline states of the dumb network element of the home wide network can be comprehensively determined by combining the number of the network users in the online state and the offline state.
S14: and determining the dummy network element which is in the offline state and exceeds the preset time length as a fault dummy network element.
If the dumb network element is determined to be in the offline state through the step S13, a network element offline alarm may be generated. In practical applications, many homes are on-line for convenient networking needs. The dummy network element is usually in communication connection with a plurality of network users, and the situation that the plurality of network users connected with the dummy network element are all in an off-line state rarely occurs. If the dumb network element is in the offline state and exceeds the preset time, it can be determined that the network users in communication connection with the dumb network element are all offline due to the fact that the dumb network element fails.
In addition, the preset duration can be preset manually or generated automatically based on time periods, geographic positions, the number of network users and other factors. For example, when the network is idle in the night period, there are often fewer network users in the on-line state, and the preset time period may be, for example, one hour. When the network is busy during the daytime period, there are more network users who are on-line, and the preset time period may be 10 minutes, for example. For another example, the number of network users in the online state in the daytime of the working day in the residential area is often small, and the preset time may be long. And the number of network users who are on-line in the working area including the office building during the daytime period of the working day is often large, and the preset time can be short. In practical application, the length of the preset duration can be adjusted according to actual requirements.
According to the scheme provided by the embodiment of the application, the online and offline log of at least one network user in communication connection with the dummy network element of the home wide network is obtained according to the resource state tree; determining the online and offline state of at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user; determining the up-down line state of the dumb network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state; and determining the dummy network element which is in the offline state and exceeds the preset time length as a fault dummy network element. According to the scheme, the upper and lower line states of the dumb network element are determined according to the upper and lower line states of the network user in communication connection with the dumb network element, so that the fault of the dumb network element is monitored, and the instantaneity and the accuracy of fault monitoring of the dumb network element are improved.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 2a, the foregoing step S12 includes:
s21: and sequencing the online time point and the offline time point of at least one network user in the online and offline logs in a time sequence to obtain sequenced online and offline logs.
In practical applications, the solution provided by the present embodiment can be implemented by a plurality of functional modules. For example, the scheduling module may instruct the acquisition module to perform acquisition of the log online and offline by issuing an acquisition task control instruction. The scheduling module starts a log analysis task through instructions or other modes, and particularly can analyze the log on the upper line and the log off the lower line in a parallel multi-task mode.
Specifically, as shown in fig. 2b, the scheduling module, the collecting module, the log server, the log rearranging and merging module, and the log analyzing module may cooperatively execute the scheme provided in the embodiment of the present application. Firstly, the scheduling module instructs the acquisition module to start an acquisition task, and instructs the log analysis module to start log analysis and execute the task. After receiving the collection task control instruction, the collection module starts a log downloading thread, the downloading thread initiates a log downloading request to the log server based on various data transmission protocols (FTP/SFTP and the like), and the log server finds a matched log file according to the downloading request parameters and returns the log file to the collection module. The acquisition module writes the received log data into a local storage in real time, and informs the log rearrangement and combination module when the storage of one log file is completed. And the acquisition module sends a notice representing that the log data are downloaded to the log rearranging and merging module so as to indicate the log rearranging and merging module to rearrange and merge the log data in time sequence. The notification can adopt a subscription-push mode, so that the framework has better expansibility and the function development of the subsequent newly added service is facilitated. The log rearranging and merging module guarantees the time order of the log files through a log rearranging mechanism, then merges a plurality of log files in the same preset time period into the same directory according to the time sequence through a configured merging rule, and finally generates a task insertion and analysis queue according to the ordered log on the upper line and the log off the lower line.
S22: and analyzing the log of the upper line and the lower line in a first preset time period to obtain an analysis result of the log of the upper line and the lower line in the first preset time period.
In this step, the task queue may be applied to analyze the log of the up and down lines. The tasks in the task queue can include a file directory path where the log is located, and the path can contain a plurality of user authentication log files. The resolution is performed by the log resolution master thread by assigning each log file to a respective thread in the thread pool through a thread pool mechanism. The log file may specifically include multiple rows of records, where each row of record corresponds to information related to network users performing online or offline.
S23: and after the analysis of the log of the up-down line in the first preset time period is finished, determining the state of the up-down line of the at least one network user according to the analysis result of the log of the up-down line in the first preset time period.
And after the log of the online and offline within the first preset time period is analyzed, comprehensively determining the online and offline state of at least one network user according to all analysis results obtained by analysis. The condition that the judgment of the on-line and off-line states is wrong due to the fact that part of logs are not analyzed and are not completely analyzed is avoided, and the accuracy of determining the on-line and off-line states of the network users is improved.
Based on the solution provided by the above embodiment, after the step S21, as shown in fig. 3a, the method further includes:
s31: and inserting the tasks to be analyzed carrying the sequenced log of the upper line and the lower line into a task queue to be analyzed.
The task to be analyzed may specifically include the sorted log on/off the line, or may also include a directory path where the sorted log on/off the line is located, so as to obtain the sorted log on/off the line according to the directory path.
Wherein, the step S22 includes:
s32: acquiring a first to-be-analyzed task from the to-be-analyzed task queue, wherein the first to-be-analyzed task carries sequenced up-down logs in a first preset time period;
s33: and allocating the first task to be analyzed to threads in a thread pool for executing analysis so as to obtain an analysis result of the log on the upper line and the lower line in the first preset time period, wherein the thread pool comprises a plurality of threads for executing analysis, and the plurality of threads in the thread pool execute analysis in parallel.
Assuming that the first preset time period is 12-00-12 and the time duration is 1 minute, the specific implementation steps of the scheme provided by this embodiment can be as shown in fig. 3 b. Firstly, a directory where the log files on the up-down line are located within 1 minute of the first preset time period is obtained, and the directory may include one or more log files on the up-down line. And traversing all the log files of the upper and lower lines under the directory, if the log files which are not analyzed exist, executing log analysis in parallel through one or more analysis threads in a mode of creating a log analysis thread until all the log files of the upper and lower lines within 1 minute of the first preset time period are analyzed. And then, the uplink and downlink states of at least one network user in communication connection with the dummy network element can be determined, and the uplink and downlink states of the dummy network element can be further judged.
Based on the solution provided in the foregoing embodiment, optionally, the resource state tree includes a non-leaf node that characterizes a dummy network element of the home wide network and a leaf node that characterizes an uplink and downlink status of the at least one network user, and a connection relationship between the leaf node and the non-leaf node in the resource state tree corresponds to a communication connection relationship between the dummy network element of the home wide network and the at least one network user;
as shown in fig. 4, the step S13 includes:
s41: and updating leaf nodes of the resource state tree according to the up-down state of the at least one network user, so that the leaf nodes carry up-down state information of the network user.
S42: and determining the up-down line state of the dumb network element of the home wide network according to the up-down line state information of the network user carried by the leaf node in the resource state tree and the connection relation between the leaf node and the non-leaf node in the resource state tree.
In the scheme provided by the embodiment of the application, the resource state tree is updated according to the uplink and downlink states of each network user, and the uplink and downlink states of the network users are represented by each leaf node in the resource state tree in real time, so that the uplink and downlink states of the dumb network element can be determined directly according to the uplink and downlink states represented by each leaf node in the resource state tree in the subsequent step, and the efficiency of determining the uplink and downlink states of the dumb network element is effectively improved.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 5a, step S42 includes:
s51: and when the leaf nodes connected with the target non-leaf node comprise at least one leaf node which represents that the network user is in an on-line state, determining that the dumb network element of the home-wide network represented by the target non-leaf node is in the on-line state.
S52: and when the network users represented by the leaf nodes connected with the target non-leaf node are all in the offline state, determining that the dummy network elements of the home-wide network represented by the target non-leaf node are in the offline state.
Because each leaf node in the resource state tree can represent the online and offline states of the network user in real time, the scheme provided by the embodiment of the application efficiently determines the upper and lower linear states of the network element represented by the non-leaf nodes according to the upper and lower linear states represented by the leaf nodes connected with the non-leaf nodes based on the connection relation of each node in the resource state tree, thereby effectively improving the efficiency of determining the online and offline states of the dumb network element.
The scheme provided by the embodiment of the application can be specifically realized through the state management module, and when the log analysis module analyzes the online and offline logs of the network user, the resource state tree can be updated in real time. Meanwhile, after all the files in the same preset time period are analyzed, state judgment logics of all the network elements need to be triggered. Therefore, the scheme provided by the embodiment of the application realizes the state judgment logic by utilizing the thread fence technology, and directly improves the use efficiency of the CPU instead of using a continuous loop detection mode.
Optionally, referring to fig. 5b, the state manager may directly determine the up-down state of the network element according to the number of network users in an online state in communication connection under the non-leaf node corresponding to the network element. And updating the up-down line state of the network element after the log analysis is completed each time by taking a preset time interval as a cycle. The network element offline alarm may be generated when the network element is in an offline (offline) state.
Based on the solution provided by the foregoing embodiment, as shown in fig. 6a, before the foregoing step S22, optionally, the method further includes:
s61: verifying whether the log of the upper line and the lower line in the first preset time period is complete or not to obtain a first verification result;
s62: when the first verification result represents that the log of the upper line and the log of the lower line in the first preset time period are incomplete, the log of the upper line and the log of the lower line in a second preset time period is obtained to perform analysis so as to obtain the analysis result of the log of the upper line and the log of the lower line in the second preset time period, wherein the second preset time period is a time period after the first preset time period.
In the scheme provided by the embodiment of the application, the accuracy of the online and offline states of each node in the resource state tree is the key for determining whether the dumb network element has a fault. And the resource state tree is constructed based on the resource file and the online file and changes the state along with the real-time analysis of the log file. Therefore, the accuracy of the resource state tree depends on the accuracy and stability of the resource file, the online file and the log file.
Optionally, the three types of files may be generated by the radius platform, and the data accuracy thereof is ensured by the radius platform. From the actual situation of the current network, the resource files and the online files have a backup mechanism, so that the stability of the two files is strong. The log file is generated by radius platform in multiple copies per minute, occasionally causing the problem of file delay transmission and even loss, which may cause the resource state tree in the memory to be abnormal.
In order to further improve the accuracy of determining the fault of the dummy network element, referring to fig. 6b, the integrity of the log online and offline is verified by the scheme provided in the embodiment of the present application. Assuming that the preset time duration is 1 minute, the method specifically includes the following steps:
first, the number of analyzed files is recorded by the log analysis engine when the log file is analyzed, and if the number of analyzed files does not reach a specified target number (the number is specified when the analysis parameter is started), a specified time T (default 60 seconds) is waited.
When the waiting time T is not ready, the file loss time T _1 is recorded, then the current analyzed data is saved to a SNAPSHOT SNAPSHOT _1, then the analysis of the log file in the current minute is abandoned, and the analysis of the log file in the next minute is continued.
Optionally, based on the scheme provided in the foregoing embodiment, when the log of the upper line and the lower line in the first preset time period is incomplete, as shown in fig. 7a, the method further includes:
s71: determining a first verification time for obtaining the first verification result;
s72: taking the first verification time as an initial time, and verifying whether the log of the upper line and the log of the lower line in the first preset time period are complete or not after a preset restoration time length to obtain a second verification result;
s73: when the second verification result represents that the log of the online and the offline in the first preset time period is incomplete, acquiring an online user file in a repair time period;
s74: and updating the resource state tree according to the online user file in the repair time period.
Referring to fig. 7b, before analyzing the next file Zhong Rizhi, it is checked whether the lost file is ready, and if the lost file is ready, the log file at the time T _1 is directly analyzed; and if the file is not ready, further calculating whether the current time and the file loss time T _1 are different by more than 15 minutes, and if the difference is more than 15 minutes, executing the state repairing operation of the step 4.
Then, an online file with the granularity of 15 minutes nearest to the current time T _2 is obtained, the resource state tree is reset based on the online file, all log files with the granularity of 1 minute from the time T _2 to the current time are downloaded from the authentication log server again, and finally the log analysis main thread is restarted, so that the lost log files can be skipped over, and the correctness of the resource state tree is ensured.
Optionally, the solution provided by the embodiment of the present application may be implemented by cooperation of various functional modules in the system as shown in fig. 7 c. The system can be divided into the following functional modules according to functions: scheduling module (scheduler), log collection (collector), log compression (merging and storing according to minutes), state management (decider), log analysis (log analyzer) and data persistence. The solid arrows indicate the data flow direction, and the dashed arrows indicate the control command flow direction.
Compared with the prior art, the scheme has the following advantages:
1. the real-time performance is strong: by utilizing the principle of strong isolation and weak consistency of the Actor model, an efficient message management module is developed, the problem that the computing efficiency is limited due to the fact that data is shared in a traditional multi-thread concurrent programming model is effectively solved, the rapid processing capability of large-scale Radius authentication logs is finally achieved, and the requirements of telecommunication operators on real-time monitoring, rapid fault location and processing and the like of home wide access networks can be met.
2. The accuracy is high: by collecting behavior logs of online and offline users of Radius width in real time and processing and analyzing the log streams of the online and offline behaviors of the Radius user from the perspective of user perception experience, the rapid association of 625 ten thousand Radius width users, 25 ten thousand primary optical splitters, 127 ten thousand secondary optical splitters and 7 ten thousand ONU devices on the whole network can be realized, the group fault condition of the devices on the home width access side of the whole network can be output in real time, the fault discovery efficiency and accuracy of the home bandwidth dummy resources are improved, and the home bandwidth access side dummy network element is enabled to be managed and controllable.
3. High availability: the analysis device can be automatically restored to a correct state after the network element state is abnormal, and the continuity of the operation and maintenance service of the home broadband network is effectively guaranteed.
In order to solve the problems existing in the prior art, an embodiment of the present application provides a device 80 for monitoring a failure of a dumb network element of a home-wide network, as shown in fig. 8, including:
the acquiring module 81 is configured to acquire an online log of at least one network user in communication connection with a dummy network element of a home network according to a resource state tree, where the resource state tree represents a communication connection relationship between the dummy network element of the home network and the at least one network user, and the online log includes an online time point and an offline time point of the network user;
a first determining module 82, configured to determine an online and offline status of the at least one network user according to an online time point and an offline time point of the network user in the online and offline log of the at least one network user;
a second determining module 83, configured to determine an up-down line state of a dummy network element of the home wide network according to the number of network users in an on-line state and/or an off-line state;
and a third determining module 84, determining the dumb network element which is in the offline state and exceeds the preset time length as a faulty dumb network element.
By the device provided by the embodiment of the application, the online and offline logs of at least one network user in communication connection with the dumb network element of the home wide network are obtained according to the resource state tree; determining the online and offline state of at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user; determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state; and determining the dumb network element in the offline state exceeding the preset time length as a fault dumb network element. According to the scheme, the upper and lower line states of the dumb network element are determined according to the upper and lower line states of the network user in communication connection with the dumb network element, so that the fault of the dumb network element is monitored, and the instantaneity and the accuracy of fault monitoring of the dumb network element are improved.
Preferably, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the above-mentioned dummy network element fault monitoring method for a home wide network, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the foregoing dumb network element fault monitoring method for a home wide network, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the particular illustrative embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but is intended to cover various modifications, equivalent arrangements, and equivalents thereof, which may be made by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for monitoring the fault of the dummy network element of the home wide network is characterized by comprising the following steps:
acquiring an online log and an offline log of at least one network user in communication connection with a dummy network element of a home wide network according to a resource state tree, wherein the resource state tree represents the communication connection relation between the dummy network element of the home network and the at least one network user, and the online log and the offline log comprise an online time point and an offline time point of the network user;
determining the online and offline state of the at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user;
determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state;
and determining the dummy network element which is in the offline state and exceeds the preset time length as a fault dummy network element.
2. The method of claim 1, wherein determining the online and offline status of the at least one network user based on the online and offline time points of the network user in the online and offline log of the at least one network user comprises:
sequencing the online time point and the offline time point of at least one network user in the online and offline logs in a time sequence to obtain sequenced online and offline logs;
analyzing the log of the upper line and the lower line in a first preset time period to obtain an analysis result of the log of the upper line and the lower line in the first preset time period;
and after the analysis of the log of the up-down line in the first preset time period is finished, determining the state of the up-down line of the at least one network user according to the analysis result of the log of the up-down line in the first preset time period.
3. The method of claim 2, wherein after sorting the time points of online and offline of at least one network user in the online and offline log in chronological order to obtain a sorted online and offline log, further comprising:
inserting the tasks to be analyzed carrying the sequenced log of the upper line and the lower line into a task queue to be analyzed;
the method for analyzing the log of the upper line and the lower line in the first preset time period to obtain the analysis result of the log of the upper line and the lower line in the first preset time period includes:
acquiring a first to-be-analyzed task from the to-be-analyzed task queue, wherein the first to-be-analyzed task carries sequenced log of an upper line and a lower line within a first preset time period;
and allocating the first task to be analyzed to threads in a thread pool for executing analysis so as to obtain an analysis result of the log on the upper line and the lower line in the first preset time period, wherein the thread pool comprises a plurality of threads for executing analysis, and the plurality of threads in the thread pool execute analysis in parallel.
4. The method of claim 3, wherein the resource state tree comprises non-leaf nodes characterizing dummy network elements of the home wide network and leaf nodes characterizing the up-down status of the at least one network user, and wherein the connection relationship between the leaf nodes and the non-leaf nodes in the resource state tree corresponds to the communication connection relationship between the dummy network elements of the home wide network and the at least one network user;
determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state, wherein the determining comprises the following steps:
updating leaf nodes of the resource state tree according to the up-down line state of the at least one network user, so that the leaf nodes carry up-down line state information of the network user;
and determining the up-down line state of the dumb network element of the home wide network according to the up-down line state information of the network user carried by the leaf node in the resource state tree and the connection relation between the leaf node and the non-leaf node in the resource state tree.
5. The method of claim 4, wherein determining the upper and lower line states of the dummy network element of the home wide network according to the upper and lower line state information of the network users carried by the leaf nodes in the resource state tree and the connection relationship between the leaf nodes and the non-leaf nodes in the resource state tree comprises:
when the leaf nodes connected with the target non-leaf node comprise at least one leaf node representing that a network user is in an on-line state, determining that a dummy network element of the home-wide network represented by the target non-leaf node is in the on-line state;
and when the network users represented by the leaf nodes connected with the target non-leaf node are all in the offline state, determining that the dummy network elements of the home-wide network represented by the target non-leaf node are in the offline state.
6. The method according to any one of claims 2-5, wherein before performing parsing on the log of the upper line and the lower line within a first preset time period to obtain a result of parsing the log of the upper line and the lower line within the first preset time period, the method further comprises:
verifying whether the log of the upper line and the lower line in the first preset time period is complete or not to obtain a first verification result;
when the first verification result represents that the log of the upper line and the log of the lower line in the first preset time period are incomplete, the log of the upper line and the log of the lower line in a second preset time period is obtained to perform analysis so as to obtain the analysis result of the log of the upper line and the log of the lower line in the second preset time period, wherein the second preset time period is a time period after the first preset time period.
7. The method of claim 6, wherein when the log of the up-down line in the first preset time period is incomplete, further comprising:
determining a first verification time for obtaining the first verification result;
taking the first verification time as an initial time, and verifying whether the log of the upper line and the log of the lower line in the first preset time period are complete or not after a preset restoration time length to obtain a second verification result;
when the second verification result represents that the up-down logs in the first preset time period are incomplete, acquiring an on-line user file in a repair time period;
and updating the resource state tree according to the online user file in the repair time period.
8. A dumb network element fault monitoring device of a home-wide network, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an online log and an offline log of at least one network user in communication connection with a dummy network element of a home wide network according to a resource state tree, the resource state tree is used for representing the communication connection relation between the dummy network element of the home network and the at least one network user, and the online log and the offline log comprise online time points and offline time points of the network user;
the first determining module is used for determining the online and offline states of the at least one network user according to the online time point and the offline time point of the network user in the online and offline log of the at least one network user;
the second determining module is used for determining the up-down line state of the dummy network element of the home wide network according to the number of the network users in the on-line state and/or the off-line state;
and a third determining module, configured to determine the dumb network element in the offline state exceeding a preset time length as a faulty dumb network element.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202110499907.7A 2021-05-08 2021-05-08 Method and device for monitoring faults of dummy network elements of home wide network Active CN115314358B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110499907.7A CN115314358B (en) 2021-05-08 2021-05-08 Method and device for monitoring faults of dummy network elements of home wide network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110499907.7A CN115314358B (en) 2021-05-08 2021-05-08 Method and device for monitoring faults of dummy network elements of home wide network

Publications (2)

Publication Number Publication Date
CN115314358A true CN115314358A (en) 2022-11-08
CN115314358B CN115314358B (en) 2024-04-09

Family

ID=83853760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110499907.7A Active CN115314358B (en) 2021-05-08 2021-05-08 Method and device for monitoring faults of dummy network elements of home wide network

Country Status (1)

Country Link
CN (1) CN115314358B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115776454A (en) * 2022-11-16 2023-03-10 浪潮通信信息系统有限公司 Method for delimiting unavailable network element facing home-wide internet

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197558B1 (en) * 2001-12-17 2007-03-27 Bellsouth Intellectual Property Corp. Methods and systems for network element fault information processing
CN105207835A (en) * 2014-06-30 2015-12-30 中国移动通信集团浙江有限公司 Determination method of network element working state of wireless local area network and apparatus thereof
CN106330297A (en) * 2015-06-18 2017-01-11 中兴通讯股份有限公司 Method and device for detecting optical fiber fault point
US20180351978A1 (en) * 2017-06-05 2018-12-06 Microsoft Technology Licensing, Llc Correlating user information to a tracked event
CN111865628A (en) * 2019-04-25 2020-10-30 中国移动通信集团河北有限公司 Statistical system, method, server and storage medium for influencing user by home wide fault
CN112020087A (en) * 2019-05-30 2020-12-01 中国移动通信集团浙江有限公司 Tunnel fault monitoring method and device and computing equipment
CN112291075A (en) * 2019-07-23 2021-01-29 中国移动通信集团浙江有限公司 Network fault positioning method and device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197558B1 (en) * 2001-12-17 2007-03-27 Bellsouth Intellectual Property Corp. Methods and systems for network element fault information processing
CN105207835A (en) * 2014-06-30 2015-12-30 中国移动通信集团浙江有限公司 Determination method of network element working state of wireless local area network and apparatus thereof
CN106330297A (en) * 2015-06-18 2017-01-11 中兴通讯股份有限公司 Method and device for detecting optical fiber fault point
US20180351978A1 (en) * 2017-06-05 2018-12-06 Microsoft Technology Licensing, Llc Correlating user information to a tracked event
CN111865628A (en) * 2019-04-25 2020-10-30 中国移动通信集团河北有限公司 Statistical system, method, server and storage medium for influencing user by home wide fault
CN112020087A (en) * 2019-05-30 2020-12-01 中国移动通信集团浙江有限公司 Tunnel fault monitoring method and device and computing equipment
CN112291075A (en) * 2019-07-23 2021-01-29 中国移动通信集团浙江有限公司 Network fault positioning method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李;吴萍;黄俊慧;: "远端站点传输连接状态异常分析和监控研究", 海洋信息, no. 01, 15 February 2016 (2016-02-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115776454A (en) * 2022-11-16 2023-03-10 浪潮通信信息系统有限公司 Method for delimiting unavailable network element facing home-wide internet

Also Published As

Publication number Publication date
CN115314358B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN110928774B (en) Automatic test system based on node type
CN111831569A (en) Test method and device based on fault injection, computer equipment and storage medium
CN102075384A (en) Performance test system and method
US12074767B2 (en) Data processing method and device for cooperating with an artificial intelligence management plane, electronic device, and storage medium
CN106685676B (en) Node switching method and device
CN101631053B (en) EAPS ring-network topology monitoring method and system
CN108900374B (en) Data processing method and device applied to DPI equipment
CN111866624B (en) ONU (optical network Unit) service migration method, device and equipment and readable storage medium
CN112350854B (en) Flow fault positioning method, device, equipment and storage medium
CN101808351A (en) Method and system for business impact analysis
CN106911510B (en) Usability monitoring system and method for network access system
CN103064353B (en) Flat machine long-range control method
CN113422696B (en) Monitoring data updating method, system, equipment and readable storage medium
CN115314358A (en) Method and device for monitoring dummy network element fault of home wide network
CN106506226B (en) A kind of starting method and device of fault detection
CN112671586B (en) Automatic migration and guarantee method and device for service configuration
CN114422386B (en) Monitoring method and device for micro-service gateway
CN115174350B (en) Operation and maintenance alarm method, device, equipment and medium
CN101917699B (en) Random reported signaling tracking method and device based on users
KR20190004970A (en) System and Method for Real-Time Trouble Cause Analysis based on Network Quality Data
CN115705259A (en) Fault processing method, related device and storage medium
CN107370612B (en) Network quality management system detection task scheduling method and device
CN116225944B (en) Software testing system and method for presetting networking environment
CN115396290B (en) Automatic fault recovery method, device and service system
CN104185093B (en) The up home gateway on-position automatic positioning methods of PON and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant