CN109347688B - Method and device for positioning fault in wireless local area network - Google Patents

Method and device for positioning fault in wireless local area network Download PDF

Info

Publication number
CN109347688B
CN109347688B CN201811418151.3A CN201811418151A CN109347688B CN 109347688 B CN109347688 B CN 109347688B CN 201811418151 A CN201811418151 A CN 201811418151A CN 109347688 B CN109347688 B CN 109347688B
Authority
CN
China
Prior art keywords
sta
identification
time
grouping
failure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811418151.3A
Other languages
Chinese (zh)
Other versions
CN109347688A (en
Inventor
何宗海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruijie Networks Co Ltd
Original Assignee
Ruijie Networks Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruijie Networks Co Ltd filed Critical Ruijie Networks Co Ltd
Priority to CN201811418151.3A priority Critical patent/CN109347688B/en
Publication of CN109347688A publication Critical patent/CN109347688A/en
Application granted granted Critical
Publication of CN109347688B publication Critical patent/CN109347688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/02Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
    • H04W84/10Small scale networks; Flat hierarchical networks
    • H04W84/12WLAN [Wireless Local Area Networks]

Abstract

The embodiment of the invention provides a method and a device for positioning faults in a wireless local area network, wherein the method comprises the following steps: acquiring a first running log of each of n wireless access points AP in a first period; summarizing all the first running logs into a second running log; generating a record with the identifier of the wireless terminal STA in the second operation log as a main key and the identifier of the AP corresponding to the main key, the content of the operation log corresponding to the main key and the generation timestamp of the operation log corresponding to the main key as contents; grouping the set of records according to their keywords; sequencing the records contained in each group according to the state change time; counting the occurrence frequency of failure of the current state corresponding to the operation process of the STA in the second period according to the result of sequencing the records contained in each group; and determining the type of the network fault in each group according to the statistical frequency. The accuracy of the fault prediction result is improved.

Description

Method and device for positioning fault in wireless local area network
Technical Field
The present invention relates to the field of data communication, and in particular, to a method and an apparatus for locating a fault in a wireless local area network.
Background
A Wireless Local Area Network (WLAN), which is called WLAN for short, is a computer Local Area Network that uses Wireless channels as transmission media, is an important supplement and extension of wired networking mode, and is widely applied to the field that requires mobile data processing or is difficult to wire physical transmission media.
The WLAN mainly includes a Station (Station, Wireless terminal STA), a Wireless Access Point (AP), a Wireless Medium (WM), and a distribution System (Dis-distribution System, DS).
The STA is generally a client in the WLAN, and may be a computer equipped with a wireless network card, or a smart phone with a wireless fidelity WiFi module. The STA may be mobile or fixed, and is the most basic component of the wireless lan.
In the operation process of the wireless local area network, the wireless access point AP may fail, the network parameter configuration may be incorrect, or the network attack may fail, which may cause the network communication to be affected, directly affect the internet experience of the user, and even bring direct economic loss to the user.
During the operation, the wireless access point AP prints some system logs (syslog) and logs of the wireless terminal STA on line.
The method comprises the steps of automatically collecting equipment syslog and STA on-line logs in real time, diagnosing possible faults of the wireless network in time from the data, notifying the faults to network management personnel, finally solving the faults in time, and recovering the normal operation of the network, wherein the problems are urgently needed to be solved by network operation and maintenance.
The existing WLAN failure can only be monitored by using a traditional manual drive test method to monitor the performance and coverage of the WLAN network, and this monitoring method requires that an engineer carries various monitoring instruments and meters to each WLAN access point to collect network quality parameters and perform data service tests on the spot, and then performs statistics and positioning analysis on the collected network quality collection parameters and data service test results to evaluate the network quality of each WLAN access point. Because the manual operation has instantaneity and subjectivity, the monitoring method cannot realize comprehensive and continuous network monitoring, thereby being difficult to prevent the monitoring from the prior network fault and quite complicated to monitor various network abnormity such as WLAN network fault positioning and the like. In addition, the existing instruments and meters for monitoring are expensive, consume a large amount of manpower, material resources and capital, have high detection cost, and bring great difficulty to the WLAN construction planning and optimization of enterprise WLAN users.
Disclosure of Invention
In order to solve the technical problem, the embodiment of the invention adopts the following technical scheme:
a method of locating a fault in a wireless local area network, comprising:
acquiring a first running log of each of n wireless access points AP in a first period, wherein n is a positive integer greater than or equal to 1;
summarizing all the first running logs into a second running log;
generating a record with the identifier of the wireless terminal STA in the second operation log as a main key and the identifier of the AP corresponding to the main key, the content of the operation log corresponding to the main key and the generation timestamp of the operation log corresponding to the main key as contents;
sequencing the records with the same STA identification and the same AP identification according to the generated time stamp;
determining the message type matched with the record according to the content of the running log corresponding to the primary key;
determining the operation process of the STA corresponding to each STA on each AP according to the message type and the sequencing result of the timestamp, wherein the operation process at least comprises the identification of the STA, the identification of the AP, the state change time of the STA and the current state of the STA;
grouping the set of records according to the keywords of the records;
sequencing the records contained in each group according to the state change time for each group;
counting the occurrence frequency of failure in the current state corresponding to the operation process of the STA in the second period according to the result of sequencing the records contained in each group;
and determining the type of the network fault in each group according to the statistical frequency.
Optionally, the state change time is an authentication time, or an online time or an offline time.
Optionally, the packet type is one of an STA authentication packet, an STA association packet, an IP address assignment packet, an IP address switching packet, or an STA offline packet.
Optionally, the operation process further includes: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
Optionally, the keyword is an identifier of the STA,
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
Optionally, the keyword is an identifier of the AP,
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
Optionally, the method further includes:
acquiring a third running log of each STA in the wireless local area network in a first period;
summarizing all the third running logs into a fourth running log;
and matching the identifier of the AP and the state change time of the STA in the fourth running log with the identifier of the AP and the state change time of the STA in the running process of the STA, and if the identifier of the AP and the state change time of the STA are matched with the identifier of the STA in the running process of the STA, adding the SSID in the fourth running log into the running process of the STA.
Optionally, the keyword is an identifier of an SSID;
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID.
Optionally, the current state of the STA is determined by the following steps:
when the message types required by the STA to be successfully online once appear in the time stamp sequencing result corresponding to the online once according to the standard specified sequence, determining the current state of the STA as successful;
and when at least one message type in the message types required by the STA for one-time online success does not appear in the time stamp sequencing result corresponding to one-time online according to the sequence specified by the standard, determining that the current state of the STA is failure.
Another aspect of the embodiments of the present invention is to provide an apparatus for locating a fault in a wireless local area network, including:
a first obtaining module, configured to obtain a first running log of each of n wireless access points AP in a first period, where n is a positive integer greater than or equal to 1;
the first summarizing module is used for summarizing all the first running logs into a second running log;
a first generating module, configured to generate a record with an identifier of a wireless terminal STA in the second operation log as a primary key, and with an identifier of an AP corresponding to the primary key, content of an operation log corresponding to the primary key, and a generation timestamp of the operation log corresponding to the primary key as content;
the first sequencing module is used for sequencing the records with the same STA identification and the same AP identification according to the generated timestamp;
the first determining module is used for determining the message type matched with the record according to the content of the running log corresponding to the main key;
a second determining module, configured to determine, according to the packet type and according to the result of the timestamp ordering, an operation process of the STA corresponding to each STA on each AP, where the operation process at least includes an STA identifier, an AP identifier, a STA state change time, and a current STA state;
the grouping module is used for grouping the record set according to the recorded keywords;
the second sequencing module is used for sequencing the records contained in each group according to the state change time;
a counting module, configured to count, according to a result of sorting the records included in each group, an occurrence frequency that a current state corresponding to an operation process of the STA in the second period is a failure;
and the third determining module is used for determining the type of the network fault in each group according to the statistical frequency.
Optionally, the state change time is an authentication time, or an online time or an offline time.
Optionally, the packet type is one of an STA authentication packet, an STA association packet, an IP address assignment packet, an IP address switching packet, or an STA offline packet.
Optionally, the operation process further includes: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
Optionally, the keyword is an identifier of the STA,
the third determining module is specifically configured to:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
Optionally, the keyword is an identifier of the AP,
the third determining module is specifically configured to:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
Optionally, the apparatus further comprises:
the second acquisition module is used for acquiring a third running log of each STA in the wireless local area network in the first period;
the second summarizing module is used for summarizing all the third running logs into fourth running logs;
a matching module, configured to match the identifier of the AP and the state change time of the STA in the fourth running log with the identifier of the AP and the state change time of the STA in the running process of the STA,
and the adding module is used for adding the SSID in the fourth running log into the running process of the STA if the SSID is matched with the SSID in the fourth running log.
Optionally, the keyword is an identifier of an SSID;
the third determining module is specifically configured to:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID.
Optionally, the apparatus further comprises: the STA current state determining module is used for determining that the STA current state is successful when the message types required by the successful one-time online of the STA all appear in the time stamp sequencing result corresponding to the one-time online according to the sequence specified by the standard;
and when at least one message type in the message types required by the STA for one-time online success does not appear in the time stamp sequencing result corresponding to one-time online according to the sequence specified by the standard, determining that the current state of the STA is failure.
The embodiment of the invention has the advantages that the Syslog logs and the STA logs of all AP equipment in the WLAN are collected all day by day automatically, the network fault in the WLAN is automatically discovered under the unattended condition, and the alarm information is automatically sent to operation and maintenance personnel, so that the WLAN network is automatically and automatically discovered and positioned. And by adopting an analysis statistical method of big data, complex data such as Syslog logs and STA logs are combined with an application scene through a calculation method of associating and aggregating with an actual business process, so that the accuracy of a fault prediction result is effectively improved, and the usability is also improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method according to an embodiment of the present invention;
FIG. 4 is a diagram of an apparatus according to an embodiment of the present invention;
FIG. 5 is a diagram of an apparatus according to an embodiment of the present invention;
fig. 6 is a system configuration diagram according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
First, it should be noted that the STA access procedure refers to a procedure in which the STA accesses the network, and includes two steps of authentication and association in the STA access procedure.
The STA process refers to a process from the network access to the offline of the STA, and in the STA process, viewed from the Syslog message, the process includes information such as authentication, association, IP allocation, IP switching due to frequency band switching, re-association and IP allocation due to roaming, and the offline.
A first embodiment of the present invention provides a method for locating a fault in a wireless local area network, as shown in fig. 1, including:
s101, acquiring a first running log of each of n wireless Access Points (AP) in a first period;
wherein n is a positive integer greater than or equal to 1.
S103, summarizing all the first running logs into a second running log;
s105, generating a record with the identifier of the wireless terminal STA in the second running log as a main key, and the identifier of the AP corresponding to the main key, the content of the running log corresponding to the main key and the generation timestamp of the running log corresponding to the main key as contents;
s107, sequencing the records with the same STA identification and the same AP identification according to the generated time stamp;
s109, determining the message type matched with the record according to the content of the running log corresponding to the main key;
s111, determining the operation process of the STA corresponding to each STA on each AP according to the message type and the sequencing result of the timestamp, wherein the operation process at least comprises the identification of the STA, the identification of the AP, the state change time of the STA and the current state of the STA;
s113, grouping the recorded set according to the recorded keywords;
s115, sequencing the records contained in each group according to the state change time;
s117, counting the frequency of occurrence that the current state corresponding to the operation process of the STA in the second period is failure according to the result of sequencing the records contained in each group;
s119, determining the type of the network fault in each group according to the statistical frequency.
Optionally, the state change time is an authentication time, or an online time or an offline time.
Optionally, the packet type is one of an STA authentication packet, an STA association packet, an IP address assignment packet, an IP address switching packet, or an STA offline packet.
Optionally, the operation process further includes: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
Optionally, the keyword in step S113 is an identification of the STA,
then, the step S119 specifically includes:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
Wherein m can be a positive integer, and the time unit can be time, minute, second, millisecond, microsecond, and the like.
Optionally, the keyword in step S113 is an identifier of the AP,
then, the step S119 specifically includes:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
Optionally, as shown in fig. 2, the second embodiment of the present invention, the method according to the first embodiment of the present invention further includes:
s201, acquiring a third running log of each STA in the wireless local area network in a first period;
s203, summarizing all the third running logs into a fourth running log;
s205, matching the AP identifier and the STA state change time in the fourth running log with the AP identifier and the STA state change time in the running process of the STA,
s207, if the SSID in the fourth running log is matched with the SSID in the fourth running log, the SSID in the fourth running log is added to the running process of the STA.
Optionally, in the second embodiment of the present invention, the keyword in step S113 is an Identifier of an SSID (Service Set Identifier);
then, in the second embodiment of the present invention, the step S119 specifically includes:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID. Wherein m can be a positive integer, and the time unit can be time, minute, second, millisecond, microsecond, and the like.
Optionally, as shown in fig. 3, the current state of the STA in the first embodiment of the present invention is determined by the following steps:
s301, when the message types required by the STA to be successfully online once appear in the time stamp sequencing result corresponding to the online once according to the sequence specified by the standard, determining that the current state of the STA is successful;
s303, when at least one message type in the message types required by the STA to be successfully online once does not appear in the time stamp sequencing result corresponding to the online once according to the sequence specified by the standard, determining that the current state of the STA is failure.
The embodiment of the invention has the advantages that the Syslog logs and the STA logs of all AP equipment in the WLAN are collected all day by day automatically, the network fault in the WLAN is automatically discovered under the unattended condition, and the alarm information is automatically sent to operation and maintenance personnel, so that the WLAN network is automatically and automatically discovered and positioned. And by adopting an analysis statistical method of big data, complex data such as Syslog logs and STA logs are combined with an application scene through a calculation method of associating and aggregating with an actual business process, so that the accuracy of a fault prediction result is effectively improved, and the usability is also improved.
A third embodiment of the present invention provides an apparatus for locating a fault in a wireless local area network, as shown in fig. 4, including:
a first obtaining module 401, configured to obtain a first running log of each of n wireless access points AP in a first period, where n is a positive integer greater than or equal to 1;
a first summarizing module 403, configured to summarize all the first operation logs into a second operation log;
a first generating module 405, configured to generate a record with an identifier of the wireless terminal STA in the second operation log as a primary key, and an identifier of an AP corresponding to the primary key, content of the operation log corresponding to the primary key, and a generation timestamp of the operation log corresponding to the primary key as contents;
a first sorting module 407, configured to sort records with the same STA identifier and the same AP identifier according to the generated timestamp;
a first determining module 409, configured to determine, according to the content of the operation log corresponding to the primary key, a packet type matched with the record;
a second determining module 411, configured to determine, according to the packet type and according to the result of the timestamp sorting, an operation process of the STA corresponding to each STA on each AP, where the operation process at least includes an identifier of the STA, an identifier of the AP, a state change time of the STA, and a current state of the STA;
a grouping module 413 for grouping the set of records according to the keywords of the records;
a second sorting module 415, configured to sort records included in each packet according to the state change time;
a counting module 417, configured to count, according to a result of sorting the records included in each group, an occurrence frequency that a current state corresponding to an operation process of the STA in the second period is a failure;
a third determining module 419, configured to determine the category of the network fault in each packet according to the statistical frequency.
Optionally, the state change time is an authentication time, or an online time or an offline time.
Optionally, the packet type is one of an STA authentication packet, an STA association packet, an IP address assignment packet, an IP address switching packet, or an STA offline packet.
Optionally, the operation process further includes: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
Optionally, the keyword is an identifier of the STA,
then, the third determining module 419 is specifically configured to:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
Optionally, the keyword is an identifier of the AP,
then, the third determining module 419 is specifically configured to:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
Optionally, in a fourth embodiment of the present invention, as shown in fig. 5, the apparatus according to the third embodiment of the present invention further includes:
a second obtaining module 501, configured to obtain a third running log of each STA in the wireless local area network in the first period;
a second summarizing module 503, configured to summarize all the third operation logs into a fourth operation log;
a matching module 505, configured to match the identifier of the AP and the state change time of the STA in the fourth operation log with the identifier of the AP and the state change time of the STA in the operation process of the STA,
an adding module 507, configured to add the SSID in the fourth operation log to an operation process of the STA if the SSID is matched with the SSID.
Optionally, the keyword is an identifier of an SSID;
then, in the fourth embodiment of the present invention, the third determining module 419 is specifically configured to:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID.
Optionally, in a third embodiment of the present invention, the apparatus further includes: the STA current state determining module is used for determining that the STA current state is successful when the message types required by the successful one-time online of the STA all appear in the time stamp sequencing result corresponding to the one-time online according to the sequence specified by the standard;
and when at least one message type in the message types required by the STA for one-time online success does not appear in the time stamp sequencing result corresponding to one-time online according to the sequence specified by the standard, determining that the current state of the STA is failure.
The embodiment of the invention has the advantages that the Syslog logs and the STA logs of all AP equipment in the WLAN are collected all day by day automatically, the network fault in the WLAN is automatically discovered under the unattended condition, and the alarm information is automatically sent to operation and maintenance personnel, so that the WLAN network is automatically and automatically discovered and positioned. And by adopting an analysis statistical method of big data, complex data such as Syslog logs and STA logs are combined with an application scene through a calculation method of associating and aggregating with an actual business process, so that the accuracy of a fault prediction result is effectively improved, and the usability is also improved.
The following further explains the embodiment of the present invention with reference to a specific application scenario, and fig. 6 is a system structure diagram of the embodiment of the present invention, as shown in fig. 6, including:
the operation and maintenance management system comprises: the system is used for receiving logs sent by APs (such as APs 1 to APn in fig. 6) through a syslog udp Protocol (User data Protocol) and a Rest interface, and storing the logs into an Elasticsearch database in a database cluster. The content of the Syslog data is presented at the management end, newly added data in the Elasticsearch database is automatically exported to an HDFS file system in the database cluster every hour (the time interval can be modified according to specific scenes, such as every two hours, and the like), and a diagnosis request is sent to the diagnosis service unit. Displaying the diagnosis result of the diagnosis service unit or sending a fault alarm.
Database clustering: the system comprises an Elasticissearch database, an HDFS distributed file system and a relational database MySql. The Elasticsearch database is used to store Syslog data for the last few days (e.g. 7 days), and new data is exported to the HDFS file system every fixed time unit (e.g. 1 hour). The HDFS file system is used to permanently save Syslog data. The MySql database is used for storing structured data, and mainly comprises data of an operation and maintenance management system and a diagnosis result of a diagnosis service unit.
A diagnostic service unit: the system is used for taking out a Syslog data file from an HDFS file system, carrying out MapReduce calculation to obtain STA process data, carrying out rule judgment and statistical calculation on the process data to obtain an object with large-range STA failure in a WLAN (wireless local area network), positioning a fault according to the object, and storing related information into a MySql database in a database cluster for showing and giving a fault alarm to an operation and maintenance management system.
In the MapReduce computing framework, map and reduce functions are tasks to be executed, and a master entity allocates the tasks to a worker entity for execution. The map function reads the distributed input data fragments, outputs a set of intermediate key/value pair values, the reduce function collects the value values with the same intermediate key value, and combines the value values to form a smaller set of value values.
In the embodiment of the invention, the data collection step and the fault diagnosis positioning step can be independent from each other. The data collection steps are as follows:
step A, the operation and maintenance management system obtains a first operation log of each of n wireless access points AP in a first period, and obtains a third operation log of each of STAs in a wireless local area network in the first period, specifically, the AP sends the first operation log to the operation and maintenance management system through a Syslog standard protocol UDP, or the AP sends the STA log to the operation and maintenance management system through calling a Rest interface (that is, the third operation log may include the uplink and downlink time of the STA, the accessed SSID information, the usage flow and other conditions).
Step B, summarizing all the first running logs into a second running log, summarizing all the third running logs into a fourth running log, and storing the data collected in the step one into an Elasticissearch database. The ordinary operation and maintenance management system can look up the Syslog data information in the Elasticsearch database and make some basic statistics. The operation and maintenance management system automatically saves the incremental data in the Elasticissearch to the HDFS file system every hour.
On the basis of data acquisition, the steps of fault diagnosis and positioning are as follows:
step one, acquiring a latest one-hour Syslog incremental data file from a database cluster at regular time.
And step two, generating a record with the identification of the wireless terminal STA in the second running log as a main key, and the identification of the AP corresponding to the main key, the content of the running log corresponding to the main key and the generation timestamp of the running log corresponding to the main key as contents.
Specifically, Mapreduce operations may be performed on the Syslog data file by the diagnostic service unit:
in the Mapper stage, a packet having a data field of staMac (MAC address of STA, a kind of STA id) is filtered, and a Mapper record is generated by using the staMac as a main key, using an AP id (e.g., SN code of device serial number), a content of Syslog packet and a content of a timestamp generated by Syslog as a numeric value.
In the Reducer stage, a StaMac is used as a dictionary key to classify Mapper data, an AP device SN code is used as the dictionary key under each StaMac key to classify Syslog content, and data packets of the same StaMac and the same SN code are sorted according to Syslog timestamps.
All Syslog under each SN code dictionary are matched to judge the message type, judge whether the message is an authentication message, an association message, an IP distribution message, an IP switching message or an STA offline message, and associate all STA processes of the STA under the AP according to the judgment, wherein each STA process comprises (StaMac, SN code (identification of AP connected with the STA), authentication time, association time, IP distribution time, STA online time, STA offline time and STA current state (identification success or failure)).
And step three, associating the STA process with the flow information in the STA log, wherein the matching relation is that the SN code, the online time and the offline time in the STA process are consistent with the SN code, the online time and the offline time in the STA log, and then adding the SSID name and the flow information corresponding to the STA log in an STA process dictionary.
And fourthly, copying three copies of the STA process data associated with the STA log, grouping the three copies according to the STAMac, the SN and the SSID, and sequencing each group according to the authentication time of the STA process.
Step five, calculating the grouping information of the STAMacs in the step four, counting all STA processes with STA state failure continuously occurring in 10 minutes, considering the fault of the access terminal if an individual STAMac failure point occurs, and considering the fault of the whole network if all the STAMacs simultaneously have problems.
And step six, calculating SN code grouping information in the step four, counting all STA processes with STA state failure continuously occurring within 10 minutes, and if individual SN failure points occur, determining that the AP has failure, wherein specific reasons can call Syslog failure content presentation.
Step seven, SSID grouping information in the step four is calculated, all STA processes with STA state failure continuously occurring in 10 minutes are counted, if an individual SSID failure point occurs, the SSID is considered to be failed, and the Syslog failure content is called for specific reasons to be presented.
And step eight, writing the equipment log diagnosis result into a relational database MySql.
And step nine, the operation and maintenance management system detects that a new network fault exists in the MySql database at regular time and sends an alarm to the operation and maintenance personnel.
Step ten, the operation and maintenance personnel can check the fault points of the network in the operation and maintenance management system and carry out targeted solution through the Syslog content.
The embodiment of the invention has the advantages that the Syslog logs and the STA logs of all AP equipment in the WLAN are collected all day by day automatically, the network fault in the WLAN is automatically discovered under the unattended condition, and the alarm information is automatically sent to operation and maintenance personnel, so that the WLAN network is automatically and automatically discovered and positioned. And by adopting an analysis statistical method of big data, complex data such as Syslog logs and STA logs are combined with an application scene through a calculation method of associating and aggregating with an actual business process, so that the accuracy of a fault prediction result is effectively improved, and the usability is also improved.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (16)

1. A method for locating a fault in a wireless local area network,
acquiring a first running log of each of n wireless access points AP in a first period, wherein n is a positive integer greater than or equal to 1;
summarizing all the first running logs into a second running log;
generating a record with the identifier of the wireless terminal STA in the second operation log as a main key and the identifier of the AP corresponding to the main key, the content of the operation log corresponding to the main key and the generation timestamp of the operation log corresponding to the main key as contents;
sequencing the records with the same STA identification and the same AP identification according to the generated time stamp;
determining the message type matched with the record according to the content of the running log corresponding to the primary key;
determining the operation process of the STA corresponding to each STA on each AP according to the message type and the sequencing result of the generated timestamp, wherein the operation process at least comprises the identification of the STA, the identification of the AP, the state change time of the STA and the current state of the STA;
grouping the set of records according to the keywords of the records;
sequencing the records contained in each group according to the state change time for each group;
counting the occurrence frequency of failure in the current state corresponding to the operation process of the STA in the second period according to the result of sequencing the records contained in each group;
determining the type of the network fault in each group according to the statistical frequency;
the current state of the STA is determined by the following steps:
when the message types required by the STA to be successfully online once appear in the time stamp sequencing result corresponding to the online once according to the standard specified sequence, determining the current state of the STA as successful;
and when at least one message type in the message types required by the STA for one-time online success does not appear in the time stamp sequencing result corresponding to one-time online according to the sequence specified by the standard, determining that the current state of the STA is failure.
2. The method of claim 1, wherein the state change time is an authentication time, or an online time or an offline time.
3. The method of claim 1, wherein the packet type is one of a STA authentication packet, a STA association packet, an IP address assignment packet, an IP address switch packet, or a STA offline packet.
4. The method of claim 1, wherein the operating process further comprises: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
5. The method of claim 1, wherein the keyword is an identification of a STA,
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
6. The method of claim 1, wherein the keyword is an identification of an AP,
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
7. The method of claim 1, further comprising:
acquiring a third running log of each STA in the wireless local area network in a first period; summarizing all the third running logs into a fourth running log;
and matching the identifier of the AP and the state change time of the STA in the fourth running log with the identifier of the AP and the state change time of the STA in the running process of the STA, and if the identifier of the AP and the state change time of the STA are matched with the identifier of the STA in the running process of the STA, adding the SSID in the fourth running log into the running process of the STA.
8. The method of claim 7, wherein the keyword is an identification of an SSID;
the step of determining the type of network fault in each packet according to the statistical frequency specifically includes:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID.
9. An apparatus for locating a fault in a wireless local area network, comprising:
a first obtaining module, configured to obtain a first running log of each of n wireless access points AP in a first period, where n is a positive integer greater than or equal to 1;
the first summarizing module is used for summarizing all the first running logs into a second running log;
a first generating module, configured to generate a record with an identifier of a wireless terminal STA in the second operation log as a primary key, and with an identifier of an AP corresponding to the primary key, content of an operation log corresponding to the primary key, and a generation timestamp of the operation log corresponding to the primary key as content;
the first sequencing module is used for sequencing the records with the same STA identification and the same AP identification according to the generated timestamp;
the first determining module is used for determining the message type matched with the record according to the content of the running log corresponding to the main key;
a second determining module, configured to determine, according to the packet type and according to the result of generating the timestamp ordering, an operation process of the STA corresponding to each STA on each AP, where the operation process at least includes an STA identifier, an AP identifier, a STA state change time, and a current STA state;
the grouping module is used for grouping the record set according to the recorded keywords;
the second sequencing module is used for sequencing the records contained in each group according to the state change time;
a counting module, configured to count, according to a result of sorting the records included in each group, an occurrence frequency that a current state corresponding to an operation process of the STA in the second period is a failure;
a third determining module, configured to determine a type of the network fault in each of the packets according to the statistical frequency; further comprising: the STA current state determining module is used for determining that the STA current state is successful when the message types required by the successful one-time online of the STA all appear in the time stamp sequencing result corresponding to the one-time online according to the sequence specified by the standard;
and when at least one message type in the message types required by the STA for one-time online success does not appear in the time stamp sequencing result corresponding to one-time online according to the sequence specified by the standard, determining that the current state of the STA is failure.
10. The apparatus of claim 9, wherein the state change time is an authentication time, or an online time or an offline time.
11. The apparatus of claim 9, wherein the packet type is one of a STA authentication packet, a STA association packet, an IP address assignment packet, an IP address switch packet, or a STA offline packet.
12. The apparatus of claim 9, wherein the operating process further comprises: one or more of STA authentication time, STA association time, IP address allocation time, STA online time and STA offline time.
13. The apparatus of claim 9, wherein the keyword is an identification of a STA,
the third determining module is specifically configured to:
when the grouping is the grouping divided according to the identification of the STA, and the current states corresponding to the operation processes of all the STAs in the continuous m time units are all failed, determining that the network fault in the grouping divided according to the identification of the STA is the fault of the whole network;
when the grouping is the grouping divided according to the identification of the STA, and the number of the STAs with the current state of failure corresponding to the operation process of the STA in the continuous m time units is smaller than a first threshold value, the identification of the STA with the state of failure is determined according to the operation process of the STA, and the network failure in the grouping divided according to the identification of the STA is determined as the failure of the STA with the state of failure according to the identification of the STA with the state of failure.
14. The apparatus of claim 9, wherein the keyword is an identification of an AP,
the third determining module is specifically configured to:
when the grouping is the grouping divided according to the identification of the AP, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, the identification of the AP corresponding to the STA with the failed state is determined according to the operation process of the STA, and the network fault in the grouping divided according to the identification of the AP is determined to be the fault of the AP corresponding to the STA with the failed state according to the identification of the AP.
15. The apparatus of claim 9, further comprising:
the second acquisition module is used for acquiring a third running log of each STA in the wireless local area network in the first period;
the second summarizing module is used for summarizing all the third running logs into fourth running logs;
a matching module, configured to match the identifier of the AP and the state change time of the STA in the fourth running log with the identifier of the AP and the state change time of the STA in the running process of the STA,
and the adding module is used for adding the SSID in the fourth running log into the running process of the STA if the SSID is matched with the SSID in the fourth running log.
16. The apparatus of claim 9, wherein the keyword is an identification of an SSID;
the third determining module is specifically configured to:
and when the grouping is the grouping divided according to the identification of the SSID, and the number of the STAs which fail in the current state and correspond to the operation process of the STAs in the continuous m time units is smaller than a first threshold value, determining the SSID corresponding to the STA which fails in the state according to the operation process of the STAs, and determining the network failure in the grouping divided according to the SSID as the failure of the SSID corresponding to the STA which fails in the state according to the SSID corresponding to the SSID.
CN201811418151.3A 2018-11-26 2018-11-26 Method and device for positioning fault in wireless local area network Active CN109347688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811418151.3A CN109347688B (en) 2018-11-26 2018-11-26 Method and device for positioning fault in wireless local area network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811418151.3A CN109347688B (en) 2018-11-26 2018-11-26 Method and device for positioning fault in wireless local area network

Publications (2)

Publication Number Publication Date
CN109347688A CN109347688A (en) 2019-02-15
CN109347688B true CN109347688B (en) 2022-04-26

Family

ID=65318088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811418151.3A Active CN109347688B (en) 2018-11-26 2018-11-26 Method and device for positioning fault in wireless local area network

Country Status (1)

Country Link
CN (1) CN109347688B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804070B (en) * 2019-11-13 2023-04-28 中国移动通信集团重庆有限公司 Business obstacle positioning method, device and equipment
CN112532441A (en) * 2020-11-24 2021-03-19 成都西加云杉科技有限公司 Network diagnosis and repair method, device, equipment and medium
CN113852984A (en) * 2021-08-24 2021-12-28 北京华信傲天网络技术有限公司 Wireless terminal access monitoring system and method, electronic equipment and readable storage device
CN113873560A (en) * 2021-09-29 2021-12-31 中国电信股份有限公司 Network fault processing method and device
CN114157560B (en) * 2021-10-26 2023-12-26 新华三大数据技术有限公司 Method for acquiring log, AP and AC
CN114721861B (en) * 2022-05-23 2022-10-04 北京必示科技有限公司 Log differentiation comparison-based fault positioning method and system
CN117240358A (en) * 2022-06-07 2023-12-15 华为技术有限公司 Method and device for uploading optical communication network

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103813368B (en) * 2012-11-12 2017-09-15 中国移动通信集团公司 A kind of method, apparatus and system being monitored to network operation state
CN102958082A (en) * 2012-12-07 2013-03-06 广州杰赛科技股份有限公司 Network optimization analysis method and system
CN104113866B (en) * 2013-04-19 2017-11-21 中国移动通信集团浙江有限公司 The processing method and processing device of wireless controller daily record
CN105207835B (en) * 2014-06-30 2019-05-03 中国移动通信集团浙江有限公司 A kind of determination method and device of the network element working condition of WLAN
CN104639368A (en) * 2015-01-13 2015-05-20 中国联合网络通信集团有限公司 Method and device for processing faults of communications network equipment
JP6328595B2 (en) * 2015-09-29 2018-05-23 東芝テック株式会社 Information processing apparatus and program
CN105224691B (en) * 2015-10-30 2019-03-26 北京网康科技有限公司 A kind of information processing method and device
CN105978723A (en) * 2016-05-11 2016-09-28 广州唯品会信息科技有限公司 Network information management method and device
US10515318B2 (en) * 2016-09-30 2019-12-24 Fortinet, Inc. Automated resolution of Wi-Fi connectivity issues over SMS
CN106844576B (en) * 2017-01-06 2020-10-13 北京蓝海讯通科技股份有限公司 Abnormity detection method and device and monitoring equipment

Also Published As

Publication number Publication date
CN109347688A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109347688B (en) Method and device for positioning fault in wireless local area network
WO2017041406A1 (en) Failure positioning method and device
US10326640B2 (en) Knowledge base radio and core network prescriptive root cause analysis
CN109996284A (en) Mobile communication Trouble call worksheet method, apparatus, equipment and medium
CN109271793B (en) Internet of things cloud platform equipment category identification method and system
CN107947968B (en) Method and device for processing network quality complaint information
CN111400127B (en) Service log monitoring method and device, storage medium and computer equipment
CN103370904A (en) Method for determining a severity of a network incident
WO2014040633A1 (en) Identifying fault category patterns in a communication network
WO2022061900A1 (en) Method for determining fault autonomy capability and related device
US10708155B2 (en) Systems and methods for managing network operations
CN108282355B (en) Equipment inspection device in cloud desktop system
WO2021143483A1 (en) System maintenance method and apparatus, device, and storage medium
CN114363151A (en) Fault detection method and device, electronic equipment and storage medium
CN112100020A (en) Data reporting method and device for base station, electronic device and storage medium
CN109963292B (en) Complaint prediction method, complaint prediction device, electronic apparatus, and storage medium
CN106998563B (en) Indoor distribution system early warning method and device based on network performance
CN107820270B (en) GPRS interface monitoring system based on GSM-R network
CN110609761B (en) Method and device for determining fault source, storage medium and electronic equipment
CN101784060A (en) Parameter processing method, network diagnosis method, terminal, server and system
CN110972185A (en) Data transmission method and device
CN110825466B (en) Program jamming processing method and jamming processing device
WO2016206241A1 (en) Data analysis method and apparatus
CN108989137B (en) Time delay measuring method and device for end-to-end communication and computer readable storage medium
CN116166499A (en) Data monitoring method and device, electronic equipment and nonvolatile storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant