CN114095800A - Large-scale wireless local area network data acquisition and processing method based on multiple data sources - Google Patents

Large-scale wireless local area network data acquisition and processing method based on multiple data sources Download PDF

Info

Publication number
CN114095800A
CN114095800A CN202111033400.9A CN202111033400A CN114095800A CN 114095800 A CN114095800 A CN 114095800A CN 202111033400 A CN202111033400 A CN 202111033400A CN 114095800 A CN114095800 A CN 114095800A
Authority
CN
China
Prior art keywords
data
log
wireless
processing
data processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111033400.9A
Other languages
Chinese (zh)
Other versions
CN114095800B (en
Inventor
向望
徐竟祎
沈敏虎
沈佳杰
万俨慧
李炜莹
赵泽宇
王新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202111033400.9A priority Critical patent/CN114095800B/en
Publication of CN114095800A publication Critical patent/CN114095800A/en
Application granted granted Critical
Publication of CN114095800B publication Critical patent/CN114095800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q9/00Arrangements in telecontrol or telemetry systems for selectively calling a substation from a main station, in which substation desired apparatus is selected for applying a control signal thereto or for obtaining measured values therefrom
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G08SIGNALLING
    • G08CTRANSMISSION SYSTEMS FOR MEASURED VALUES, CONTROL OR SIMILAR SIGNALS
    • G08C17/00Arrangements for transmitting signals characterised by the use of a wireless electrical link
    • G08C17/02Arrangements for transmitting signals characterised by the use of a wireless electrical link using a radio link
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • H04L41/0213Standardised network management protocols, e.g. simple network management protocol [SNMP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0892Network architectures or network communication protocols for network security for authentication of entities by using authentication-authorization-accounting [AAA] servers or protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a large-scale wireless local area network data acquisition and processing method based on multiple data sources, which comprises the following steps: step 1, acquiring and storing IP address lease data of a DCHP system by using DHCP data processing to obtain terminal type data; step 2, AAA authentication data processing, collecting and storing AAA authentication system user authentication data to obtain AAA structured data; step 3, using SNMP data processing to collect the operation data of the network equipment, the wireless terminal and the wireless interference source, and combining the terminal type data and the user authentication data to process together to obtain SNMP structured data; step 4, log data processing, collecting and storing the log data of NAT system address conversion, DNS system domain name resolution and AC to obtain log structured data; step 5, processing and collecting and storing SFLOW flow data by using the SFLOW flow data to obtain SFLOW flow structured data; step 6, storing the structured data into a wireless network core database; and 7, inputting the core database into the intelligent operation and maintenance system.

Description

Large-scale wireless local area network data acquisition and processing method based on multiple data sources
Technical Field
The invention relates to the field of intelligent operation and maintenance, in particular to a large-scale wireless local area network data acquisition and processing method based on multiple data sources.
Background
With the continuous improvement and development of wireless local area network technology, mobile terminal devices and mobile applications, WLAN is no longer used as a supplement to wired networks, but becomes a main network access mode. In a home network, a WiFi network based on a WLAN technology has completely replaced a wired network, and can access the internet through a home WiFi from traditional terminal devices such as a personal computer, a mobile phone, a tablet computer, and the like, to internet-of-things devices such as a household appliance, a sensor, and the like, and then to new energy devices such as an electric vehicle, and the like. In the network coverage in the campus area, the WLAN generally adopts large-scale deployment and independent networking, and includes hundreds to tens of thousands of devices, which become the main network access mode of the user terminal. After a large-scale wireless network system is built, in order to know the network operation condition and monitor the network performance, the network needs to be managed, monitored and measured, so that the wireless network system is optimized and improved, and high-speed and stable service is better provided for a user terminal.
The intelligent operation and maintenance AIOps is a concept proposed in 2016, and has gradually become practical and effective due to the great breakthrough of researchers in deep learning. The core technology of AIOps is Big Data (Big Data) and Machine Learning (Machine Learning), for a network system, various Data generated in the operation of the network system are automatically collected, an integral Big Data set is formed after cleaning and sorting, intelligent analysis is carried out through a Machine Learning method, and analyzed results can be used for network optimization and various application systems, for example, problems and faults existing in the network are found through analysis, and automatic correction can be carried out through a certain strategy. In a large-scale wireless local area network, a large amount of data is generated at any moment, different types of data are generated from an access layer to an exit link device, including an AP, an AC, an access switch, a convergence switch, a core switch, a firewall (NAT), a DNS system, a DHCP system, and the like, and the multi-source data are also diversified, such as flow, NAT address translation, a security protection log, a wireless controller log, a wireless AP operating state, authentication of an AAA system, charging information, and the like. The wireless network has one more AP layer than the wired network, and as the uncertainty of the wireless transmission medium is larger, compared with the wired network, the stability of the wireless network is more challenging, the network operation and maintenance is more complex, and the workload is larger, so that the application of an intelligent operation and maintenance system is more needed. The invention collects various data generated by the operation of the large-scale wireless local area network, and carries out processing such as cleaning, combination and storage on the collected data to form a large data set for the use of the intelligent operation and maintenance system.
Disclosure of Invention
The present invention is made to solve the above problems, and an object of the present invention is to provide a method for collecting and processing data of a large-scale wireless local area network based on multiple data sources.
The invention provides a large-scale wireless local area network data acquisition and processing method based on multiple data sources, which is characterized by comprising the following steps:
step 1, acquiring and processing Internet Protocol (IP) address lease data of a Dynamic Host Configuration Protocol (DHCP) system by accessing a database by adopting a DHCP data processing flow to obtain terminal type data.
And step 2, accessing, collecting and processing the user authentication data of the AAA authentication system by adopting an AAA authentication data processing flow database to obtain the user authentication data and the AAA structured data.
And step 3, acquiring the operation data of the network equipment, the operation data of the wireless terminal and the operation data of the wireless interference source by adopting a Simple Network Management Protocol (SNMP) data processing flow through an SNMP protocol, and processing and storing the operation data by combining the terminal type data input by a DHCP data processing flow and the user authentication data input by an AAA authentication data processing flow to obtain SNMP structured data.
And 4, acquiring and processing address conversion log data of a Network Address Translation (NAT) system, domain name resolution log data of a Domain Name System (DNS) system and log data of a wireless controller (AC) by adopting a log data processing flow through a system log protocol (SYSLOG) to obtain log structured data.
And 5, acquiring and processing SFLOW flow data by adopting an SFLOW flow data processing flow through an SFLOW protocol to obtain SFLOW flow structured data.
And 6, storing the AAA structured data obtained by the AAA authentication data processing flow, the simple network management protocol structured data obtained by the SNMP data processing flow, the log structured data obtained by the log data processing flow and the SFLOW traffic structured data obtained by the SFLOW traffic data processing flow into a wireless network core database.
And 7, inputting the wireless network core database into the intelligent operation and maintenance system.
The large-scale wireless local area network data acquisition and processing method based on multiple data sources provided by the invention can also have the following characteristics: in step 1, the DHCP data processing flow includes three sub-processes of DHCP, an Organization Unique Identifier (OUI) and Fingerbank, and specifically includes the following sub-steps:
step 1-1, the DHCP data processing flow accesses and collects the Internet protocol address lease data of the DHCP system through a database, and the method comprises the following steps: including IP, media access control layer data (MAC), lease time, option55, option60, host name (hostname), etc.
And step 1-2, respectively carrying out local matching on the terminal, matching with OUI of IEEE and calling CloudAPI provided by Fingerbank for matching.
And 1-3, identifying the terminal type data and inputting the terminal type data into an SNMP data processing flow.
The large-scale wireless local area network data acquisition and processing method based on multiple data sources provided by the invention can also have the following characteristics: wherein, in the step 2, the following substeps are specifically included:
and 2-1, the AAA authentication data processing flow acquires real-time internet surfing detail data of the AAA authentication system through database access.
And 2-2, cleaning the data and generating a new field to obtain user authentication data and AAA structured data.
And 2-3, inputting the user authentication data into an SNMP data processing process, and storing the AAA structured data into a wireless network core database.
The large-scale wireless local area network data acquisition and processing method based on multiple data sources provided by the invention can also have the following characteristics: in step 3, the SNMP data processing flow includes two sub-processes of a running data processing process and a file data processing process, and specifically includes the following sub-steps:
step 3-1, acquiring running data of the network equipment, the wireless terminal and the wireless interference source through an SNMP acquisition process, wherein the concrete acquisition objects comprise: the wireless access point, the wireless controller, the wireless radio frequency, the switch, the wireless terminal, the wireless probe and other devices also comprise an illegal wireless Access Point (AP), an illegal terminal and other interference sources detected by the Radio Frequency (RF).
And 3-2, receiving the terminal type data input by the DHCP data processing flow and the user authentication data input by the AAA authentication data processing flow.
And 3-3, processing and storing all the acquired and received data together.
And 3-4, independently designing a processing process and a monthly archiving process for the running data and the archive data of each type of equipment, wherein the running data processing process is used for processing the equipment running data generated according to a time sequence, the archive data processing process is used for processing the archive data of all elements, and finally, the SNMP structured data is obtained through processing.
And 3-5, storing the SNMP structural data obtained by processing into a wireless network core database.
The large-scale wireless local area network data acquisition and processing method based on multiple data sources provided by the invention can also have the following characteristics: in step 4, the log data processing flow includes four sub-processes of Network Address Translation (NAT) log data processing, Domain Name System (DNS) log data processing, domain name system reverse (rDNS) query, and wireless controller (AC) log data processing, and specifically includes the following sub-steps:
and 4-1, acquiring an address conversion log of the NAT system and domain name resolution log data of the DNS through a SYSLOG protocol.
And 4-2, combining NAT log data processing, DNS log data processing and rDNS query, and matching the domain name address corresponding to the IP address to obtain log structured data.
And 4-3, acquiring the log data of the AC through a SYSLOG protocol, processing and storing to obtain log structured data.
And 4-4, storing the log structured data obtained by processing in the step 4-2 and the step 4-3 into a wireless network core database.
The large-scale wireless local area network data acquisition and processing method based on multiple data sources provided by the invention can also have the following characteristics: wherein, in the step 5, the following substeps are specifically included:
and 5-1, acquiring SFLOW flow data through an SFLOW protocol by adopting an SFLOW flow data processing flow.
And 5-2, extracting a field directly related to the access behavior of the wireless terminal in the SFLOW flow data, performing SFLOW related configuration on a port of the switch in the process, pushing the SFLOW sampling information in the outgoing direction to an SFLOW agent, and configuring a sampling ratio according to actual requirements to obtain the SFLOW flow structured data.
And 5-3, storing the SFLOW flow structured data to a wireless network core database.
Action and Effect of the invention
According to the large-scale wireless local area network data acquisition and processing method based on multiple data sources, the IP address lease data of a DHCP system is acquired and processed by accessing the database through a DHCP data processing flow to obtain terminal type data, the real-time internet surfing detail data of the AAA authentication system is acquired and processed by accessing the database through an AAA authentication data processing flow to obtain user authentication data and AAA structured data, the operation data of network equipment, the operation data of a wireless terminal and the operation data of a wireless interference source are acquired by acquiring the operation data of the network equipment through an SNMP protocol through an SNMP data processing flow, the terminal type data input by the DHCP data processing flow and the user authentication data input by the AAA authentication data processing flow are combined to be processed and stored together to obtain the SNMP structured data, and the address conversion log data of the NAT system are acquired and processed through an SYSLOG protocol through a log data processing flow, The domain name of the DNS system analyzes log data and log data of the AC to obtain log structured data, SFLOW flow data are collected and processed through an SFLOW protocol by an SFLOW flow data processing flow to obtain SFLOW flow structured data, all the obtained structured data are stored in a wireless network core database, and the wireless network core database is input into the intelligent operation and maintenance system. Compared with other methods, the process has more and more comprehensive collected and processed data sources, covers the whole process of using the wireless network by the wireless terminal, integrates the collected data with external standard data such as OUI, Fingerbank, rDNS and the like, and stores the unified and structured data into a wireless network core database. The standard big data set generated by the method can be used for analyzing the running state of the wireless network, the user behavior and the like, and the structured data can be customized and pushed to an application system according to the requirements of various applications such as an intelligent operation and maintenance system.
Drawings
Fig. 1 is a flowchart of a large-scale wireless local area network data acquisition and processing method based on multiple data sources in embodiment 1 of the present invention;
fig. 2 is a general architecture diagram of a large-scale wireless local area network data acquisition and processing method based on multiple data sources in embodiment 2 of the present invention;
fig. 3 is an illustration of the DHCP data processing flow in embodiment 2 of the present invention;
fig. 4 is a flow chart of DHCP data processing in embodiment 2 of the present invention;
FIG. 5 is a diagram showing AAA authentication data processing in embodiment 2 of the present invention;
fig. 6 is a flowchart of AAA authentication data processing in embodiment 2 of the present invention;
fig. 7 is an example diagram of SNMP data processing in embodiment 2 of the present invention;
FIG. 8 is a diagram of log data processing use in embodiment 2 of the present invention;
fig. 9 is a diagram of SFLOW traffic data processing in embodiment 2 of the present invention.
Detailed Description
In order to make the technical means, creation features, achievement objectives and effects of the present invention easy to understand, the following embodiments specifically describe a large-scale wireless local area network data collection and processing method based on multiple data sources in conjunction with the accompanying drawings.
< example 1>
In embodiment 1, a large-scale wireless local area network data acquisition and processing method based on multiple data sources is provided.
Fig. 1 is a flow chart of a large-scale wireless local area network data acquisition and processing method based on multiple data sources according to an embodiment of the present invention.
As shown in fig. 1, the method for acquiring and processing data of a large-scale wireless local area network based on multiple data sources according to the present embodiment includes the following steps:
and step S1, acquiring and processing the IP address lease data of the DHCP system by accessing the database through the DHCP data processing flow to obtain the terminal type data.
The specific implementation manner of step S1 is:
step S1-1, the DHCP data processing process accesses and collects the internet protocol address lease data of the DHCP system through the database, including: including IP, MAC, lease time, option55, option60, hostname, etc. The specific implementation mode is as follows: and the DHCP data processing process periodically queries a DHCP database to obtain the latest IP address lease data and takes the MAC address as a unique identifier.
And step S1-2, performing local matching on the terminal, performing matching with the OUI of the IEEE and calling CloudAPI provided by Fingerbank for matching respectively. The specific implementation mode is as follows:
step S1-2-1, checking whether the collected data is updated, and ending the process if the data is not updated;
and step S1-2-2, checking whether the next piece of data exists, if not, updating the local matching database and ending the process.
And step S1-2-3, if the next piece of data exists, the previous piece of data is matched locally.
And step S1-2-4, checking whether the local matching is successful, if the local matching is successful, the local database stores the terminal type information of the equipment, and continuing to process the next piece of data.
And step S1-2-5, if the local matching is unsuccessful, carrying out OUI matching.
And step S1-2-6, continuing to perform finger bank inquiry, and then continuing to process the next piece of data until all data are processed.
And step S1-3, recognizing the terminal type data and inputting the terminal type data into the SNMP data processing flow.
Step S2, the AAA authentication data processing flow database is adopted to access, collect and process the user authentication data of the AAA authentication system, and the user authentication data and the AAA structured data are obtained.
The specific implementation manner of step S2 is:
and step S2-1, the AAA authentication data processing flow accesses and collects the real-time internet surfing detail data of the AAA authentication system through the database.
Step S2-2, washing the data and generating new fields to obtain user authentication data and AAA structured data. The specific implementation mode comprises the following steps:
and step S2-2-1, checking whether the acquired data is updated or not, and ending the process if the data is not updated.
And step S2-2-2, checking whether the next piece of data exists, and if not, ending the flow.
And step S2-2-3, if the next piece of data exists, deleting the field of the previous piece of data.
And step S2-2-4, formatting the IPv4 and the IPv6 fields.
Step S2-2-5, the terminal MAC address of the MAC field is formatted.
And step S2-2-6, deleting problematic data in the data.
And step S2-2-7, inquiring AP archive data to generate new apname and ssid fields, and then continuing to process the next piece of data until all data are processed.
And step S2-3, inputting the user authentication data into the SNMP data processing process, and storing the AAA structured data into the wireless network core database.
Step S3, adopting SNMP data processing flow to collect the operation data of network device, wireless terminal and wireless interference source through SNMP protocol, and combining the terminal type data inputted by DHCP data processing flow and the user authentication data inputted by AAA authentication data processing flow to process and store together, to obtain SNMP structured data.
The specific implementation manner of step S3 is:
step S3-1, collecting the running data of the network device, the wireless terminal and the wireless interference source through the SNMP collecting process, wherein the concrete collecting objects comprise: the devices such as wireless access points, wireless controllers, wireless radio frequencies, switches, wireless terminals and wireless probes also include interference sources such as illegal APs and illegal terminals from RF.
Step S3-2, receiving the terminal type data input by the DHCP data processing flow and the user authentication data input by the AAA authentication data processing flow.
And step S3-3, all the collected and received data are processed and stored together.
And step S3-4, independently designing a processing process and a monthly archiving process for the operation data and the archive data of each type of equipment, wherein the operation data processing process is used for processing the equipment operation data generated according to the time sequence, the archive data processing process is used for processing the archive data of all elements, and finally the SNMP structured data is obtained through processing.
And step S3-5, storing the SNMP structural data obtained by processing into a wireless network core database.
And step S4, acquiring and processing address conversion log data of the NAT system, domain name resolution log data of the DNS system and log data of the AC through SYSLOG by adopting a log data processing flow to obtain log structured data.
The specific implementation manner of step S4 is:
and step S4-1, collecting address conversion log of the NAT system and domain name resolution log data of the DNS through SYSLOG protocol.
And step S4-2, combining NAT log data processing, DNS log data processing and rDNS query, matching the domain name address corresponding to the IP address, and obtaining log structured data.
And step S4-3, collecting the log data of the AC through a SYSLOG protocol, processing and storing to obtain log structured data.
And step S4-4, storing the log structured data processed in step 4-2 and step 4-3 into the wireless network core database.
And step S5, acquiring and processing SFLOW flow data by adopting an SFLOW flow data processing flow through an SFLOW protocol to obtain SFLOW flow structured data.
The specific implementation manner of step S5 is:
and step S5-1, collecting SFLOW flow data through an SFLOW protocol by adopting an SFLOW flow data processing flow.
And step S5-2, extracting a field directly related to the access behavior of the wireless terminal in the SFLOW flow data, performing SFLOW related configuration on a port of the switch in the process, pushing the SFLOW sampling information in the outgoing direction to an SFLOW agent, and configuring a sampling ratio according to actual requirements to obtain the SFLOW flow structured data.
And step S5-3, storing the SFLOW traffic structured data to a wireless network core database.
Step S6, storing AAA structured data obtained by AAA authentication data processing flow, simple network management protocol structured data obtained by SNMP data processing flow, log structured data obtained by log data processing flow, and SFLOW traffic structured data obtained by SFLOW traffic data processing flow in a wireless network core database.
And step S7, inputting the wireless network core database into the intelligent operation and maintenance system.
< example 2>
In example 2, a specific application of example 1 is provided.
The specific implementation manner of this embodiment is:
fig. 2 is a general architecture diagram of a large-scale wireless local area network data acquisition and processing method based on multiple data sources in this embodiment.
As shown in fig. 2, the method for acquiring and processing data of a large-scale wireless local area network based on multiple data sources in this embodiment mainly includes five types of processing flows, which are DHCP data processing, SNMP data processing, log data processing, AAA authentication data processing, and SFLOW traffic data processing, respectively. The DHCP data processing comprises three subprocesses of DHCP, OUI and Fingerbank; the SNMP data processing comprises two sub-processes of operation data processing and archive data processing; the log data processing comprises four sub-processes of NAT, DNS, rDNS and AC.
The specific implementation steps of this embodiment are as follows:
and step S1, acquiring and processing the IP address lease data of the DHCP system by accessing the database through the DHCP data processing flow to obtain the terminal type data.
Fig. 3 is an exemplary diagram of a DHCP data processing flow in this embodiment.
As shown in fig. 3, the DHCP system stores the IP address lease data in the database, the DHCP data acquisition case acquires the IP address lease data through database query, and the terminal identification case performs terminal type identification through local matching, OUI matching, and Fingerbank query. In addition, the OUI data updating case acquires the OUI data of IEEE periodically, preprocesses the OUI data and stores the OUI data as a JSON file for OUI matching use; the Fingerbank query case carries out terminal type matching by calling CloudAPI provided by Fingerbank; for unidentified terminals, an administrator can fill in terminal information through manual updating.
The OUI is a global MAC address distribution table, the first 24 bits of the MAC address correspond to a manufacturer, the MAC address is issued as an oui.txt file on an IEEE website, a http:// standards-oui.ee.org/OUI/oui.txt file is downloaded periodically by a DHCP data processing flow, data is preprocessed and stored as a JSON structure file, and OUI matching is facilitated. And finger bank is used as an open source project, free cloud API is provided, 250 query requests can be initiated every minute, more data need to be submitted as much as possible in order to improve the accuracy of terminal type identification, and the DHCP data processing flow of the invention submits the four parameters of MAC, option55, option60 and hostname to carry out finger bank query.
Fig. 4 is a flow chart of DHCP data processing in an embodiment of the present invention.
As shown in fig. 4, the specific implementation process of the DHCP data processing flow in this embodiment includes:
step S1-1, the DHCP data processing process periodically queries the DHCP database to obtain the latest IP address lease data and takes the MAC address as the unique identifier.
Step S1-2, checking whether the data is updated, if not, ending the flow;
and step S1-3, checking whether the next piece of data exists, if not, updating the local matching database and ending the process.
And step S1-4, if the next piece of data exists, locally matching the previous piece of data.
And step S1-5, checking whether the local matching is successful, if the local matching is successful, the local database stores the terminal type information of the equipment, and continuously processing the next piece of data.
And step S1-6, if the local matching is unsuccessful, carrying out OUI matching.
And step S1-7, continuing to perform finger bank inquiry and then continuing to process the next piece of data.
And step S1-8, after all data are processed, updating the newly matched terminal information to a local matching database.
And step S2, accessing, collecting and processing the real-time internet surfing detail data of the AAA authentication system by adopting an AAA authentication data processing flow database to obtain user authentication data and AAA structured data.
FIG. 5 is an illustration of the AAA authentication data processing flow in this embodiment;
as shown in fig. 5, the AAA data processing use case periodically queries real-time internet surfing detail data from the AAA authentication system database, after data cleaning and new field generation, stores the structured data, classifies the structured data by NAS IP and date, and sends the user authentication data to the SNMP data processing process.
Fig. 6 is a flowchart of AAA authentication data processing in the present embodiment;
as shown in fig. 6, the specific implementation of the AAA authentication data processing flow in this embodiment is as follows:
in step S2-1, the AAA data processing process periodically obtains data through database queries.
Step S2-2, checking whether the data is updated, and ending the flow if the data is not updated.
And step S2-3, checking whether the next piece of data exists, and if not, ending the flow.
Step S2-4, if there is the next piece of data, field deletion is performed on the previous piece of data, the original data structure is generally an extension of the AAA authentication system on the standard RADIUS Accounting data structure, and includes many fields, the AAA authentication system collected in this embodiment has 106 fields, the system only needs 21 of them, and the remaining fields are deleted through the field deletion flow.
And step S2-5, formatting the IPv4 and IPv6 fields, and replacing the empty IP ' 0.0.0.0 ' of the IPv4 and the empty IP ' of the IPv6 with a null value nan.
And step S2-6, formatting the terminal MAC address of the MAC field, wherein the terminal MAC address format has a plurality of segmentation modes according to different manufacturer definitions, and some letters are upper case letters and some letters are lower case letters. The MAC address is a core field of the network equipment, almost all the network equipment use the MAC address as a unique identifier, and therefore all the MAC address formats are unified into 'xxxx-xxxx-xxxx' and letter lowercase through a MAC address formatting process.
Step S2-7, delete the problematic data in the data, such as invalid data with the unique identification field "rad _ online _ id" of the RADIUS Accounting information being empty.
And step S2-8, inquiring the AP archive data to generate new apname and ssid fields. The original data of the AAA system has a field 'station _ id' which identifies the AP and SSID connected with the terminal, and there may be three formats, [ AP MAC ]: SSID, [ AP Name ]: SSID ] or [ AP Serial ]: SSID ], in this embodiment, the original data is split into AP identification and SSID, then the AP archive is inquired, the AP identification corresponds to the AP Name, the apname and SSID fields are generated, and then the next data is processed continuously.
And step S2-9, finishing the flow after all data are processed.
Step S3, adopting SNMP data processing flow to collect the operation data of network device, wireless terminal and wireless interference source through SNMP protocol, and combining the terminal type data inputted by DHCP data processing flow and the user authentication data inputted by AAA authentication data processing flow to process and store together, to obtain SNMP structured data.
Fig. 7 is an exemplary diagram illustrating the SNMP data processing flow in this embodiment.
As shown in fig. 7, the SNMP trap process periodically traps network operation data from a network device, and since the MIB bases of different manufacturers have large differences, different procedures are required for trapping and processing, and this embodiment supports five SNMP traps of common manufacturer devices in a large-scale wireless lan, which are denoted by VD1-VD 5. The data preprocessing case preprocesses input data of a DHCP data processing flow and an AAA authentication data processing flow, and then the SNMP data processing case cleans and structurally stores all data, particularly, the data of part of manufacturer equipment takes an AP serial number as an index, the data of part of manufacturer equipment takes an AP MAC address as an index, standardization is carried out in the SNMP data processing process, and the AP MAC address is uniformly taken as a unique identifier of an AP.
The running data processing process is responsible for managing a running database, the running data is divided into basic data and sequence data, the basic data is an AC and AP and RF real-time state snapshots under the management of the AC, and the sequence data is network running data generated according to a time sequence and comprises the AC, the AP, the RF, the STA, the RAP, the RSTA and the SS.
The archive data processing process manages the lifetime archives of all the equipment, and establishes and updates a lifetime archive database of all the elements including AC, AP, RF, STA, RAP, RSTA, SS and User according to the basic data and the sequence data of the operation data processing, wherein the AP archive integrates the RF archive and the AP position data, and the User archive only records the corresponding relation between the User name and the terminal and does not contain privacy data such as User identity. The administrator can query the basic data and the sequence data through a data query function, and can search corresponding equipment, users and related data through the unique identification.
And step S4, acquiring and processing address conversion log data of the NAT system, domain name resolution log data of the DNS system and log data of the AC through SYSLOG by adopting a log data processing flow to obtain log structured data.
FIG. 8 is an exemplary diagram illustrating a log data processing flow in an embodiment of the invention.
As shown in fig. 8, the NAT subprocess performs structured processing and storage on NAT log data, and matches an external target IP address with a corresponding domain name through the DNS subprocess and the rDNS subprocess; the DNS subprocess conducts structuralization processing and storage on DNS log data, receives IP address data input by the NAT subprocess, and matches a domain name corresponding to the IP address from a DNS log library through a domain name matching use case; and if the domain name cannot be matched from the local DNS log library, sending the IP address to the rDNS subprocess, and inquiring through a POST method of the MxToolbox.
The AC sub-process is relatively independent, and processes and stores the log data of the AC, wherein the log data comprises AC, AP, RF, 802.1x certification and the like, and the AC log structures of five manufacturers are different, so that five separate processes from the VD1 log to the VD5 log are also needed for processing. Compared with the AC data acquired by the SNMP, the AC log acquired by the SYSLOG has repeated data and is supplemented with each other, wherein the SNMP data is more comprehensive, and the AC log is more real-time, so that the application with high real-time requirement can use the AC log data, the missing fields can be supplemented from the SNMP data, and the application with low real-time requirement can completely use the SNMP data. The administrator can inquire NAT, DNS, AC log and statistical data through the operation interface of the log inquiry function.
SYSLOG is a standard log structure stored in a textual format that can be easily processed and structured for each log. The NAT log and DNS log structures are fixed, while the AC log is widely different from manufacturer to manufacturer, and the implementation is described below by taking the AC log of VD1 as an example. The SYSLOG log data structure main body comprises PRI, Header and Message, wherein PRI is composed of Facility and preference defined by SYSLOG, Header is composed of time and host name or IP address, Message is log content, most main difference of logs of different manufacturers is in the Message part, and the former SYSLOG structure is consistent. Firstly, the SYSLOG structure of the VD1 log is processed, time, Hostname, type and information are decomposed, the type is further subdivided into large classes and small classes, 12 large classes, 41 small classes and the occupation ratio thereof are counted, wherein the first 20 small classes with the largest number are log data to be collected by the invention, and the total occupation ratio exceeds 99%. For the 20 log types required by the invention, the structure definition of the log types is inquired according to the official document of the VD1, a pattern is formulated for each type to be matched, and the matched data is structured. The AC logs for other VDs are similarly processed according to the official documents.
And step S5, acquiring and processing SFLOW flow data by adopting an SFLOW flow data processing flow through an SFLOW protocol to obtain SFLOW flow structured data.
Fig. 9 is an illustration of the flow of SFLOW traffic data processing in the present embodiment.
As shown in fig. 9, SFLOW related configuration needs to be performed on the uplink port of the core switch, SFLOW sampling information in the outgoing direction is pushed to the SFLOW agent, and the sampling ratio is configured according to actual requirements. The SFLOW agent is implemented using the official tool sflowtool, and specifies by command to receive the required SFLOW fields at UDP 6343 port, including unixSecondsUTC, agent, sampleType, srcIP, dstIP, srcMAC, dstMAC, inputPort, outputPort, headerProtocol, sampledPacketSize, samplesequencenumbeno, samplePool, IPProtocol, TCPSrcPort, TCPDstPort, TCPFlags, extendedType, nextHop. The SFLOW data processing process reads data from the SFLOW agent, changes partial field names in the data into globally uniform fields, for example, the srCIP is changed into ipaddress, and the srcMAC is changed into macaddress, so that the data generated by other data processing processes can be conveniently integrated.
Step S6, storing AAA structured data obtained by AAA authentication data processing flow, simple network management protocol structured data obtained by SNMP data processing flow, log structured data obtained by log data processing flow, and SFLOW traffic structured data obtained by SFLOW traffic data processing flow in a wireless network core database.
And step S7, inputting the wireless network core database into the intelligent operation and maintenance system.
Effects and effects of the embodiments
According to the large-scale wireless lan data collecting and processing method based on multiple data sources in the embodiments 1 and 2, the DHCP data processing flow is adopted to collect and process the IP address lease data of the DHCP system through the database access to obtain the terminal type data, the AAA authentication data processing flow is adopted to collect and process the real-time internet surfing detail data of the AAA authentication system through the database access to obtain the user authentication data and the AAA structural data, the SNMP data processing flow is adopted to collect the operation data of the network device, the operation data of the wireless terminal and the operation data of the wireless interference source through the SNMP protocol, the terminal type data input by the DHCP data processing flow and the user authentication data input by the AAA authentication data processing flow are combined to be processed and stored together to obtain the SNMP structural data, and the log data processing flow is adopted to collect and process the address conversion log data of the NAT system through the SYSLOG protocol, The domain name of the DNS system analyzes log data and log data of the AC to obtain log structured data, SFLOW flow data are collected and processed through an SFLOW protocol by an SFLOW flow data processing flow to obtain SFLOW flow structured data, all the obtained structured data are stored in a wireless network core database, and the wireless network core database is input into the intelligent operation and maintenance system. Compared with other methods, the process has more and more comprehensive collected and processed data sources, covers the whole process of using the wireless network by the wireless terminal, integrates the collected data with external standard data such as OUI, Fingerbank, rDNS and the like, and stores the unified and structured data into a wireless network core database. The standard big data set generated by the embodiment can be used for analyzing the running state of the wireless network, the user behavior and the like, and the structured data can be customized and pushed to an application system according to the requirements of various applications such as an intelligent operation and maintenance system.
The above embodiments are preferred examples of the present invention, and are not intended to limit the scope of the present invention.

Claims (6)

1. The large-scale wireless local area network data acquisition and processing method based on multiple data sources is characterized by comprising the following steps of:
step 1, acquiring and processing internet protocol address lease data of a dynamic host configuration protocol system by adopting a dynamic host configuration protocol data processing flow through database access to obtain terminal type data;
step 2, accessing, collecting and processing user authentication data of the AAA authentication system by adopting an AAA authentication data processing flow database to obtain user authentication data and AAA structured data;
step 3, adopting a simple network management protocol data processing flow to acquire running data of network equipment, running data of a wireless terminal and running data of a wireless interference source through a simple network management protocol, and processing and storing the terminal type data input by the dynamic host configuration protocol data processing flow and the user authentication data input by the AAA authentication data processing flow together to obtain simple network management protocol structured data;
step 4, acquiring and processing address conversion log data of a network address conversion system, domain name resolution log data of a domain name system and log data of a wireless controller by adopting a log data processing flow through a system log protocol to obtain log structured data;
step 5, collecting and processing SFLOW flow data by adopting an SFLOW flow data processing flow through an SFLOW protocol to obtain SFLOW flow structured data;
step 6, storing the AAA structured data obtained by the AAA authentication data processing flow, the SNMP structured data obtained by the SNMP data processing flow, the log structured data obtained by the log data processing flow and the SFLOW traffic structured data obtained by the SFLOW traffic data processing flow into a wireless network core database;
and 7, inputting the wireless network core database into an intelligent operation and maintenance system.
2. The large-scale wireless local area network data collection and processing method based on multiple data sources as claimed in claim 1, wherein:
in step 1, the data processing flow of the dynamic host configuration protocol includes three sub-processes of a dynamic host configuration protocol, an organization unique identifier and Fingerbank, and specifically includes the following sub-steps:
step 1-1, the processing flow of the dynamic host configuration protocol data acquires internet protocol address lease data of the dynamic host configuration protocol system through database access, and comprises the following steps: including internet protocol data, media access control layer data, lease time, option55, option60, host name, etc.;
step 1-2, respectively carrying out local matching on a terminal, matching with the organization unique identifier of IEEE and calling CloudAPI provided by the Fingerbank for matching;
and 1-3, identifying terminal type data and inputting the terminal type data into the simple network management protocol data processing flow.
3. The large-scale wireless local area network data collection and processing method based on multiple data sources as claimed in claim 1, wherein:
wherein, in the step 2, the following substeps are specifically included:
step 2-1, the AAA authentication data processing flow accesses and collects real-time internet surfing detail data of the AAA authentication system through a database;
step 2-2, cleaning data and generating new fields to obtain the user authentication data and the AAA structured data;
and 2-3, inputting the user authentication data into the simple network management protocol data processing process, and storing the AAA structured data into the wireless network core database.
4. The large-scale wireless local area network data collection and processing method based on multiple data sources as claimed in claim 1, wherein:
in step 3, the simple network management protocol data processing flow includes two sub-processes, namely a running data processing process and a file data processing process, and specifically includes the following sub-steps:
step 3-1, collecting the operation data of the network equipment, the wireless terminal and the wireless interference source through the simple network management protocol collection process, wherein the concrete collection objects comprise: the wireless access point, the wireless controller, the wireless radio frequency, the switch, the wireless terminal, the wireless probe and other equipment also comprise illegal wireless access points, illegal terminals and other interference sources detected by the radio frequency;
step 3-2, receiving the terminal type data input by the dynamic host configuration protocol data processing flow and the user authentication data input by the AAA authentication data processing flow;
step 3-3, all the collected and received data are processed and stored together;
3-4, independently designing a processing process and a monthly archiving process for the operation data and the archive data of each type of equipment, wherein the operation data processing process is used for processing the equipment operation data generated according to a time sequence, the archive data processing process is used for processing the archive data of all elements, and finally, the simple network management protocol structured data is obtained through processing;
and 3-5, storing the structured data of the simple network management protocol obtained by processing into the wireless network core database.
5. The large-scale wireless local area network data collection and processing method based on multiple data sources as claimed in claim 1, wherein:
in step 4, the log data processing flow includes four sub-processes of network address conversion log data processing, domain name system reverse query and wireless controller log data processing, and specifically includes the following sub-steps:
step 4-1, collecting the address conversion log of the network address conversion system and the domain name resolution log data of the domain name system through the system log protocol;
step 4-2, combining the network address conversion log data processing, the domain name system log data processing and the domain name system reverse query, and matching the domain name address corresponding to the internet protocol address to obtain log structured data;
4-3, acquiring log data of the wireless controller through the system log protocol, processing and storing to obtain log structured data;
and 4-4, storing the log structured data obtained by processing in the step 4-2 and the step 4-3 into the wireless network core database.
6. The large-scale wireless local area network data collection and processing method based on multiple data sources as claimed in claim 1, wherein:
wherein, in the step 5, the following substeps are specifically included:
step 5-1, collecting the SFLOW flow data through an SFLOW protocol by adopting the SFLOW flow data processing flow;
step 5-2, extracting a field directly related to the access behavior of the wireless terminal in the SFLOW flow data, performing SFLOW related configuration on a port of a switch in the process, pushing the SFLOW sampling information in the outgoing direction to an SFLOW agent, and configuring a sampling ratio according to actual requirements to obtain SFLOW flow structured data;
and 5-3, storing the SFLOW traffic structured data to the wireless network core database.
CN202111033400.9A 2021-09-03 2021-09-03 Large-scale wireless local area network data acquisition and processing method based on multiple data sources Active CN114095800B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111033400.9A CN114095800B (en) 2021-09-03 2021-09-03 Large-scale wireless local area network data acquisition and processing method based on multiple data sources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111033400.9A CN114095800B (en) 2021-09-03 2021-09-03 Large-scale wireless local area network data acquisition and processing method based on multiple data sources

Publications (2)

Publication Number Publication Date
CN114095800A true CN114095800A (en) 2022-02-25
CN114095800B CN114095800B (en) 2023-08-25

Family

ID=80296352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111033400.9A Active CN114095800B (en) 2021-09-03 2021-09-03 Large-scale wireless local area network data acquisition and processing method based on multiple data sources

Country Status (1)

Country Link
CN (1) CN114095800B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150594A (en) * 2007-10-18 2008-03-26 中国联合通信有限公司 An integrated access method and system for mobile cellular network and WLAN
CN103200030A (en) * 2013-03-12 2013-07-10 福建星网锐捷网络有限公司 Network management device and method
CN103532752A (en) * 2013-10-10 2014-01-22 北京首信科技股份有限公司 Management device and method for realizing integration of surfing logs of mobile internet users
CN103905440A (en) * 2014-03-28 2014-07-02 哈尔滨工程大学 Network security situation awareness analysis method based on log and SNMP information fusion
CN106878092A (en) * 2017-03-28 2017-06-20 上海以弈信息技术有限公司 A kind of network O&M monitor in real time of multi-source heterogeneous data fusion is presented platform with analysis
CN107577588A (en) * 2017-09-26 2018-01-12 北京中安智达科技有限公司 A kind of massive logs data intelligence operational system
CN108183809A (en) * 2016-12-08 2018-06-19 国家电网公司 Improve the log equipment lifecycle management platform implementation method of O&M efficiency
CN110719194A (en) * 2019-09-12 2020-01-21 中国联合网络通信集团有限公司 Network data analysis method and device
CN112671592A (en) * 2021-01-16 2021-04-16 鸣飞伟业技术有限公司 Network equipment operation and maintenance management system
CN112884452A (en) * 2021-03-17 2021-06-01 北京幂数科技有限公司 Intelligent operation and maintenance multi-source data acquisition visualization analysis system
CN113157994A (en) * 2021-03-02 2021-07-23 昆山九华电子设备厂 Multi-source heterogeneous platform data processing method
CN113312340A (en) * 2021-04-09 2021-08-27 国网陕西省电力公司电力科学研究院 Integrated method and system for processing, fusing and displaying multi-source data

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150594A (en) * 2007-10-18 2008-03-26 中国联合通信有限公司 An integrated access method and system for mobile cellular network and WLAN
CN103200030A (en) * 2013-03-12 2013-07-10 福建星网锐捷网络有限公司 Network management device and method
CN103532752A (en) * 2013-10-10 2014-01-22 北京首信科技股份有限公司 Management device and method for realizing integration of surfing logs of mobile internet users
CN103905440A (en) * 2014-03-28 2014-07-02 哈尔滨工程大学 Network security situation awareness analysis method based on log and SNMP information fusion
CN108183809A (en) * 2016-12-08 2018-06-19 国家电网公司 Improve the log equipment lifecycle management platform implementation method of O&M efficiency
CN106878092A (en) * 2017-03-28 2017-06-20 上海以弈信息技术有限公司 A kind of network O&M monitor in real time of multi-source heterogeneous data fusion is presented platform with analysis
CN107577588A (en) * 2017-09-26 2018-01-12 北京中安智达科技有限公司 A kind of massive logs data intelligence operational system
CN110719194A (en) * 2019-09-12 2020-01-21 中国联合网络通信集团有限公司 Network data analysis method and device
CN112671592A (en) * 2021-01-16 2021-04-16 鸣飞伟业技术有限公司 Network equipment operation and maintenance management system
CN113157994A (en) * 2021-03-02 2021-07-23 昆山九华电子设备厂 Multi-source heterogeneous platform data processing method
CN112884452A (en) * 2021-03-17 2021-06-01 北京幂数科技有限公司 Intelligent operation and maintenance multi-source data acquisition visualization analysis system
CN113312340A (en) * 2021-04-09 2021-08-27 国网陕西省电力公司电力科学研究院 Integrated method and system for processing, fusing and displaying multi-source data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨志宏;李立;蔡世贵;李万鹏;: "WLAN网络智能管理系统", 电信快报, no. 08 *
林宏伟;: "基于SNMP协议的内联网用户监视系统模型", 贵阳学院学报(自然科学版), no. 03 *

Also Published As

Publication number Publication date
CN114095800B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN108257374B (en) Identification method and device of user-to-user relationship
US6205122B1 (en) Automatic network topology analysis
CN102710777B (en) Advertisement push-delivery method and system, as well as advertisement pusher
CN103546343B (en) The network traffics methods of exhibiting of network traffic analysis system and system
CN110247999A (en) Domain name analytic method, domain name mapping device, household appliance and storage medium
CN103516541A (en) Configuration information automatic management method based on intelligent transformer substation
JP2005151107A (en) Device management method of data center, device management server, device management system of data center, and program
CN110719194B (en) Network data analysis method and device
EP2869495A1 (en) Node de-duplication in a network monitoring system
CN109617732B (en) Integrated hybrid networking and comprehensive management method for power wireless private network
CN102750750A (en) Punch card method based on Wi-Fi and system thereof
CN103501356A (en) Network address configuration method and system of network camera
CN109743745B (en) Mobile network access type identification method and device, electronic equipment and storage medium
CN102710965B (en) Video monitoring data acquisition method and system and special bearing network
KR20140106264A (en) Method of managing M2M network through the correlation between A plurality of sensors and system for it
CN102594885B (en) Sensor network resolves interoperation A platform, sensor network interoperability methods and system
CN108173692A (en) It is a kind of based on the whole network equipment sensory perceptual system being actively and passively combined and cognitive method
CN104219100A (en) Information acquiring method and device
KR20070059215A (en) A method, an access point, a telecommunication device, a server and an information system for providing and for retrieving within a telecommunication network available network connection types
CN109406960A (en) A kind of partial discharge of transformer on-line detecting system and method based on wireless sensor
CN113612646B (en) Neighborhood network topology visualization method based on neighbor discovery
CN114095800B (en) Large-scale wireless local area network data acquisition and processing method based on multiple data sources
KR20090043216A (en) Apparatus and method for updating a network information based on a terminal
CN113300880B (en) Ethernet switch topology generation and drawing method based on Tarjan algorithm
Iannaccone et al. CoMo: An open infrastructure for network monitoring–research agenda

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant