CN116112407A - Network flow data acquisition system - Google Patents

Network flow data acquisition system Download PDF

Info

Publication number
CN116112407A
CN116112407A CN202211694321.7A CN202211694321A CN116112407A CN 116112407 A CN116112407 A CN 116112407A CN 202211694321 A CN202211694321 A CN 202211694321A CN 116112407 A CN116112407 A CN 116112407A
Authority
CN
China
Prior art keywords
flow
data
module
network
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211694321.7A
Other languages
Chinese (zh)
Inventor
杨学濬
范铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xuedeng Information Technology Co ltd
Original Assignee
Shanghai Xuedeng Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xuedeng Information Technology Co ltd filed Critical Shanghai Xuedeng Information Technology Co ltd
Priority to CN202211694321.7A priority Critical patent/CN116112407A/en
Publication of CN116112407A publication Critical patent/CN116112407A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/12Network monitoring probes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0281Proxies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of flow acquisition systems, and discloses a network flow data acquisition system, which comprises a plurality of proxy servers and a total server, wherein the proxy servers are used for acquiring, storing and analyzing network flow subareas and then transmitting analysis data and total flow data compression and package files to the total server for management; the proxy server comprises a data acquisition module, a partitioning module, a storage module and an analysis module. The invention is used for collecting network traffic through a plurality of proxy servers, and carrying out classification scanning, processing, filtering, storing and analyzing in the collecting process, so as to optimize the data collecting direction, reduce repeated mirror image collection, reduce anonymous traffic, network attack invasion and malicious traffic, not only improve network safety, but also reduce the probability of fault occurrence, and promote reasonable and effective network traffic collection.

Description

Network flow data acquisition system
Technical Field
The invention relates to the technical field of flow acquisition systems, in particular to a network flow data acquisition system.
Background
With the rapid development of modern computer science and technology, the data volume of network transmission is also increasing continuously, and statistics of network traffic has become a concern. In a plurality of methods for counting network traffic, the proxy server accesses and collects the number of network traffic through proxy IP, compared with a router, the method can not only improve the speed, but also avoid the blocking of a defense mechanism in a website, so that the collection effect is better. The proxy server plays the role of data collection. Therefore, the flow IP obtained by the proxy server is analyzed, so that a certain effect on network research, network management and flow data acquisition can be expected.
Along with the gradual deep progress of the network and service fusion, a large number of network attacks and invasion behaviors of malicious traffic are generated, network security becomes a problem which cannot be ignored, the repeatability of the content of the existing internet is too high, collected data is too large, and some data collection is only carried out by sending a mirror image of the traffic to an analysis server, so that the traffic collection cannot be reasonably and effectively carried out, and the probability of occurrence of faults is easy to generate.
Disclosure of Invention
The invention provides a network flow data acquisition system, which has the beneficial effects of carrying out classification scanning, processing, filtering, storing and analyzing in the acquisition process, promoting the optimization of the data acquisition direction, solving the problems that the background art mentioned above gradually goes deep along with the process of network and service fusion, a large number of network attacks and invasion behaviors of malicious flows occur, network safety becomes a problem which cannot be ignored, the repeatability of the existing internet content is too high, the acquired data is too large, and the problem that the probability of occurrence of faults is easy to occur because the acquired data is only obtained by sending a mirror image of the flow to an analysis server is solved.
The invention provides the following technical scheme: the network flow data acquisition system comprises a plurality of proxy servers and a total server, wherein the proxy servers are used for acquiring, storing and analyzing the network flow subareas and then transmitting analysis data and total flow data compression and package files to the total server for management;
the proxy server comprises a data acquisition module, a partition module, a storage module and an analysis module; the data acquisition module is used for acquiring the data information such as the total network use amount, the current total network speed, the network use speed and the like of each IP address in software and a webpage in real time, converting the data information into a file form and storing the file into a flow file;
the partition module is used for carrying out statistics on the flow files, carrying out partition division on the files according to the flow types, the flow formats and the time interval distribution, and carrying out identification according to the first letter of F/J/G/S/T;
the storage module is used for storing, processing and filtering the flow data after the partition module is used for counting and dividing;
the analysis module is used for comparing and analyzing the information filtered by the storage module with the previous network flow data, obtaining statistical data and packaging the statistical data and sending the statistical data to the total server.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the data acquisition module comprises at least one or more acquisition ports, the acquisition ports are used for acquiring the acquired data of the network traffic, and the data acquisition module comprises a judging module and an identification module;
the judging module is used for judging the flow data in the process of acquisition to obtain a mark, and writing the mark into the mark module after the judgment;
the standard judged by the judging module comprises data quantity and data type of the flow unit byte, the data quantity is marked with a flow unit identifier, the current collected flow unit is converted, when the data identifier of more than 1G is A-type data, the data identifier of less than 1G is B-type data, and the data identifier of less than 1MB is K-type data.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the partition module is used for scanning the data traffic in the data acquisition module, scanning the specified catalogue and the files of the specified catalogue, judging whether anonymous traffic exists in the scanning process, and classifying the anonymous identification information as risk data if the anonymous traffic identification data exists;
scanning normal byte traffic, classifying byte traffic, wherein the partition module comprises a traffic type unit, a traffic format unit and an interval distribution type unit;
the flow type unit comprises a flooding flow, an accurate flow, a public domain flow, a private domain flow and a pushing flow;
the flooding flow comprises flow generated by information browsed by a microblog and a news channel;
the accurate flow is a specific flow generated by searching through an index or a keyword by a user;
the public domain flow is the same flow generated in the process of browsing an e-commerce platform, similar spelling, beijing east, taobao, zhiju, bean and other fixed platforms;
the private domain flow is used for spontaneously generated flow in user software, and is similar to the flow stored in a group, a comment and a privacy file;
the push flow is generated by automatically pushing data information of a user through a fixed platform, a webpage, an APP and the like.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: and the flow format unit is internally classified into a voice file, a picture file and a video file.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the storage module comprises a storage unit, a processing unit and a filtering unit;
the storage unit is used for classifying and storing the flow data generated by the partition module, generating a type identifier and identifying according to F/J/G/S/T;
the processing unit is used for carrying out statistical processing on the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow;
the filtering unit is used for filtering the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow, filtering single inactive data to obtain the total number of relative activities, and comparing the total number of activities with an active connection number threshold of the acquisition port.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the analysis module comprises a comparison unit, an analysis unit and a decision unit;
the comparison unit is used for comparing the activity degree of the processed and filtered data flow information in the storage module with the threshold value of the active connection number of the acquisition port;
the analysis unit is used for analyzing and re-arranging the data in the storage module, setting aside the sequence of the original program, analyzing and re-arranging the instruction, optimizing the execution sequence, judging whether the instruction is processed or is processed together with other instructions by reading the decoded software instruction, and optimizing the execution instruction of the data in the storage module so that the instruction is processed and executed more efficiently in one pass, and sending the processed and executed instruction to the decision unit;
the decision unit is used for making decisions on the analyzed rearrangement instructions, providing a decision scheme, generating statistical data and packaging the statistical data to the total server.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the proxy server is used for obtaining the IP proxy for collecting data for single network and multi-network traffic.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the anonymous traffic is set to a file with a suffix tmp.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: the proxy server also comprises a monitoring module and an alarm module, wherein the monitoring module is used for monitoring the network flow of the area in real time, and when the acquisition state of the data acquisition module is abnormal, the alarm module sends alarm information.
As an alternative to the network traffic data acquisition system according to the invention, the following is used: and the total server is used for monitoring the information collected by each proxy server, and if the abnormality of the proxy server is detected, the total server carries out proxy adjustment of the IP.
The invention has the following beneficial effects:
1. according to the network traffic data acquisition system, network traffic is acquired through the plurality of proxy servers, and is subjected to classified scanning, processing, filtering, storing and analyzing in the acquisition process, so that the data acquisition direction is optimized, repeated mirror image acquisition is reduced, the conditions of anonymous traffic, network attack invasion and malicious traffic are reduced, network safety is improved, the probability of faults is reduced, and reasonable and effective network traffic acquisition is promoted.
2. The network flow data acquisition system is used for carrying out secondary fine processing on the classified network flow data, wherein primary pretreatment is scanning in the acquisition process, filtering and processing at the time are used for obtaining the active total number, comparing the active total number with a later analysis unit, better knowing the regularity and uniqueness of the acquired network data and reducing the probability of fault occurrence.
3. According to the network flow data acquisition system, anonymous flow is classified in the process of scanning flow data marked by the data acquisition module through the partition module, normal byte flow is classified according to flow types and flow format interval distribution categories, the classified flow data are convenient to analyze, corresponding flooding flow-F/accurate flow-J/public domain flow G/private domain flow S/pushing flow-T are classified, and the separation sections of interval time A-D are convenient to classify, so that analysis such as flow change and positioning is facilitated in the later stage, and the requirements and values of users can be known from network flow to be matched, so that later analysis is promoted, and reasonable and effective flow acquisition is promoted.
Drawings
FIG. 1 is a schematic flow chart of the system of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Along with the gradual deep progress of the network and service fusion, a large number of network attacks and invasion behaviors of malicious traffic are generated, network security becomes a problem which cannot be ignored, the repeatability of the content of the existing internet is too high, collected data is too large, and some data collection is only carried out by sending a mirror image of the traffic to an analysis server, so that the traffic collection cannot be reasonably and effectively carried out, and the probability of occurrence of faults is easy to generate.
The invention provides the following technical scheme: the network flow data acquisition system comprises a plurality of proxy servers and a total server, wherein the proxy servers are in communication connection with the total server, and the proxy servers are used for acquiring, storing and analyzing the network flow subareas and then transmitting analysis data and total flow data compression and package files to the total server for management;
the proxy server comprises a data acquisition module, a partition module, a storage module and an analysis module; the data acquisition module is used for acquiring the data information such as the total network use amount, the current total network speed, the network use speed and the like of each IP address in software and a webpage in real time, converting the data information into a file form and storing the file into a flow file;
the partition module is used for carrying out statistics on the flow files, carrying out partition division on the files according to the flow types, the flow formats and the time interval distribution, and carrying out identification according to the first letter of F/J/G/S/T;
the storage module is used for storing, processing and filtering the flow data after the partition module is used for counting and dividing;
the analysis module is used for comparing and analyzing the information filtered by the storage module with the previous network flow data, obtaining statistical data and packaging the statistical data and sending the statistical data to the total server.
In the embodiment, the network traffic is collected through the plurality of proxy servers, and is subjected to classified scanning, processing, filtering, storing and analyzing in the collecting process, so that the data collecting direction is optimized, the repeated mirror image collection is reduced, the conditions of anonymous traffic, network attack invasion and malicious traffic are reduced, the network safety is improved, the probability of occurrence of faults is reduced, and reasonable and effective network traffic collection is promoted.
Example 2
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the data acquisition module comprises at least one or more acquisition ports, the acquisition ports are used for acquiring the acquired data of the network traffic, and the data acquisition module comprises a judging module and an identification module;
the judging module is used for judging the flow data in the process of acquisition to obtain a mark, and writing the mark into the mark module after the judgment;
the standard judged by the judging module comprises data quantity and data type of the flow unit byte, the data quantity is marked with a flow unit identifier, the current collected flow unit is converted, when the data identifier of more than 1G is A-type data, the data identifier of less than 1G is B-type data, and the data identifier of less than 1MB is K-type data.
In this embodiment, the identifier is obtained after simply judging the data acquisition process, and then the identifier is sent to the partition module to perform specific classification, the network traffic data is preprocessed in the data acquisition process, and the case data size of the network traffic data is identified, so that the later-stage partition module is convenient to classify and process the identified network traffic data, the situation that the load and breakdown are caused by the storage classification is reduced due to overlarge data.
Example 3
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the partition module is used for scanning the data traffic in the data acquisition module, scanning the specified catalogue and the files of the specified catalogue, judging whether anonymous traffic exists in the scanning process, and classifying the anonymous identification information as risk data if the anonymous traffic identification data exists;
scanning normal byte traffic, classifying byte traffic, wherein the partition module comprises a traffic type unit, a traffic format unit and an interval distribution type unit;
and the flow format unit is internally classified into a voice file, a picture file and a video file.
The interval distribution category unit is configured to classify network traffic in different interval time periods, for example: the method comprises the steps that 8-12 am is a time period A, 12 pm-4 pm is a time period B, 4-8 pm is a time period C, 8-12 late is a time period D, and 12-8 late is a time period E, so that network data traffic classification is classified;
the flow type unit comprises a flooding flow, an accurate flow, a public domain flow, a private domain flow and a pushing flow;
the flooding flow comprises flow generated by information browsed by a microblog and a news channel;
the accurate flow is a specific flow generated by searching through an index or a keyword by a user;
the public domain flow is the same flow generated in the process of browsing an e-commerce platform, similar spelling, beijing east, taobao, zhiju, bean and other fixed platforms;
the private domain flow is used for spontaneously generated flow in user software, and is similar to the flow stored in a group, a comment and a privacy file;
the push flow is generated by automatically pushing data information of a user through a fixed platform, a webpage, an APP and the like.
In this embodiment, in the process of scanning the flow data identified by the data acquisition module, the partition module classifies anonymous flows, classifies normal byte flows according to flow types and flow format interval distribution categories, so as to be convenient for analyzing classified flow data, wherein the corresponding flooding flow-F/accurate flow-J/public domain flow G/private domain flow S/pushing flow-T is classified, and the partition section of the interval time A-D is classified, so that analysis such as flow rendering and positioning is facilitated in the later stage, and the requirement and value of a user can be known from network flows so as to be matched, thereby facilitating later stage analysis.
Example 4
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the storage module comprises a storage unit, a processing unit and a filtering unit;
the storage unit is used for classifying and storing the flow data generated by the partition module, generating a type identifier and identifying according to F/J/G/S/T;
the processing unit is used for carrying out statistical processing on the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow;
the filtering unit is used for filtering the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow, filtering single inactive data to obtain the total number of relative activities, and comparing the total number of activities with an active connection number threshold of the acquisition port.
In this embodiment, the method is used for performing secondary fine processing on the classified network traffic data, wherein the primary pretreatment is scanning in the acquisition process, and the filtering and processing are used for obtaining the active total number, so as to compare with the later analysis unit, and better understand the regularity and uniqueness of the acquired network data.
Example 5
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the analysis module comprises a comparison unit, an analysis unit and a decision unit;
the comparison unit is used for comparing the activity degree of the processed and filtered data flow information in the storage module with the threshold value of the active connection number of the acquisition port;
the analysis unit is used for analyzing and re-arranging the data in the storage module, setting aside the sequence of the original program, analyzing and re-arranging the instruction, optimizing the execution sequence, judging whether the instruction is processed or is processed together with other instructions by reading the decoded software instruction, and optimizing the execution instruction of the data in the storage module so that the instruction is processed and executed more efficiently in one pass, and sending the processed and executed instruction to the decision unit;
the decision unit is used for making decisions on the analyzed rearrangement instructions, providing a decision scheme, generating statistical data and packaging the statistical data to the total server.
In this embodiment, the method is used for comparing and analyzing the network traffic information stored after filtering to obtain an optimized decision, so that the optimization of the network traffic data collected by the whole server is more accurate and positioned in the later stage.
Example 6
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the proxy server is used for obtaining the IP proxy for collecting data for single network and multi-network traffic.
The total server is used for monitoring information collected by each proxy server, and if the abnormality of the proxy server is detected, the total server carries out proxy adjustment of the IP.
In the work of capturing data, the target website to be collected can block your request IP, so that your request can not acquire correct data, the proxy IP is used for crawling in order to solve the problem, because the proxy IP can play a role of a middle layer, when you use the proxy IP, the target website can consider that your IP address is the true IP of you, even if the target website is blocked, only the virtual IP connected by us is blocked, so that the webpage access can be continuously and normally carried out on the own IP.
Currently, most websites are provided with special operators and managers, so that the managers generally set defense mechanisms and anti-climbing measures of the websites based on the safety consideration of the websites. Some common fixed IPs are easily identified as low quality IPs and are blacked out and masked based on the existence of website defense mechanisms. However, when you use the high-anonymity IP to access the website, the target website is judged to be the real IP and the real user accesses, so that the high-anonymity proxy IP is not only shielded, but also the access speed is faster, and the acquisition effect is naturally optimal. In summary, the collected data must use proxy IP, which not only can increase the speed but also can avoid blocking, and the IP sea IP proxy provides a large amount of high-speed anonymous proxy, which has lower delay, faster speed and higher efficiency.
In the embodiment, the adoption of the IP proxy server is convenient for collecting efficiently, and the monitoring is carried out through the total server, so that the condition that the proxy IP is shielded and cannot be accessed in the collecting process is reduced, and the proxy IP can be timely adjusted.
Example 7
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the anonymous traffic is set to a file with a suffix tmp.
MP and TEMP files are temporary files generated by various software or systems, known as garbage files. The temporary file generated by Windows is essentially indistinguishable from the virtual memory, but the temporary file is more targeted than the virtual memory and serves a program alone, which results in two problems: firstly, occupy space, and secondly, the risk of program incapacity after deletion exists.
In the embodiment, the temporary files in the anonymous traffic are removed, so that the storage space is conveniently optimized, the memory of the temporary files, the anonymous files and the virtual files occupying the storage space is reduced, and the possibility of breakdown of the server caused by overlarge data is reduced.
Example 8
This embodiment is explained based on embodiment 1, and concretely, please refer to fig. 1, wherein: the proxy server also comprises a monitoring module and an alarm module, wherein the monitoring module is used for monitoring the network flow of the area in real time, and when the acquisition state of the data acquisition module is abnormal, the alarm module sends alarm information.
In this embodiment, the network traffic security is consolidated by monitoring the collected network traffic in real time and adopting an alarm mode.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that it will be apparent to those skilled in the art that several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the scope of the invention.

Claims (10)

1. The network flow data acquisition system comprises a plurality of proxy servers and a total server, and is characterized in that: the proxy server is used for collecting, storing and analyzing the network flow subareas and then transmitting analysis data and total flow data compression package files to the total server for management;
the proxy server comprises a data acquisition module, a partition module, a storage module and an analysis module; the data acquisition module is used for acquiring the data information such as the total network use amount, the current total network speed, the network use speed and the like of each IP address in software and a webpage in real time, converting the data information into a file form and storing the file into a flow file;
the partition module is used for carrying out statistics on the flow files, carrying out partition division on the files according to the flow types, the flow formats and the time interval distribution, and carrying out identification according to the first letter of F/J/G/S/T;
the storage module is used for storing, processing and filtering the flow data after the partition module is used for counting and dividing;
the analysis module is used for comparing and analyzing the information filtered by the storage module with the previous network flow data, obtaining statistical data and packaging the statistical data and sending the statistical data to the total server.
2. The network traffic data acquisition system of claim 1, wherein: the data acquisition module comprises at least one or more acquisition ports, the acquisition ports are used for acquiring the acquired data of the network traffic, and the data acquisition module comprises a judging module and an identification module;
the judging module is used for judging the flow data in the process of acquisition to obtain a mark, and writing the mark into the mark module after the judgment;
the standard judged by the judging module comprises data quantity and data type of the flow unit byte, the data quantity is marked with a flow unit identifier, the current collected flow unit is converted, when the data identifier of more than 1G is A-type data, the data identifier of less than 1G is B-type data, and the data identifier of less than 1MB is K-type data.
3. The network traffic data acquisition system of claim 1, wherein: the partition module is used for scanning the data traffic in the data acquisition module, scanning the specified catalogue and the files of the specified catalogue, judging whether anonymous traffic exists in the scanning process, and classifying the anonymous identification information as risk data if the anonymous traffic identification data exists;
scanning normal byte traffic, classifying byte traffic, wherein the partition module comprises a traffic type unit, a traffic format unit and an interval distribution type unit;
the flow type unit comprises a flooding flow, an accurate flow, a public domain flow, a private domain flow and a pushing flow;
the flooding flow comprises flow generated by information browsed by a microblog and a news channel;
the accurate flow is a specific flow generated by searching through an index or a keyword by a user;
the public domain flow is the same flow generated in the process of browsing an e-commerce platform, similar spelling, beijing east, taobao, zhiju, bean and other fixed platforms;
the private domain flow is used for spontaneously generated flow in user software, and is similar to the flow stored in a group, a comment and a privacy file;
the push flow is generated by automatically pushing data information of a user through a fixed platform, a webpage, an APP and the like.
4. The network traffic data acquisition system of claim 1, wherein: and the flow format unit is internally classified into a voice file, a picture file and a video file.
5. A network traffic data acquisition system according to claim 3, wherein: the storage module comprises a storage unit, a processing unit and a filtering unit;
the storage unit is used for classifying and storing the flow data generated by the partition module, generating a type identifier and identifying according to F/J/G/S/T;
the processing unit is used for carrying out statistical processing on the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow;
the filtering unit is used for filtering the flooding flow, the accurate flow, the public domain flow, the private domain flow and the pushing flow, filtering single inactive data to obtain the total number of relative activities, and comparing the total number of activities with an active connection number threshold of the acquisition port.
6. The network traffic data acquisition system of claim 5, wherein: the analysis module comprises a comparison unit, an analysis unit and a decision unit;
the comparison unit is used for comparing the activity degree of the processed and filtered data flow information in the storage module with the threshold value of the active connection number of the acquisition port;
the analysis unit is used for analyzing and re-arranging the data in the storage module, setting aside the sequence of the original program, analyzing and re-arranging the instruction, optimizing the execution sequence, judging whether the instruction is processed or is processed together with other instructions by reading the decoded software instruction, and optimizing the execution instruction of the data in the storage module so that the instruction is processed and executed more efficiently in one pass, and sending the processed and executed instruction to the decision unit;
the decision unit is used for making decisions on the analyzed rearrangement instructions, providing a decision scheme, generating statistical data and packaging the statistical data to the total server.
7. The network traffic data acquisition system of claim 1, wherein: the proxy server is used for obtaining the IP proxy for collecting data for single network and multi-network traffic.
8. A network traffic data acquisition system according to claim 3, wherein: the anonymous traffic is set to a file with a suffix tmp.
9. The network traffic data acquisition system of claim 1, wherein: the proxy server also comprises a monitoring module and an alarm module, wherein the monitoring module is used for monitoring the network flow of the area in real time, and when the acquisition state of the data acquisition module is abnormal, the alarm module sends alarm information.
10. The network traffic data acquisition system of claim 1, wherein: and the total server is used for monitoring the information collected by each proxy server, and if the abnormality of the proxy server is detected, the total server carries out proxy adjustment of the IP.
CN202211694321.7A 2022-12-28 2022-12-28 Network flow data acquisition system Pending CN116112407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211694321.7A CN116112407A (en) 2022-12-28 2022-12-28 Network flow data acquisition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211694321.7A CN116112407A (en) 2022-12-28 2022-12-28 Network flow data acquisition system

Publications (1)

Publication Number Publication Date
CN116112407A true CN116112407A (en) 2023-05-12

Family

ID=86263079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211694321.7A Pending CN116112407A (en) 2022-12-28 2022-12-28 Network flow data acquisition system

Country Status (1)

Country Link
CN (1) CN116112407A (en)

Similar Documents

Publication Publication Date Title
US10218740B1 (en) Fuzzy hash of behavioral results
EP3143547B1 (en) System and method for high speed threat intelligence management using unsupervised machine learning and prioritization algorithms
KR101010302B1 (en) Security management system and method of irc and http botnet
KR101391781B1 (en) Apparatus and Method for Detecting HTTP Botnet based on the Density of Web Transaction
US8650646B2 (en) System and method for optimization of security traffic monitoring
US10521358B2 (en) System, apparatus and method for prioritizing the storage of content based on a threat index
US8141157B2 (en) Method and system for managing computer security information
US20130326620A1 (en) Investigative and dynamic detection of potential security-threat indicators from events in big data
US10404731B2 (en) Method and device for detecting website attack
EP3905624A1 (en) Botnet domain name family detecting method, apparatus, device, and storage medium
JP2016508353A (en) Improved streaming method and system for processing network metadata
CN115134099B (en) Network attack behavior analysis method and device based on full flow
US20230412591A1 (en) Traffic processing method and protection system
CN111625700B (en) Anti-grabbing method, device, equipment and computer storage medium
Hajamydeen et al. A refined filter for UHAD to improve anomaly detection
CN109190408B (en) Data information security processing method and system
CN116112407A (en) Network flow data acquisition system
Shomura et al. Analyzing the number of varieties in frequently found flows
CN112637150B (en) Honey pot analysis method and system based on nginx
CN114157504A (en) Safety protection method based on Servlet interceptor
CN113709265A (en) Method, device and system for identifying domain name and computer readable storage medium
CN114422232B (en) Method, device, electronic equipment, system and medium for monitoring illegal flow
CN117978450A (en) Security detection method, device, equipment and storage medium
CN114244561A (en) Network security detection method and device
CN118250040A (en) Data security maintenance optimization method and system based on data analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination