CN111064637A - NetFlow data duplicate removal method and device - Google Patents

NetFlow data duplicate removal method and device Download PDF

Info

Publication number
CN111064637A
CN111064637A CN201911280899.6A CN201911280899A CN111064637A CN 111064637 A CN111064637 A CN 111064637A CN 201911280899 A CN201911280899 A CN 201911280899A CN 111064637 A CN111064637 A CN 111064637A
Authority
CN
China
Prior art keywords
netflow
data
information
netflow data
repeated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911280899.6A
Other languages
Chinese (zh)
Other versions
CN111064637B (en
Inventor
窦鹏辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongying Youchuang Information Technology Co Ltd
Original Assignee
Zhongying Youchuang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongying Youchuang Information Technology Co Ltd filed Critical Zhongying Youchuang Information Technology Co Ltd
Priority to CN201911280899.6A priority Critical patent/CN111064637B/en
Publication of CN111064637A publication Critical patent/CN111064637A/en
Application granted granted Critical
Publication of CN111064637B publication Critical patent/CN111064637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method and a device for removing duplicate of NetFlow data, wherein the method comprises the following steps: acquiring NetFlow data information and IP address information of NetFlow data exporting equipment; identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment; identifying NetFlow repeated data according to the keyword information; and carrying out deduplication processing on the repeated NetFlow data according to the identification result. The invention can perform duplicate removal processing on repeated NetFlow data, and avoids the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the acquisition equipment for multiple times, thereby consuming a large amount of resources and causing inaccurate subsequent data statistical results.

Description

NetFlow data duplicate removal method and device
Technical Field
The invention relates to the technical field of communication, in particular to a NetFlow data duplicate removal method and device.
Background
NetFlow is a network monitoring function for providing a session level view of network traffic and recording session information of each TCP/IP (Transmission Control Protocol/Internet Protocol). However, the network administrator generally does not have any deep knowledge of the technical implementation mechanism of NetFlow itself or the network topology to be monitored, which may result in the same network TCP/IP session information being exported by the NetFlow data export device into multiple pieces of repeated NetFlow data. If the repeated NetFlow data cannot be identified and eliminated, one piece of NetFlow data is collected, calculated or circulated by the acquisition equipment for multiple times, so that not only is the resource consumption large, but also the subsequent data statistical result is inaccurate. And at present, no method capable of realizing deduplication on NetFlow data exists.
Disclosure of Invention
The embodiment of the invention provides a NetFlow data deduplication method, which is used for performing deduplication processing on repeated NetFlow data, and avoiding that one piece of NetFlow data is collected, calculated or circulated by acquisition equipment for multiple times, so that a large amount of resources are consumed, and the subsequent data statistical result is inaccurate, and the method comprises the following steps:
acquiring NetFlow data information and IP address information of NetFlow data exporting equipment;
identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
identifying NetFlow repeated data according to the keyword information;
and carrying out deduplication processing on the repeated NetFlow data according to the identification result.
Optionally, the NetFlow data information and NetFlow data derivation device IP address information for identifying the keyword information include:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
Optionally, identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data, includes:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Optionally, the identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data further includes:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
Optionally, the identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data further includes:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
Optionally, identifying the NetFlow duplicate data according to the keyword information includes:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
Optionally, the method further includes:
and after acquiring NetFlow repeated data, alarming the staff.
Optionally, performing deduplication processing on the repeated NetFlow data according to the identification result, including:
and deleting the NetFlow repeated data with the identification.
The embodiment of the invention also provides a NetFlow data deduplication device, which is used for performing deduplication processing on repeated NetFlow data, and avoiding that one piece of NetFlow data is collected, calculated or circulated by collection equipment for multiple times, so that a large amount of resources are consumed, and subsequent data statistics results are inaccurate, and the device comprises:
the information acquisition module is used for acquiring NetFlow data information and IP address information of NetFlow data export equipment;
the data identification module is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module is used for identifying the NetFlow repeated data according to the keyword information;
and the data deduplication module is used for performing deduplication processing on the repeated NetFlow data according to the identification result.
Optionally, the NetFlow data information and NetFlow data derivation device IP address information for identifying the keyword information include:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
Optionally, the data identification module is further configured to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Optionally, the data identification module is further configured to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
Optionally, the data identification module is further configured to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
Optionally, the data identification module is further configured to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
Optionally, the apparatus further comprises:
and the alarm module is used for alarming the working personnel after identifying the NetFlow repeated data.
Optionally, the data deduplication module is further configured to:
and deleting the NetFlow repeated data with the identification.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the above method is stored.
In the embodiment of the invention, by acquiring the NetFlow data information and the IP address information of the NetFlow data derivation equipment, the keyword information for reflecting NetFlow repeated data is identified according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment, and then the NetFlow repeated data is identified only according to the keyword information, and the repeated NetFlow data is subjected to deduplication processing according to the identification result. In conclusion, the repeated NetFlow data can be subjected to deduplication processing, and the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the collection equipment for multiple times is avoided, so that a large amount of resources are consumed, and the subsequent data statistics result is inaccurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
fig. 1 is a flowchart of a NetFlow data deduplication method in an embodiment of the present invention;
fig. 2 is another flowchart of the NetFlow data deduplication method according to the embodiment of the present invention;
fig. 3 is a schematic structural diagram of a NetFlow data deduplication device in the embodiment of the present invention;
fig. 4 is another schematic structural diagram of the NetFlow data deduplication device in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
Fig. 1 is a flowchart of a NetFlow data deduplication method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
and step 101, acquiring NetFlow data information and IP address information of NetFlow data derivation equipment.
In this embodiment, acquiring NetFlow data information includes: receiving and analyzing the NetFlow data information (generally, the received NetFlow data information cannot directly acquire the content of the NetFlow data information, and data analysis is needed).
NetFlow data information and NetFlow data derivation device IP address information for identifying keyword information, including: the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data. That is, the key information is identified by a combination of source address information, destination address information, source port information, destination port information, protocol type information, next hop address information, and NetFlow data derivation device IP address information in NetFlow data.
The NetFlow data export device may be: routers, switches, etc.
In specific implementation, the original NetFlow data is analyzed. The analyzed data is added with the IP address of the NetFlow data export device to form a piece of data. Representing one TCP/IP session information.
And 102, identifying key information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment.
In this embodiment, step 102 includes: and comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period. The "preset time period" may be 50s-70s (e.g., 55s, 60s, 65s, etc.).
Judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Specifically, if the NetFlow data information includes a case where the next hop address information is the same as the IP address of the derivation device, it is confirmed that NetFlow duplicate data exists, and the keyword information for reflecting the NetFlow duplicate data is identified.
If the NetFlow data has the next hop address information identical to the IP address of the export device, it is confirmed that the same network session reflecting the NetFlow data is repeated after the NetFlow data is exported due to passing through a plurality of consecutive export devices. The exporting device is referred to as a router and the "continuation" means: the plurality of routers are connected in sequence.
And if the same keyword information exists in the multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists. If the same data key information exists in multiple groups of NetFlow data information sent by different NetFlow data export devices, it is determined that the same network session for reflecting NetFlow data is repeated after the NetFlow data is exported due to the fact that the same network session passes through multiple discontinuous export devices. The exporting device is referred to as a router, "discontinuous" means: the plurality of routers are not connected with each other, and there may be no connection relationship between one router and other routers.
And if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
And 103, identifying NetFlow repeated data according to the keyword information.
In this embodiment, if the same key information exists in multiple sets of NetFlow data information sent by different NetFlow data derivation devices, the IP address information of the different NetFlow data derivation devices is recorded, and the NetFlow data other than the first NetFlow data in the NetFlow data information with the same key information is identified.
In order to facilitate the staff to find the error in time, as shown in fig. 2, the method further includes:
step 201, after acquiring NetFlow repeated data, alarming the staff.
In specific implementation, once the acquisition device recognizes that the acquired NetFlow data is repeated, the acquisition device firstly gives an alarm to a worker (the worker can adjust the configuration of the NetFlow export device according to the alarm information), then automatically records the IP of the NetFlow data export device for generating the repeated data, and identifies the same NetFlow data except the first piece.
The identified duplicate NetFlow data information and NetFlow data derivation device IP address information can be used for calculation of a data transfer path (a session passes through a plurality of NetFlow data derivation devices, that is, a transfer path of the session).
And step 104, performing deduplication processing on the repeated NetFlow data according to the identification result.
In this embodiment, step 104 includes: and deleting the NetFlow repeated data with the identification.
During specific implementation, operations such as ignoring and discarding NetFlow repeated data can be performed according to the identifier.
According to the NetFlow data deduplication method provided by the embodiment of the invention, by acquiring NetFlow data information and IP address information of NetFlow data derivation equipment, keyword information for reflecting NetFlow repeated data is identified according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment, and subsequently, only the NetFlow repeated data needs to be identified according to the keyword information, and deduplication processing is performed on the repeated NetFlow data according to an identification result. In conclusion, the repeated NetFlow data can be subjected to deduplication processing, and the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the collection equipment for multiple times is avoided, so that a large amount of resources are consumed, and the subsequent data statistics result is inaccurate.
The invention is illustrated below in a specific example:
s1: and the acquisition equipment receives and analyzes the NetFlow data.
S2: extracting the source address, the destination address, the source port, the destination port, the protocol type, the next hop IP (hereinafter referred to as six-tuple) and the NetFlow data in the analyzed NetFlow data to derive the IP address information of the equipment, and forming new data information by seven field information.
S3: the six-element group information generated at step S2 is buffered for a certain period of time (typically 60 seconds).
S4: comparing the information cached in the step S3 in real time, and if the same hexahydric group information appears but is not sent by the same NetFlow data export device, indicating that the NetFlow data has a duplicate problem.
S5: a warning of NetFlow data duplication is issued, and the six-tuple identified in step S4 and derived device IP information (seven field information in total) are cached.
S6: identifying all the NetFlow data in real time according to the cache information in the step S5, wherein the NetFlow data corresponding to the source address, the destination address, the source port, the destination port and the protocol type (five fields) which are the same as the cache information only retains the first NetFlow data, and the rest of the same NetFlow data are all identified as repeated data.
S7: after identifying the NetFlow data, the acquisition equipment can ignore, discard and other operations on the data according to the identification, and complete the deduplication processing on NetFlow repeated data.
Based on the same inventive concept, the embodiment of the present invention further provides a NetFlow data deduplication device, as described in the following embodiments. Because the principle of solving the problem of the NetFlow data deduplication device is similar to that of the NetFlow data deduplication method, the implementation of the NetFlow data deduplication device can refer to the implementation of the NetFlow data deduplication method, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 3 is a schematic structural diagram of a NetFlow data deduplication device provided in an embodiment of the present invention, and as shown in fig. 3, the device includes:
an information obtaining module 301, configured to obtain NetFlow data information and IP address information of a NetFlow data derivation device;
the data identification module 302 is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module 303 is configured to identify NetFlow repeated data according to the keyword information;
and the data deduplication module 304 is configured to perform deduplication processing on the repeated NetFlow data according to the identification result.
In the embodiment of the present invention, the NetFlow data information includes: NetFlow data information and NetFlow data derivation device IP address information for identifying keyword information, including:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
In an embodiment of the present invention, the data identification module 302 is further configured to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
In an embodiment of the present invention, the data identification module 302 is further configured to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
In an embodiment of the present invention, the data identification module 302 is further configured to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
In this embodiment of the present invention, the data identification module 303 is further configured to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
In the embodiment of the present invention, as shown in fig. 4, the apparatus further includes:
and the alarm module 401 is configured to alarm the staff after acquiring the NetFlow repeated data.
In an embodiment of the present invention, the data deduplication module 304 is further configured to:
and deleting the NetFlow repeated data with the identification.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the above method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the above method is stored.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (18)

1. A NetFlow data deduplication method is characterized by comprising the following steps:
acquiring NetFlow data information and IP address information of NetFlow data exporting equipment;
identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
identifying NetFlow repeated data according to the keyword information;
and carrying out deduplication processing on the repeated NetFlow data according to the identification result.
2. The method of claim 1, wherein the NetFlow data information and NetFlow data-derived device IP address information for identifying key information, comprises:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
3. The method of claim 2, wherein identifying key information reflecting NetFlow duplicate data from NetFlow data information and NetFlow data derivation device IP address information comprises:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
4. The method of claim 3, wherein identifying key information reflecting NetFlow duplication from NetFlow data information and NetFlow data derivation device IP address information, further comprises:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
5. The method of claim 4, wherein key information for reflecting NetFlow duplicate data is identified from the NetFlow data information and the NetFlow data derivation device IP address information, further comprising:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
6. The method of claim 5, wherein identifying NetFlow duplicates from key information comprises:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
7. The method of claim 1, further comprising:
and after acquiring NetFlow repeated data, alarming the staff.
8. The method of claim 1, wherein de-duplicating duplicate NetFlow data based on the identification comprises:
and deleting the NetFlow repeated data with the identification.
9. A NetFlow data deduplication apparatus, comprising:
the information acquisition module is used for acquiring NetFlow data information and IP address information of NetFlow data export equipment;
the data identification module is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module is used for identifying the NetFlow repeated data according to the keyword information;
and the data deduplication module is used for performing deduplication processing on the repeated NetFlow data according to the identification result.
10. The apparatus of claim 9, wherein the NetFlow data information and NetFlow data-derived device IP address information for identifying key information, comprises:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
11. The apparatus of claim 9, wherein the data identification module is further to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
12. The apparatus of claim 11, wherein the data identification module is further to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
13. The apparatus of claim 12, wherein the data identification module is further to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
14. The apparatus of claim 13, wherein the data identification module is further to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
15. The apparatus of claim 9, further comprising:
and the alarm module is used for alarming the working personnel after identifying the NetFlow repeated data.
16. The apparatus of claim 9, wherein the data deduplication module is further to:
and deleting the NetFlow repeated data with the identification.
17. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 8 when executing the computer program.
18. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 8.
CN201911280899.6A 2019-12-13 2019-12-13 NetFlow data duplicate removal method and device Active CN111064637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911280899.6A CN111064637B (en) 2019-12-13 2019-12-13 NetFlow data duplicate removal method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911280899.6A CN111064637B (en) 2019-12-13 2019-12-13 NetFlow data duplicate removal method and device

Publications (2)

Publication Number Publication Date
CN111064637A true CN111064637A (en) 2020-04-24
CN111064637B CN111064637B (en) 2021-10-01

Family

ID=70300981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911280899.6A Active CN111064637B (en) 2019-12-13 2019-12-13 NetFlow data duplicate removal method and device

Country Status (1)

Country Link
CN (1) CN111064637B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468486A (en) * 2020-11-24 2021-03-09 北京天融信网络安全技术有限公司 Netflow data duplicate removal method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090161578A1 (en) * 2007-12-21 2009-06-25 Hong Kong Applied Science And Technology Research Institute Co. Ltd. Data routing method and device thereof
CN101483491A (en) * 2008-01-11 2009-07-15 华为技术有限公司 Shared guard ring, multicast source route protection method and node thereof
CN102158401A (en) * 2011-03-03 2011-08-17 江苏方天电力技术有限公司 Flow monitoring model based on electric automation system
US20130332596A1 (en) * 2012-06-11 2013-12-12 James O. Jones Network traffic tracking
CN205336305U (en) * 2015-12-07 2016-06-22 贵州电网公司信息通信分公司 Hardware framework that NS3 parallel simulation simulation system used
CN106027406A (en) * 2016-05-23 2016-10-12 电子科技大学 NS3 simulation system flow importing method based on Netflow
CN106209840A (en) * 2016-07-12 2016-12-07 中国银联股份有限公司 A kind of network packet De-weight method and device
CN110557302A (en) * 2019-08-30 2019-12-10 西南交通大学 Network equipment message observation data acquisition method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090161578A1 (en) * 2007-12-21 2009-06-25 Hong Kong Applied Science And Technology Research Institute Co. Ltd. Data routing method and device thereof
CN101483491A (en) * 2008-01-11 2009-07-15 华为技术有限公司 Shared guard ring, multicast source route protection method and node thereof
CN102158401A (en) * 2011-03-03 2011-08-17 江苏方天电力技术有限公司 Flow monitoring model based on electric automation system
US20130332596A1 (en) * 2012-06-11 2013-12-12 James O. Jones Network traffic tracking
CN205336305U (en) * 2015-12-07 2016-06-22 贵州电网公司信息通信分公司 Hardware framework that NS3 parallel simulation simulation system used
CN106027406A (en) * 2016-05-23 2016-10-12 电子科技大学 NS3 simulation system flow importing method based on Netflow
CN106209840A (en) * 2016-07-12 2016-12-07 中国银联股份有限公司 A kind of network packet De-weight method and device
CN110557302A (en) * 2019-08-30 2019-12-10 西南交通大学 Network equipment message observation data acquisition method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468486A (en) * 2020-11-24 2021-03-09 北京天融信网络安全技术有限公司 Netflow data duplicate removal method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111064637B (en) 2021-10-01

Similar Documents

Publication Publication Date Title
JP5961354B2 (en) Method and apparatus for efficient netflow data analysis
CN107508722B (en) Service monitoring method and device
JP5933463B2 (en) Log occurrence abnormality detection device and method
CN105608517B (en) Business transaction performance management and visualization method and device based on flow
CN111935063B (en) Abnormal network access behavior monitoring system and method for terminal equipment
CN112636942B (en) Method and device for monitoring service host node
US10884805B2 (en) Dynamically configurable operation information collection
JP6190539B2 (en) Log analysis apparatus, log analysis system, log analysis method, and computer program
CN106844170A (en) A kind of troubleshooting, the influence face method and apparatus of analysis failure
CN113055335A (en) Method, apparatus, network system and storage medium for detecting communication abnormality
CN106649344B (en) Weblog compression method and device
CN112600719A (en) Alarm clustering method, device and storage medium
CN111064637B (en) NetFlow data duplicate removal method and device
CN109033188A (en) A kind of metadata acquisition method, apparatus, server and computer-readable medium
CN108255659A (en) A kind of application program capacity monitoring method and its system
CN109714214A (en) A kind of processing method and management equipment of server exception
US8838774B2 (en) Method, system, and computer program product for identifying common factors associated with network activity with reduced resource utilization
CN116975938A (en) Sensor data processing method in product manufacturing process
CN112565232A (en) Log analysis method and system based on template and flow state
CN112579833B (en) Service association relation acquisition method and device based on user operation data
CN111917660B (en) Optimization method and device for gateway equipment policy
CN112804190B (en) Security event detection method and system based on boundary firewall flow
CN111130921B (en) Method and device for processing performance index of core network element
CN113285824A (en) Method and device for monitoring security of network configuration command
CN112866044B (en) Network equipment state information acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: Room 702-2, No. 4811, Cao'an Highway, Jiading District, Shanghai

Patentee after: CHINA UNITECHS

Address before: 100872 5th floor, Renmin culture building, 59 Zhongguancun Street, Haidian District, Beijing

Patentee before: CHINA UNITECHS

CP02 Change in the address of a patent holder