CN111064637A - NetFlow data duplicate removal method and device - Google Patents
NetFlow data duplicate removal method and device Download PDFInfo
- Publication number
- CN111064637A CN111064637A CN201911280899.6A CN201911280899A CN111064637A CN 111064637 A CN111064637 A CN 111064637A CN 201911280899 A CN201911280899 A CN 201911280899A CN 111064637 A CN111064637 A CN 111064637A
- Authority
- CN
- China
- Prior art keywords
- netflow
- data
- information
- netflow data
- repeated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/028—Capturing of monitoring data by filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/50—Address allocation
- H04L61/5007—Internet protocol [IP] addresses
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a method and a device for removing duplicate of NetFlow data, wherein the method comprises the following steps: acquiring NetFlow data information and IP address information of NetFlow data exporting equipment; identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment; identifying NetFlow repeated data according to the keyword information; and carrying out deduplication processing on the repeated NetFlow data according to the identification result. The invention can perform duplicate removal processing on repeated NetFlow data, and avoids the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the acquisition equipment for multiple times, thereby consuming a large amount of resources and causing inaccurate subsequent data statistical results.
Description
Technical Field
The invention relates to the technical field of communication, in particular to a NetFlow data duplicate removal method and device.
Background
NetFlow is a network monitoring function for providing a session level view of network traffic and recording session information of each TCP/IP (Transmission Control Protocol/Internet Protocol). However, the network administrator generally does not have any deep knowledge of the technical implementation mechanism of NetFlow itself or the network topology to be monitored, which may result in the same network TCP/IP session information being exported by the NetFlow data export device into multiple pieces of repeated NetFlow data. If the repeated NetFlow data cannot be identified and eliminated, one piece of NetFlow data is collected, calculated or circulated by the acquisition equipment for multiple times, so that not only is the resource consumption large, but also the subsequent data statistical result is inaccurate. And at present, no method capable of realizing deduplication on NetFlow data exists.
Disclosure of Invention
The embodiment of the invention provides a NetFlow data deduplication method, which is used for performing deduplication processing on repeated NetFlow data, and avoiding that one piece of NetFlow data is collected, calculated or circulated by acquisition equipment for multiple times, so that a large amount of resources are consumed, and the subsequent data statistical result is inaccurate, and the method comprises the following steps:
acquiring NetFlow data information and IP address information of NetFlow data exporting equipment;
identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
identifying NetFlow repeated data according to the keyword information;
and carrying out deduplication processing on the repeated NetFlow data according to the identification result.
Optionally, the NetFlow data information and NetFlow data derivation device IP address information for identifying the keyword information include:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
Optionally, identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data, includes:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Optionally, the identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data further includes:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
Optionally, the identifying, according to the NetFlow data information and the NetFlow data derivation device IP address information, key information for reflecting NetFlow duplicate data further includes:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
Optionally, identifying the NetFlow duplicate data according to the keyword information includes:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
Optionally, the method further includes:
and after acquiring NetFlow repeated data, alarming the staff.
Optionally, performing deduplication processing on the repeated NetFlow data according to the identification result, including:
and deleting the NetFlow repeated data with the identification.
The embodiment of the invention also provides a NetFlow data deduplication device, which is used for performing deduplication processing on repeated NetFlow data, and avoiding that one piece of NetFlow data is collected, calculated or circulated by collection equipment for multiple times, so that a large amount of resources are consumed, and subsequent data statistics results are inaccurate, and the device comprises:
the information acquisition module is used for acquiring NetFlow data information and IP address information of NetFlow data export equipment;
the data identification module is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module is used for identifying the NetFlow repeated data according to the keyword information;
and the data deduplication module is used for performing deduplication processing on the repeated NetFlow data according to the identification result.
Optionally, the NetFlow data information and NetFlow data derivation device IP address information for identifying the keyword information include:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
Optionally, the data identification module is further configured to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Optionally, the data identification module is further configured to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
Optionally, the data identification module is further configured to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
Optionally, the data identification module is further configured to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
Optionally, the apparatus further comprises:
and the alarm module is used for alarming the working personnel after identifying the NetFlow repeated data.
Optionally, the data deduplication module is further configured to:
and deleting the NetFlow repeated data with the identification.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the above method is stored.
In the embodiment of the invention, by acquiring the NetFlow data information and the IP address information of the NetFlow data derivation equipment, the keyword information for reflecting NetFlow repeated data is identified according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment, and then the NetFlow repeated data is identified only according to the keyword information, and the repeated NetFlow data is subjected to deduplication processing according to the identification result. In conclusion, the repeated NetFlow data can be subjected to deduplication processing, and the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the collection equipment for multiple times is avoided, so that a large amount of resources are consumed, and the subsequent data statistics result is inaccurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
fig. 1 is a flowchart of a NetFlow data deduplication method in an embodiment of the present invention;
fig. 2 is another flowchart of the NetFlow data deduplication method according to the embodiment of the present invention;
fig. 3 is a schematic structural diagram of a NetFlow data deduplication device in the embodiment of the present invention;
fig. 4 is another schematic structural diagram of the NetFlow data deduplication device in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
Fig. 1 is a flowchart of a NetFlow data deduplication method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
and step 101, acquiring NetFlow data information and IP address information of NetFlow data derivation equipment.
In this embodiment, acquiring NetFlow data information includes: receiving and analyzing the NetFlow data information (generally, the received NetFlow data information cannot directly acquire the content of the NetFlow data information, and data analysis is needed).
NetFlow data information and NetFlow data derivation device IP address information for identifying keyword information, including: the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data. That is, the key information is identified by a combination of source address information, destination address information, source port information, destination port information, protocol type information, next hop address information, and NetFlow data derivation device IP address information in NetFlow data.
The NetFlow data export device may be: routers, switches, etc.
In specific implementation, the original NetFlow data is analyzed. The analyzed data is added with the IP address of the NetFlow data export device to form a piece of data. Representing one TCP/IP session information.
And 102, identifying key information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment.
In this embodiment, step 102 includes: and comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period. The "preset time period" may be 50s-70s (e.g., 55s, 60s, 65s, etc.).
Judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
Specifically, if the NetFlow data information includes a case where the next hop address information is the same as the IP address of the derivation device, it is confirmed that NetFlow duplicate data exists, and the keyword information for reflecting the NetFlow duplicate data is identified.
If the NetFlow data has the next hop address information identical to the IP address of the export device, it is confirmed that the same network session reflecting the NetFlow data is repeated after the NetFlow data is exported due to passing through a plurality of consecutive export devices. The exporting device is referred to as a router and the "continuation" means: the plurality of routers are connected in sequence.
And if the same keyword information exists in the multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists. If the same data key information exists in multiple groups of NetFlow data information sent by different NetFlow data export devices, it is determined that the same network session for reflecting NetFlow data is repeated after the NetFlow data is exported due to the fact that the same network session passes through multiple discontinuous export devices. The exporting device is referred to as a router, "discontinuous" means: the plurality of routers are not connected with each other, and there may be no connection relationship between one router and other routers.
And if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
And 103, identifying NetFlow repeated data according to the keyword information.
In this embodiment, if the same key information exists in multiple sets of NetFlow data information sent by different NetFlow data derivation devices, the IP address information of the different NetFlow data derivation devices is recorded, and the NetFlow data other than the first NetFlow data in the NetFlow data information with the same key information is identified.
In order to facilitate the staff to find the error in time, as shown in fig. 2, the method further includes:
In specific implementation, once the acquisition device recognizes that the acquired NetFlow data is repeated, the acquisition device firstly gives an alarm to a worker (the worker can adjust the configuration of the NetFlow export device according to the alarm information), then automatically records the IP of the NetFlow data export device for generating the repeated data, and identifies the same NetFlow data except the first piece.
The identified duplicate NetFlow data information and NetFlow data derivation device IP address information can be used for calculation of a data transfer path (a session passes through a plurality of NetFlow data derivation devices, that is, a transfer path of the session).
And step 104, performing deduplication processing on the repeated NetFlow data according to the identification result.
In this embodiment, step 104 includes: and deleting the NetFlow repeated data with the identification.
During specific implementation, operations such as ignoring and discarding NetFlow repeated data can be performed according to the identifier.
According to the NetFlow data deduplication method provided by the embodiment of the invention, by acquiring NetFlow data information and IP address information of NetFlow data derivation equipment, keyword information for reflecting NetFlow repeated data is identified according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment, and subsequently, only the NetFlow repeated data needs to be identified according to the keyword information, and deduplication processing is performed on the repeated NetFlow data according to an identification result. In conclusion, the repeated NetFlow data can be subjected to deduplication processing, and the phenomenon that one piece of NetFlow data is collected, calculated or circulated by the collection equipment for multiple times is avoided, so that a large amount of resources are consumed, and the subsequent data statistics result is inaccurate.
The invention is illustrated below in a specific example:
s1: and the acquisition equipment receives and analyzes the NetFlow data.
S2: extracting the source address, the destination address, the source port, the destination port, the protocol type, the next hop IP (hereinafter referred to as six-tuple) and the NetFlow data in the analyzed NetFlow data to derive the IP address information of the equipment, and forming new data information by seven field information.
S3: the six-element group information generated at step S2 is buffered for a certain period of time (typically 60 seconds).
S4: comparing the information cached in the step S3 in real time, and if the same hexahydric group information appears but is not sent by the same NetFlow data export device, indicating that the NetFlow data has a duplicate problem.
S5: a warning of NetFlow data duplication is issued, and the six-tuple identified in step S4 and derived device IP information (seven field information in total) are cached.
S6: identifying all the NetFlow data in real time according to the cache information in the step S5, wherein the NetFlow data corresponding to the source address, the destination address, the source port, the destination port and the protocol type (five fields) which are the same as the cache information only retains the first NetFlow data, and the rest of the same NetFlow data are all identified as repeated data.
S7: after identifying the NetFlow data, the acquisition equipment can ignore, discard and other operations on the data according to the identification, and complete the deduplication processing on NetFlow repeated data.
Based on the same inventive concept, the embodiment of the present invention further provides a NetFlow data deduplication device, as described in the following embodiments. Because the principle of solving the problem of the NetFlow data deduplication device is similar to that of the NetFlow data deduplication method, the implementation of the NetFlow data deduplication device can refer to the implementation of the NetFlow data deduplication method, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 3 is a schematic structural diagram of a NetFlow data deduplication device provided in an embodiment of the present invention, and as shown in fig. 3, the device includes:
an information obtaining module 301, configured to obtain NetFlow data information and IP address information of a NetFlow data derivation device;
the data identification module 302 is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module 303 is configured to identify NetFlow repeated data according to the keyword information;
and the data deduplication module 304 is configured to perform deduplication processing on the repeated NetFlow data according to the identification result.
In the embodiment of the present invention, the NetFlow data information includes: NetFlow data information and NetFlow data derivation device IP address information for identifying keyword information, including:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
In an embodiment of the present invention, the data identification module 302 is further configured to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
In an embodiment of the present invention, the data identification module 302 is further configured to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
In an embodiment of the present invention, the data identification module 302 is further configured to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
In this embodiment of the present invention, the data identification module 303 is further configured to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
In the embodiment of the present invention, as shown in fig. 4, the apparatus further includes:
and the alarm module 401 is configured to alarm the staff after acquiring the NetFlow repeated data.
In an embodiment of the present invention, the data deduplication module 304 is further configured to:
and deleting the NetFlow repeated data with the identification.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the above method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the above method is stored.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (18)
1. A NetFlow data deduplication method is characterized by comprising the following steps:
acquiring NetFlow data information and IP address information of NetFlow data exporting equipment;
identifying key word information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
identifying NetFlow repeated data according to the keyword information;
and carrying out deduplication processing on the repeated NetFlow data according to the identification result.
2. The method of claim 1, wherein the NetFlow data information and NetFlow data-derived device IP address information for identifying key information, comprises:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
3. The method of claim 2, wherein identifying key information reflecting NetFlow duplicate data from NetFlow data information and NetFlow data derivation device IP address information comprises:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
4. The method of claim 3, wherein identifying key information reflecting NetFlow duplication from NetFlow data information and NetFlow data derivation device IP address information, further comprises:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
5. The method of claim 4, wherein key information for reflecting NetFlow duplicate data is identified from the NetFlow data information and the NetFlow data derivation device IP address information, further comprising:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
6. The method of claim 5, wherein identifying NetFlow duplicates from key information comprises:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
7. The method of claim 1, further comprising:
and after acquiring NetFlow repeated data, alarming the staff.
8. The method of claim 1, wherein de-duplicating duplicate NetFlow data based on the identification comprises:
and deleting the NetFlow repeated data with the identification.
9. A NetFlow data deduplication apparatus, comprising:
the information acquisition module is used for acquiring NetFlow data information and IP address information of NetFlow data export equipment;
the data identification module is used for identifying keyword information for reflecting NetFlow repeated data according to the NetFlow data information and the IP address information of the NetFlow data derivation equipment;
the data identification module is used for identifying the NetFlow repeated data according to the keyword information;
and the data deduplication module is used for performing deduplication processing on the repeated NetFlow data according to the identification result.
10. The apparatus of claim 9, wherein the NetFlow data information and NetFlow data-derived device IP address information for identifying key information, comprises:
the source address information, the destination address information, the source port information, the destination port information, the protocol type information, the next hop address information and the NetFlow data derivation device IP address information in the NetFlow data.
11. The apparatus of claim 9, wherein the data identification module is further to:
comparing the acquired IP address information of the plurality of NetFlow data deriving devices with a plurality of groups of NetFlow data information corresponding to the IP address information of each NetFlow data deriving device in a preset time period;
judging whether the NetFlow data export devices with the same NetFlow data information in the multiple groups of NetFlow data information are the same NetFlow data export device or not according to the IP address information of the multiple NetFlow data export devices;
and identifying key word information for reflecting the NetFlow repeated data according to the judgment result.
12. The apparatus of claim 11, wherein the data identification module is further to:
and if the next hop address information is the same as the IP address of the derivation device in the NetFlow data information, confirming that NetFlow repeated data exists, and identifying key information for reflecting the NetFlow repeated data.
13. The apparatus of claim 12, wherein the data identification module is further to:
if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data exporting equipment, determining that NetFlow repeated data exists;
and if the same NetFlow data information exists in the multiple groups of NetFlow data information sent by the same NetFlow data export equipment, confirming that the configuration of the NetFlow data export equipment has errors.
14. The apparatus of claim 13, wherein the data identification module is further to:
and if the same keyword information exists in multiple groups of NetFlow data information sent by different NetFlow data derivation devices, recording the IP address information of the different NetFlow data derivation devices, and identifying NetFlow data except the first NetFlow data in the NetFlow data information with the same keyword information.
15. The apparatus of claim 9, further comprising:
and the alarm module is used for alarming the working personnel after identifying the NetFlow repeated data.
16. The apparatus of claim 9, wherein the data deduplication module is further to:
and deleting the NetFlow repeated data with the identification.
17. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 8 when executing the computer program.
18. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911280899.6A CN111064637B (en) | 2019-12-13 | 2019-12-13 | NetFlow data duplicate removal method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911280899.6A CN111064637B (en) | 2019-12-13 | 2019-12-13 | NetFlow data duplicate removal method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111064637A true CN111064637A (en) | 2020-04-24 |
CN111064637B CN111064637B (en) | 2021-10-01 |
Family
ID=70300981
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911280899.6A Active CN111064637B (en) | 2019-12-13 | 2019-12-13 | NetFlow data duplicate removal method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111064637B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112468486A (en) * | 2020-11-24 | 2021-03-09 | 北京天融信网络安全技术有限公司 | Netflow data duplicate removal method and device, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090161578A1 (en) * | 2007-12-21 | 2009-06-25 | Hong Kong Applied Science And Technology Research Institute Co. Ltd. | Data routing method and device thereof |
CN101483491A (en) * | 2008-01-11 | 2009-07-15 | 华为技术有限公司 | Shared guard ring, multicast source route protection method and node thereof |
CN102158401A (en) * | 2011-03-03 | 2011-08-17 | 江苏方天电力技术有限公司 | Flow monitoring model based on electric automation system |
US20130332596A1 (en) * | 2012-06-11 | 2013-12-12 | James O. Jones | Network traffic tracking |
CN205336305U (en) * | 2015-12-07 | 2016-06-22 | 贵州电网公司信息通信分公司 | Hardware framework that NS3 parallel simulation simulation system used |
CN106027406A (en) * | 2016-05-23 | 2016-10-12 | 电子科技大学 | NS3 simulation system flow importing method based on Netflow |
CN106209840A (en) * | 2016-07-12 | 2016-12-07 | 中国银联股份有限公司 | A kind of network packet De-weight method and device |
CN110557302A (en) * | 2019-08-30 | 2019-12-10 | 西南交通大学 | Network equipment message observation data acquisition method |
-
2019
- 2019-12-13 CN CN201911280899.6A patent/CN111064637B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090161578A1 (en) * | 2007-12-21 | 2009-06-25 | Hong Kong Applied Science And Technology Research Institute Co. Ltd. | Data routing method and device thereof |
CN101483491A (en) * | 2008-01-11 | 2009-07-15 | 华为技术有限公司 | Shared guard ring, multicast source route protection method and node thereof |
CN102158401A (en) * | 2011-03-03 | 2011-08-17 | 江苏方天电力技术有限公司 | Flow monitoring model based on electric automation system |
US20130332596A1 (en) * | 2012-06-11 | 2013-12-12 | James O. Jones | Network traffic tracking |
CN205336305U (en) * | 2015-12-07 | 2016-06-22 | 贵州电网公司信息通信分公司 | Hardware framework that NS3 parallel simulation simulation system used |
CN106027406A (en) * | 2016-05-23 | 2016-10-12 | 电子科技大学 | NS3 simulation system flow importing method based on Netflow |
CN106209840A (en) * | 2016-07-12 | 2016-12-07 | 中国银联股份有限公司 | A kind of network packet De-weight method and device |
CN110557302A (en) * | 2019-08-30 | 2019-12-10 | 西南交通大学 | Network equipment message observation data acquisition method |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112468486A (en) * | 2020-11-24 | 2021-03-09 | 北京天融信网络安全技术有限公司 | Netflow data duplicate removal method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111064637B (en) | 2021-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5961354B2 (en) | Method and apparatus for efficient netflow data analysis | |
CN107508722B (en) | Service monitoring method and device | |
JP5933463B2 (en) | Log occurrence abnormality detection device and method | |
CN105608517B (en) | Business transaction performance management and visualization method and device based on flow | |
CN111935063B (en) | Abnormal network access behavior monitoring system and method for terminal equipment | |
CN112636942B (en) | Method and device for monitoring service host node | |
US10884805B2 (en) | Dynamically configurable operation information collection | |
JP6190539B2 (en) | Log analysis apparatus, log analysis system, log analysis method, and computer program | |
CN106844170A (en) | A kind of troubleshooting, the influence face method and apparatus of analysis failure | |
CN113055335A (en) | Method, apparatus, network system and storage medium for detecting communication abnormality | |
CN106649344B (en) | Weblog compression method and device | |
CN112600719A (en) | Alarm clustering method, device and storage medium | |
CN111064637B (en) | NetFlow data duplicate removal method and device | |
CN109033188A (en) | A kind of metadata acquisition method, apparatus, server and computer-readable medium | |
CN108255659A (en) | A kind of application program capacity monitoring method and its system | |
CN109714214A (en) | A kind of processing method and management equipment of server exception | |
US8838774B2 (en) | Method, system, and computer program product for identifying common factors associated with network activity with reduced resource utilization | |
CN116975938A (en) | Sensor data processing method in product manufacturing process | |
CN112565232A (en) | Log analysis method and system based on template and flow state | |
CN112579833B (en) | Service association relation acquisition method and device based on user operation data | |
CN111917660B (en) | Optimization method and device for gateway equipment policy | |
CN112804190B (en) | Security event detection method and system based on boundary firewall flow | |
CN111130921B (en) | Method and device for processing performance index of core network element | |
CN113285824A (en) | Method and device for monitoring security of network configuration command | |
CN112866044B (en) | Network equipment state information acquisition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder |
Address after: Room 702-2, No. 4811, Cao'an Highway, Jiading District, Shanghai Patentee after: CHINA UNITECHS Address before: 100872 5th floor, Renmin culture building, 59 Zhongguancun Street, Haidian District, Beijing Patentee before: CHINA UNITECHS |
|
CP02 | Change in the address of a patent holder |