CN103546307B - Network flow storage method - Google Patents

Network flow storage method Download PDF

Info

Publication number
CN103546307B
CN103546307B CN201210246855.3A CN201210246855A CN103546307B CN 103546307 B CN103546307 B CN 103546307B CN 201210246855 A CN201210246855 A CN 201210246855A CN 103546307 B CN103546307 B CN 103546307B
Authority
CN
China
Prior art keywords
client
network flow
server
packet
hash value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210246855.3A
Other languages
Chinese (zh)
Other versions
CN103546307A (en
Inventor
薛波
薛一波
王大伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201210246855.3A priority Critical patent/CN103546307B/en
Publication of CN103546307A publication Critical patent/CN103546307A/en
Application granted granted Critical
Publication of CN103546307B publication Critical patent/CN103546307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a kind of network flow storage method, including step: S1. initializes Client table and Server table;S2. what lookup captured newly enters the network flow that packet is corresponding in stream table;S3. Client table and Server table are updated.The method of the present invention is by condensing together the network flow belonging to same node rapidly, traffic classification system depth can be helped to excavate the relation between network flow, the challenge that reply new application layer protocol brings can be high-performance traffic classification system in express network, the design of content monitoring system and realizes providing technical support.

Description

Network flow storage method
Technical field
The invention belongs to traffic classification technical field in network technology, particularly relate to a kind of new network flow storage method.
Background technology
Due to quickly growing of network technology and the network bandwidth, the data traffic in network is also multiplied, at high speed bone On dry network, data traffic has reached Gbit each second, even more than 10Gbit.Flow is divided by the network traffics constantly increased Class proposes new challenge: the efficiency of traditional traffic classification system based on packet is difficult to meet high-speed backbone monitoring Needs.Under high-speed broadband network environment, the infinite arrival of network data high speed, and uninterruptedly, present mass data feature, and And this locality cannot be carried out storage.Therefore, traditional traffic classification system of packet capture-packet reduction-pattern match is relied on System efficiency cannot meet needs.Additionally, along with the complexity day by day of network environment, increasing application layer protocol uses encryption association View encryption data payload package.In this case, the difficulty finding data pack load crucial is increasing, ultimately results in based on number Traffic classification technology according to bag is of serious failure.
Being different from traffic classification technology based on packet, traffic classification technology based on network flow is conceived to network flow. Traditionally network flow is defined as having identical five-tuple (<source address, destination address, source port, destination interface, agreement>) The set of packet.As a kind of data exchange ways, network flow reflects Host behavior and main frame from a microcosmic point The details of intercommunication.
The supposed premise of traffic classification technology based on network flow is that different agreement has its distinctive network flow statistics spy Property, and the flow produced with this different agreement of classifying.Owing to this technology introduces substantial amounts of statistical information as basic reference Factor, so during the method for machine learning has inevitably been attached to identify by it, it is desirable to obtain more preferable traffic classification Energy.Machine learning method was introduced in traffic classification technology in 2004, entered flow according to the statistical property that flow has Row classification.Such as, the distribution character of network flow persistent period, flow free time, the inter-packet gap time, the information such as packet length, for For traffic classification, it is distinctive information.They can serve as the feature of discriminant and are flowed by machine learning model utilization Amount classification.
In order to extract network flow statistical nature, need to set up one and extract and the data structure of storage network flow, and according to The specification of network flow, extracts from background traffic and stores network flow information.At present, almost all of flow based on network flow Categorizing system all uses stream table extract and store network flow.Stream table have employed a kind of Hash table and adds the structure of chained list and determining the back of the body Network flow belonging to each packet in scape flow, and it is stored.After data are coated capture, traffic classification System can utilize the five-tuple of this packet to calculate a hash value, and utilizes this hash value finds in Hash table whether there is this The information of packet map network stream.If it does not exist, then first the arrival packet being belonging network stream with this packet, A network flow record is set up for it.Utilizing Hash table to store network flow, conflict is inevitable.Therefore, when a collision occurs, it is System can set up the chained list carry respective items at Hash table for the network flow of conflict.Utilize this stream table, based on network flow Traffic classification system can network flow belonging to the most corresponding each packet, and extract the system of single network stream efficiently Meter feature.
Along with the development of network technology, new application layer protocol emerges in an endless stream.In order to increase network utilization, and Antagonism traffic classification system, many emerging application layer protocols can enable multiple network flow simultaneously and complete a communication task.Wherein, Each network flow is merely responsible for a part for task.P2P agreement is an exemplary of this emerging application layer protocol.In order to more Realizing file-sharing well, quickly, a file division is become multiple pieces, and utilizes multiple network flow by many P2P consultations Share this document simultaneously;Another typical example is interactive protocol, and this agreement needs to enter with server in running Row is mutual.In order to improve efficiency, different interaction contents all can be deposited to different servers by most interactive protocols, and objective Family end then can utilize multiple network flow to realize the most mutual of information simultaneously.This novel application layer protocol is to based on network flow Traffic classification system propose new challenge: first, this agreement uses multiple network flow to complete same communication task simultaneously, The knowledge extracted from single network stream and utilize is reduced, have impact on the recognition performance of categorizing system;Secondly, current base Traffic classification system in network flow is conceived to single network stream, is difficult to carry out all-network stream produced by this agreement point Class.
In order to solve the problems referred to above, the challenge that reply new application layer protocol brings, more and more based on network flow Traffic classification technology starts with Multi net voting stream feature.This novel network flow feature attempts from Multi net voting stream angle, Find the relationship characteristic between multiple network flow, to realize accurate, the complete classification of P2P, interactive protocol flow.But, Current stream list structure is but difficult to extract Multi net voting flow relation feature: stream table uses a kind of flat structure storage network flow, net Network stream is evenly distributed in Hash table.May not there is any relation in the network flow with identical hash value, and belong to same The hash value of the network flow of one agreement may be different, and therefore we are difficult to the relation judging between network flow.
From Multi net voting stream angle, find the relation between multiple network flow, extract the relation between Multi net voting stream special Levy, it is possible to help traffic classification system based on network flow to realize the accurate, complete of the novel protocol flow such as P2P, interactive protocol Back-up class.But, current stream list structure is conceived to single network stream, uses a kind of flat structure storage network flow, is difficult to Extract the relationship characteristic between multiple network flow.
Summary of the invention
(1) to solve the technical problem that
The technical problem to be solved is: how to provide a kind of new network flow to extract and storage method, it is possible to Fast and effeciently extract the relationship characteristic between Multi net voting stream, to help traffic classification system preferably to tackle new application layer association The challenge that view is brought.
(2) technical scheme
In order to solve the problems referred to above, the invention provides a kind of network flow storage method, including step: S1. initializes Client table and Server table;S2. what lookup captured newly enters the network flow that packet is corresponding in stream table;S3. update Client table and Server table.
Preferably, step S1 includes: S1.1 be Client initialize a size be ncHash table;S1.2 is Server Initializing a size is nsHash table.
Preferably, for Client initialized Hash table be a size be ncSequence list, each list item is used for Store multiple network flows that a client is initiated.
Preferably, for Server initialized Hash table be a size be nsSequence list, each list item is used for Store multiple network flows that a server receives.
Preferably, step S2 includes: S2.1 captures one and newly enters packet;S2.2 calculates packet forward five-tuple (< source Address, destination address, source port, destination interface, agreement >) hash value h1;S2.3 utilizes whether hash value h1 searches in stream table There is corresponding network flow, if existing, then the direction of labelling current data packet is from client to server, and performs step S3, if not existing, then performs step S2.4;S2.4 calculates packet reverse five-tuple (< destination address, source address, destination Mouthful, source port, agreement >) hash value h2;S2.5 utilizes hash value h2 to search the network flow that whether there is correspondence in stream table, if Exist, then the direction of labelling current data packet is from server to client, and performs step S3, if not existing, then performs step Rapid S2.6;S2.6 utilizes hash value h1 to be that this packet creates a network flow record, and the direction of then labelling current data packet It is first packet of network flow from client to server, performs step S3.
Preferably, step S3 includes: S3.1 calculates the hash value h3 and the Hash of server ip address of client ip address Value h4;S3.2 judges that whether packet is first packet of this network flow, the most then perform step S3.3, if it is not, then Perform step S3.4;S3.3 increases new network flow information in Client table and Server table;S3.4 uses packet to update Respective items in Client table and Server table;S3.5 returns and performs step S2.
Preferably, step S3.3 includes: S3.31 utilizes hash value h3 to search whether to there is this client at Client table Information, if not existing, then creates the record of correspondence in Client table for this client;S3.32 utilizes hash value h4 to exist Server table searches whether to exist the information of this server, if not existing, is then that this server creates correspondence in Server table Record.
Preferably, when creating client records, if list item corresponding to h3 is taken by other clients, then use chain After client-side information is mounted to list item corresponding to h3 by table.
Preferably, when creating server record, if list item corresponding to h4 is taken by other servers, then use chain After server info is mounted to list item corresponding to h4 by table.
Preferably, step S3.4 includes: S3.41 judges that whether packet is first packet of this network flow, if so, Then perform step S3.42, if it is not, then perform step S3.43;S3.42 increases in the respective items of Client table and Server table Add network flow information;S3.43 utilize packet update Client table and Server table to the network flow information in corresponding.
(3) beneficial effect
The method of the present invention, on the basis of existing stream table, increases by two Hash tables and is used for storing client and server The transmission of node and the network flow of reception.After packet is captured, first look for its network flow corresponding in stream table, then profit Two the Hash tables of information updating provided with network flow and packet.The method is by belonging to the net of same node rapidly Network stream condenses together, it is possible to helping traffic classification system depth to excavate the relation between network flow, reply new application layer is assisted The challenge that view is brought can be high-performance traffic classification system in express network, the design of content monitoring system and realizes providing Technical support.
Accompanying drawing explanation
With reference to the accompanying drawings and combine example to further describe the present invention.Wherein:
Fig. 1 is the key step flow chart of the network flow storage method according to the embodiment of the present invention.
Fig. 2 be the network flow storage method according to the embodiment of the present invention be embodied as flow chart of steps.
Fig. 3 is the Client table according to the embodiment of the present invention and Server table renewal schematic diagram.
Detailed description of the invention
Below in conjunction with the accompanying drawings and embodiment, the detailed description of the invention of the present invention is described in further detail.Hereinafter implement Example is used for illustrating the present invention, but is not limited to the scope of the present invention.
Current stream list structure is conceived to single network stream, uses a kind of flat structure storage network flow, is difficult to extract Relationship characteristic between multiple network flows so that traffic classification system based on network flow is difficult to tackle new application layer protocol band The challenge come.For this problem, the present invention proposes a kind of new network flow storage method.The method is at the base of existing stream table On plinth, increase by two Hash tables for storing the transmission of client and server node and the network flow of reception.Packet quilt After capture, first look for its network flow corresponding in stream table, then utilize the information updating two that network flow and packet provide Individual Hash table.By being condensed together by the network flow belonging to same node rapidly, the method can help traffic classification System depth excavates the relation between network flow, the challenge that reply new application layer protocol brings.
As shown in Figure 1-Figure 3, according to a kind of new network flow storage method of present invention offer, it comprises the following steps:
S1 initializes Client table and Server table;
Wherein, step S1 farther includes:
S1.1 is the Hash table that Client initializes that size is nc (such as: 4096);
Wherein, in step S1.1,
For Client initialized Hash table be a size be the sequence list of nc, each list item for storage one Multiple network flows that client is initiated;
Wherein, in step S1.2,
For Server initialized Hash table be a size be the sequence list of ns, each list item for storage one Multiple network flows that server receives;
Wherein, the step setting up the described stream table for storing network flow information is also included before step S1.
S1.2 is the Hash table that Server initializes that size is ns (such as: 4096);
What S2 lookup captured newly enters the network flow that packet is corresponding in stream table;
Wherein, step S2 farther includes:
S2.1 captures one and newly enters packet;
Wherein, in step S2.1,
The packet of capture includes Transmission Control Protocol and udp data bag.
S2.2 calculates packet forward five-tuple (<source address, destination address, source port, destination interface, agreement>) Hash value h1;
S2.3 utilizes hash value h1 to search the network flow that whether there is correspondence in stream table, if existing, then labelling current data The direction of bag is from client to server, and performs step S3, if not existing, then performs step S2.4;
S2.4 calculates the reverse five-tuple of packet (<destination address, source address, destination interface, source port, agreement>) Hash value h2;
S2.5 utilizes hash value h2 to search the network flow that whether there is correspondence in stream table, if existing, then labelling current data The direction of bag is from server to client, and performs step S3, if not existing, then performs step S2.6;
S2.6 utilizes hash value h1 to be that this packet creates a network flow record, and the direction of then labelling current data packet It is first packet of network flow from client to server, performs step S3;
S3 updates Client table and Server table;
As in figure 2 it is shown, captured packet is initially used for updating stream table, after renewal, the network flow information in stream table is then used Update Client table and Server table;
Wherein, step S3 farther includes:
S3.1 calculates the hash value h3 and the hash value h4 of server ip address of client ip address;
S3.2 judges that whether packet is first packet of this network flow, the most then perform step S3.3, if not It is then to perform step S3.4;
S3.3 increases new network flow information in Client table and Server table;
Wherein, step S3.3 farther includes:
S3.31 utilizes hash value h3 to search whether to exist the information of this client at Client table, if not existing, then for being somebody's turn to do Client creates the record of correspondence in Client table;
Wherein, in step S3.31,
When creating client records, if list item corresponding to h3 is taken by other clients, then use chained list by visitor After family client information is mounted to the list item that h3 is corresponding;
S3.32 utilizes hash value h4 to search whether to exist the information of this server at Server table, if not existing, then for being somebody's turn to do Server creates the record of correspondence in Server table;
Wherein, in step S3.32,
When creating server record, if list item corresponding to h4 is taken by other servers, then use the chained list will clothes After business device information is mounted to list item corresponding to h4;
S3.4 uses packet to update respective items in Client table and Server table;
Wherein, step S3.4 farther includes:
S3.41 judges that whether packet is first packet of this network flow, the most then perform step S3.42, if not It is then to perform step S3.43;
S3.42 increases network flow information in the respective items of Client table and Server table;
S3.43 utilize packet update Client table and Server table to the network flow information in corresponding;
S3.5 returns and performs step S2.
Description of the invention is given for example with for the sake of describing, and is not exhaustively or by the present invention It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Select and retouch Stating embodiment is in order to the principle of the present invention and actual application are more preferably described, and enables those of ordinary skill in the art to manage Solve the present invention thus design the various embodiments with various amendments being suitable to special-purpose.

Claims (8)

1. a network flow storage method, it is characterised in that include step:
S1. Client table and Server table are initialized;Described step S1 includes:
S1.1 be Client initialize a size be ncHash table;It is that a size is for Client initialized Hash table ncSequence list, multiple network flows that each list item is initiated for one client of storage;
S1.2 be Server initialize a size be nsHash table;
S2. what lookup captured newly enters the network flow that packet is corresponding in stream table;
S3. Client table and Server table are updated;
Wherein, the step setting up the described stream table for storing network flow information is also included before step S1.
2. the method for claim 1, it is characterised in that:
For Server initialized Hash table be a size be nsSequence list, each list item for storage one service Multiple network flows that device receives.
3. the method for claim 1, it is characterised in that step S2 includes:
S2.1 captures one and newly enters packet;
S2.2 calculates the hash value h1 of packet forward five-tuple, and described packet forward five-tuple is < source address, destination Location, source port, destination interface, agreement >;
S2.3 utilizes hash value h1 to search in stream table the network flow that whether there is correspondence, if exist, then and labelling current data packet Direction is from client to server, and performs step S3, if not existing, then performs step S2.4;
S2.4 calculates the hash value h2 of the reverse five-tuple of packet, and the reverse five-tuple of described packet is < destination address, seedbed Location, destination interface, source port, agreement >;
S2.5 utilizes hash value h2 to search in stream table the network flow that whether there is correspondence, if exist, then and labelling current data packet Direction is from server to client, and performs step S3, if not existing, then performs step S2.6;
S2.6 utilizes hash value h1 to be that this packet creates a network flow record, and the direction of labelling current data packet is from visitor Family end, to first packet of the network flow of server, performs step S3.
4. the method for claim 1, it is characterised in that step S3 includes:
S3.1 calculates the hash value h3 and the hash value h4 of server ip address of client ip address;
S3.2 judges that whether packet is first packet of this network flow, the most then perform step S3.3, if it is not, then Perform step S3.4;
S3.3 increases new network flow information in Client table and Server table;
S3.4 uses packet to update respective items in Client table and Server table;
S3.5 returns and performs step S2.
5. method as claimed in claim 4, it is characterised in that step S3.3 includes:
S3.31 utilizes hash value h3 to search whether to exist the information of this client at Client table, if not existing, is then this client End creates the record of correspondence in Client table;
S3.32 utilizes hash value h4 to search whether to exist the information of this server at Server table, if not existing, is then this service Device creates the record of correspondence in Server table.
6. the method as described in claim 4 or 5, it is characterised in that:
When creating client records, if list item corresponding to h3 is taken by other clients, then use chained list by client After information is mounted to the list item that h3 is corresponding.
7. the method as described in claim 4 or 5, it is characterised in that:
When creating server record, if list item corresponding to h4 is taken by other servers, then use chained list by server After information is mounted to the list item that h4 is corresponding.
8. method as claimed in claim 4, it is characterised in that step S3.4 includes:
S3.41 judges that whether packet is first packet of this network flow, the most then perform step S3.42, if it is not, Then perform step S3.43;
S3.42 increases network flow information in the respective items of Client table and Server table;
S3.43 utilize packet update Client table and Server table to the network flow information in corresponding.
CN201210246855.3A 2012-07-16 2012-07-16 Network flow storage method Active CN103546307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210246855.3A CN103546307B (en) 2012-07-16 2012-07-16 Network flow storage method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210246855.3A CN103546307B (en) 2012-07-16 2012-07-16 Network flow storage method

Publications (2)

Publication Number Publication Date
CN103546307A CN103546307A (en) 2014-01-29
CN103546307B true CN103546307B (en) 2016-12-21

Family

ID=49969383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210246855.3A Active CN103546307B (en) 2012-07-16 2012-07-16 Network flow storage method

Country Status (1)

Country Link
CN (1) CN103546307B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107248939B (en) * 2017-05-26 2020-07-31 中国人民解放军理工大学 Network flow high-speed correlation method based on hash memory

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729413A (en) * 2009-11-06 2010-06-09 清华大学 Multi-service processing system and method based on ATCA
CN102468987A (en) * 2010-11-08 2012-05-23 清华大学 NetFlow characteristic vector extraction method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729413A (en) * 2009-11-06 2010-06-09 清华大学 Multi-service processing system and method based on ATCA
CN102468987A (en) * 2010-11-08 2012-05-23 清华大学 NetFlow characteristic vector extraction method

Also Published As

Publication number Publication date
CN103546307A (en) 2014-01-29

Similar Documents

Publication Publication Date Title
US7644150B1 (en) System and method for network traffic management
Chen et al. A first look at inter-data center traffic characteristics via yahoo! datasets
US9509614B2 (en) Hierarchical load balancing in a network environment
CN102724317B (en) A kind of network traffic data sorting technique and device
US9172756B2 (en) Optimizing application performance in a network environment
US20080209053A1 (en) HTTP-Based Peer-to-Peer Framework
CN102148854B (en) Method and device for identifying peer-to-peer (P2P) shared flows
CN105357142A (en) Method for designing network load balancer system based on ForCES
CN103281211A (en) Large-scale network node grouping management system and management method
CN101022371A (en) Automatic discovering and managing method for extendable interconnection network measurement server
CN103546307B (en) Network flow storage method
Gu et al. Online wireless mesh network traffic classification using machine learning
Liu et al. Efficient FIB caching using minimal non-overlapping prefixes
Bashir et al. Classifying P2P activity in Netflow records: A case study on BitTorrent
Nagaraj et al. Hierarchy-aware distributed overlays in data centers using DC2
US20230188496A1 (en) Microservice visibility and control
CN101668035A (en) Method for recognizing various P2P-TV application video flows in real time
Stevens et al. Analysis of an anycast based overlay system for scalable service discovery and execution
KR20190064066A (en) Traffic load management apparatus and method based on coordinated application protocol for internet of things local networks
Ekambaram et al. Interest flooding reduction in content centric networks
Turner et al. Can overlay hosting services make ip ossification irrelevant
Yamanaka et al. Openflow networks with limited l2 functionality
Flores-De La Cruz et al. OpenFlow compatible key-based routing protocol: adapting SDN networks to content/service-centric paradigm
Guohao et al. A data center load balancing algorithm based on artificial bee colony algorithm
Król et al. DISC-NG: Robust Service Discovery in the Ethereum Global Network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant