CN106059957B

CN106059957B - Quickly flow stream searching method and system under a kind of high concurrent network environment

Info

Publication number: CN106059957B
Application number: CN201610330417.3A
Authority: CN
Inventors: 刘庆云; 王鹏; 周舟; 李佳; 杨威; 方滨兴; 郭莉
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2016-05-18
Filing date: 2016-05-18
Publication date: 2019-09-10
Anticipated expiration: 2036-05-18
Also published as: CN106059957A

Abstract

The invention relates to a fast flow table search method and system in a high concurrent network environment. The method comprises: 1) counting the traffic entering the network interface, and setting the buffer window of the buffer according to the current traffic condition of the statistics; 2) according to the size of the buffer window set, utilizing quintuple information to process 3) Schedule each cached group according to the preset scheduling strategy, and send each group to the connection management module in turn; 4) The connection management module extracts the quintuple information of each group and performs flow table search process, find the corresponding flow entry, and use the data packets in the group to update the information of the flow entry. The invention is mainly applicable to the high-speed network flow processing system of the backbone link, can optimize the access cost of the connection management module under the high-speed network environment, and improve the access efficiency of the flow table.

Description

A fast flow table lookup method and system in a high-concurrency network environment

技术领域technical field

本发明属于网络安全技术领域，具体涉及一种面向高并发网络环境的快速流表查找方法和系统。The invention belongs to the technical field of network security, and in particular relates to a fast flow table search method and system for a high-concurrency network environment.

背景技术Background technique

在高速网络环境下，高效率的连接管理已经成为现有网络流量处理系统(如入侵检测、流量计费等系统)的一个关键模块，通常流量处理系统架构主要分为三大模块：流量获取、连接管理、业务处理。连接管理为业务处理提供流追溯功能，包括查找、更新和删除这三种操作。为准确的记录每一条连接，连接管理模块必须维护一个连接表(或会话表)，其中每一个连接表项追溯网络中的一条连接，负责记录连接的标识ID、状态等相关信息，其中连接标识是全局唯一的，一般由TCP/IP头部的五元组信息构成。In a high-speed network environment, efficient connection management has become a key module of existing network traffic processing systems (such as intrusion detection, traffic accounting, etc. systems). Usually, the traffic processing system architecture is mainly divided into three modules: traffic acquisition, Connection management, business processing. Connection management provides flow traceability for business processing, including three operations: search, update, and delete. In order to accurately record each connection, the connection management module must maintain a connection table (or session table), wherein each connection table item traces a connection in the network, and is responsible for recording the identification ID, status and other related information of the connection, wherein the connection identification It is globally unique and generally consists of five-tuple information in the TCP/IP header.

现有流量处理系统采用单包调度的策略：数据包首先缓存在网卡缓冲区中，之后按缓冲区内的到达顺序依次送往连接管理模块，执行连接的状态更新与维护操作。在高速网络环境中，单包处理不仅会带来大量的函数回调开销，也会导致流表访问的性能瓶颈。随着并发连接数的增加，连接表的规模不断增加。受限于硬件资源的限制和哈希表结构自身的局限，哈希表槽数需要预先设定且动态调整非常困难，冲突链的增长使得单包流表查找效率下降。在现有10Gbps流量的高速网络中，包速达到10Mpps甚至更高，而网络中绝大多数包都需要执行流表查找操作，流表的查找频率和包到达速率相当，流表查找效率已经成为流处理系统的重要性能之一。基于此，有必要设计一种可扩展、高效的流表查找方法，来应对骨干网络的高速并发环境。The existing traffic processing system adopts a single-packet scheduling strategy: data packets are first buffered in the buffer of the network card, and then sent to the connection management module in order of arrival in the buffer to perform connection status update and maintenance operations. In a high-speed network environment, single-packet processing will not only bring a lot of function callback overhead, but also lead to performance bottlenecks in flow table access. As the number of concurrent connections increases, the size of the connection table continues to increase. Due to the limitations of hardware resources and the structure of the hash table, the number of slots in the hash table needs to be preset and it is very difficult to adjust dynamically. The growth of conflict chains reduces the search efficiency of the single-packet flow table. In the existing high-speed network with 10Gbps traffic, the packet speed reaches 10Mpps or even higher, and most packets in the network need to perform a flow table lookup operation. The flow table lookup frequency is equivalent to the packet arrival rate, and the flow table lookup efficiency has become One of the important performance of stream processing system. Based on this, it is necessary to design a scalable and efficient flow table lookup method to cope with the high-speed concurrent environment of the backbone network.

目前的流表查找操作被分为三类实现方法：哈希表，布鲁姆过滤器，内容寻址存储器。对采用哈希表结构的流表而言，一次哈希表查找包括哈希值的计算和冲突链比较两步操作。用连接法处理哈希冲突的最坏情况性能很差：所有N个关键字都被插入到同一个槽中，从而产生一个长度为N的链表，这时的最坏查找长度为O(N)。The current flow table lookup operation is divided into three types of implementation methods: hash table, Bloom filter, and content addressable memory. For a flow table using a hash table structure, a hash table lookup includes two steps: calculation of hash value and comparison of conflict chains. The worst-case performance of hash collisions handled by the join method is very poor: all N keys are inserted into the same slot, resulting in a linked list of length N, and the worst search length at this time is O(N) .

基于此，很多工作都集中在尽量使得每个槽上的冲突链长度均衡，以保证平均查找长度接近最好情况的O(1+α)，α为装载因子。要实现这一点需要好的哈希算法，尽管可以借助复杂的密码学哈希方法(MD5、SHA-1)来实现哈希表中的所有冲突链长度分布均衡，但是好的哈希函数通常会消耗大量的CPU。Based on this, a lot of work is focused on making the conflict chain lengths on each slot as equal as possible to ensure that the average search length is close to the best case O(1+α), where α is the loading factor. To achieve this, a good hash algorithm is required. Although complex cryptographic hash methods (MD5, SHA-1) can be used to achieve a balanced distribution of the lengths of all conflicting chains in the hash table, good hash functions usually Consumes a lot of CPU.

相对于前面提到的一重哈希，多重哈希的效果会更好。多重哈希将会计算多次哈希值，最终插入到多个子表中最短的一个，但这使得每次查找都要查找多个冲突链，在包数密集的骨干网络里将带来较大的查找开销。Compared with the single hash mentioned above, the effect of multi-hashing will be better. Multi-hashing will calculate the hash value multiple times, and finally insert it into the shortest one of the multiple sub-tables, but this makes it necessary to search for multiple conflict chains for each search, which will bring a large number of packets in the backbone network with dense packets. lookup cost.

在借助网络局部性优化查找操作方面，使用FPGA和SRAM实现流表的高速cache可以加快访问速度，受限于FPGA的电路复杂性以及SRAM的容量限制，流表的规模受到了存储容量的限制，在高并发网络环境中，受流量波动性和突发流量的影响，大量活动连接将会被迫替换掉，导致系统漏检。In terms of optimizing the search operation with the help of network locality, using FPGA and SRAM to realize the high-speed cache of the flow table can speed up the access speed. Limited by the circuit complexity of the FPGA and the capacity limitation of the SRAM, the size of the flow table is limited by the storage capacity. In a high-concurrency network environment, affected by traffic fluctuations and burst traffic, a large number of active connections will be forced to be replaced, resulting in missed detection by the system.

发明内容Contents of the invention

为了优化高速网络环境下连接管理模块的访问开销，基于骨干链路的高并发，慢更新，存在一定程度局部性的流量特征，本发明提供了一种快速流表查找方法和系统，主要适用于骨干链路的高速网络流量处理系统中。In order to optimize the access overhead of the connection management module in a high-speed network environment, based on the high concurrency and slow update of the backbone link, there is a certain degree of localized traffic characteristics, the present invention provides a fast flow table lookup method and system, which is mainly applicable to In the high-speed network traffic processing system of the backbone link.

本发明的主要内容包括：(1)高效的网络流量分组算法，对网卡缓冲区内的数据包按照流标号进行分组；(2)阈值调度策略，对于已经分组的数据包进行调度；(3)流表查找。The main contents of the present invention include: (1) an efficient network flow grouping algorithm, grouping the data packets in the network card buffer according to the flow label; (2) threshold scheduling strategy, scheduling the grouped data packets; (3) Flow table lookup.

本发明的快速流表查找方法的核心是将网络中的数据包分组送往连接管理模块，以减少流表查找的比较次数和回调开销。分组后每个连接到达的包数越多，快速查找方法带来的积极效果越好。因此，高效的分组算法是快速流表查找方法的基础。基于此，分组算法的设计主要包含以下几个方面：The core of the fast flow table lookup method of the present invention is to send the data packets in the network to the connection management module, so as to reduce the comparison times and callback overhead of the flow table lookup. The more packets arriving per connection after grouping, the more positive the fast lookup method will be. Therefore, efficient grouping algorithms are the basis of fast flow table lookup methods. Based on this, the design of the grouping algorithm mainly includes the following aspects:

1)分组依据为TCP/IP头部的五元组信息。连接管理模块中的连接是以网络通信中的源IP、目的IP、源端口、目的端口以及传输层通信协议类型五元组信息唯一确定。1) The basis for grouping is the five-tuple information in the TCP/IP header. The connection in the connection management module is uniquely determined by the five-tuple information of source IP, destination IP, source port, destination port and transport layer communication protocol type in network communication.

2)分组算法的高效性和灵活性。分组操作会引入一定的时间开销，好的数据结构可以极大降低分组操作的开销，使得快速查找方法带来更多的积极效果。2) Efficiency and flexibility of the grouping algorithm. Grouping operations will introduce a certain amount of time overhead, and a good data structure can greatly reduce the overhead of grouping operations, making the fast search method bring more positive effects.

3)对于分组后的每一组数据包来说，它们的五元组相同，来自同一个连接，需要高效地索引起来，以便调度策略可以高效地对各个分组进行调度、维护。3) For each group of data packets after grouping, their quintuples are the same and come from the same connection, and need to be efficiently indexed so that the scheduling strategy can efficiently schedule and maintain each group.

4)分组算法操作的对象是网卡缓冲区内的数据包，缓冲窗口的大小需要折中考虑。窗口太大，不仅会消耗一定的内存空间，也会导致数据包从捕获到处理的延时增加；窗口太小，则会导致每个连接缓存的包数太少，带来的积极效果有限。4) The operation object of the grouping algorithm is the data packet in the buffer of the network card, and the size of the buffer window needs to be considered in compromise. If the window is too large, it will not only consume a certain amount of memory space, but also increase the delay from capture to processing of data packets; if the window is too small, the number of packets cached by each connection will be too small, and the positive effect will be limited.

对于已经分组过的数据包，需要一定的调度策略将各个分组送往连接管理模块，好的调度策略不仅可以使每个分组内的数据包都能得到公平的调度机会，同时也会为快速查找方法带来更多的积极效果。调度策略的设计主要包含以下设计内容：For the data packets that have been grouped, a certain scheduling strategy is required to send each group to the connection management module. A good scheduling strategy can not only enable the data packets in each group to get a fair scheduling opportunity, but also provide for quick search. method brings more positive effects. The design of the scheduling strategy mainly includes the following design contents:

1)对于已缓存的分组，包数多的分组应当优先被调度，包数少的分组应当延后调度，等待缓存更多的数据包，这可以减少更多流表访问开销。1) For cached packets, packets with a large number of packets should be scheduled first, and packets with a small number of packets should be scheduled later to wait for more data packets to be cached, which can reduce more flow table access overhead.

2)被延后调度的分组不应出现饥饿现象，即长时间没有得到调度机会，这会使得系统漏检。2) There should be no starvation phenomenon in the delayed scheduling group, that is, no scheduling opportunity for a long time, which will make the system miss detection.

本发明提供的快速流表查找方案，将网络中的数据包分组送往连接管理模块，以减少流表查找的比较次数和回调开销，并通过合理的调度策略使每个分组内的数据包都能得到公平的调度机会。该方法主要适用于骨干链路的高速网络流量处理系统中，能够优化高速网络环境下连接管理模块的访问开销，提高流表的访问效率。The fast flow table lookup solution provided by the present invention sends the data packets in the network to the connection management module to reduce the comparison times and callback overhead of the flow table lookup, and through a reasonable scheduling strategy, the data packets in each group are Can get a fair scheduling opportunity. This method is mainly applicable to the high-speed network traffic processing system of the backbone link, and can optimize the access cost of the connection management module in the high-speed network environment, and improve the access efficiency of the flow table.

附图说明Description of drawings

图1是本发明系统结构示意图。Fig. 1 is a schematic diagram of the system structure of the present invention.

图2是数据流分组结构示意图。Fig. 2 is a schematic diagram of a data stream packet structure.

图3是Q1Q2队列迁移示意图。Figure 3 is a schematic diagram of Q1Q2 queue migration.

图4是场景A中流表访问时间对比图。Figure 4 is a comparison of flow table access time in scenario A.

图5是场景A中流表查找长度对比图。Figure 5 is a comparison chart of flow table lookup lengths in scenario A.

图6是场景B中流表访问时间对比图。Figure 6 is a comparison of flow table access time in scenario B.

图7是场景B中流表查找长度对比图。Fig. 7 is a comparison diagram of flow table lookup length in scenario B.

具体实施方式Detailed ways

下面通过具体实施例和附图，对本发明做进一步说明。The present invention will be further described below through specific embodiments and accompanying drawings.

本发明的总体框架如图1所示，由网络接口、缓冲区窗口管理模块、数据流分组模块、饥饿避免模块、分组调度器、连接管理模块六个部分组成，运行步骤如下：The general frame of the present invention is as shown in Figure 1, is made up of six parts of network interface, buffer window management module, data stream grouping module, starvation avoiding module, packet scheduler, connection management module, and operation steps are as follows:

1)流量进入网络接口的同时对流量情况进行统计，并将流量统计信息送入缓冲区窗口管理模块；缓冲区窗口管理模块根据当前的流量状况，从预置的窗口大小中选择一个；1) When the traffic enters the network interface, the traffic situation is counted, and the traffic statistics are sent to the buffer window management module; the buffer window management module selects one from the preset window sizes according to the current traffic situation;

2)根据设定的窗口大小，数据流分组模块对到达的数据包执行分组操作，当调度时机到达时，触发分组调度器；2) According to the set window size, the data stream grouping module performs a grouping operation on the arriving data packets, and when the scheduling opportunity arrives, triggers the grouping scheduler;

3)分组调度器收到触发指令后，对各个缓存的分组按照调度策略进行调度，依次将各个分组送往连接管理模块；3) After receiving the trigger instruction, the packet scheduler schedules each cached packet according to the scheduling strategy, and sends each packet to the connection management module in turn;

4)饥饿避免模块负责从分组调度器采集调度信息，并适时触发分组调度器对未调度的分组进行调度；4) The starvation avoidance module is responsible for collecting scheduling information from the packet scheduler, and triggering the packet scheduler to schedule unscheduled packets in due course;

5)连接管理模块抽取出分组调度器送来的各个分组的五元组信息，进行一次真实的流表查找过程，找到对应的流表项，之后使用分组内的数据包依次更新流表项的状态等信息。5) The connection management module extracts the quintuple information of each group sent by the packet scheduler, performs a real flow table lookup process, finds the corresponding flow table entry, and then uses the data packets in the group to update the flow table entry in turn. Status and other information.

下面，就运行步骤做详细地论述。Next, the operation steps will be described in detail.

缓冲区窗口管理模块：通过对当前包到达速度和包间隔信息进行采集，结合系统的延时容忍程度选择一个合适的窗口大小K，如窗口大小预置了64、256、512三个值，默认选择为256，其单位为“个”，用以描述可容纳多少个数据包。所述延时容忍度是指系统或者项目可承受的数据包处理延时。若系统延时容忍度为100us以内，窗口选择为K＝64；若系统延时容忍度为100～500us，窗口选择K＝256；否则窗口选K＝512大小。Buffer window management module: By collecting the current packet arrival speed and packet interval information, combined with the delay tolerance of the system, an appropriate window size K is selected. For example, the window size is preset with three values of 64, 256, and 512. The choice is 256, and its unit is "piece", which is used to describe how many data packets can be accommodated. The delay tolerance refers to the data packet processing delay that can be tolerated by the system or project. If the system delay tolerance is within 100us, the window selection is K=64; if the system delay tolerance is 100-500us, the window selection is K=256; otherwise, the window selection is K=512.

数据流分组模块：如图2，为了实现高效的数据流分组，算法采用了哈希表(图2中PT)为主结构进行数据包的分组，同时引入了索引队列(图2中Q1和Q2)对已经分组的数据包进行索引，并在哈希表和索引队列之间建立双向索引关系，具体步骤如下：Data stream grouping module: as shown in Figure 2, in order to achieve efficient data stream grouping, the algorithm uses a hash table (PT in Figure 2) as the main structure to group data packets, and introduces index queues (Q1 and Q2 in Figure 2 ) to index the packet that has been grouped, and establish a bidirectional index relationship between the hash table and the index queue, the specific steps are as follows:

1)每当一个数据包x到达时，当前缓冲区的总包数cached_num增加1，抽取x的五元组信息代表其流标号x.fid，并由x.fid计算出x在PT中的位置j。转2)。1) Whenever a data packet x arrives, the total number of packets cached_num in the current buffer is increased by 1, and the five-tuple information of x is extracted to represent its flow label x.fid, and the position of x in PT is calculated by x.fid j. Go to 2).

2)若PT[j]为空，表明x不属于任何一个当前已缓存的分组，需要执行3)建立一个新分组并建立PT与Q1之间的双向索引；若PT[j]不为空，执行4)。2) If PT[j] is empty, it means that x does not belong to any of the currently cached groups, you need to execute 3) Create a new group and establish a bidirectional index between PT and Q1; if PT[j] is not empty, Execute 4).

3)将x及其五元组信息存入PT[j]位置，PT[j]已经缓存的包数PT[j].pkt_count增加1，并在Q1尾部t处增加一项j，建立起Q1到PT[j]的单向索引，Q1中的元素个数增加了一个，Q1中的总包数Q1.pkt_count增加1；同时在PT[j]位置增加Q1和t信息：PT[j].Q＝Q1,PT[j].idx＝t，建立起PT到Q1的单向索引，此时PT[j]与Q1之间的双向索引建立成功。转6)。3) Store x and its quintuple information in PT[j], increase the number of packets PT[j].pkt_count that PT[j] has cached by 1, and add an item j at the tail t of Q1 to establish Q1 To the unidirectional index of PT[j], the number of elements in Q1 is increased by one, and the total number of packets Q1.pkt_count in Q1 is increased by 1; at the same time, Q1 and t information are added at the position of PT[j]: PT[j]. Q=Q1, PT[j].idx=t, a unidirectional index from PT to Q1 is established, and at this time, a bidirectional index between PT[j] and Q1 is established successfully. Go to 6).

4)PT[j]不为空，表明PT的j位置已经维护了一个分组和索引信息。将PT[j].fid与x.fid做比较，若相等，表示x属于PT[j]维护的分组，执行5)；若不相等，意味着分组过程发生了哈希冲突，将冲突标记submitflag置1，转6)。4) PT[j] is not empty, indicating that position j of PT has maintained a group and index information. Compare PT[j].fid with x.fid, if they are equal, it means that x belongs to the group maintained by PT[j], go to 5); if they are not equal, it means that there is a hash conflict in the grouping process, and mark the conflict as submitflag set to 1, turn to 6).

5)将x存储到PT[j]中，PT[j]已经缓存的包数PT[j].pkt_count增加1，j所对应的Q(Q1或Q2)中总包数Q.pkt_count增加1；转6)。5) Store x in PT[j], the number of packets PT[j].pkt_count already cached by PT[j] increases by 1, and the total number of packets Q.pkt_count in Q (Q1 or Q2) corresponding to j increases by 1; Go to 6).

6)若submitflag为1，将submitflag置0，触发分组调度器进行调度，调度之后PT[j]的位置为空，执行4)；或者，如果当前缓冲区的总包数cached_num＝K，表示缓冲区已经满了，也应触发分组调度器进行调度；之后，继续回到1)。6) If submitflag is 1, set submitflag to 0, trigger the packet scheduler to schedule, after scheduling, the position of PT[j] is empty, and execute 4); or, if the total number of packets in the current buffer cached_num=K, it means buffering If the zone is full, the packet scheduler should also be triggered for scheduling; after that, continue back to 1).

分组调度器：Packet scheduler:

1)对于数据分组模块中4)和5)，若分组过程发生了哈希冲突，或者若PT[j].pkt_count超过历史平均值，则执行过程2)将PT[j]从Q1中转移到Q2中；1) For 4) and 5) in the data grouping module, if there is a hash collision in the grouping process, or if PT[j].pkt_count exceeds the historical average value, then perform the process 2) transfer PT[j] from Q1 to in Q2;

2)如图3所示。其中(a)图为将Q1中的PT[2]迁移到Q2中，图中的虚线表示为将要在Q2中增加的索引，Q1中PT[2]将会被移除)。(b)图为完成迁移操作后的索引情况，虚线表示的单向索引已经更新为了实线的双向索引，同时，Q1中尾部的PT[1]填补了被移除的PT[2]，与PT[1]对应的索引Q1:3页更新为了Q1:2，在Q2的尾部t处增加一项元素j，Q2的元素个数增加1，建立起Q2到PT[j]的单向索引，Q2.pkt_count加上PT[j].pkt_count。通过PT[j]到Q1的单向索引，找到其在Q1中的位置PT[j].idx，通过Q1尾部t存储的元素i找到PT[i]，将PT[i].idx设置为PT[j].idx相同值，同时将Q1中的位置PT[j].idx处的元素设置为i，将Q1的元素个数减1，Q1中总包Q1.pkt_count减去PT[j].pkt_count，至此完成将PT[j]从Q1中删除的操作。将PT[j].idx设置为t，PT[j].Q设置为Q2，建立PT[j]到Q2的索引。2) As shown in Figure 3. The figure (a) is to migrate PT[2] in Q1 to Q2, the dotted line in the figure indicates the index to be added in Q2, and PT[2] in Q1 will be removed). (b) The figure shows the index situation after the migration operation is completed. The one-way index indicated by the dotted line has been updated to the two-way index indicated by the solid line. At the same time, the PT[1] at the end of Q1 fills the removed PT[2]. The index Q1:3 page corresponding to PT[1] is updated to Q1:2, an element j is added at the tail t of Q2, the number of elements of Q2 is increased by 1, and a one-way index from Q2 to PT[j] is established. Q2.pkt_count plus PT[j].pkt_count. Through the one-way index from PT[j] to Q1, find its position PT[j].idx in Q1, find PT[i] through the element i stored in the tail t of Q1, and set PT[i].idx to PT [j].idx have the same value, and at the same time set the element at position PT[j].idx in Q1 to i, reduce the number of elements in Q1 by 1, and subtract PT[j] from the total package Q1.pkt_count in Q1. pkt_count, so far the operation of deleting PT[j] from Q1 is completed. Set PT[j].idx to t, PT[j].Q to Q2, and create an index from PT[j] to Q2.

3)若调度由数据流分组模块触发，则将Q2索引的分组按组提交到连接管理模块；若调度由饥饿避免模块触发，则将Q1中索引的分组按组提交到连接管理模块。3) If the scheduling is triggered by the data stream grouping module, submit the Q2 indexed groups to the connection management module; if the scheduling is triggered by the starvation avoidance module, then submit the Q1 indexed groups to the connection management module.

饥饿避免模块：Starvation avoidance module:

1)通过采集数据流分组模块中6)的触发信息，适时触发分组调度器提交Q1中索引的分组。1) By collecting the trigger information of 6) in the data flow grouping module, the packet scheduler is triggered to submit the group indexed in Q1 in good time.

2)设计分组模块冲突计数器C1和Q2调度计数器C2，前者负责记录分组模块的哈希冲突次数，每发生一次冲突C1增加1，每当缓冲区满时C1减1(若C1小于0则置0)；后者负责记录Q2累计调度的总包数，每次调度Q2时，C2累加Q2.pkt_count，当Q1被调度时，C2被置0；2) Design grouping module conflict counter C1 and Q2 scheduling counter C2. The former is responsible for recording the number of hash collisions of the grouping module. Every time a conflict occurs, C1 increases by 1, and whenever the buffer is full, C1 decreases by 1 (if C1 is less than 0, set it to 0 ); the latter is responsible for recording the total number of packets scheduled by Q2, each time Q2 is scheduled, C2 accumulates Q2.pkt_count, when Q1 is scheduled, C2 is set to 0;

3)每当Q1.pkt_count＝K或者C1超过3或者C2的值超过Q1.pkt_count的10倍时，就触发调度器调度Q1，并清空计数器的值为0。3) Whenever Q1.pkt_count=K or C1 exceeds 3 or the value of C2 exceeds 10 times of Q1.pkt_count, the scheduler is triggered to schedule Q1, and the value of the counter is cleared to 0.

连接管理模块：Connection management module:

1)对于调度器模块送来的Q(Q1或者Q2)，对于Q中每一项i，找到对一个的PT[i]，用PT[i]中的第一个包进行真实流表的查找过程。1) For the Q (Q1 or Q2) sent by the scheduler module, for each item i in Q, find a pair of PT[i], and use the first packet in PT[i] to search the real flow table process.

2)若找到对应的流表项，使用PT[i]中的每个包依次更新该流表项。若未找到，则建立一个新的流表项，并依次更新流状态。2) If the corresponding flow entry is found, each packet in PT[i] is used to update the flow entry in turn. If not found, create a new flow entry and update the flow state in sequence.

采用两个数据集对本发明进行评估，数据集基本信息如表1所示。Two data sets are used to evaluate the present invention, and the basic information of the data sets is shown in Table 1.

表1.数据集基本信息Table 1. Basic information of the dataset

通过在不同时间刻度下对是否采用快速流表查找方法作为对照来评估本发明的效果。评价指标采用流表平均查找长度和流表平均访问时间两个维度。结果如图4-7所示，其中图4是场景A中流表访问时间对比图，图5是场景A中流表查找长度对比图，图6是场景B中流表访问时间对比图，图7是场景B中流表查找长度对比图。实验结果表明，在多种流量环境下，本发明提出的快速流表查找方法均有很好的性能提升，可以提高流表的访问效率。The effect of the present invention is evaluated by comparing whether to use the fast flow table lookup method under different time scales. The evaluation index adopts two dimensions: the average lookup length of the flow table and the average access time of the flow table. The results are shown in Figure 4-7, where Figure 4 is a comparison of flow table access time in scenario A, Figure 5 is a comparison of flow table lookup length in scenario A, Figure 6 is a comparison of flow table access time in scenario B, and Figure 7 is a comparison of flow table access time in scenario A Comparison of flow table lookup lengths in B. Experimental results show that under various traffic environments, the fast flow table lookup method proposed by the present invention has a good performance improvement, and can improve the access efficiency of the flow table.

本发明的具体步骤使用了哈希表和队列的实现方式，但不局限与这两种数据结构，也可以使用其他线性结构(如栈)代替队列，用其他可实现键值映射类数据结构(如红黑树)代替哈希表结构。Concrete steps of the present invention have used the realization mode of hash table and queue, but not limited to these two kinds of data structures, also can use other linear structure (as stack) to replace queue, can realize key-value mapping class data structure with other ( Such as red-black tree) instead of hash table structure.

以上实施例仅用以说明本发明的技术方案而非对其进行限制，本领域的普通技术人员可以对本发明的技术方案进行修改或者等同替换，而不脱离本发明的精神和范围，本发明的保护范围应以权利要求书所述为准。The above embodiments are only used to illustrate the technical solution of the present invention and not to limit it. Those of ordinary skill in the art can modify or equivalently replace the technical solution of the present invention without departing from the spirit and scope of the present invention. The scope of protection should be determined by the claims.

Claims

1. quick flow stream searching method under a kind of high concurrent network environment, which comprises the following steps:

1) flow for entering network interface is counted, according to the buffering window of the current traffic conditions setting buffer area of statistics Mouthful；

2) according to the size of the buffer window of setting, division operation is executed using data packet of the five-tuple information to arrival；

3) grouping of each caching is scheduled according to preset scheduling strategy, connection management mould successively is sent in each grouping Block；

4) connection management module extracts the five-tuple information of each grouping, carries out flow stream searching process, finds corresponding flow table , and use the information of the data packet update flow entry in grouping.

2. the method as described in claim 1, which is characterized in that step 1) passes through to current packet arrival rate and inter-packet gap information It is acquired, selects suitable buffer window size in conjunction with the delay degrees of tolerance of system.

3. the method as described in claim 1, which is characterized in that step 2) uses Hash table to carry out dividing for data packet for main structure Group, while introducing index queue and grouped data packet is indexed, and foundation is double between Hash table and index queue To index relative.

4. the method as described in claim 1, which is characterized in that the step 3) scheduling strategy includes:

A) for the grouping cached, the packet priority more than packet number is scheduled, and scheduling is delayed in the few grouping of packet number, waits and caching more More data packet, to reduce flow table access expense；

B) hunger phenomenon should not occur in the grouping for being delayed scheduling, i.e., do not obtain dispatcher meeting for a long time, to avoid system is generated System missing inspection.

5. quick flow stream searching system under a kind of high concurrent network environment using claim 1 the method, which is characterized in that Module and connection management module are avoided including buffer window management module, data stream packet module, packet scheduler, starvation；

The buffer window management module sets buffer area for caching traffic statistics, and according to current traffic conditions Buffer window；

The data stream packet module executes division operation to the data packet of arrival, and work as according to the buffer window size of setting Scheduling occasion triggers the packet scheduler when reaching；

After the packet scheduler receives triggering command, the grouping of each caching is scheduled according to preset scheduling strategy, The connection management module successively is sent in each grouping；

The starvation avoids module from being responsible for from the packet scheduler collection scheduling information, and triggers the packet scheduler in due course Unscheduled grouping is scheduled；

The connection management module extracts the five-tuple information for each grouping that packet scheduler is sent, and carries out flow stream searching mistake Journey finds corresponding flow entry, and the information of flow entry is updated using the data packet in grouping.

6. system as claimed in claim 5, which is characterized in that the buffer window management module is by reaching current packet Speed and inter-packet gap information are acquired, and select suitable window size in conjunction with the delay degrees of tolerance of system.

7. such as system described in claim 5 or 6, which is characterized in that the data stream packet module is tied using based on Hash table Structure carries out the grouping of data packet, while introducing index queue and being indexed to grouped data packet, and in Hash table and rope Draw and establishes two-way index relative between queue.

8. system as claimed in claim 7, which is characterized in that the step of data stream packet module is grouped is as follows:

1) when being reached a data packet x, total packet number cached_num of current buffer increases by 1, extracts the five-tuple letter of x Breath represents its number of failing to be sold at auction x.fid, and calculates position j of the x in Hash table PT by x.fid, turns 2)；

If 2) PT [j] is sky, show that x is not belonging to the grouping that any one has currently been cached, 3) execution establishes a new grouping simultaneously Establish the two-way index between PT and index queue Q1；If PT [j] is not sky, execute 4)；

3) x and its five-tuple information being stored in the position PT [j], PT [j] buffered packet number PT [j] .pkt_count increases by 1, And increase a j at the t of the tail portion Q1, it is established that the unidirectional index of Q1 to PT [j]；Total packet number Q1.pkt_count in Q1 increases 1, while increasing Q1 and t information: PT [j] .Q=Q1, PT [j] .idx=t, it is established that the single way cable of PT to Q1 in the position [j] PT Draw, the two-way index between PT [j] and Q1 is successfully established at this time, is turned 6)；

4) PT [j] for sky, show the position j of PT maintained one grouping and index information, by PT [j] .fid with X.fid is compared, if equal, indicated that x belongs to the grouping of PT [j] maintenance, is executed 5)；If unequal, it is meant that grouping process hair Hash-collision has been given birth to, conflict label submitflag is set 1, is turned 6)；If hash-collision has occurred in grouping process, if PT [j] .pkt_count is more than history average, then PT [j] is transferred to rope from index queue Q1 by the packet scheduler Draw in queue Q2；

5) by x storage in PT [j], PT [j] buffered packet number PT [j] .pkt_count increases total in Q corresponding to 1, j Packet number Q.pkt_count increases by 1, and wherein Q is Q1 or Q2；Turn 6)；

6) if submitflag is 1, submitflag is set 0, the packet scheduler is triggered and is scheduled, PT after scheduling The position of [j] is sky, is executed 4)；Alternatively, indicate that buffer area has been expired if total packet number cached_num=K of current buffer, Also the packet scheduler is triggered to be scheduled；Later, it continues back at 1).

9. system as claimed in claim 8, which is characterized in that the step of packet scheduler is scheduled is as follows:

1] for the step 4) that is grouped in data grouping module and 5), if hash-collision, Huo Zheruo has occurred in grouping process PT [j] .pkt_count is more than history average, then implementation procedure 2] PT [j] is transferred to index queue from index queue Q1 In Q2；

2] element number for increasing element a j, Q2 at the tail portion t of Q2 increases by 1, it is established that the unidirectional index of Q2 to PT [j], Q2.pkt_count adds PT [j] .pkt_count；The unidirectional index that Q1 is arrived by PT [j], finds its position PT in Q1 [j] .idx finds PT [i] by the element i that the tail portion Q1 t is stored, sets PT [j] .idx identical value for PT [i] .idx, simultaneously I is set by the element at position PT [j] .idx in Q1, the element number of Q1 is subtracted 1, Q1.pkt_count is always wrapped in Q1 and is subtracted PT [j] .pkt_count is removed, the operation for deleting PT [j] from Q1 is so far completed；T, PT [j] .Q are set by PT [j] .idx It is set as Q2, establishes the index that PT [j] arrives Q2；

3] if scheduling is triggered by data stream packet module, the grouping of Q2 index is submitted to connection management module by group；If adjusting Degree avoids module from triggering by starvation, then the grouping indexed in Q1 is submitted to connection management module by group.

10. system as claimed in claim 9, which is characterized in that the starvation avoids the treatment process of module are as follows:

1] it by the triggering information of step 6) in acquisition data stream packet module, triggers and is indexed in packet scheduler submission Q1 in due course Grouping；

2] grouping module collision counter C1 and Q2 schedule counter C2 are designed, the hash-collision that C1 is responsible for recording grouping module is secondary Number, every primary conflict C1 of generation increase by 1, and C1 subtracts 1 when buffer area is full, sets 0 if C1 is less than 0；It is accumulative that C2 is responsible for record Q2 Total packet number of scheduling, when dispatching Q2 every time, C2 adds up Q2.pkt_count, and when Q1 is scheduled, C2 is set to 0；

3] it when the value that Q1.pkt_count=K or C1 is more than 3 or C2 is more than 10 times of Q1.pkt_count, just triggers Scheduler schedules Q1, and the value for emptying counter is 0.