CN103944783A - Flow identification method based on posterior features - Google Patents

Flow identification method based on posterior features Download PDF

Info

Publication number
CN103944783A
CN103944783A CN201410165425.8A CN201410165425A CN103944783A CN 103944783 A CN103944783 A CN 103944783A CN 201410165425 A CN201410165425 A CN 201410165425A CN 103944783 A CN103944783 A CN 103944783A
Authority
CN
China
Prior art keywords
stream
packet
posteriority
strategy
derivation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410165425.8A
Other languages
Chinese (zh)
Other versions
CN103944783B (en
Inventor
王雨
张风雨
赵靓
申娟
李玉峰
姜鲲鹏
朱圣平
周锟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PLA Information Engineering University
Original Assignee
PLA Information Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PLA Information Engineering University filed Critical PLA Information Engineering University
Priority to CN201410165425.8A priority Critical patent/CN103944783B/en
Publication of CN103944783A publication Critical patent/CN103944783A/en
Application granted granted Critical
Publication of CN103944783B publication Critical patent/CN103944783B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a flow identification method based on posterior features. The method comprises the following steps of (1) setting posterior strategies; (2) setting derivation strategies and ageing time Tr; (3) establishing a derivation flow table; (4) establishing a backtracking data pool; (5) carrying out matching of the posterior strategies on packets; (6) extracting flow feature information contained in packets hitting the posterior strategies, establishing table items of the derivation flow table, and storing integrated flow feature information and timestamp Tm when matching occurs in the table items; (7) writing entered packets into the backtracking data pool to carry out delayed time processing, reading the delayed packets and extracting the flow feature information, retrieving the derivation flow table according to hash values of the flow feature information, recording the current time as Tn, and marking the current packet as a hitting packet if the flow feature information in the derivation flow table is matched with the flow feature information of the delayed packets and the condition that Tn-Tm <Tr is met. The flow identification method based on the posterior features is easy to achieve and high in reliability.

Description

Based on the stream recognition method of posteriority feature
(1), technical field: the present invention relates to a kind of stream recognition method, particularly relate to a kind of stream recognition method based on posteriority feature.
(2), background technology: utilize Business Stream recognition and classification technology to realize the classification of different business stream is processed and have very general application in conventional network equipment.In current network equipment, the realization of real-time stream recognition and classification technology is all carried out based on specific priori strategy, that is, after strategy matching occurs, extract traffic characteristic, then flow is subsequently processed.
The processing logic of this priori strategy, in the time need to extracting complete flow data, in the stream having arrived before occurring for strategy matching, grouping cannot be carried out effective recognition and classification.Thereby cannot obtain complete flow data and maybe cannot take action to complete flow data.
(3), summary of the invention:
The technical problem to be solved in the present invention is: a kind of stream recognition method based on posteriority feature is provided, and the method can realize recalls processing to the complete message grouping of data flow, and realization is simple, reliability is high.
Technical scheme of the present invention:
Based on a stream recognition method for posteriority feature, contain the following step:
Step 1: posteriority strategy is set;
Step 2: derivation strategy and ageing time T are set r, ageing time T rthe effective acting time of corresponding derivation strategy;
Step 3: build and derive from stream table;
Step 4: build retrieve data pond;
Step 5: the packet that enters recognition system is carried out to the coupling of posteriority strategy, if there is a packet P match hit, show data stream conforms posteriority policy condition now, mark is also exported all packet of this data flow;
Step 6: extract the contained stream characteristic information of packet that hits posteriority strategy, convection current characteristic information carries out hash computing, hash value is set to search key and sets up the list item that derives from stream table, stores complete stream characteristic information and mate the time stamp T while generation in list item m, list item is write and derives from stream table;
Step 7: the packet that enters recognition system is write to retrieve data pond, and carrying out duration in the memory in retrieve data pond is T ddelay process, then read the packet P after time delay d, extract the packet P after time delay dstream characteristic information, derive from stream table according to the hash value retrieval of this stream characteristic information, and to record current time be T nif, in derivation stream table, flow characteristic information and mate with the stream characteristic information of the packet after time delay, compare time stamp T mif meet T n-T m<T r, the current packet of mark is for hitting packet.
Posteriority strategy has following feature: data flow continues a moment in the cycle, and when the characteristic matching of a packet and this posteriority strategy, the processing action request after the match is successful is that the grouping of arrival before this data flow is recalled;
Derivation strategy is extracted and derived from by the packet of hitting posteriority strategy, any packet in the corresponding unique data flow of derivation strategy and data flow;
Derive from stream table and contain N derivation strategy, N is more than or equal to 1 natural number, and the index entry that derives from stream table is the hash value that derives from stream characteristic information, derives from stream table and contains and derive from the complete characterization information of stream and the rise time of derivation strategy;
Retrieve data pond adopts dual-memory ping-pong to store respectively and read packet.
All packet in step 5 contain match hit and advance into the packet of recognition system.
Stream characteristic information in step 6 contains five-tuple.
Duration T in step 7 ddynamically determine according to these indexs of the average duration of the designed capacity of recognition system, data flow, data input rate size, or be appointed as fixed value, this fixed value is less than the duration that the designed capacity of recognition system can be born.Delay process is to realize posterior key means, has ensured to generate in advance derivation strategy before packet arrives.
For solving hash collision problem, the hash bucket degree of depth can be set and be greater than 2, in the time having conflict to produce, by the time stamp T of more different list items mdetermine list item is early covered.
Based on a flow identification system for posteriority feature, this system realizes based on FPGA/CAM/SRAM/DDR-II, the abbreviation that wherein CAM is Content Addressable Memory.This system comprises:
Prescreen engine: this module adopts the logic realization of tabling look-up based on CAM, for according to known conditions, specific stream being carried out to prescreen, reduces the data traffic that enters posteriority strategy flow identification system, thereby provides longer specific duration T for system d;
Posteriority strategy matching engine: this module adopts the logic realization of tabling look-up based on CAM, for carrying out the coupling of posteriority strategy;
Derive from stream table maintenance module: this module is for the maintenance of the derivation stream list item that generates after posteriority strategy matching, and stream list item is write to SRAM store;
Retrieve data pond module: the mode that switching was stored/read to this module based on two DDR-II realizes the time delay to packet, for system provides the ability of recalling;
Derive from stream table search engine: this module is mated for the packet after time delay being flowed to list item, and carries out mark according to matching result team message.
Beneficial effect of the present invention:
1, the present invention can carry out posteriority strategy matching in any packet of data flow, posteriority strategy matching extracts the characteristic information of stream after occurring, and packet in the stream that the characteristic action extracting is arrived within a period of time before, realize the complete message grouping of data flow is recalled to processing, this is recalled processing and can utilize posteriority strategy to carry out discriminator to the message arriving before, guarantees to hit the integrated degree of stream on largely.
2, the present invention realizes simply, does not need large-scale External memory equipment, and all functions can realize on single circuit plate, and therefore, reliability is high.
3, flexibility of the present invention is good, by parameters such as dynamic adjustment prescreen strategy and tactful ageing times, can dynamically adjust convection current and recall the tenability of time.
(4), brief description of the drawings:
Fig. 1 is the structural representation of the flow identification system based on posteriority feature;
Fig. 2 is the keyword extraction schematic diagram of posteriority strategy matching in the flow identification system based on posteriority feature;
Fig. 3 is the contents in table schematic diagram of posteriority strategy keyword in the flow identification system based on posteriority feature;
Fig. 4 is the stream table content schematic diagram that in the flow identification system based on posteriority feature, posteriority strategy matching engine generates;
Fig. 5 is the retrieve data pond schematic diagram that adopts dual-memory PPD pingpong delay structure in the flow identification system based on posteriority feature.
(5), embodiment:
Stream recognition method based on posteriority feature contains the following step:
Step 1: posteriority strategy is set;
Step 2: derivation strategy and ageing time T are set r, ageing time T rthe effective acting time of corresponding derivation strategy;
Step 3: build and derive from stream table;
Step 4: build retrieve data pond;
Step 5: the packet that enters recognition system is carried out to the coupling of posteriority strategy, if there is a packet P match hit, show data stream conforms posteriority policy condition now, mark is also exported all packet of this data flow;
Step 6: extract the contained stream characteristic information of packet that hits posteriority strategy, convection current characteristic information carries out hash computing, hash value is set to search key and sets up the list item that derives from stream table, stores complete stream characteristic information and mate the time stamp T while generation in list item m, list item is write and derives from stream table;
Step 7: the packet that enters recognition system is write to retrieve data pond, and carrying out duration in the memory in retrieve data pond is T ddelay process, then read the packet P after time delay d, extract the packet P after time delay dstream characteristic information, derive from stream table according to the hash value retrieval of this stream characteristic information, and to record current time be T nif, in derivation stream table, flow characteristic information and mate with the stream characteristic information of the packet after time delay, compare time stamp T mif meet T n-T m<T r, the current packet of mark is for hitting packet.
Posteriority strategy has following feature: data flow continues a moment in the cycle, and when the characteristic matching of a packet and this posteriority strategy, the processing action request after the match is successful is that the grouping of arrival before this data flow is recalled;
Derivation strategy is extracted and derived from by the packet of hitting posteriority strategy, any packet in the corresponding unique data flow of derivation strategy and data flow;
Derive from stream table and contain N derivation strategy, N is more than or equal to 1 natural number, and the index entry that derives from stream table is the hash value that derives from stream characteristic information, derives from stream table and contains and derive from the complete characterization information of stream and the rise time of derivation strategy;
Retrieve data pond adopts dual-memory ping-pong to store respectively and read packet.
All packet in step 5 contain match hit and advance into the packet of recognition system.
Stream characteristic information in step 6 contains five-tuple.
Duration T in step 7 ddynamically determine according to these indexs of the average duration of the designed capacity of recognition system, data flow, data input rate size, or be appointed as fixed value, this fixed value is less than the duration that the designed capacity of recognition system can be born.Delay process is to realize posterior key means, has ensured to generate in advance derivation strategy before packet arrives.
For solving hash collision problem, the hash bucket degree of depth can be set and be greater than 2, in the time having conflict to produce, by the time stamp T of more different list items mdetermine list item is early covered.
Based on a flow identification system for posteriority feature, this system realizes based on FPGA/CAM/SRAM/DDR-II, the abbreviation that wherein CAM is Content Addressable Memory.This system comprises:
Prescreen engine: this module adopts the logic realization of tabling look-up based on CAM, for according to known conditions, specific stream being carried out to prescreen, reduces the data traffic that enters posteriority strategy flow identification system, thereby provides longer specific duration T for system d;
Posteriority strategy matching engine: this module adopts the logic realization of tabling look-up based on CAM, for carrying out the coupling of posteriority strategy;
Derive from stream table maintenance module: this module is for the maintenance of the derivation stream list item that generates after posteriority strategy matching, and stream list item is write to SRAM store;
Retrieve data pond module: the mode that switching was stored/read to this module based on two DDR-II realizes the time delay to packet, for system provides the ability of recalling;
Derive from stream table search engine: this module is mated for the packet after time delay being flowed to list item, and carries out mark according to matching result team message.
In order to understand better the present invention, below in conjunction with the flow identification system based on posteriority feature proposed by the invention, technical scheme of the present invention is illustrated.
As shown in Figure 1, first the message that enters system filters through prescreen engine.The object of prescreen is to reduce to enter the data traffic size of system, recalls the time thereby utilize limited memory to provide longer.Prescreen can be undertaken by the stream feature keyword leaving in CAM, also can be undertaken by certain interface/branch road in direct appointment initial data or the data that meet certain feature.
The data that prescreen module is sent are sent into respectively message delay module and are carried out time delay, send into posteriority characteristic matching engine and mate.
Posteriority characteristic matching engine completes the posteriority strategy matching of data flow.
Posteriority strategy is the sensitive words in data flow text normally, occurs mainly with the form of character string.Native system adopts CAM chip to realize searching of sensitive words in packet.As shown in Figure 2, the sensitive words width of establishing support is CL byte, and list item width is PL byte, from start of text of data message, and table look-up keyword extraction deliver to CAM chip and search of interval PL-CL+1 byte.For a sensitive words, at the deviation post of tabling look-up and may occur in keyword, in CAM chip, should derive PL-CL band mask list item according to it.As shown in Figure 3.
The stream characteristic information of the packet of the current processing of tabling look-up is carried out HASH by posteriority characteristic matching engine.When detecting after the sensitive words that contains posteriority strategy in message, CAM chip returns to the tactful ID that hits instruction and hit.Posteriority characteristic matching engine is according to hitting instruction, and taking HASH value as index reading flow table, the structure of stream table as shown in Figure 4.In stream table, have complete stream characteristic information, and last time timestamp when match hit.If the HASH bucket degree of depth is 2, from two list items of stream table, searches out an empty list item and stream characteristic information and the time of hitting of notebook data grouping are write to list item; If there is no sky list item, choose the list item generating at first in two list items and cover.Carry out subsequently the renewal of stream table.
The function that the complete paired data packet of message delay module is carried out time delay.Native system adopts dual-memory ping-pong to store respectively/read message.As shown in Figure 5, establishing data message, to be injected into single memory be that the full time is t 1, the acceptable maximum message segment output of system time delay is t 2, when timing is to Min (t 1, t 2) time, memory is read and write to switching.When tentation data full gear is injected, the multipotency storage T of single memory 0time, and t 2>T 0, obvious, Min (t 1, t 2) >T 0., system is minimum can provide T 0the time delay of time.In fact,, due to the effect of front end prescreen engine, the delay duration that system can provide is much larger than T 0.The delay duration that system can provide is corresponding to the tenability of recalling to packet.
Stream table search engine is responsible for the packet after to time delay based on stream table and is carried out the coupling based on stream.Packet after time delay is calculated according to the same described consistent HASH algorithm, subsequently according to HASH read stream table, and with list item in the stream characteristic information stored carry out precise alignment.If unanimously, take out the rise time T of corresponding list item m, taking current time as T n, derivation strategy ageing time is T rif meet T n– T m<T r, judge that this packet hits stream table, and carry out exporting after respective markers.

Claims (5)

1. the stream recognition method based on posteriority feature, is characterized in that: contain the following step:
Step 1: posteriority strategy is set;
Step 2: derivation strategy and ageing time T are set r, ageing time T rthe effective acting time of corresponding derivation strategy;
Step 3: build and derive from stream table;
Step 4: build retrieve data pond;
Step 5: the packet entering is carried out to the coupling of posteriority strategy, if there is a packet match hit, show data stream conforms posteriority policy condition now, mark is also exported all packet of this data flow;
Step 6: extract the contained stream characteristic information of packet that hits posteriority strategy, convection current characteristic information carries out hash computing, hash value is set to search key and sets up the list item that derives from stream table, stores complete stream characteristic information and mate the time stamp T while generation in list item m, list item is write and derives from stream table;
Step 7: the packet entering is write to retrieve data pond, and carrying out duration in the memory in retrieve data pond is T ddelay process, then read the packet after time delay, extract the stream characteristic information of the packet after time delay, derive from stream according to the hash value retrieval of this stream characteristic information and show, and to record current time be T nif, in derivation stream table, flow characteristic information and mate with the stream characteristic information of the packet after time delay, compare time stamp T mif meet T n-T m<T r, the current packet of mark is for hitting packet.
2. the stream recognition method based on posteriority feature according to claim 1, it is characterized in that: described posteriority strategy has following feature: data flow continues a moment in the cycle, when the characteristic matching of a packet and this posteriority strategy, the processing action request after the match is successful is that the grouping to arriving before this data flow is recalled;
Described derivation strategy is extracted and derived from by the packet of hitting posteriority strategy, any packet in the corresponding unique data flow of derivation strategy and data flow;
In described derivation stream table, contain N derivation strategy, N is more than or equal to 1 natural number, and the index entry that derives from stream table is the hash value that derives from stream characteristic information, derives from stream table and contains and derive from the complete characterization information of stream and the rise time of derivation strategy;
Described retrieve data pond adopts dual-memory ping-pong to store respectively and read packet.
3. the stream recognition method based on posteriority feature according to claim 1, is characterized in that: all packet in described step 5 contain the packet that match hit advances into.
4. the stream recognition method based on posteriority feature according to claim 1, is characterized in that: the stream characteristic information in described step 6 contains five-tuple.
5. the stream recognition method based on posteriority feature according to claim 1, is characterized in that: the duration T in described step 7 ddynamically determine according to the average duration of identification designed capacity, data flow, these indexs of data input rate size, or be appointed as fixed value, this fixed value is less than the duration that identification designed capacity can be born.
CN201410165425.8A 2014-04-23 2014-04-23 Stream recognition method based on posteriority feature Active CN103944783B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410165425.8A CN103944783B (en) 2014-04-23 2014-04-23 Stream recognition method based on posteriority feature

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410165425.8A CN103944783B (en) 2014-04-23 2014-04-23 Stream recognition method based on posteriority feature

Publications (2)

Publication Number Publication Date
CN103944783A true CN103944783A (en) 2014-07-23
CN103944783B CN103944783B (en) 2017-06-09

Family

ID=51192276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410165425.8A Active CN103944783B (en) 2014-04-23 2014-04-23 Stream recognition method based on posteriority feature

Country Status (1)

Country Link
CN (1) CN103944783B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114301812A (en) * 2021-12-29 2022-04-08 北京物芯科技有限责任公司 Method, device, equipment and storage medium for monitoring message processing result

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080147402A1 (en) * 2006-01-27 2008-06-19 Woojay Jeon Automatic pattern recognition using category dependent feature selection
CN101459554A (en) * 2008-12-30 2009-06-17 成都市华为赛门铁克科技有限公司 Method and apparatus for data stream detection
US20110213742A1 (en) * 2010-02-26 2011-09-01 Lemmond Tracy D Information extraction system
CN102959543A (en) * 2010-05-04 2013-03-06 沙扎姆娱乐有限公司 Methods and systems for processing sample of media stream

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080147402A1 (en) * 2006-01-27 2008-06-19 Woojay Jeon Automatic pattern recognition using category dependent feature selection
CN101459554A (en) * 2008-12-30 2009-06-17 成都市华为赛门铁克科技有限公司 Method and apparatus for data stream detection
US20110213742A1 (en) * 2010-02-26 2011-09-01 Lemmond Tracy D Information extraction system
CN102959543A (en) * 2010-05-04 2013-03-06 沙扎姆娱乐有限公司 Methods and systems for processing sample of media stream

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘三民等: "一种基于SVM后验概率的网络流量识别方法", 《计算机工程》 *
江军等: "IPTV媒体流识别技术的研究与实现", 《通信市场》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114301812A (en) * 2021-12-29 2022-04-08 北京物芯科技有限责任公司 Method, device, equipment and storage medium for monitoring message processing result

Also Published As

Publication number Publication date
CN103944783B (en) 2017-06-09

Similar Documents

Publication Publication Date Title
CN103123618B (en) Text similarity acquisition methods and device
CN106202028B (en) A kind of address information recognition methods and device
CN108199863B (en) Network traffic classification method and system based on two-stage sequence feature learning
TW200607341A (en) Media asset management system for managing video segments from fixed-area security cameras and associated methods
TW200606673A (en) Media asset management system for managing video segments from an aerial sensor platform and associated methods
CN103186669B (en) Keyword fast filtering method
CN105099918B (en) A kind of matched method and apparatus of data search
CN105337991A (en) Integrated message flow searching and updating method
CN104361296B (en) A kind of lookup method of parallel Large Copacity accesses control list
CN103686345A (en) Video content comparing method based on digital signal processor
CN104216925A (en) Repetition deleting processing method for video content
Monemi et al. Online NetFPGA decision tree statistical traffic classifier
CN104393961B (en) Received packet sorting and invalid packet processing method
CN103281291B (en) A kind of application protocol recognition method based on Hadoop
CN109117096A (en) Distributed data storage method and system based on block chain
CN103944783A (en) Flow identification method based on posterior features
CN102984242B (en) A kind of automatic identifying method of application protocol and device
CN106506399B (en) Realize the method, apparatus and data exchange chip of MFP
EP3964966A1 (en) Message matching table lookup method, system, storage medium, and terminal
CN107085576A (en) A kind of stream data statistic algorithm and device
CN104410483A (en) Received packet sorting and null packet processing system
CN104008136A (en) Method and device for text searching
CN104486020B (en) Network data counting method for clock recovery
CN106599326A (en) Duplication eliminating method and system for recorded data under cloud architecture
CN104850606A (en) Method for summarizing social events in mobile crowd sensing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant