CN105871726A - Mode matching method for dynamically adding tree node and unit based on common prefix - Google Patents

Mode matching method for dynamically adding tree node and unit based on common prefix Download PDF

Info

Publication number
CN105871726A
CN105871726A CN201610161030.XA CN201610161030A CN105871726A CN 105871726 A CN105871726 A CN 105871726A CN 201610161030 A CN201610161030 A CN 201610161030A CN 105871726 A CN105871726 A CN 105871726A
Authority
CN
China
Prior art keywords
address
unit
tree
node
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610161030.XA
Other languages
Chinese (zh)
Inventor
王巍
杨武
苘大鹏
玄世昌
朱秋阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201610161030.XA priority Critical patent/CN105871726A/en
Publication of CN105871726A publication Critical patent/CN105871726A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/745Address table lookup; Address filtering
    • H04L45/7453Address table lookup; Address filtering using hashing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/25Mapping addresses of the same type
    • H04L61/2503Translation of Internet protocol [IP] addresses
    • H04L61/251Translation of Internet protocol [IP] addresses between different IP versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/618Details of network addresses
    • H04L2101/659Internet protocol version 6 [IPv6] addresses

Abstract

The invention belongs to the technical field of network information processing, and particularly relates to a mode matching method for dynamically adding a tree node and a unit based on a common prefix. The mode matching method comprises the steps of (1) mode uploading operation of firstly starting searching from a root node when a new IP address is added into a dictionary tree; and (2) network address searching operation of searching a network IP address in a dictionary tree structure, wherein a fan-out width is used for selecting a transverse searching method in a unit searching process. In comparison with the existing method, the method for dynamically the tree node and the unit based on a common substring, provided by the invention, designs a searching scheme of a transverse unit and reduces visit times of a memory in the searching process based on the compression in the longitudinal direction of the tree.

Description

A kind of dynamically interpolation tree node and the method for mode matching of unit of based on common prefix
Technical field
The invention belongs to network information processing technical field, be specifically related to a kind of dynamically interpolation tree node and list of based on common prefix The method for mode matching of unit.
Background technology
Present stage, network security is a very powerful and exceedingly arrogant topic, widely available along with network application, and netizen's quantity drastically increases Add, use the network address that the network user is uniquely identified in a network.The Internet Filtering skill that present stage is existed For art, filter type based on the network address is a kind of filters being widely used, and blacklist is such filters A kind of actual application.In blacklist, list the IP address being prevented from, set up the initial stage in link every time, check and visited Whether the network address asked is in blacklist, if in blacklist, then filters out this session, improves the real-time of Internet Filtering Property.
Neil Barrett once used the bullet of flight to describe the growth trend of number of users in the Internet, and IPv4 address makes the most not With, catering to the challenge of address exhaustion, CIDR (Classless Inter Domain Routing) and IPv6 addressing technique are suggested, To improve the Internet bearing capacity to user.The development of adjoint network address, multi-pattern matching algorithm based on the network address is also Should improve to some extent, the lookup of IPv4 address can not only be satisfied with, and want to realize high-speed searching IPv6 address, use simultaneously Increasing of amount amount also can make the quantity of disabled user increase, and such address filtered model matching algorithm is it should also be taken into account that pattern Extensive characteristic.In consideration of it, build the IP address match party of the real-time characteristic of an extensive characteristic meeting pattern and lookup Method method is the main object of the present invention.
In the filtering technique of the network address, normal adopted data structure is Hash table and dictionary tree, and both are respectively arranged with pluses and minuses, breathes out Uncommon table must take into the generation of hash-collision, on the one hand needs to select preferable hash algorithm, on the one hand needs design preferably to breathe out Uncommon contention-resolution schemes, XOR Hash is the hash scheme that present stage effect is pretty good, but for the network address of magnanimity and have For the character set of limit, collision problem is increasing along with the increase of schema size, and therefore Hash table is not ideal basic Structure type.For dictionary tree, general tree unit monocase storage scheme memory cost is relatively big, but there is not conflict Problem, can access in tree node search procedure and repeatedly access internal memory, need for complete IPv6 address access internal memory 128 Secondary, this can badly influence the lookup time of the network address.
Summary of the invention
It is an object of the invention to provide a kind of minimizing internal storage access number of times in network ip address search procedure based on before public That sews dynamically adds tree node and the method for mode matching of unit.
The object of the present invention is achieved like this:
A kind of tree node and method for mode matching of unit of dynamically adding based on common prefix:
(1) pattern loads operation: when a new IP address adds dictionary tree, first begin look for from root vertex, first Checking and dictionary tree has existed this IP address, if existed, not adding, otherwise this IP address is added It is added in dictionary tree;
(2) network address lookup operation:
For network ip address lookup in dictionary tree structure, in unit search procedure, fan-out width is used to select laterally The method searched, the most comprehensively uses sequential search, binary chop and uses the mode of directly location to realize, each node Have the attribute of fan-out width, be three kinds of lookup modes define threshold value demarcation line come division unit search method.
When Network Search address in dictionary tree, the process that binding pattern loads, begin look for routing information from root node, its Middle unit search procedure uses sequential search, binary chop and directly positions the mode combined and realize, the fan-out of use node Width determines the selection of lookup method.
The beneficial effects of the present invention is:
Compared with the existing methods, the present invention proposes tree node and the element method of dynamically adding based on public substring, design horizontal stroke To the lookup scheme of unit, reduce the access times of internal memory in search procedure based on the compression that tree is longitudinal.
Accompanying drawing explanation
Fig. 1 is the flow chart that in the present invention, pattern loads;
Fig. 2 be the present invention illustrate in pattern loading procedure specification tree node and unit storage content variation diagram;
Fig. 3 is to set the flowchart that unit adds in the present invention;
Fig. 4 is to set the flowchart that unit is searched in the present invention;
Fig. 5 is the flow chart of network address lookup in the present invention;
Fig. 6 is the matching efficiency comparison diagram that the present invention and XOR hash algorithm, tradition dictionary tree are directed to IPv4 address;
Fig. 7 is the matching efficiency comparison diagram that the present invention and tradition dictionary tree are directed to IPv4 address;
Fig. 8 is the EMS memory occupation comparison diagram that the present invention and tradition dictionary tree are directed to IPv6 address;
Fig. 9 is the matching efficiency comparison diagram that the present invention and tradition dictionary tree are directed to IPv6 address.
Detailed description of the invention
Below in conjunction with the accompanying drawings the present invention is described further.
The present invention describes a kind of magnanimity IP address pattern matching process based on common prefix, and the basic data structure of the method is Dictionary tree.In IP address pattern loading procedure, using the method dynamically adding pattern, form based on common prefix is dynamically repaiied Changing the storage content of tree node and unit, unit is no longer only to deposit a character, but the substring of storage mode, thus realize The longitudinal compression of dictionary tree;The present invention takes into account the node fan-out width that tree is horizontal simultaneously, uses unit storage initial to determine Number of unit, fan-out width, without departing from 256, completes the transverse compression of dictionary tree, and be used in combination sequential search, two points look into The lookup mode looked for and directly position is to complete the lookup of unit.Technical solution of the present invention is possible not only to reduce dictionary tree node and list Unit takes up room, and can effectively reduce internal storage access number of times simultaneously.This method can support IPv4 address and IPv6 ground simultaneously The efficient matchings of location.
Structure based on dictionary tree of the present invention, it is contemplated that the problem of internal storage access number of times and tree node and the space hold problem of unit, Dictionary tree is carried out the compression of rational vertical and horizontal, the scheme that appropriate design lateral cell is searched, in order to improve network IP The search efficiency of address.
It is an object of the invention to provide a kind of dynamically interpolation tree node and the pattern matching algorithm of unit, a side of based on common prefix Face is deposited public substring by unit and is realized the longitudinal compression of dictionary tree, reduces the internal memory in network ip address search procedure and visits Ask number of times;On the other hand it is that the substring initial character deposited by unit is controlled fan-out width and realizes horizontal compression, in conjunction with multiple Lookup method realizes quick cell and searches, and the reduced overall of tree scale realizes the internal memory usage amount of tree node and unit.
It is an object of the invention to be achieved through the following technical solutions:
Pattern loads operation (step 110);
Network address lookup operation (step 120);
The pattern loading method of described step 110 is as follows:
When a new IP address adds dictionary tree when, can first start to see from root vertex and look for, first look in dictionary tree There is this IP address, if existed, not added, otherwise this IP address has been added in dictionary tree, As shown in the flow chart in Fig. 1.
The IP address lookup method of described step 120 is as follows:
For network ip address lookup in dictionary tree structure, the concrete optimization of present aspect is dictionary tree node transversely unit Lookup realize.In unit search procedure, fan-out width is used to select the method laterally searched, i.e. comprehensive use order to look into Look for, binary chop and use the mode of directly location to realize, each node has the attribute of fan-out width, is three kinds of lookups Mode defines threshold value demarcation line and carrys out the method that division unit is searched, in order to improve matching efficiency transversely according to having that unit stores Sequence, quickly finds corresponding unit information and compares.
The present invention is to provide a kind of pattern matching algorithm based on common prefix.It is described in detail below in conjunction with accompanying drawing.
Following steps of the present invention are implemented:
1. pattern loads operation (step 110);
2. network address lookup operation (step 120);
Step 110 is implemented by following several steps:
(1) when a new IP address adds dictionary tree when, can first start to see from root vertex and look for, first look at word Allusion quotation tree existing this IP address, if existed, not having added, otherwise added this IP address to word In allusion quotation tree, as shown in the flow chart in Fig. 1.
(2) in adding procedure, for tree lateral cell, the moment remains the character string information deposited in each unit Initial character is different, thus limits the scale that tree is extending transversely, and its unit number is 255 to the maximum.Wherein in order to quickly Determine whether node lower unit has the unit that can mate, space can be introduced in node, comprise under this node for depositing The information of initial character in unit, it is achieved quickly navigate to cell position, in time not depositing cash in corresponding units information, can be direct Under this node, add new unit information, otherwise, find corresponding unit information, find the substring deposited in this unit and treat The substring inserted contrasts, and finds common prefix information, dynamically the amendment storage of unit neutron string and adding of subsequent node Adding, this is that a kind of thought sought common ground while reserving difference realizes.
For tree node and the dynamic change of unit, can be as in figure 2 it is shown, use character string to represent in each node and unit at this The change of character string storage, not using the network address to show is in order to visual representation dynamically changes.In pattern loading procedure, When finding tree node to be operated, character string the most to be loaded is " texnex ", uses the thought sought common ground while reserving difference, finds node The unit Cell of identical initial character below Node, treats and loads the substring information searching common prefix substring deposited in character and Cell, The substring of storage in Cell is split into new node and unit information, the path division situation in below figure 2, finally exist Below new node, the suffix information being inserted into substring is inserted in new unit, completes the operating process dynamically changed.
(3) pattern that is described above inserts situation, from above it is recognised that this storage method is corresponding to general dictionary Certain compression has been had on tree construction is longitudinal, simultaneously for the transversely initial uniqueness according to guarantee, transversely fan-out width Do not affect compared to general dictionary tree structure, but for step-length is more than the dictionary tree of 1, laterally also achieve compression.
In the horizontal, in addition to storage compression, it is also possible to be optimized from laterally searching angle, experimental program as used herein Being comprehensively to use sequential search, binary chop and use the mode of directly location to realize, each node has fan-out width Attribute, is that three kinds of lookup modes define threshold value demarcation line and carry out the method that division unit is searched, in order to improve matching efficiency transversely. Fig. 3 gives concrete tree unit addition process.
According to the adding procedure of unit, look into for the ease of subsequent cell when adding Code information under some node Node The facility looked for, is ranked up unit adding, and changes fan-out width, for the selection of follow-up lookup method simultaneously.
Described step 120 is implemented by following several steps:
(1) for network address lookup in dictionary tree structure, concrete optimization is that the lookup of transversely unit realizes, figure 4 give unit search realize process.In unit search procedure, fan-out width is used to select the method laterally searched, According to the order of unit storage, quickly find corresponding unit information and compare.
(2) Fig. 5 gives the entire flow figure of network address lookup.From the beginning of Root node, substring is searched corresponding one by one Cell unit, once the match is successful for substring, then Node points to the children tree nodes of Cell, and it is corresponding that follow-up substring searched again by this sample Cell information, until Size terminates, check that the Tail field in corresponding node Node indicates whether is 1 simultaneously, from And understand, Text is accurately coupling or prefix matching success.
The enforcement of the inventive method realizes by C language under linux.By above research, based on two kinds of data structures The realization of matching process understands, and every kind of method has an applied environment of oneself, different applied environments, method realize efficiency meeting There is a certain distance, for being directed to the implementation environment of method, be for wanting that address large-scale sensitive in network carries out filtering Asking, the selection for method needs to consider that the demand of real-time, matching efficiency require higher, considers large-scale sensitive letter simultaneously For breath, algorithm energy implementation pattern collection is combined into the other situation of millions.
Method implementation environment:
The IPv4 address pattern information of stochastic generation different scales, use respectively hash method (XOR_Hash) based on XOR, Tradition Trie method and dynamically divide under the D_Trie improved method of Trie tree carries out having a competition based on common prefix, Fig. 6 is given The matching efficiency curve chart of three kinds of methods, thus figure understands, in the case of fixing Hash table size, along with schema size Increase, occur conflict probability increase, so lookup the time the biggest, but for schema size set less in the case of, XOR hash method has preferable efficiency;From the point of view of tradition Trie method, the performance improving D_Trie method has optimized, In order to check the gap of both performances intuitively, Fig. 7 gives relevant matches Performance comparision situation just for Trie and D_Trie.
As shown in Figure 7, the efficiency of Trie method can change along with the increase of schema size, and the time of searching is more and more longer, And the D_Trie method improved is along with the increase of schema scale, searching time fluctuation little, the average lookup time maintains About 0.9us.
In method development, it has to be considered that an angle be that the storage of improved method large-scale ip to be met v6 is searched, At the IPv6 address set of this stochastic generation certain scale, from memory space consumption and two angles of matching efficiency, improvement side is described The superiority of method.
The IPv6 address pattern information of stochastic generation different scales, Fig. 8 checks in IPv6 set from the angle of EMS memory occupation On sample, method comparing result, when schema size reaches ten million when, Trie method can not loading mode, and In the pattern of million grades, the size consuming internal memory is about 3 times of improved method consumption internal memory.Fig. 9 gives two kinds of methods Contrast on matching performance, it can be seen that matching efficiency will not have the biggest floating, about along with the change of schema size At 1.1us.

Claims (2)

1. dynamically interpolation tree node and the method for mode matching of unit of based on a common prefix, it is characterised in that:
(1) pattern loads operation: when a new IP address adds dictionary tree, first begin look for from root vertex, first Checking and dictionary tree has existed this IP address, if existed, not adding, otherwise this IP address is added It is added in dictionary tree;
(2) network address lookup operation:
For network ip address lookup in dictionary tree structure, in unit search procedure, fan-out width is used to select laterally The method searched, the most comprehensively uses sequential search, binary chop and uses the mode of directly location to realize, each node Have the attribute of fan-out width, be three kinds of lookup modes define threshold value demarcation line come division unit search method.
2. dynamically interpolation tree node and the method for mode matching of unit of based on a common prefix, it is characterised in that: when at dictionary In tree during Network Search address, the process that binding pattern loads, begin look for routing information from root node, wherein unit was searched Using sequential search, binary chop in journey and directly position the mode combined and realize, the fan-out width of use node determines to be looked into Look for the selection of method.
CN201610161030.XA 2016-03-21 2016-03-21 Mode matching method for dynamically adding tree node and unit based on common prefix Pending CN105871726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610161030.XA CN105871726A (en) 2016-03-21 2016-03-21 Mode matching method for dynamically adding tree node and unit based on common prefix

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610161030.XA CN105871726A (en) 2016-03-21 2016-03-21 Mode matching method for dynamically adding tree node and unit based on common prefix

Publications (1)

Publication Number Publication Date
CN105871726A true CN105871726A (en) 2016-08-17

Family

ID=56625611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610161030.XA Pending CN105871726A (en) 2016-03-21 2016-03-21 Mode matching method for dynamically adding tree node and unit based on common prefix

Country Status (1)

Country Link
CN (1) CN105871726A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905413A (en) * 2019-04-30 2019-06-18 新华三信息安全技术有限公司 A kind of matching process and device of IP address

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040236720A1 (en) * 2000-04-06 2004-11-25 International Business Machines Corporation Longest prefix match lookup using hash function
CN101055574A (en) * 2006-04-13 2007-10-17 华为技术有限公司 Domain name information storage and inquiring method and system
CN102404225A (en) * 2011-11-30 2012-04-04 华南理工大学 Method for rapid enqueue of packet for differential queue service system
CN105260354A (en) * 2015-08-20 2016-01-20 及时标讯网络信息技术(北京)有限公司 Chinese AC (Aho-Corasick) automaton working method based on keyword dictionary tree structure

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040236720A1 (en) * 2000-04-06 2004-11-25 International Business Machines Corporation Longest prefix match lookup using hash function
CN101055574A (en) * 2006-04-13 2007-10-17 华为技术有限公司 Domain name information storage and inquiring method and system
CN102404225A (en) * 2011-11-30 2012-04-04 华南理工大学 Method for rapid enqueue of packet for differential queue service system
CN105260354A (en) * 2015-08-20 2016-01-20 及时标讯网络信息技术(北京)有限公司 Chinese AC (Aho-Corasick) automaton working method based on keyword dictionary tree structure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄健青等: "Web日志分析中数据预处理的设计与实现", 《河南科技大学学报》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905413A (en) * 2019-04-30 2019-06-18 新华三信息安全技术有限公司 A kind of matching process and device of IP address

Similar Documents

Publication Publication Date Title
Quan et al. TB2F: Tree-bitmap and bloom-filter for a scalable and efficient name lookup in content-centric networking
CN104052661B (en) Container name server and container name analytic method
CN102521334B (en) Data storage and query method based on classification characteristics and balanced binary tree
CN103812849B (en) A kind of local cache update method, system, client and server
US20150026211A1 (en) Dynamic social network relationship determination method and apparatus
CN108846133B (en) Block chain storage structure based on B-M tree, B-M tree establishment algorithm and search algorithm
CN103873371A (en) Name routing fast matching search method and device
WO2016209975A2 (en) Preliminary ranker for scoring matching documents
Ghasemi et al. A fast and memory-efficient trie structure for name-based packet forwarding
CN104008152A (en) Distributed file system architectural method supporting mass data access
CN103810244A (en) Distributed data storage system expansion method based on data distribution
CN105740472A (en) Distributed real-time full-text search method and system
WO2016209962A2 (en) Storage and retrieval of data from a bit vector search index
CN107480252A (en) A kind of data query method, client, service end and system
CN103051543A (en) Route prefix processing, lookup, adding and deleting method
CN105357247B (en) Multidimensional property cloud resource range lookup method based on layering cloud peer-to-peer network
WO2016209964A1 (en) Bit vector search index using shards
CN107368578A (en) A kind of method and system for quickly generating ES query statements
EP3314465A1 (en) Match fix-up to remove matching documents
CN102378407B (en) Object name resolution system and method in internet of things
CN108628907A (en) A method of being used for the Trie tree multiple-fault diagnosis based on Aho-Corasick
CN106796588B (en) The update method and equipment of concordance list
CN105871726A (en) Mode matching method for dynamically adding tree node and unit based on common prefix
He et al. A fast and memory-efficient approach to NDN name lookup
CN106649489A (en) Continuous skyline query processing mechanism in geographic text information data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160817