CN110324204B - High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array) - Google Patents

High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array) Download PDF

Info

Publication number
CN110324204B
CN110324204B CN201910583091.9A CN201910583091A CN110324204B CN 110324204 B CN110324204 B CN 110324204B CN 201910583091 A CN201910583091 A CN 201910583091A CN 110324204 B CN110324204 B CN 110324204B
Authority
CN
China
Prior art keywords
matching
module
parallel
data
parallel matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910583091.9A
Other languages
Chinese (zh)
Other versions
CN110324204A (en
Inventor
孙明乾
乔庐峰
陈庆华
陆雅丽
李礼思洋
赵彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Army Engineering University of PLA
Original Assignee
Army Engineering University of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Army Engineering University of PLA filed Critical Army Engineering University of PLA
Priority to CN201910583091.9A priority Critical patent/CN110324204B/en
Publication of CN110324204A publication Critical patent/CN110324204A/en
Application granted granted Critical
Publication of CN110324204B publication Critical patent/CN110324204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Logic Circuits (AREA)

Abstract

The invention discloses a high-speed regular expression matching engine and a high-speed regular expression matching method implemented in an FPGA (field programmable gate array). The matching engine comprises a DFA (distributed feed back) table item distribution module, a data packet preprocessing module, an FIFO (first in first out) module, a parallel matching module, a memory module and a controller, wherein the DFA table item distribution module is positioned on a software layer, and the data packet preprocessing module, the FIFO module, the parallel matching module, the memory module and the controller are positioned on a hardware layer. The method comprises the following steps: the matching process of the whole data packet is divided into a parallel matching part and a serial matching part, so that a serial-parallel combined matching mode is formed: parallel matching is adopted for most field positions in the data packet, and serial matching is adopted for a small part of field positions. Meanwhile, for the storage of the DFA table entry, a three-level storage structure of a register, an on-chip RAM and an off-chip DDR3 is adopted. The invention improves the matching speed of the data packet and reduces the resource consumption in the FPGA chip.

Description

High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array)
Technical Field
The invention relates to the technical field of electronic circuits, in particular to a high-speed regular expression matching engine and a method realized in an FPGA (field programmable gate array).
Background
High performance Deep Packet Inspection (DPI) systems require the use of a regular expression matching engine to perform packet inspection operations. Before deep packet inspection is performed by using a regular expression, the regular expression needs to be converted into a Deterministic Finite Automata (DFA) to generate a DFA state transition table, and then the DFA state transition table is written into a storage space inside a matching engine, and the regular expression matching engine completes the inspection process of the whole data packet by using DFA state information stored inside the regular expression matching engine.
The traditional regular expression-based data packet detection is a serial iterative matching process, and in the process, only under the condition that the current state and the current character are determined, the next hop state can be calculated, so that one hop is completed. The method has the advantages that the design of the whole matching engine has certain simplicity, the matching process of the whole data packet can be completed only by simply controlling the reading operation of the DFA table entry, but simultaneously, the method can only process one character in one time unit, and the improvement of the matching speed is limited to a certain extent. At present, researchers have proposed a multi-step DFA, and the specific implementation idea is to convert the serial matching process of a data packet into a parallel matching process, iterate the original DFA N times to generate N-step DFAs, and process N characters in parallel within one time unit. The storage consumption of the N-step DFA is N times of the original DFA, when the algorithm is realized by utilizing hardware platforms such as the FPGA and the like, the storage resource consumption brought by the algorithm can instantaneously exhaust the on-chip resource of the FPGA, and if an off-chip large-capacity memory such as a DDR is used for storing the table entry of the N-step DFA, the longer access delay of the off-chip memory limits the throughput of the whole system to a certain extent.
Disclosure of Invention
The invention aims to provide a high-speed regular expression matching engine and a method which are realized in an FPGA (field programmable gate array) and have the advantages of high data packet matching speed, low resource consumption in an FPGA chip, reliability, high efficiency and easiness in expansion.
The technical solution for realizing the invention is as follows: a high-speed regular expression matching engine realized in FPGA comprises a DFA (distributed data array) table item distribution module, a data packet preprocessing module, an FIFO (first in first out) module, a parallel matching module, a memory module and a controller, wherein the DFA table item distribution module is positioned on a software layer, and the data packet preprocessing module, the FIFO module, the parallel matching module, the memory module and the controller are positioned on a hardware layer;
the DFA table item distribution module stores different table items into different memories according to the access frequency of each table item, wherein the different table items comprise an on-chip register, an on-chip RAM and an off-chip DDR 3;
the data packet preprocessing module divides the data packets input into the system into a group of N-8 bits, wherein N is the number of the subsequent characters which can be simultaneously matched in parallel and is stored into the FIFO module;
the FIFO module is used for caching the divided data packets for the controller to read;
the parallel matching module receives a parallel matching request from the controller, and simultaneously performs parallel matching on the N characters according to state table entry information stored in the register file, the circuit is realized based on combinational logic, and the generated parallel matching result is sent to the controller for further analysis;
the memory module is used for storing DFA table items;
and the controller is used for controlling the operation of the whole matching engine.
Furthermore, the DFA table entry distribution module is located in a software layer, and DFA state table entries are sequentially written into a memory under the control of the CPU.
Furthermore, the on-chip RAM included in the FIFO module and the memory module is designed and implemented using an IP core provided in the FPGA.
A high-speed regular expression matching method realized in FPGA comprises the following steps:
step 1, reading data to be detected from FIFO, and determining the operation to be taken on the data according to the current state recorded by an internal register;
step 2, if the current state meets the requirement of parallel matching, sending the N x 8bits data to a combinational logic circuit corresponding to a parallel matching module, performing serial matching on partial characters failed in matching, finally updating an internal register, and reading the next group of data;
step 3, if the current state does not meet the parallel matching requirement, serial matching is sequentially carried out from the first character corresponding to the lowest byte of the data, whether the parallel matching requirement is met or not is judged after each serial matching, and if the parallel matching requirement is met, the rest data is sent to a parallel matching module for parallel matching; otherwise, continuing to carry out serial matching.
Further, the remaining data in step 3 is sent to a parallel matching module for parallel matching, which specifically includes:
step 3.1, extracting the information of a key state KS from the register file, wherein the key state KS has two characteristics: one is the default transition of KS, i.e., the highest in percentage is itself; another is that the default transitions of a large number of states near KS are also KS;
step 3.2, respectively sending the n parallel input characters to n parallel execution decoders for character analysis, and inquiring a KS register file according to character analysis results, namely storing all transferred corresponding addresses of KS so as to obtain n data inquiry results, wherein the n data inquiry results respectively correspond to the n characters to be matched and represent a next hop state obtained by inputting the specific character when the system is in a key state;
step 3.3, sending the n results generated in the step 3.2 to a comparator, carrying out equal comparison with the state number of KS, and splicing the n comparison results into n bits of data;
and 3.4, performing secondary matching on the n-bit matching result in the step 3.3 to generate a 64-bit parallel matching result, and sending the result to the controller for further analysis.
Compared with the prior art, the invention has the following remarkable advantages: (1) the storage positions of the DFA table items are reasonably divided according to the access frequency, so that the consumption of on-chip storage is effectively reduced; (2) in the data packet detection process, parallel matching is adopted for certain field positions, so that the throughput of the whole system is improved; (3) the designed regular expression matching engine has universality and can be used as a lightweight module to be embedded in a high-performance router or other network equipment, so that the equipment can have a basic safety protection function based on application layer filtering; (4) the DFA table items are not subjected to secondary conversion, and the parallel matching module does not generate extra storage space consumption; the whole matching engine is realized by consuming less hardware logic resources and can be instantiated for many times in the FPGA, so that the performance and the throughput can be increased by times.
Drawings
FIG. 1 is a block diagram of a circuit structure of a high-speed regular expression matching engine implemented in an FPGA according to the present invention.
FIG. 2 is a flow chart of the parallel matching module executing the parallel matching operation according to the present invention.
Detailed Description
The invention will be further explained with reference to the drawings.
With reference to fig. 1, the high-speed regular expression matching engine implemented in FPGA of the present invention includes a DFA table distribution module, a data packet preprocessing module, an FIFO module, a parallel matching module, a memory module, and a controller, wherein the DFA table distribution module is located on a software layer, and the data packet preprocessing module, the FIFO module, the parallel matching module, the memory module, and the controller are located on a hardware layer;
the DFA table item distribution module stores different table items into different memories according to the access frequency of each table item, wherein the different table items comprise an on-chip register, an on-chip RAM and an off-chip DDR 3;
the data packet preprocessing module divides the data packets input into the system into a group of N-8 bits, wherein N is the number of the subsequent characters which can be simultaneously matched in parallel and is stored into the FIFO module;
the FIFO module is used for caching the divided data packets for the controller to read;
the parallel matching module receives a parallel matching request from the controller, and simultaneously performs parallel matching on the N characters according to state table entry information stored in the register file, the circuit is realized based on combinational logic, and the generated parallel matching result is sent to the controller for further analysis;
the memory module is used for storing DFA table items;
and the controller is used for controlling the operation of the whole matching engine.
A high-speed regular expression matching method realized in FPGA comprises the following steps:
step 1, reading data to be detected from FIFO, and determining the operation to be taken on the data according to the current state recorded by an internal register;
step 2, if the current state meets the requirement of parallel matching, sending the N x 8bits data to a combinational logic circuit corresponding to a parallel matching module, performing serial matching on some partial characters failed in matching, finally updating an internal register, and reading the next group of data;
step 3, if the current state does not meet the parallel matching requirement, serial matching is sequentially carried out from the first character corresponding to the lowest byte of the data, whether the parallel matching requirement is met or not is judged after each serial matching, and if the parallel matching requirement is met, the rest data is sent to a parallel matching module for parallel matching; otherwise, continuing to carry out serial matching.
Further, the remaining data in step 3 is sent to a parallel matching module for parallel matching, which specifically includes:
the parallel matching module is implemented based on combinational logic, as shown in fig. 2, specifically as follows:
step 3.1, extracting the information of a key state KS from the register file, wherein the key state KS has two characteristics: one is the default transition of KS, i.e., the highest in percentage is itself; another is that the default transitions of a large number of states near KS are also KS;
step 3.2, respectively sending the n parallel input characters to n parallel execution decoders for character analysis, and inquiring a KS register file according to character analysis results, namely storing all transferred corresponding addresses of KS so as to obtain n data inquiry results, wherein the n data inquiry results respectively correspond to the n characters to be matched and represent a next hop state obtained by inputting the specific character when the system is in a key state;
step 3.3, sending the n results generated in step 3.2 to a comparator, and performing equal comparison with the state number of KS, wherein most transitions in KS return to the comparator, so that the execution result of each equal comparator has a high probability of 1, which means that the data query result is equal to the state number of KS; then, splicing the n comparison results into n bits of data;
step 3.4, performing secondary matching on the n-bit matching result in the step 3.3, aiming at determining whether the system can jump back to the key state again or not by reading a certain character and jumping to other states when the system is in the key state; the secondary matching enables a part of bits 0 to be corrected into bits 1, which means that when a system reads in a character corresponding to the bit, the state jumps to a certain state directly connected with KS, and then the next character is read in, namely after the next character corresponds to a high-order character adjacent to the character, the state jumps back to KS; the secondary matching effectively reduces the workload of subsequent serial matching; after the second matching, a 64-bit parallel matching result is generated and sent to the controller for further analysis.
In conclusion, the storage positions of the DFA table entries are reasonably divided according to the access frequency, so that the consumption of on-chip storage is effectively reduced; in the data packet detection process, parallel matching is adopted for certain field positions, so that the throughput of the whole system is improved; the designed regular expression matching engine has universality and can be used as a lightweight module to be embedded in a high-performance router or other network equipment, so that the equipment can have a basic safety protection function based on application layer filtering; the DFA table items are not subjected to secondary conversion, and the parallel matching module does not generate extra storage space consumption; the whole matching engine is realized by consuming less hardware logic resources and can be instantiated for many times in the FPGA, so that the performance and the throughput can be increased by times.

Claims (5)

1. A high-speed regular expression matching engine realized in FPGA is characterized by comprising a DFA (distributed design automation) table item distribution module, a data packet preprocessing module, an FIFO (first in first out) module, a parallel matching module, a memory module and a controller, wherein the DFA table item distribution module is positioned on a software layer, and the data packet preprocessing module, the FIFO module, the parallel matching module, the memory module and the controller are positioned on a hardware layer;
the DFA table item distribution module stores different table items into different memories according to the access frequency of each table item, wherein the different table items comprise an on-chip register, an on-chip RAM and an off-chip DDR 3;
the data packet preprocessing module divides the data packets input into the system into a group of N-8 bits, wherein N is the number of the subsequent characters which can be simultaneously matched in parallel and is stored into the FIFO module;
the FIFO module is used for caching the divided data packets for the controller to read;
the parallel matching module receives a parallel matching request from the controller, and simultaneously performs parallel matching on the N characters according to state table entry information stored in the register file, the circuit is realized based on combinational logic, and the generated parallel matching result is sent to the controller for further analysis;
the memory module is used for storing DFA table items;
the controller is used for controlling the operation of the whole matching engine, and specifically comprises the following steps:
reading out the data to be detected from the FIFO module, and determining the operation to be taken on the data according to the current state recorded by the internal register;
if the current state meets the requirement of parallel matching, the N x 8bits data is sent to a combinational logic circuit corresponding to a parallel matching module, serial matching is carried out on partial characters failed in matching, and finally an internal register is updated and the next group of data is read;
if the current state does not meet the parallel matching requirement, serial matching is sequentially carried out from the first character corresponding to the lowest byte of the data, whether the parallel matching requirement is met or not is judged after each serial matching, and if the parallel matching requirement is met, the rest data is sent to a parallel matching module for parallel matching; otherwise, continuing to carry out serial matching.
2. The high-speed regular expression matching engine implemented in FPGA of claim 1, wherein said DFA table entry distribution module is located in software layer, and DFA state table entries are written into memory in sequence under control of CPU.
3. The high-speed regular expression matching engine implemented in an FPGA of claim 1, wherein on-chip RAMs included in said FIFO module and said memory module are designed and implemented using an IP core built into the FPGA.
4. A high-speed regular expression matching method realized in an FPGA is characterized in that the high-speed regular expression matching engine realized in the FPGA based on any claim 1 to 3 comprises the following steps:
step 1, reading data to be detected from FIFO, and determining the operation to be taken on the data according to the current state recorded by an internal register;
step 2, if the current state meets the requirement of parallel matching, sending the N x 8bits data to a combinational logic circuit corresponding to a parallel matching module, performing serial matching on partial characters failed in matching, finally updating an internal register, and reading the next group of data;
step 3, if the current state does not meet the parallel matching requirement, serial matching is sequentially carried out from the first character corresponding to the lowest byte of the data, whether the parallel matching requirement is met or not is judged after each serial matching, and if the parallel matching requirement is met, the rest data is sent to a parallel matching module for parallel matching; otherwise, continuing to carry out serial matching.
5. The method for matching high-speed regular expressions implemented in an FPGA according to claim 4, wherein the remaining data in step 3 is sent to a parallel matching module for parallel matching, specifically as follows:
step 3.1, extracting the information of a key state KS from the register file, wherein the key state KS has two characteristics: one is the default transition of KS, i.e., the highest in percentage is itself; another is that the default transitions of a large number of states near KS are also KS;
step 3.2, respectively sending the n parallel input characters to n parallel execution decoders for character analysis, and inquiring a KS register file according to character analysis results, namely storing all transferred corresponding addresses of KS so as to obtain n data inquiry results, wherein the n data inquiry results respectively correspond to the n characters to be matched and represent a next hop state obtained by inputting the specific character when the system is in a key state;
step 3.3, sending the n results generated in the step 3.2 to a comparator, carrying out equal comparison with the state number of KS, and splicing the n comparison results into n bits of data;
and 3.4, performing secondary matching on the n-bit matching result in the step 3.3 to generate a 64-bit parallel matching result, and sending the result to the controller for further analysis.
CN201910583091.9A 2019-07-01 2019-07-01 High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array) Active CN110324204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910583091.9A CN110324204B (en) 2019-07-01 2019-07-01 High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910583091.9A CN110324204B (en) 2019-07-01 2019-07-01 High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array)

Publications (2)

Publication Number Publication Date
CN110324204A CN110324204A (en) 2019-10-11
CN110324204B true CN110324204B (en) 2020-09-11

Family

ID=68121671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910583091.9A Active CN110324204B (en) 2019-07-01 2019-07-01 High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array)

Country Status (1)

Country Link
CN (1) CN110324204B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010749A (en) * 2019-12-19 2021-06-22 上海复旦微电子集团股份有限公司 Regular expression matching system
CN111130946B (en) * 2019-12-30 2022-03-25 联想(北京)有限公司 Acceleration method and device for deep packet identification and storage medium
CN113703715B (en) * 2021-08-31 2024-02-23 深信服科技股份有限公司 Regular expression matching method and device, FPGA and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101841546A (en) * 2010-05-17 2010-09-22 华为技术有限公司 Rule matching method, device and system
CN103312627A (en) * 2013-05-30 2013-09-18 中国人民解放军国防科学技术大学 Regular expression matching method based on two-level storage
CN107193776A (en) * 2017-05-24 2017-09-22 南京大学 A kind of new transfer algorithm for matching regular expressions

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7702629B2 (en) * 2005-12-02 2010-04-20 Exegy Incorporated Method and device for high performance regular expression pattern matching
CN102521356B (en) * 2011-12-13 2015-04-01 曙光信息产业(北京)有限公司 Regular expression matching equipment and method on basis of deterministic finite automaton
CN106776456B (en) * 2017-01-18 2019-06-18 中国人民解放军国防科学技术大学 High speed regular expression matching hybrid system and method based on FPGA+NPU

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101841546A (en) * 2010-05-17 2010-09-22 华为技术有限公司 Rule matching method, device and system
CN103312627A (en) * 2013-05-30 2013-09-18 中国人民解放军国防科学技术大学 Regular expression matching method based on two-level storage
CN107193776A (en) * 2017-05-24 2017-09-22 南京大学 A kind of new transfer algorithm for matching regular expressions

Also Published As

Publication number Publication date
CN110324204A (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN110324204B (en) High-speed regular expression matching engine and method implemented in FPGA (field programmable Gate array)
US8780926B2 (en) Updating prefix-compressed tries for IP route lookup
US6392910B1 (en) Priority encoder with multiple match function for content addressable memories and methods for implementing the same
JP6082753B2 (en) Method and system for data analysis in a state machine
Bando et al. Flashtrie: Hash-based prefix-compressed trie for IP route lookup beyond 100Gbps
Bando et al. FlashTrie: beyond 100-Gb/s IP route lookup using hash-based prefix-compressed trie
US7694068B1 (en) Re-entrant processing in a content addressable memory
GB2386716A (en) A method of improving the lookup performance of tree-type knowledge base searches
EP2668576A2 (en) State grouping for element utilization
CN105376159A (en) Packet processing and forwarding device and method
CN101692209B (en) Circuit design method and device for simulating TCAM by using embedded SRAM of FPGA
CN109981464B (en) TCAM circuit structure realized in FPGA and matching method thereof
US20110276737A1 (en) Method and system for reordering the request queue of a hardware accelerator
CN112667526B (en) Method and circuit for realizing access control list circuit
CN113037634A (en) Processing method, logic circuit and equipment of matching action table based on FPGA
CN102014065A (en) Method for analyzing packet headers, header analysis preprocessing device and network processor
US7661138B1 (en) Finite state automaton compression
CN113986817B (en) Method for accessing in-chip memory area by operation chip and operation chip
CN113867796B (en) Protocol conversion bridge for improving reading performance by using multi-state machine and implementation method
CN113411380B (en) Processing method, logic circuit and equipment based on FPGA (field programmable gate array) programmable session table
KR20150014487A (en) Semiconductor storage device
CN111177198B (en) Content searching method for chip
CN112994886B (en) Hardware for generating TCAM search keywords and implementation method
CN117201401A (en) High-performance address table management system and design method
AU2021106221A4 (en) An improved tcam cell design and method of operation for reduced power dissipation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant