CN113703715B - Regular expression matching method and device, FPGA and medium - Google Patents

Regular expression matching method and device, FPGA and medium Download PDF

Info

Publication number
CN113703715B
CN113703715B CN202111017189.1A CN202111017189A CN113703715B CN 113703715 B CN113703715 B CN 113703715B CN 202111017189 A CN202111017189 A CN 202111017189A CN 113703715 B CN113703715 B CN 113703715B
Authority
CN
China
Prior art keywords
regular expression
matching
matched
character strings
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111017189.1A
Other languages
Chinese (zh)
Other versions
CN113703715A (en
Inventor
李建权
文曦畅
徐敬蘅
闫凡
郜振锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202111017189.1A priority Critical patent/CN113703715B/en
Publication of CN113703715A publication Critical patent/CN113703715A/en
Application granted granted Critical
Publication of CN113703715B publication Critical patent/CN113703715B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F5/00Methods or arrangements for data conversion without changing the order or content of the data handled
    • G06F5/06Methods or arrangements for data conversion without changing the order or content of the data handled for changing the speed of data flow, i.e. speed regularising or timing, e.g. delay lines, FIFO buffers; over- or underrun control therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0002Serial port, e.g. RS232C
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The application discloses a regular expression matching method, a device, an FPGA and a medium, wherein the method comprises the following steps: receiving a character string to be matched sent by computer equipment connected with the FPGA; carrying out parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in an idle state to obtain a matching result; and sending the matching result to the computer equipment. Therefore, the regular expressions which are originally executed by the CPU are matched and unloaded onto the FPGA, the character strings to be matched are matched in parallel by utilizing the high parallel capability of the FPGA, and as each regular expression is matched with the character string to be matched in parallel in the matching process, compared with the prior art that the CPU can only match the regular expressions in series one by one, the matching time delay can be reduced, the matching efficiency can be improved, and the performance of the whole system can be improved.

Description

Regular expression matching method and device, FPGA and medium
Technical Field
The application relates to the technical field of computers, in particular to a regular expression matching method, a device, an FPGA and a medium.
Background
In actual computer business processing, many business processes need to be regularly matched, for example, rule-like businesses such as an intrusion prevention system, deep packet inspection and the like contain a large number of regular expressions to judge whether the content of the message data meets a specific rule or not, and perform related processing on the content of the message data according to a rule matching result. CPU (central processing unit ) is used as general processor, and needs to match regular expressions piece by piece, resulting in longer system matching delay and low matching efficiency. And related data show that the rule-type services such as intrusion prevention system, deep packet inspection and the like consume more than 30% of CPU computing resources, and become the bottleneck for improving the system performance.
Disclosure of Invention
In view of this, the purpose of the present application is to provide a regular expression matching method, device, FPGA, and medium, which can reduce matching delay, improve matching efficiency, and help to improve system performance. The specific scheme is as follows:
in a first aspect, the present application discloses a regular expression matching method, applied to an FPGA, including:
receiving a character string to be matched sent by computer equipment connected with the FPGA;
carrying out parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in an idle state to obtain a matching result;
and sending the matching result to the computer equipment.
Optionally, the receiving the character string to be matched sent by the computer device connected with the FPGA includes:
and receiving a character string to be matched sent by computer equipment connected with the FPGA by utilizing an AXI-Stream protocol.
Optionally, before the matching the character strings to be matched in parallel by each regular expression in the regular expression matching module in the idle state, the method further includes:
determining a regular expression matching module which is in an idle state and has the smallest module identifier as a target regular expression matching module;
transmitting the character strings to be matched to the target regular expression matching module;
correspondingly, the parallel matching of the character strings to be matched through each regular expression in the regular expression matching module in the idle state comprises the following steps:
and carrying out parallel matching on the character strings to be matched through each regular expression in the target regular expression matching module.
Optionally, the sending the matching result to the computer device includes:
when the regular expression matching module is in a calculation completion state, reading the matching result from the regular expression matching module;
and sending the matching result to the computer equipment.
Optionally, the performing parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in the idle state to obtain a matching result includes:
preprocessing the character strings to be matched through the regular expression matching module;
and carrying out parallel matching on the character strings to be matched after pretreatment by the regular expression matching module to obtain a matching result.
Optionally, the preprocessing the character string to be matched through the regular expression matching module includes:
caching the character strings to be matched through a FIFO (first in first out) buffer in the regular expression matching module;
performing data bit width conversion on the character strings to be matched through a shift register in the regular expression matching module;
and copying the character strings to be matched after the data bit width conversion through a plurality of clock registers in the regular expression matching module.
Optionally, the performing parallel matching on the character strings to be matched after preprocessing by the regular expression matching module to obtain a matching result includes:
carrying out parallel matching on the copied character strings to be matched after the data bit width conversion of each data bit is converted through each regular expression in the regular expression matching module, so as to obtain a regular expression output result;
and carrying out data bit width conversion on the regular expression output result to obtain the matching result.
In a second aspect, the present application discloses a regular expression matching apparatus, applied to an FPGA, including:
the data distribution module is used for receiving character strings to be matched sent by computer equipment connected with the FPGA;
the regular expression matching module is used for carrying out parallel matching on the character strings to be matched through each regular expression to obtain a matching result;
and the data selection module is used for sending the matching result to the computer equipment.
In a third aspect, the present application discloses an FPGA comprising:
a storage unit and a processing unit;
wherein the storage unit is used for storing a computer program;
the processing unit is configured to execute the computer program to implement the foregoing disclosed regular expression matching method.
In a fourth aspect, the present application discloses a computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the foregoing disclosed regular expression matching method.
As can be seen, the present application discloses a regular expression matching method, which is applied to an FPGA, and first receives a character string to be matched sent by a computer device connected to the FPGA. And then carrying out parallel matching on the character strings to be matched through all regular expressions in the regular expression matching module in the idle state to obtain a matching result. The matching result may then be sent to the computer device. Therefore, the regular expressions which are originally executed by the CPU are matched and unloaded onto the FPGA, the character strings to be matched are matched in parallel by utilizing the high parallel capability of the FPGA, and as each regular expression is matched with the character strings to be matched in parallel in the matching process, compared with the serial matching of the regular expressions one by the CPU in the prior art, the matching time delay can be reduced, the matching efficiency is improved, and the performance of the whole system is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a regular expression matching method disclosed in the present application;
FIG. 2 is a flowchart of a specific regular expression matching method disclosed herein;
FIG. 3 is a diagram of a regular expression matching overall architecture disclosed herein;
FIG. 4 is a block diagram of a regular expression matching module disclosed herein;
FIG. 5 is a schematic structural diagram of a regular expression matching device disclosed in the present application;
fig. 6 is a schematic structural diagram of an FPGA disclosed in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Currently, in actual computer business processing, many business processing needs to perform regular matching, for example, rule-like businesses such as an intrusion prevention system, deep packet inspection and the like include a large number of regular expressions to determine whether the content of the message data meets a specific rule, and perform related processing on the content of the message data according to a rule matching result. CPU is used as a general processor, regular expression matching is needed piece by piece, so that system matching delay is longer and matching efficiency is low. And related data show that the rule-type services such as intrusion prevention system, deep packet inspection and the like consume more than 30% of CPU computing resources, and become the bottleneck for improving the system performance. In view of this, the present application provides a regular expression matching method, which can reduce matching delay, improve matching efficiency, and help to improve system performance.
Referring to fig. 1, an embodiment of the present application discloses a regular expression matching method, applied to an FPGA, including:
step S11: and receiving a character string to be matched sent by computer equipment connected with the FPGA.
In an actual implementation process, the FPGA (Filed Programmable Gate Array, field programmable gate array) is connected with a computer device so as to receive a matching instruction and the like issued by the computer device, where a specific connection mode between the FPGA and the computer device may be determined according to an actual situation, and is not limited herein.
When regular expression matching is required, the computer equipment issues corresponding character strings to be matched to the FPGA, so that the FPGA is required to receive the character strings to be matched.
Step S12: and carrying out parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in the idle state to obtain a matching result.
It can be understood that after the character strings to be matched are obtained, the character strings to be matched need to be matched in parallel through each regular expression in the regular expression matching module in an idle state, so that a matching result is obtained.
That is, a plurality of regular expression matching modules may be set on the FPGA, where each regular expression matching module includes preset regular expressions, so that each regular expression matching module may perform a matching operation of different strings to be matched in parallel, and each regular expression may perform parallel matching between the string to be matched and each different regular expression. After the character strings to be matched are acquired, the character strings to be matched need to be matched in parallel through each regular expression in the regular expression module in an idle state.
Step S13: and sending the matching result to the computer equipment.
It can be understood that after the character strings to be matched are matched in parallel by each regular expression in the regular expression matching module in the idle state, a corresponding matching result is obtained, so that the matching result needs to be sent to the computer equipment, so that the computer equipment performs corresponding operations such as alarming, neglecting and the like according to the matching result.
As can be seen, the present application discloses a regular expression matching method, which is applied to an FPGA, and first receives a character string to be matched sent by a computer device connected to the FPGA. And then carrying out parallel matching on the character strings to be matched through all regular expressions in the regular expression matching module in the idle state to obtain a matching result. The matching result may then be sent to the computer device. Therefore, the regular expressions which are originally executed by the CPU are matched and unloaded onto the FPGA, the character strings to be matched are matched in parallel by utilizing the high parallel capability of the FPGA, and as each regular expression is matched with the character strings to be matched in parallel in the matching process, compared with the serial matching of the regular expressions one by the CPU in the prior art, the matching time delay can be reduced, the matching efficiency is improved, and the performance of the whole system is improved.
Referring to fig. 2, an embodiment of the present application discloses a specific regular expression matching method, applied to an FPGA, where the method includes:
step S21: and receiving a character string to be matched sent by computer equipment connected with the FPGA.
The FPGA needs to be connected with computer equipment so that the computer equipment can send relevant data to the FPGA, wherein AXI (Advanced eXtensible Interface) -Stream protocol can be adopted for communication between the computer equipment and the FPGA. Correspondingly, the FPGA needs to receive the character strings to be matched sent by computer equipment connected with the FPGA by utilizing an AXI-Stream protocol.
When the data transmission interface between the computer equipment and the FPGA is in an AXI-Stream protocol format, tdata, tkeep, tvalid, tlast and a treatment signal are contained. Tdata is data content transmitted to the FPGA, tkeep is a flag of data validity of each byte in tdata, tvalid is a flag of overall validity of tdata, tlast is a flag of ending single data transmission, and tready indicates that there is a matching module currently in an idle state.
Step S22: and preprocessing the character strings to be matched through the regular expression matching module.
After the character strings to be matched are received, the character strings to be matched are required to be matched in parallel through each regular expression in the regular expression matching module in an idle state.
In the implementation process, the regular expression in the idle state on the FPGA can comprise a plurality of regular expression matching modules, so that the regular expression matching module in the idle state with the smallest module identifier can be determined as the target regular expression matching module; transmitting the character strings to be matched to the target regular expression matching module; correspondingly, the parallel matching of the character strings to be matched through each regular expression in the regular expression matching module in the idle state comprises the following steps: and carrying out parallel matching on the character strings to be matched through each regular expression in the target regular expression matching module. For example, if the regular expression matching modules 1 to 10 are in an idle state, the regular expression matching module 1 is regarded as a target regular expression module.
The parallel matching of the character strings to be matched through each regular expression in the regular expression matching module in the idle state can specifically include: and preprocessing the character strings to be matched through the regular expression matching module.
Specifically, the method comprises the following steps: caching the character strings to be matched through a FIFO (First Input First Output) cache in the regular expression matching module; performing data bit width conversion on the character strings to be matched through a shift register in the regular expression matching module; and copying the character strings to be matched after the data bit width conversion through a plurality of clock registers in the regular expression matching module.
The method comprises the steps of firstly caching the character strings to be matched into a FIFO buffer of the regular expression matching module, then reading the character strings to be matched from the FIFO buffer, converting the data bit width of the character strings to be matched by using a shift register so that the data bit width is consistent with the FPGA data processing bit width, and then copying the character strings to be matched after converting the data bit width by using a plurality of clock registers so that each regular expression in the regular expression has a part of character strings to be matched after converting the data bit width, so that each regular expression can match the character strings to be matched in parallel.
Step S23: and carrying out parallel matching on the character strings to be matched after pretreatment by the regular expression matching module to obtain a matching result.
After the character strings to be matched are preprocessed, the character strings to be matched after preprocessing can be matched in parallel through the regular expression matching module, and a matching result is obtained.
Specifically, the character strings to be matched after the bit width conversion of the copied data are subjected to parallel matching through all regular expressions in the regular expression matching module, so that a regular expression output result is obtained; and carrying out data bit width conversion on the regular expression output result to obtain the matching result.
That is, the character strings to be matched after the data bit width conversion of the copied data are matched in parallel through each regular expression in the regular expression matching module, so that a regular expression output result can be obtained, and the matching result is obtained after the data bit width conversion of the regular expression output result is carried out.
Step S24: and when the regular expression matching module is in a calculation completion state, reading the matching result from the regular expression matching module.
It can be appreciated that when the regular expression module is in a complete state of calculation, it indicates that each regular expression in the regular expression matching module is completely matched with the character string to be matched, so that the matching result can be read from the regular expression matching module.
Step S25: and sending the matching result to the computer equipment.
After the matching result is read from the regular expression matching module, the matching result can be returned to the computer device.
Referring to FIG. 3, a graph of a whole architecture for regular expression matching is shown. The data transmission content comprises tdata, tkeep, tvalid, tlast and tready signals, the data sent to the FPGA by the computer equipment is firstly received by a data distribution sub-module and sent to a corresponding idle regular expression matching module, after the data are matched in parallel by each regular expression in the regular expression matching module, the matching result is output to a data selection sub-module, and the matching result is transmitted back to the computer equipment by the data selection sub-module.
Referring to FIG. 4, a regular expression matching module architecture is shown. The character string to be matched, which is sent by the computer equipment, is firstly cached in the FIFO cache area, then data is read from the FIFO cache area at intervals of a certain period by the data reading and bit width conversion submodule, the data bit width conversion is realized by the shift register, the data reading period is related to the input data bit width, and one byte of data is read only in one clock period. The register copying submodule improves the data fanout capability through multi-clock register copying, copies the data after the bit width conversion into a plurality of copies, so that all regular expressions are matched in parallel, and the time sequence violation of the system is avoided. And then the data enter a high-parallelism regular expression matching module, and the matching result is characterized in a 0/1 mode. In order to adapt the bit width of the Card-Host data link, the data bit width conversion is performed by a data bit width conversion sub-module before the matching result is transmitted.
Referring to fig. 5, an embodiment of the present application discloses a regular expression matching device, which is applied to an FPGA, and includes:
the data distribution module 11 is used for receiving the character strings to be matched sent by the computer equipment connected with the FPGA;
the regular expression matching module 12 is configured to match the character strings to be matched in parallel through each regular expression, so as to obtain a matching result;
and the data selection module 13 is used for sending the matching result to the computer equipment.
As can be seen, the present application discloses a regular expression matching method, which is applied to an FPGA, and first receives a character string to be matched sent by a computer device connected to the FPGA. And then carrying out parallel matching on the character strings to be matched through all regular expressions in the regular expression matching module in the idle state to obtain a matching result. The matching result may then be sent to the computer device. Therefore, the regular expressions which are originally executed by the CPU are matched and unloaded onto the FPGA, the character strings to be matched are matched in parallel by utilizing the high parallel capability of the FPGA, and as each regular expression is matched with the character strings to be matched in parallel in the matching process, compared with the serial matching of the regular expressions one by the CPU in the prior art, the matching time delay can be reduced, the matching efficiency is improved, and the performance of the whole system is improved.
In a specific implementation process, the data distribution module 11 is configured to:
and receiving a character string to be matched sent by computer equipment connected with the FPGA by utilizing an AXI-Stream protocol.
In a specific implementation, the data distribution module 11 is further configured to:
determining a regular expression matching module which is in an idle state and has the smallest module identifier as a target regular expression matching module;
transmitting the character strings to be matched to the target regular expression matching module;
accordingly, the regular expression matching module 12 is configured to:
and carrying out parallel matching on the character strings to be matched through each regular expression.
In a specific implementation, the data selection module 13 is configured to:
when the regular expression matching module is in a calculation completion state, reading the matching result from the regular expression matching module;
and sending the matching result to the computer equipment.
In a specific implementation, the regular expression matching module 12 is configured to:
preprocessing the character strings to be matched;
and carrying out parallel matching on the character strings to be matched after pretreatment to obtain a matching result.
In a specific implementation, the regular expression matching module 12 includes:
the FIFO buffer is used for buffering the character strings to be matched;
the shift register is used for carrying out data bit width conversion on the character strings to be matched;
and the multi-clock register is used for copying the character strings to be matched after the data bit width conversion.
In a specific implementation, the regular expression matching module 12 is configured to:
carrying out parallel matching on the copied character strings to be matched after the data bit width conversion by using each regular expression to obtain a regular expression output result;
and carrying out data bit width conversion on the regular expression output result to obtain the matching result.
The data distribution module 11 is the data distribution sub-module in fig. 3, the regular expression matching module 12 is the regular expression matching module in fig. 3, and the data selection module 13 is the data selection sub-module in fig. 3.
Further, referring to fig. 6, the embodiment of the present application further discloses an FPGA, which includes: a processing unit 21 and a storage unit 22.
Wherein the storage unit 22 is configured to store a computer program; the processing unit 21 is configured to execute the computer program to implement the regular expression matching method disclosed in the foregoing embodiment.
For the specific process of the regular expression matching method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
Further, an embodiment of the present application further discloses a computer readable storage medium for storing a computer program, where the computer program when executed by a processor implements the steps disclosed in any of the foregoing embodiments:
receiving a character string to be matched sent by computer equipment connected with the FPGA;
carrying out parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in an idle state to obtain a matching result;
and sending the matching result to the computer equipment.
As can be seen, the present application discloses a regular expression matching method, which is applied to an FPGA, and first receives a character string to be matched sent by a computer device connected to the FPGA. And then carrying out parallel matching on the character strings to be matched through all regular expressions in the regular expression matching module in the idle state to obtain a matching result. The matching result may then be sent to the computer device. Therefore, the regular expressions which are originally executed by the CPU are matched and unloaded onto the FPGA, the character strings to be matched are matched in parallel by utilizing the high parallel capability of the FPGA, and as each regular expression is matched with the character strings to be matched in parallel in the matching process, compared with the serial matching of the regular expressions one by the CPU in the prior art, the matching time delay can be reduced, the matching efficiency is improved, and the performance of the whole system is improved.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
and receiving a character string to be matched sent by computer equipment connected with the FPGA by utilizing an AXI-Stream protocol.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
determining a regular expression matching module which is in an idle state and has the smallest module identifier as a target regular expression matching module;
transmitting the character strings to be matched to the target regular expression matching module;
correspondingly, the parallel matching of the character strings to be matched through each regular expression in the regular expression matching module in the idle state comprises the following steps:
and carrying out parallel matching on the character strings to be matched through each regular expression in the target regular expression matching module.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
when the regular expression matching module is in a calculation completion state, reading the matching result from the regular expression matching module;
and sending the matching result to the computer equipment.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
preprocessing the character strings to be matched through the regular expression matching module;
and carrying out parallel matching on the character strings to be matched after pretreatment by the regular expression matching module to obtain a matching result.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
caching the character strings to be matched through a FIFO (first in first out) buffer in the regular expression matching module;
performing data bit width conversion on the character strings to be matched through a shift register in the regular expression matching module;
and copying the character strings to be matched after the data bit width conversion through a plurality of clock registers in the regular expression matching module.
In this embodiment, when the computer subroutine stored in the computer readable storage medium is executed by the processor, the following steps may be specifically implemented:
carrying out parallel matching on the copied character strings to be matched after the data bit width conversion of each data bit is converted through each regular expression in the regular expression matching module, so as to obtain a regular expression output result;
and carrying out data bit width conversion on the regular expression output result to obtain the matching result.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a list of processes, methods, articles, or apparatus that comprises other elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above describes in detail a regular expression matching method, device, FPGA, and medium provided in the present application, and specific examples are applied to illustrate the principles and embodiments of the present application, where the illustration of the above examples is only used to help understand the method and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (8)

1. The regular expression matching method is characterized by being applied to the FPGA and comprising the following steps of:
receiving a character string to be matched sent by computer equipment connected with the FPGA;
carrying out parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in an idle state to obtain a matching result;
transmitting the matching result to the computer device;
the step of performing parallel matching on the character strings to be matched through each regular expression in the regular expression matching module in the idle state to obtain a matching result, including: caching the character strings to be matched through a FIFO (first in first out) buffer in the regular expression matching module; performing data bit width conversion on the character strings to be matched through a shift register in the regular expression matching module; copying the character strings to be matched after the data bit width conversion through a plurality of clock registers in the regular expression matching module so as to finish preprocessing; and carrying out parallel matching on the character strings to be matched after pretreatment by the regular expression matching module to obtain a matching result.
2. The regular expression matching method according to claim 1, wherein the receiving the character string to be matched sent by the computer device connected to the FPGA comprises:
and receiving a character string to be matched sent by computer equipment connected with the FPGA by utilizing an AXI-Stream protocol.
3. The regular expression matching method according to claim 1, wherein before the matching the character strings to be matched in parallel by each regular expression in the regular expression matching module in the idle state, further comprises:
determining a regular expression matching module which is in an idle state and has the smallest module identifier as a target regular expression matching module;
transmitting the character strings to be matched to the target regular expression matching module;
correspondingly, the parallel matching of the character strings to be matched through each regular expression in the regular expression matching module in the idle state comprises the following steps:
and carrying out parallel matching on the character strings to be matched through each regular expression in the target regular expression matching module.
4. The regular expression matching method of claim 1, wherein the sending the matching result to the computer device comprises:
when the regular expression matching module is in a calculation completion state, reading the matching result from the regular expression matching module;
and sending the matching result to the computer equipment.
5. The regular expression matching method according to claim 1, wherein the performing parallel matching on the character strings to be matched after preprocessing by the regular expression matching module to obtain a matching result includes:
carrying out parallel matching on the copied character strings to be matched after the data bit width conversion of each data bit is converted through each regular expression in the regular expression matching module, so as to obtain a regular expression output result;
and carrying out data bit width conversion on the regular expression output result to obtain the matching result.
6. A regular expression matching device, applied to an FPGA, comprising:
the data distribution module is used for receiving character strings to be matched sent by computer equipment connected with the FPGA;
the regular expression matching module is used for carrying out parallel matching on the character strings to be matched through each regular expression when the character strings are in an idle state, so as to obtain a matching result;
the data selection module is used for sending the matching result to the computer equipment;
the regular expression matching module comprises:
the FIFO buffer is used for buffering the character strings to be matched;
the shift register is used for carrying out data bit width conversion on the character strings to be matched;
the multi-clock register is used for copying the character strings to be matched after the data bit width conversion so as to complete preprocessing;
the regular expression matching module is also used for carrying out parallel matching on the character strings to be matched after pretreatment to obtain a matching result.
7. An FPGA, comprising:
a storage unit and a processing unit;
wherein the storage unit is used for storing a computer program;
the processing unit is configured to execute the computer program to implement the regular expression matching method of any of claims 1 to 5.
8. A computer readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the regular expression matching method of any of claims 1 to 5.
CN202111017189.1A 2021-08-31 2021-08-31 Regular expression matching method and device, FPGA and medium Active CN113703715B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111017189.1A CN113703715B (en) 2021-08-31 2021-08-31 Regular expression matching method and device, FPGA and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111017189.1A CN113703715B (en) 2021-08-31 2021-08-31 Regular expression matching method and device, FPGA and medium

Publications (2)

Publication Number Publication Date
CN113703715A CN113703715A (en) 2021-11-26
CN113703715B true CN113703715B (en) 2024-02-23

Family

ID=78658402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111017189.1A Active CN113703715B (en) 2021-08-31 2021-08-31 Regular expression matching method and device, FPGA and medium

Country Status (1)

Country Link
CN (1) CN113703715B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881517A (en) * 2023-07-25 2023-10-13 中科驭数(北京)科技有限公司 Database data processing method and system
CN117574178B (en) * 2024-01-15 2024-04-26 国网湖北省电力有限公司信息通信公司 Automatic network flow character string matching method and device based on FPGA

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101360088A (en) * 2007-07-30 2009-02-04 华为技术有限公司 Regular expression compiling, matching system and compiling, matching method
EP2390797A1 (en) * 2010-05-25 2011-11-30 Huawei Technologies Co., Ltd. Regular expression matching method and system
CN102521356A (en) * 2011-12-13 2012-06-27 曙光信息产业(北京)有限公司 Regular expression matching equipment and method on basis of deterministic finite automaton
GB201218305D0 (en) * 2012-10-12 2012-11-28 Ibm Processor instruction based data prefetching
KR20150026979A (en) * 2013-08-30 2015-03-11 캐비엄, 인코포레이티드 GENERATING A NFA (Non-Deterministic finite automata) GRAPH FOR REGULAR EXPRESSION PATTERNS WITH ADVANCED FEATURES
CN104753931A (en) * 2015-03-18 2015-07-01 中国人民解放军信息工程大学 DPI (deep packet inspection) method based on regular expression
CN106776456A (en) * 2017-01-18 2017-05-31 中国人民解放军国防科学技术大学 High speed matching regular expressions hybrid system and method based on FPGA+NPU
CN109408682A (en) * 2018-10-30 2019-03-01 杭州安恒信息技术股份有限公司 A kind of method of regular expression matching, system and equipment
CN110324204A (en) * 2019-07-01 2019-10-11 中国人民解放军陆军工程大学 A kind of high speed regular expression matching engine realized in FPGA and method
CN111177491A (en) * 2019-12-31 2020-05-19 奇安信科技集团股份有限公司 Regular expression matching method and device, electronic equipment and storage medium
US10691856B1 (en) * 2018-04-02 2020-06-23 Xilinx, Inc. System design flow with runtime customizable circuits

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8990232B2 (en) * 2012-05-15 2015-03-24 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for parallel regular expression matching
US20150324457A1 (en) * 2014-05-09 2015-11-12 Dell Products, Lp Ordering a Set of Regular Expressions for Matching Against a String

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101360088A (en) * 2007-07-30 2009-02-04 华为技术有限公司 Regular expression compiling, matching system and compiling, matching method
WO2009015603A1 (en) * 2007-07-30 2009-02-05 Huawei Technologies Co., Ltd. Regular expression compiling system, matching system, compiling method and matching method
EP2390797A1 (en) * 2010-05-25 2011-11-30 Huawei Technologies Co., Ltd. Regular expression matching method and system
CN102521356A (en) * 2011-12-13 2012-06-27 曙光信息产业(北京)有限公司 Regular expression matching equipment and method on basis of deterministic finite automaton
GB201218305D0 (en) * 2012-10-12 2012-11-28 Ibm Processor instruction based data prefetching
KR20150026979A (en) * 2013-08-30 2015-03-11 캐비엄, 인코포레이티드 GENERATING A NFA (Non-Deterministic finite automata) GRAPH FOR REGULAR EXPRESSION PATTERNS WITH ADVANCED FEATURES
CN104753931A (en) * 2015-03-18 2015-07-01 中国人民解放军信息工程大学 DPI (deep packet inspection) method based on regular expression
CN106776456A (en) * 2017-01-18 2017-05-31 中国人民解放军国防科学技术大学 High speed matching regular expressions hybrid system and method based on FPGA+NPU
US10691856B1 (en) * 2018-04-02 2020-06-23 Xilinx, Inc. System design flow with runtime customizable circuits
CN109408682A (en) * 2018-10-30 2019-03-01 杭州安恒信息技术股份有限公司 A kind of method of regular expression matching, system and equipment
CN110324204A (en) * 2019-07-01 2019-10-11 中国人民解放军陆军工程大学 A kind of high speed regular expression matching engine realized in FPGA and method
CN111177491A (en) * 2019-12-31 2020-05-19 奇安信科技集团股份有限公司 Regular expression matching method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于FPGA的正则表达式匹配引擎设计;王奇敏;李训根;赵海斌;;电子世界(01);全文 *

Also Published As

Publication number Publication date
CN113703715A (en) 2021-11-26

Similar Documents

Publication Publication Date Title
CN113703715B (en) Regular expression matching method and device, FPGA and medium
CN106407201B (en) Data processing method and device and computer readable storage medium
CN109656923B (en) Data processing method and device, electronic equipment and storage medium
CN109255057B (en) Block generation method, device, equipment and storage medium
CN110825436A (en) Calculation method applied to artificial intelligence chip and artificial intelligence chip
CN112817602A (en) JSON format data sending and receiving method, device and medium
WO2022228390A1 (en) Media content processing method, apparatus and device, and storage medium
CN113672030B (en) Data transmission rate generator and related apparatus and method
CN110633433A (en) Page caching method and device, electronic equipment and storage medium
US11381630B2 (en) Transmitting data over a network in representational state transfer (REST) applications
CN111858381B (en) Application fault tolerance capability test method, electronic device and medium
CN112131242A (en) Data rapid query method and device based on redis
CN111461825B (en) Virtual resource generation method and device, electronic equipment and storage medium
CN114063923A (en) Data reading method and device, processor and electronic equipment
CN110990490B (en) Method, device, equipment and medium for checking in blockchain network
CN113807056A (en) Method, device and equipment for correcting error of document name sequence number
CN110019671B (en) Method and system for processing real-time message
US20150067053A1 (en) Managing message distribution in a networked environment
CN115297169B (en) Data processing method, device, electronic equipment and medium
CN111414383A (en) Data request method, data processing system and computing device
CN112163176A (en) Data storage method and device, electronic equipment and computer readable medium
CN111382233A (en) Similar text detection method and device, electronic equipment and storage medium
CN112860739A (en) Hotspot data processing method and device, service processing system and storage medium
CN118012799B (en) Request processing method, request processing device, electronic equipment and storage medium
CN115374320B (en) Text matching method and device, electronic equipment and computer medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant