WO2022134942A1 - 一种海量流量下报文识别的方法和装置 - Google Patents

一种海量流量下报文识别的方法和装置 Download PDF

Info

Publication number
WO2022134942A1
WO2022134942A1 PCT/CN2021/130891 CN2021130891W WO2022134942A1 WO 2022134942 A1 WO2022134942 A1 WO 2022134942A1 CN 2021130891 W CN2021130891 W CN 2021130891W WO 2022134942 A1 WO2022134942 A1 WO 2022134942A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
quintuple
packet
service
core
Prior art date
Application number
PCT/CN2021/130891
Other languages
English (en)
French (fr)
Inventor
王赟
曾伟
Original Assignee
武汉绿色网络信息服务有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 武汉绿色网络信息服务有限责任公司 filed Critical 武汉绿色网络信息服务有限责任公司
Publication of WO2022134942A1 publication Critical patent/WO2022134942A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/37Compiler construction; Parser generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Definitions

  • the present invention relates to the field of network security, and in particular, to a method and device for packet identification under massive traffic.
  • the data plane development kit dpdk solves the bottleneck of collecting and forwarding a large amount of data, but the current matching data feature matching algorithm consumes a lot of resources, such as AC algorithm, AC automata has a large The problem is that the memory consumption is very large. When we want to search for Chinese strings, how to choose the smallest character unit will directly affect the query speed and memory usage.
  • the present invention solves the problem that in the current network monitoring system, due to the huge network traffic but less access ports and access traffic, the existing monitoring method is inefficient, resulting in the inability to detect malicious packets. the issue of comprehensive monitoring.
  • call dpdk to allocate the regular business processing core and special business processing core of the CPU; load the monitoring policy rule file and business feature file, and convert the monitoring policy rule file and business feature file into the rule database and business feature database of Hyperscan mode ;Call dpdk to complete data access, distinguish regular business and special business according to Hyperscan's business feature database and the key of the data quintuple hash value, and put each data quintuple into the data queue of the corresponding CPU core; from each CPU The data quintuple is taken out from the data queue of the kernel and decoded one by one, and a data packet corresponding to each data quintuple is generated; according to the rule database and business database of Hyperscan, the corresponding CPU core is used to respectively decode the data packets in the corresponding queue. Scanning and matching are performed to obtain malicious packets and process malicious packets.
  • each accessed data quintuple is put into the data queue of the corresponding CPU core according to the key of the hash value, which specifically includes generating a hash table according to the service feature database, and the key of the hash table is the key of the hash value corresponding to the special service ; Determine whether the key of the hash value of each data quintuple exists in the hash table; if it exists in the hash table, put the data quintuple into the corresponding position in the hash table, and put it into the data queue of the special business processing core ; If it does not exist in the hash table, put the data quintuple into the processing queue of the conventional business processing core.
  • using the corresponding CPU core to scan and match the data packets in the corresponding queues further comprising: judging whether the application ID of each service packet on the regular service core is a special service ID; The quintuple information of the data packet, the application ID and the preset special service identifier are repackaged, and the packaged data is put into the data queue of the special service processing core, and the special service processing core performs scanning and matching; Service ID, which is scanned and matched by the regular service processing core.
  • the corresponding CPU cores to scan and match the data packets in the corresponding queues, further comprising: if a rule in the rule database or the service feature database contains multiple parallel quintuple features and string features at the same time, matching The quintuple features and string features of the data packets in the rules are scanned in parallel, and then the scanning results are integrated; Features and string features are scanned serially.
  • the normal service processing core and the special service processing core of the CPU after allocating the normal service processing core and the special service processing core of the CPU, it further includes: configuring the packet receiving port and the sending port of the packet, configuring the CPU's packet receiving core, normal service processing core, special service processing core and data forwarding Core, configure the number of memory channels; apply for a ring for receiving message packets and sending messages to the business processing core for each CPU core, as well as a ring for sending data packets, and initialize each ring; establish memory Mapping relationship between pool, ring and DMA; each port and ring after configuration is started.
  • the data quintuple is extracted one by one from the data queue of each CPU core for decoding, which specifically includes: constructing a tree structure in a pre-order traversal manner according to the protocol level in the data quintuple, wherein the level of the tree is the same as the The protocol level is the same, and each node of the tree is a protocol node; while constructing the tree structure, each node of the tree is decoded.
  • the method further includes: judging whether the data quintuple has a corresponding session; if there is no corresponding session, create a corresponding session and join the session hash In the table, and submit the timeout; if there is a corresponding session, update the corresponding session hash node, and determine whether the session ends.
  • use the corresponding CPU core to scan and match the data packets in the corresponding queues and further include: if the packet is a single packet scan of the UDP protocol, the block mode is used; if the packet is an entire stream of TCP data Scan, use steam mode.
  • processing malicious packets specifically includes: blocking malicious packets or reporting netflow logs.
  • the present invention provides a device for identifying packets under massive traffic, specifically: comprising at least one processor and a memory, at least one processor and the memory are connected through a data bus, and the memory storage can be processed by at least one The instruction is executed by the processor, and after the instruction is executed by the processor, the instruction is used to complete the method for packet identification under massive traffic in the first aspect.
  • the beneficial effects of the embodiments of the present invention are: the data distribution of the dpdk is affected by the identification scan of the Hyperscan, and the performance consumption brought by the identification of the Hyperscan application is reduced by the distribution of the special data through the dpdk at the same time. It improves the efficiency of the fusion of dpdk technology and Hyperscan technology.
  • Hyperscan is used for classification, compilation and identification. While dpdk completes the large-traffic packet collection, the efficiency of rule matching is improved, so that dpdk technology and Hyperscan technology can be combined more efficiently.
  • FIG. 1 is a flowchart of a method for identifying packets under massive traffic according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a top-level frame structure of a data access layer used by a method for packet identification under massive traffic provided by an embodiment of the present invention
  • FIG. 3 is a flowchart of another method for packet identification under massive traffic provided by an embodiment of the present invention.
  • FIG. 5 is a schematic flow chart of the Hyperscan compilation period used in a method for packet identification under massive traffic provided by an embodiment of the present invention
  • FIG. 6 is a schematic flowchart of a Hyperscan runtime used in a method for packet identification under massive traffic provided by an embodiment of the present invention
  • FIG. 7 is a schematic diagram of a hierarchical relationship of wtp message decoding used in a method for identifying a message under massive traffic according to an embodiment of the present invention
  • FIG. 8 is a flowchart of another method for packet identification under massive traffic provided by an embodiment of the present invention.
  • FIG. 9 is a flowchart of another method for packet identification under massive traffic provided by an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an apparatus for identifying packets under massive traffic according to an embodiment of the present invention.
  • the present invention is an architecture of a specific functional system. Therefore, the functional logic relationship of each structural module is mainly described in the specific embodiments, and the specific software and hardware implementations are not limited.
  • dpdk Abbreviated as: dpdk. It is a set of data plane development tools provided by Intel, including a set of corresponding toolkits for lib libraries. Provides library functions and driver support for efficient packet processing in user space under the Intel architecture (IA) processor architecture.
  • IA Intel architecture
  • dpdk focuses on the high-performance processing of data packets in network applications. It runs on the user space and uses the data plane library provided by itself to send and receive data packets, bypassing the Linux kernel protocol stack to process data packets.
  • Traditional packet processing tasks involve switching between kernel mode and user mode, as well as multiple memory copies, which increases system consumption.
  • CPU-centric systems have a large processing bottleneck.
  • dpdk technology improves data packet processing in general-purpose servers.
  • the dpdk application program runs in the User Space of the operating system, and uses the data plane library provided by itself to send and receive packets, bypassing the Linux kernel mode protocol stack to improve the efficiency of packet processing.
  • the processing performance is further improved through mechanisms such as polling and interrupts, multi-threaded programming, CPU affinity, large page tables, lock-free mechanism, cache pre-reading, and UIO (Userspace I/O).
  • the high-performance regular expression matching library from Intel uses a specific syntax and working mode to ensure its practicability in real network scenarios.
  • Hyperscan uses the SIMD instructions that Intel processors have in the engine for acceleration.
  • the user can customize the behavior after matching through the callback function. Since the generated database is read-only, users can share the database in multiple CPU cores or multi-threaded scenarios to improve matching scalability. It has the characteristics of diverse functions, support for large-scale matching, support for streaming mode, high performance and high scalability.
  • Hyperscan can meet different usage scenarios through different matching modes (stream mode and block mode), support matching of tens of thousands to hundreds of thousands of rules, and single-core performance can achieve 3.6Gbps ⁇ 23.9Gbps, with the number of cores used
  • the matching performance is basically in a linear growth trend.
  • a network sniffer can capture and analyze data packets on the network, and respond and process according to defined rules. After analyzing the rules of the acquired data packets, according to the rule chain, Activation (alarm and start another dynamic rule chain), Dynamic (called by other rule packages), Alert (alarm), Pass (ignore). ), Log (not alarming but recording network traffic) five response mechanisms.
  • Snort has a variety of functions such as packet sniffing, packet analysis, packet detection, and response processing. Each module implements different functions. Each module is combined with Snort in the form of plug-ins, which is convenient for function expansion.
  • the function of the preprocessing plug-in is to run before rule matching and misuse detection, complete TIP fragment reassembly, http decoding, telnet decoding and other functions, and the processing plug-in completes functions such as checking each field of the protocol, closing the connection, and attacking the response.
  • the output plug-in will get Various conditions after processing are output in the form of logs or warnings.
  • PCRE is a Perl library, including a perl-compatible regular expression library.
  • Communication term Usually refers to the source IP address, source port, destination IP address, destination port and transport layer protocol.
  • RSS is a load distribution method proposed by Microsoft. It calculates the network layer & transport layer two/three/quadruple hash value in the network data message, and takes the least significant bit of the hash value to index the indirect addressing table.
  • Intel's x86 processor performs access control through the ring level.
  • the level is divided into 4 layers, ring0, ring1, ring2 and ring3.
  • the ring0 layer has the highest authority, and the ring3 layer has the lowest authority.
  • Memory Pool abbreviated as mempool. It is a memory allocation method, also known as fixed-size-blocks allocation. When you directly use APIs such as new and malloc to apply for memory allocation, due to the variable size of the applied memory block, a large number of memory fragments will be caused and performance will be reduced when used frequently.
  • the memory pool is to apply for the allocation of a certain number of memory blocks of equal size as spares before actually using the memory. When there is a new memory demand, a part of the memory block is allocated from the memory pool. Continue to apply for new memory to improve memory allocation efficiency.
  • DMA Direct Memory Access
  • This mechanism allows hardware devices of different speeds to communicate without relying on the CPU's heavy interrupt load.
  • DMA transfers copy data from one address space to another.
  • the transfer itself is performed and completed by the DMA controller. For example, when a block of external memory is moved to a faster memory area inside the chip, the processor can handle other tasks at the same time.
  • DMA transfers play an important role in high-performance embedded system algorithms and networks.
  • dpdk is combined with hyperscan, which satisfies both large-traffic data collection and efficient feature string matching, and improves the processing efficiency of large-traffic.
  • Step 101 Invoke dpdk to allocate the normal service processing core and special service processing core of the CPU.
  • the method for packet identification introduces a data access layer between the network interface layer (network card driver) and the monitoring application to complete packet distribution, packet sharing, device isolation, and data isolation.
  • the data distribution function can make the system more convenient to use data concurrency to realize the concurrent processing of data packets and improve the user monitoring application.
  • Packet sharing allows multiple data packets to be shared, and supports debugging and comparison tools such as TCPDump. Through TCPDump, the data packets transmitted in the network can be completely intercepted and analyzed, and support for network layer, protocol, host, network or port. filter, and provide logical statements such as and, or, not to remove useless information.
  • the use of independent data access layers can also support other independent business programs like monitoring applications.
  • Device isolation enables the system to ignore the impact of network card hardware differences and thus has the ability to extend to non-X86 platforms. Data isolation can effectively protect the system kernel to prevent equipment system failures caused by abnormal operation of business programs.
  • the top-level framework of the data access layer used in this embodiment needs to use modules such as a receiving interface, a sending interface, a receiving queue pool, and a sending queue pool.
  • the receive queue pool manages all receive queue rings
  • the send queue pool manages all send queue rings.
  • each functional module in the data access layer is matched with the actual hardware device, as shown in Figure 3, the initialization configuration can be performed through the following steps.
  • the service processing cores are further divided into groups.
  • the cores are divided into two groups: regular business processing cores and special business processing cores. Regular business processing checks for regular business processing, and special business processing checks for special business processing.
  • Step 201 Configure a packet receiving port and a sending port of the message, configure a packet receiving core, a normal service processing core, a special service processing core and a data forwarding core of the CPU, and configure the number of memory channels.
  • Step 202 Apply for a ring for receiving message data packets and sending messages to the service processing core, and a ring for sending data packets for each CPU core, and initialize each ring.
  • Step 203 Establish a mapping relationship among the memory pool, ring and DMA.
  • Step 204 Start the configured ports and rings.
  • specific initialization configuration may be performed according to the following steps.
  • the following specific configuration is only an example of a certain scenario.
  • the actual parameter configuration depends on the size of the data stream, and the serial numbers and number of packet receiving cores, processing cores, and forwarding cores.
  • Step 301 Configure port1 and port2 as packet receiving ports, port3 and port4 as sending ports; CPU serial numbers 1-2 as packet receiving cores, CPU serial numbers 3-4 as data forwarding cores, and CPU serial numbers 5 and 7 as regular services Processing core; CPU serial numbers 6 and 8 are used as special service processing cores, and the CPU serial number and the number of memory channels are set to 4.
  • Step 302 Each core applies for 1 receiving queue ring for data packet reception, and applies for 1 forwarding ring for sending packets to the service processing core.
  • Step 303 Initialize and configure the packet receiving ports port1 and port2, bind the service forwarding core, and initialize the receiving queue ring.
  • Step 304 Establish a mapping relationship among the mempool, the receive queue ring, the forwarding ring and the DMA, and start port1 and port2.
  • Step 305 Apply for a sending queue ring.
  • Step 306 Configure port3 and port4, bind the data forwarding core, and initialize the sending queue ring.
  • Step 307 Establish a mapping relationship between mempool, send queue ring and DMA, and start port3 and port4.
  • the receiving queue ring, the forwarding ring and the sending queue ring used in steps 301 to 307 are all lock-free queues.
  • the two threads can operate concurrently without any locking behavior, maintaining thread safety while ensuring high concurrent processing, and improving the processing efficiency of the CPU.
  • Step 102 Load the monitoring policy rule file and the business feature file, and convert the monitoring policy rule file and the business feature file into a Hyperscan mode rule database and business feature database.
  • the service feature may be the application ID feature of the service, The port number for sending and receiving services, etc.
  • the monitoring policy rule file and service feature file used in this embodiment are written in the snort syntax format, and the rule database uses hs_database_t.
  • the rule database and the service feature database can choose to use the flow pattern recognition rule base or the block pattern recognition rule base, or use both databases at the same time;
  • Each regular expression rule and the corresponding event are stored in the library, where the event is the event id after the identification hit, and the rule database and the service feature database are shared by all service processing threads in this embodiment.
  • Hyperscan is based on automata theory, and its workflow is mainly divided into two parts: compile time and run-time.
  • Hyperscan comes with a regular expression compiler written in C++. As shown in Figure 5, it takes regular expressions as input, and generates corresponding databases through complex graph analysis and optimization processes for different CPU core architecture platforms, user-defined patterns and special syntaxes. In addition, the resulting database can be serialized and kept in memory for runtime fetching.
  • the runtime of Hyperscan is developed in C language.
  • Figure 6 shows the main flow of Hyperscan during runtime. Users need to pre-allocate a section of memory to store temporary matching status information, and then use the compiled database to call the matching engine (such as NFA, DFA, etc.) inside Hyperscan to perform pattern matching on the input.
  • Hyperscan uses the SIMD instructions that Intel processors have in the engine for acceleration. At the same time, the user can customize the behavior after the match occurs through the callback function. Since the generated database is read-only, users can share the database in multiple CPU cores or multi-threaded scenarios to improve matching scalability.
  • hs_compile is called for the rule compilation of a single regular expression, and the prototype is:
  • expression is a regular expression string; flags is used to control regular behavior, such as: ignoring case, making the ".” symbol include newlines, etc.; mode determines the format of the generated database, mainly BLOCK, STREAM and VECTOR, the database of each mode can only be used by the corresponding scan interface; platform is used to specify the CPU characteristics of the target platform of this database, and the value of platform is NULL to indicate that the target platform is consistent with the current platform; db is used to save The compiled database; error is used to receive error information.
  • hs_compile_multi For the rule compilation of multiple regular expressions, hs_compile_multi is called, and the prototype is:
  • expressions are multiple regular expression strings; flags and ids are the flag and id arrays corresponding to expressions respectively; elements are the number of expression strings; the rest of the parameter definitions are the same as those of hs_compile.
  • Step 103 Invoke dpdk to complete data access, distinguish regular services and special services according to the Hyperscan service feature database and the key of the data quintuple hash value, and put each data quintuple into the data queue of the corresponding CPU core.
  • the packet receiving cores 1-2 complete the access of large-traffic data, and enter the received data into the queue (inqueue).
  • the queue In order to ensure that the same quintuple data is input to the same receiving core and avoid the loss of processing efficiency caused by the core switching of different parts of the quintuple during matching, when the input is put into the data queue, the same The quintuple of data is input into the same receive queue ring of the CPU core used for receiving.
  • the soft hash algorithm (RSS) of dpdk is used to achieve the same source and the same destination, so as to ensure the singleness and integrity of the data flow on each core.
  • the service packets are classified according to the service feature database, and the regular service packets and special service packets are distinguished.
  • a hash table may be used for classification.
  • the key of the hash table is the key of the hash value corresponding to the special business; determine whether the key of the hash value of each data quintuple exists in the hash table; if the key of the data quintuple hash value exists In the hash table, it indicates that the data quintuple is a special service, put the data quintuple into the corresponding position in the hash table, and put it into the receiving queue ring of the special service processing core, waiting for the special service processing core to process; If it does not exist in the hash table, it means that the data quintuple is a regular service, and the data quintuple is put into the receiving queue ring of the regular service processing core for processing by the regular service processing core.
  • Step 104 Extract data quintuples one by one from the data queue of each CPU core to decode, and generate a data packet corresponding to each data quintuple.
  • the regular service processing core and the special service processing core respectively cyclically read the data in the respective receive queue rings for parallel processing.
  • a tree structure is constructed in a pre-order traversal manner, wherein the level of the tree is consistent with the protocol level, and each node of the tree is a protocol node; While building the tree structure, each node of the tree is decoded.
  • Step 401 Construct a tree composed of protocol nodes, namely a tree structure, by each layer of protocol nodes.
  • Step 402 Expand the tree layer by layer according to the protocol node.
  • Step 403 While constructing the tree, use the pre-order traversal method to complete the traversal of each node in the tree.
  • Step 404 Complete the decoding work of each layer protocol by traversing.
  • step 403 and step 404 are completed synchronously, that is, every time a node is constructed, the decoding of the node is completed at the same time, and the traversal of the node is completed, and there is no need to wait for the traversal after the tree construction is completed, which improves the decoding efficiency.
  • the fields to be decoded include: protocol variable data of protocols such as ETHER, ICMP, IP, TCP, UDP, DNS, HTTP, SMTP, POP3, IMAP, and WTP. As shown in Figure 7, it is the hierarchical relationship of WTP packet decoding. The number of decoded fields is not less than 300.
  • the implementation manner provided in this embodiment can support the decoding of IPv4 and IPv6 protocols.
  • session management includes: session creation, update, aging and closing.
  • hardware timers can be used to maintain the life cycle of quintuple-based sessions.
  • the session management based on quintuple is maintained through a hash table, where the key of the hash table is quintuple information.
  • Determine whether the data quintuple has a corresponding session if there is no corresponding session, create a corresponding session, add it to the session hash table, and submit the timeout; if there is a corresponding session, update the corresponding session hash node, and determine whether the session ends .
  • session management can be performed through the following steps:
  • Step 501 Decode and output the data in the receiving queue ring, and check whether there is the same node in the hash table.
  • Step 502 When the search result is empty, create a new hash node and submit a timeout at the same time.
  • Step 503 When the same node is found, update the existing node, and determine whether the current session ends.
  • Step 504 If the current session has ended, delete the timeout and close the session.
  • Step 505 If the current session has not ended, only the session node information is updated.
  • Step 506 When the timeout returns, check whether the current state of the session is updated compared to the time of submission.
  • Step 507 If the session is not updated, delete the timeout and close the session.
  • Step 508 If the session state is updated, update the session node information, and resubmit the timeout until the next timeout arrives.
  • step 503 the basis for judging whether the TCP session ends is to judge whether the current session has a FIN or RST packet
  • Step 105 According to the rule database and the business database of Hyperscan, use the corresponding CPU core to scan and match the data packets in the corresponding queues, obtain malicious packets, and process the malicious packets.
  • Hyperscan In order to identify different types of data packets, Hyperscan also needs to be used for feature scanning.
  • different scanning matching methods can be selected to improve the efficiency of feature scanning.
  • a rule in the rule database or business feature database contains multiple parallel quintuple features and string features at the same time, scan the quintuple features and string features of the data packets in the rule in parallel, and then check the Scan results are integrated.
  • the service packet When the application ID of a service packet is a special service in the service database, the service packet is repackaged into a data packet, and the data packet contains the special service identifier, the application ID, and the quintuple information of the original service packet.
  • the data packet is sent back to the receiving queue corresponding to the receiving cores (1 and 2 cores) of dpdk; the data packet with the special identifier is retrieved from the receiving queue of dpdk, and the quintuple in the data packet is extracted information, calculate the quintuple hash value through dpdk's RSS algorithm, and update it to the special service tuple hash table; after the data receiving core (1, 2) core receives the data, retrieve the quintuple information in the hash table, and return When the retrieval result is true, the piece of data is distributed to the queues of the special service processing cores (6 and 8 cores) for separate processing, and the application ID is directly assigned, which reduces the link of subsequent hypers
  • the block mode is used, and hs_scan is called for matching.
  • the prototype is:
  • db is the database obtained by compiling in step 103; data and length are the data to be matched and the data length respectively; flags is used to control the function behavior in future versions and is not currently used; scratch is to be used when matching
  • the temporary data has been allocated; onEvent: the callback function called when matching, the user can customize the behavior after matching through the callback function; context is the user-defined pointer.
  • the stream mode is used for matching. Since the stream mode is required, the stream must be opened before matching.
  • db is the pre-compiled schema database; flags: the flag for modifying the stream behavior, this parameter is provided for future use, but is not used at present; stream: when it succeeds, it will return a pointer to the generated hs_stream_t, and when it fails, it is NULL.
  • id is the hs_stream_t pointer corresponding to the stream to which the data belongs; other parameters are the same as hs_scan.
  • the flow mode of Hyperscan can realize the cross-packet scanning of TCP data, and can complete the cross-packet identification of the protocol without reorganizing the TCP data flow, but it needs to cache the context of the TCP data flow and maintain it through the hash table.
  • After scanning the payload of the current data packet and determining the hit rule look up the corresponding eventID in the identification library according to the hit identification rule, and save it in the hash node of the session.
  • the hit event ID corresponds to the malicious packet in the data stream that hits the monitoring policy rule.
  • the parameter flag in order to improve the matching efficiency, whether in block mode or stream mode, the parameter flag must be set to HS_FLAG_DOALL in the process of matching search to avoid performance degradation of hyperscan.
  • the malicious packets that hit the monitoring policy rules also need to be blocked or reported to the netflow log.
  • the blocking scheme and the reporting scheme can be selected, and other processing schemes for malicious packets can also be selected.
  • Blocking scheme After the current data packet hits the monitoring policy, immediately form an RST packet according to the quintuple information in the original data stream, a total of two RST packets are sent to the client and the server respectively, and the packets are output at the same time (inqueue) to the dpdk packet sending queue. After the data forwarding core (5-6 core) reads the packet, it is sent to the protocol stack through the configured packet sending port. After the client and server receive the RST packet , automatically close the session, the stream ends, and achieve the effect of blocking.
  • Netflow log reporting scheme Based on session management, after the session ends normally or the timeout expires, a session-level view of network traffic is provided through the pre-delivered netflow custom format, and the information of each TCP/IP transaction is recorded. The field information is organized and reported according to the template information, and the flow log information is reported in the format of TLV fragmentation; the output message (inqueue) is sent to the dpdk packet sending queue. After reading the message, the data forwarding core uses socket communication to monitor Reporting on duty.
  • the advantage of the netflow log reporting solution is that it can output statistical data based on a template, which is convenient for adding data fields that need to be output, and supports a variety of new netflow functions, with flexible use and strong scalability. Specifically, the netflow V9 custom format can be used.
  • the method for identifying and blocking malicious packets in massive traffic makes full use of the processing performance of DPDK for data packets and the efficient and flexible regular expression matching capability of Hyperscan through the combination of DPDK and Hyperscan, providing a It is a fast and accurate rule matching method that can be used in massive traffic scenarios, realizes comprehensive and rapid monitoring of massive data packets, effectively monitors various network attack behaviors, and satisfies the purpose of comprehensively monitoring and protecting network security and realizing network security and stable operation.
  • the present invention also provides a device for identifying packets under massive traffic that can be used to implement the above method.
  • FIG. 10 A schematic diagram of the device architecture of the embodiment.
  • the apparatus for identifying packets under massive traffic in this embodiment includes one or more processors 21 and a memory 22 . Among them, one processor 21 is taken as an example in FIG. 10 .
  • the processor 21 and the memory 22 may be connected through a bus or in other ways, and the connection through a bus is taken as an example in FIG. 10 .
  • the memory 22 is used as a non-volatile computer-readable storage medium for a method of identifying a message under massive traffic, and can be used to store non-volatile software programs, non-volatile computer-executable programs and modules, such as the massive amount in Embodiment 1. Packet identification method under traffic.
  • the processor 21 executes various functional applications and data processing of the device for packet identification under massive traffic by running the non-volatile software programs, instructions and modules stored in the memory 22, that is, to realize the massive traffic of Embodiment 1. The method of packet identification.
  • Memory 22 may include high speed random access memory, and may also include nonvolatile memory, such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device.
  • the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the program instructions/modules are stored in the memory 22, and when executed by one or more processors 21, execute the method for packet identification under massive traffic in the above-mentioned embodiment 1, for example, execute the above-described FIG. 1, FIG. 3, The various steps shown in FIG. 4 , FIG. 8 and FIG. 9 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种海量流量下报文识别的方法和装置,涉及网络安全领域。该方法包括:调用dpdk,分配CPU的常规业务处理核和特殊业务处理核;加载监控策略规则文件和业务特征文件,转换为Hyperscan模式的规则数据库和业务特征数据库;调用dpdk完成数据接入,业务特征数据库和数据五元组hash值的key区分常规业务和特殊业务,将每个数据五元组放入相应CPU内核的数据队列;对数据五元组进行解码,生成每个数据五元组对应的数据报文;根据Hyperscan的规则数据库和业务数据库,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,获取恶意报文,并对恶意报文进行处理。该方法减小了在Hyperscan应用识别中带来的性能消耗,提高了dpdk技术与Hyperscan技术融合的高效性。

Description

一种海量流量下报文识别的方法和装置 【技术领域】
本发明涉及网络安全领域,特别是涉及一种海量流量下报文识别的方法和装置。
【背景技术】
随着互联网业务快速发展,互联网出入口带宽不断增加,网络流量快速增长,网络攻击行为数量不断增长、复杂性也不断提升。为了保证互联网系统能够给使用者创造一个安全、稳定的使用环境,必须严密关注网络系统服务平台的运行状态,识别海量流量的数据报文中的恶意报文,并对恶意报文进行相应阻断和上报等处理。
但是,在目前的网络环境中国际出入口流量与日俱增,系统无法实现全流量接入,多数省际出入口因运营商不断调整网络拓扑及扩容,接入线路非常少,部分接入线路流量占比很低,导致监测点及监测流量接入不足,无法实现有效监测。同时,近年网络攻击行为在数量上有了较大增长,攻击特征有了较大变化,其监测复杂性也不断提升,现有的恶意网络行为监测引擎需要升级以满足监测需求。为了对海量数据进行监测,数据面开发套件dpdk解决了大量数据的采集及转发瓶颈,但目前与之匹配的数据特征匹配算法对资源的耗费极大,如AC算法,AC自动机有一个很大的问题是消耗的内存特别大,当我们要做的中文字符串的查找时,如何选择最小的字符单位,会直接影响查询速度和占用内存。
鉴于此,如何克服该现有技术所存在的缺陷,解决现有监测方法无法全面高效的监测恶意报文的现象,是本技术领域待解决的问题。
【发明内容】
针对现有技术的以上缺陷或改进需求,本发明解决了目前网络监测系统中由于网络流量巨大但接入口和接入流量较少,但现有监测方式效率较低,导致的无法对恶意报文进行全面监测的问题。
本发明实施例采用如下技术方案:
第一方面,调用dpdk,分配CPU的常规业务处理核和特殊业务处理核;加载监控策略规则文件和业务特征文件,将监控策略规则文件和业务特征 文件转换为Hyperscan模式的规则数据库和业务特征数据库;调用dpdk完成数据接入,根据Hyperscan的业务特征数据库和数据五元组hash值的key区分常规业务和特殊业务,将每个数据五元组放入相应CPU内核的数据队列;从每个CPU内核的数据队列中逐个取出数据五元组进行解码,生成每个数据五元组对应的数据报文;根据Hyperscan的规则数据库和业务数据库,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,获取恶意报文,并对恶意报文进行处理。
优选的,根据hash值的key将每个接入的数据五元组放入相应CPU内核的数据队列,具体包括根据业务特征数据库生成hash表,hash表的key为特殊业务对应的hash值的key;判断每个数据五元组hash值的key是否存在于hash表中;若存在于hash表中,将数据五元组放入hash表中相应的位置,并放入特殊业务处理核的数据队列;若不存在于hash表中,将数据五元组放入常规业务处理核的处理队列。
优选的,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:判断常规业务核上的每个业务报文的应用ID是否为特殊业务ID;若是特殊业务ID,将数据报文的五元组信息、应用ID和预设特殊业务标识符重新组包,将组包后的数据放入特殊业务处理核的数据队列,由特殊业务处理核进行扫描匹配;若不是特殊业务ID,由常规业务处理核进行扫描匹配。
优选的,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:若规则数据库或业务特征数据库的一条规则同时包含多个并列的五元组特征和字符串特征,对规则中的数据报文的五元组特征和字符串特征进行并行扫描,再对扫描结果进行整合;或,若规则数据库或业务特征数据库的规则包括多重限定,依次对数据报文的五元组特征和字符串特征进行串行扫描。
优选的,分配CPU的常规业务处理核和特殊业务处理核之后,还包括:配置报文的收包端口和发送端口,配置CPU的收包核、常规业务处理核、特殊业务处理核和数据转发核,配置内存通道的数量;为每个CPU内核申请用于接收报文数据包和将报文发送到业务处理核的ring,以及一个用于发送数据包的ring,并初始化各ring;建立内存池、ring和DMA之间的映射关系;启动配置后的各端口和ring。
优选的,从每个CPU内核的数据队列中逐个取出数据五元组进行解码,具体包括:根据数据五元组中的协议层级,按照先序遍历的方式构建树结 构,其中,树的层级与协议层级一致,树的每一个节点为一个协议节点;构建树结构的同时,对树的每一个节点进行解码。
优选的,从每个CPU内核的数据队列中逐个取出数据五元组进行解码之后,还包括:判断数据五元组是否有对应的会话;若没有对应的会话,创建对应的会话,加入会话hash表中,并提交超时;若存在相应的会话,更新对应的会话hash节点,并判断会话是否结束。
优选的,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:若报文为UDP协议的单个报文扫描,使用block模式;若报文为TCP数据的整条流扫描,使用steam模式。
优选的,对恶意报文进行处理,具体包括:对恶意报文进行阻断或netflow日志上报。
另一方面,本发明提供了一种海量流量下报文识别的装置,具体为:包括至少一个处理器和存储器,至少一个处理器和存储器之间通过数据总线连接,存储器存储能被至少一个处理器执行的指令,指令在被处理器执行后,用于完成第一方面中的海量流量下报文识别的方法。
与现有技术相比,本发明实施例的有益效果在于:通过Hyperscan的识别扫描影响dpdk的数据分发,同时通过dpdk对特殊数据的分发减小了在Hyperscan应用识别中带来的性能消耗,效的提高了dpdk技术与Hyperscan技术融合的高效性。并在优选方案中利用Hyperscan进行分类编译及识别,在dpdk完成大流量收包的同时,提高了规则匹配效率,从而使dpdk技术遇Hyperscan技术更高效的结合。
【附图说明】
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍。显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种海量流量下报文识别的方法流程图;
图2为本发明实施例提供的一种海量流量下报文识别的方法使用的数据访问层顶层框架结构示意图;
图3为本发明实施例提供的另一种海量流量下报文识别的方法流程图;
图4为本发明实施例提供的另一种海量流量下报文识别的方法流程图;
图5为本发明实施例提供的一种海量流量下报文识别的方法中使用的 Hyperscan编译期流程示意图;
图6为本发明实施例提供的一种海量流量下报文识别的方法中使用的Hyperscan运行期流程示意图;
图7为本发明实施例提供的一种海量流量下报文识别的方法中使用的wtp报文解码层级关系示意图;
图8为本发明实施例提供的另一种海量流量下报文识别的方法流程图;
图9为本发明实施例提供的另一种海量流量下报文识别的方法流程图;
图10为本发明实施例提供的一种海量流量下报文识别的装置结构示意图。
【具体实施方式】
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
本发明是一种特定功能系统的体系结构,因此在具体实施例中主要说明各结构模组的功能逻辑关系,并不对具体软件和硬件实施方式做限定。
此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。下面就参考附图和实施例结合来详细说明本发明。
对本发明实施例中使用到的一些术语解释如下:
(1)Intel Data Plane Development Kit
简写为:dpdk。是intel提供的一套数据平面开发工具集,包括一组lib库相应的工具包集合。为Intel architecture(IA)处理器架构下用户空间高效的数据包处理提供库函数和驱动的支持。dpdk专注于网络应用中数据包的高性能处理,运行在用户空间上利用自身提供的数据平面库来收发数据包,绕过了Linux内核协议栈对数据包处理过程。传统的包处理任务存在内核态与用户态的切换,以及多次的内存拷贝,系统消耗变大,以CPU为核心的系统存在很大的处理瓶颈,dpdk技术提升了在通用服务器的数据包处理效能,dpdk应用程序运行在操作系统的User Space,利用自身提供的数据面库进行收发包处理,绕过了Linux内核态协议栈,以提升报文处理效率。同时,还通过轮询与中断、多线程编程、CPU亲和性、大页表、无锁机制、cache预读取、UIO(Userspace I/O)等机制进一步提高处理性能。
(2)Hyperscan
来自于Intel的高性能的正则表达式匹配库,使用了特定的语法和工作模式来保证其在真实网络场景下的实用性。Hyperscan在引擎中使用Intel处理器所具有的SIMD指令进行加速。同时,用户可以通过回调函数来自定义匹配后采取的行为。由于生成的数据库是只读的,用户可以在多个CPU核或多线程场景下共享数据库来提升匹配扩展性。具备功能多样、支持大规模匹配、支持流模式、高性能高扩展性等特点。Hyperscan可以通过不同的匹配模式(流模式和块模式)来满足不同的使用场景、支持几万到几十万的规则的匹配,并且单核性能可实现3.6Gbps~23.9Gbps,随着使用核数的增加,匹配性能基本处于线性增长的趋势。
(3)Snort
一种网络嗅探器,能够对网络上的数据包进行抓包分析并根据所定义的规则进行响应及处理。通过对获取的数据包,进行各规则的分析后,根据规则链,可采取Activation(报警并启动另外一个动态规则链)、Dynamic(由其它的规则包调用)、Alert(报警),Pass(忽略),Log(不报警但记录网络流量)五种响应的机制。Snort有数据包嗅探,数据包分析,数据包检测,响应处理等多种功能,每个模块实现不同的功能,各模块都是用插件的方式和Snort相结合,功能扩展方便。例如,预处理插件的功能就是在规则匹配误用检测之前运行,完成TIP碎片重组,http解码,telnet解码等功能,处理插件完成检查协议各字段,关闭连接,攻击响应等功能,输出插件将得理后的各种情况以日志或警告的方式输出。
(4)Perl Compatible Regular Expressions
简写为:PCRE,是一个Perl库,包括perl兼容的正则表达式库。
(5)五元组
通信术语。通常是指源IP地址,源端口,目的IP地址,目的端口和传输层协议。
(6)receive side scaling
简写为:RSS。是由微软提出的一种负载分流方法,通过计算网络数据报文中的网络层&传输层二/三/四元组hash值,取hash值的最低有效位用于索引间接寻址表。
(7)ring
Intel的x86处理器是通过ring级别来进行访问控制的,级别共分4层,ring0、ring1、ring2和ring3。ring0层拥有最高的权限,ring3层 拥有最低的权限。
(8)内存池
Memory Pool,简写为mempool。是一种内存分配方式,又被称为固定大小区块规划(fixed-size-blocks allocation)。直接使用new、malloc等API申请分配内存时,由于所申请内存块的大小不定,当频繁使用时会造成大量的内存碎片并进而降低性能。内存池则是在真正使用内存之前,先申请分配一定数量的、大小相等的内存块留作备用,当有新的内存需求时,就从内存池中分出一部分内存块,若内存块不够再继续申请新的内存,使得内存分配效率得到提升。
(9)直接存储器访问
Direct Memory Access,简写为DMA。该机制允许不同速度的硬件装置来沟通,而不需要依赖于CPU的大量中断负载。DMA传输将数据从一个地址空间复制到另外一个地址空间。当CPU初始化这个传输动作,传输动作本身是由DMA控制器来实行和完成。例如,将一个外部内存的区块移动至芯片内部更快的内存区时,处理器可以同时处理其它任务。DMA传输对于高效能嵌入式系统算法和网络具有重要作用。
实施例1:
目前,互联网出入口带宽不断增加,网络流量快速增长,网络攻击行为数量不断增长、复杂性也不断提升。因此,需要对海量流量下的报文进行快速大量的匹配,以查找其中可能存在的恶意报文。本实施例中,将dpdk与hyperscan相结合,同时满足了大流量数据采集与高效的特征字符串匹配,提高了大流量的处理效率。
如图1所示,本发明实施例提供的海量流量中不同类型报文识别阻断的方法具体步骤如下:
步骤101:调用dpdk,分配CPU的常规业务处理核和特殊业务处理核。
本实施例提供的报文识别的方法,在网络接口层(网卡驱动)与监测应用程序之间引入数据访问层,完成报文分发、报文共享、设备隔离以及数据隔离。数据分发功能可以使系统更方便的使用数据并发来实现数据包的并发处理,提升用户监测应用程序。报文共享可以使多个进行共享数据报文,对TCPDump等调试对比工具提供支持,通过TCPDump可以将网络中传送的数据包完全截获下来提供分析,支持针对网络层、协议、主机、网络或端口的过滤,并提供and、or、not等逻辑语句去除无用的信息。进一步的,使用独立的数据访问层也可支持类似监测应用程序的其他独立业务 程序。设备隔离可以使本系统能够忽略网卡硬件差异的影响从而有能力扩展到非X86平台。数据隔离可以有效保护系统内核阻止业务程序异常操作引起设备系统故障。
为了实现数据访问层的各项功能,如图2所示,本实施例使用的数据访问层顶层框架需要使用接收接口、发送接口、接收队列池、发送队列池等模块。其中,接收队列池管理所有的接收队列ring,发送队列池管理所有的发送队列ring。
为了建立数据访问层,将数据访问层中各功能模块与实际的硬件设备相匹配,如图3所示,可以通过以下步骤进行初始化配置。
在具体的业务处理场景中,根据业务类型不同,除了常规业务外,还存在一些需要进行特殊处理的业务报文,如某些虽然符合监控策略规则的特征但属于正常业务报文的业务报文,或占用CPU时间较长的报文,或处理时需要对CPU进行特殊设置的报文。在本实施例中,为了将常规业务和特殊业务进行分类处理,对CPU进行初始化时,除了将CPU内核按照常规的收包核、数据转发核和业务处理核进行分组外,还进一步将业务处理核分为常规业务处理核和特殊业务处理核两组,常规业务处理核对常规业务进行处理,特殊业务处理核对特殊业务进行处理。
步骤201:配置报文的收包端口和发送端口,配置CPU的收包核、常规业务处理核、特殊业务处理核和数据转发核,配置内存通道的数量。
步骤202:为每个CPU内核申请用于接收报文数据包和将报文发送到业务处理核的ring,以及一个用于发送数据包的ring,并初始化各ring。
步骤203:建立内存池、ring和DMA之间的映射关系。
步骤204:启动配置后的各端口和ring。
在本实施例的具体实施场景中,如图4所示,可以按照下列步骤进行具体的初始化配置。下列具体配置仅为某个场景的实例,在不同的实施场景中,实际参数配置要根据数据流的大小,收包核、处理核、转发核的序列号和数量而定。
步骤301:配置port1、port2作为收包端口,port3、port4作为发送端口;CPU序列号1-2作为收包核,CPU序列号3-4作为数据转发核,CPU序列号5和7作为常规业务处理核;CPU序列号6和8作为特殊业务处理核,CPU序列号,内存通道数设置为4。
步骤302:每个核申请1个接收队列ring用于数据包接收,申请1个转发ring用于将报文发送到业务处理核。
步骤303:初始化收包端口port1、port2并进行配置,绑定业务转发核,初始化接收队列ring。
步骤304:建立mempool、接收队列ring、转发ring和DMA之间的映射关系,并启动port1、port2。
步骤305:申请一个发送队列ring。
步骤306:对port3、port4进行配置,绑定数据转发核,并将发送队列ring进行初始化。
步骤307:建立mempool、发送队列ring和DMA之间的映射关系,并启动port3、port4。
进一步的,为了提高CPU的处理效率,步骤301-步骤307中使用的接收队列ring、转发ring和发送队列ring都为无锁队列,多个CPU在操作同一个队列时,若同时具有入队线程和出队线程时,两个线程可以并发操作,而不需要任何加锁行为,在保证高并发处理情况下保持线程的安全,并提高CPU的处理效率。
步骤102:加载监控策略规则文件和业务特征文件,将监控策略规则文件和业务特征文件转换为Hyperscan模式的规则数据库和业务特征数据库。
为了识别业务报文中可能存在的恶意报文,需要获取恶意报文的监控策略规则,符合监控策略规则特征的报文被识别为恶意报文。另一方面,为了对常规业务报文和特殊业务报文进行分类,本实施例中还需要根据业务特征文件中的业务特征规则对业务报文进行识别,业务特征可以为业务的应用ID特征、业务的收发端口号等。
为了便于使用Hyperscan进行匹配,本实施例使用的监控策略规则文件和业务特征文件按照snort语法格式编写,规则数据库使用hs_database_t。针对UDP协议的单个报文和TCP数据的整条数据流两种数据模块,规则数据库和业务特征数据库可以选择使用流模式识别规则库或块模式识别规则库,或两种数据库同时使用;识别规则库中保存每条正则表达式规则以及对应的event,event为识别命中后的事件id,规则数据库和业务特征数据库被本实施例中所有业务处理线程共享。
Hyperscan以自动机理论为基础,其工作流程主要分成两个部分:编译期(compiletime)和运行期(run-time)。
(1)编译期
Hyperscan自带C++编写的正则表达式编译器。如图5所示,它将正则表达式作为输入,针对不同的CPU内核架构平台,用户定义的模式及特殊 语法,经过复杂的图分析及优化过程生成对应的数据库。另外,生成的数据库可以被序列化后保存在内存中,以供运行期提取使用。
(2)运行期
Hyperscan的运行期是通过C语言来开发的。图6为Hyperscan在运行期的主要流程。用户需要预先分配一段内存来存储临时匹配状态信息,之后利用编译生成的数据库调用Hyperscan内部的匹配引擎(如NFA,DFA等)来对输入进行模式匹配。Hyperscan在引擎中使用Intel处理器所具有的SIMD指令进行加速。同时,用户可以通过回调函数来自定义匹配发生后采取的行为。由于生成的数据库是只读的,用户可以在多个CPU核或多线程场景下共享数据库来提升匹配扩展性。
具体地,针对单条正则表达式的规则编译调用hs_compile,原型为:
Figure PCTCN2021130891-appb-000001
使用的参数定义为:expression是正则表达式字符串;flags用于控制正则的行为,例如:忽略大小写、使“.”符号包含换行等;mode确定了生成database的格式,主要有BLOCK,STREAM和VECTOR三种,每一种模式的database只能由相应的scan接口使用;platform用于指定此database的目标平台的CPU特性,platform的值为NULL表示目标平台与当前平台一致;db用于保存编译后的database;error用于接收错误信息。
针对多条正则表达式的规则编译调用hs_compile_multi,原型为:
Figure PCTCN2021130891-appb-000002
使用的参数定义为:expressions是多个正则表达式字符串;flags和 ids分别是expressions对应的flag和id数组;elements是表达式字符串的个数;其余参数定义与hs_compile的参数定义相同。
步骤103:调用dpdk完成数据接入,根据Hyperscan的业务特征数据库和数据五元组hash值的key区分常规业务和特殊业务,将每个数据五元组放入相应CPU内核的数据队列。
通过调用dpdk开发套件,收包核1-2完成大流量数据的接入,并将接收数据入队列(inqueue)。为保证同一五元组数据输入到同一接收核上,避免匹配时处理五元组不同部分进行内核切换造成的处理效率损失,将输入放入数据队列时,需要将接入的数据中同一个五元组的数据输入到同一个用于接收的CPU核心的接收队列ring中。在本实施例的具体实施场景中,采用dpdk的软哈希算法(RSS)做到同源同宿,以保证每个核上数据流的单一性和完整性。
进行数据接入后,根据业务特征数据库对业务报文进行分类,区分常规业务报文也特殊业务报文。在具体实施中,为了提高分类效率降低分类计算复杂度,可以使用hash表进行分类。根据业务特征数据库生成hash表,hash表的key为特殊业务对应的hash值的key;判断每个数据五元组hash值的key是否存在于hash表中;若数据五元组hash值的key存在于hash表中,表示该数据五元组为特殊业务,将数据五元组放入hash表中相应的位置,并放入特殊业务处理核的接收队列ring中,待特殊业务处理核进行处理;若不存在于hash表中,则表示该数据五元组是常规业务,将数据五元组放入常规业务处理核的接收队列ring中,待常规业务处理核进行处理。将常规业务和特殊业务的数据五元组分类存放至不同CPU核的接收队列ring中后,完成了不同类型的业务报文的分发,通过常规业务和特殊业务的区分,减小了在hyperscan应用识别中带来的性能消耗,有效的提高了dpdk技术与hyperscan技术融合的高效性。
步骤104:从每个CPU内核的数据队列中逐个取出数据五元组进行解码,生成每个数据五元组对应的数据报文。
将常规业务和特殊业务的数据五元组分类存放至不同CPU核的接收队列ring中后,常规业务处理核和特殊业务处理核分别循环读取各自接收队列ring中的数据进行并行处理。
针对从每个CPU内核的接收队列ring中读取(dequeue)到的数据,还需要进行链路层、网络层、传输层和应用层四层协议的快速解码,解码信息主要用于后续的监控规则匹配。在本实施例的具体实施场景中,根据 数据五元组中的协议层级,按照先序遍历的方式构建树结构,其中,树的层级与协议层级一致,树的每一个节点为一个协议节点;构建树结构的同时,对树的每一个节点进行解码。
如图8所示,进行解码的具体步骤为:
步骤401:由各层协议节点构建一棵由协议节点组成的树,即tree结构。
步骤402:根据协议节点对树进行逐层扩展。
步骤403:在构建树的同时,使用先序遍历方式完成树中各节点的遍历。
步骤404:通过遍历完成各层协议的解码工作。
上述步骤中,步骤403和步骤404同步完成,即每构建一个节点,同时完成对该节点的解码,完成对该节点的遍历,而不需要等待树建构完成后在进行遍历,提高了解码效率。
具体的,需要进行解码的字段包括:ETHER、ICMP、IP、TCP、UDP、DNS、HTTP、SMTP、POP3、IMAP、WTP等协议的协议变量数据。如图7所示,为WTP报文解码的层级关系。解码的字段不少于300个。本实施例提供的实施方式可以支持Ipv4和ipv6协议解码。
进一步的,为了对TCP、UDP承载的数据进行维护,在对数据进行解码时,还需要对接收到的数据进行基于五元组的会话管理,以便于对报文所承载的数据的生命周期进行维护。具体的,会话(session)管理包括:session的创建、更新、老化和关闭。为了更精确便捷的维护会话节点的生命周期,可以使用硬件定时器维护基于五元组的会话的生命周期。
在本实施例的具体实施场景中,基于五元组的会话管理通过hash表进行维护,其中hash表的key为五元组信息。判断数据五元组是否有对应的会话;若没有对应的会话,创建对应的会话,加入会话hash表中,并提交超时;若存在相应的会话,更新对应的会话hash节点,并判断会话是否结束。
如图9所示,可以通过如下步骤进行会话管理:
步骤501:对接收队列ring中的数据进行解码输出,并查找是否hash表中是否有相同的节点。
步骤502:当查找结果为空时,创建新的hash节点,同时提交超时器。
步骤503:当查找到相同节点时,更新已有节点,并判断当前session是否结束。
步骤504:若当前session已结束,则删除超时器,并关闭会话。
步骤505:若当前session仍未结束,则只更新该session节点信息。
步骤506:当超时器返回时,检查session当前状态较提交时是否有更新。
步骤507:若session无更新,删除超时器,并关闭session。
步骤508:若session状态有更新,则更新session节点信息,并重新提交超时器,直至下一次超时到达。
具体的,在步骤503中,TCP会话是否结束的判断依据为判断当前会话是否有FIN或RST包,
步骤105:根据Hyperscan的规则数据库和业务数据库,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,获取恶意报文,并对恶意报文进行处理。
为了对不同类型的数据报文进行识别,还需要使用Hyperscan进行特征扫描。具体实施过程中,可以根据规则的不同特性,选用不同的扫描匹配方式提高特征扫描的效率。
(1)若规则数据库或业务特征数据库的一条规则同时包含多个并列的五元组特征和字符串特征,对规则中的数据报文的五元组特征和字符串特征进行并行扫描,再对扫描结果进行整合。如下的监控策略:udp.dstport<=1025&&udp.dstport>=1023&&udp.payload.len>991,包含三个并列的简单规则语句,数据流扫描规则数据库时,该条策略的三个语句的扫描会同时进行,获取每一个语句的匹配结果,再根据(语句1匹配结果)&&(语句2匹配结果)&&(语句3匹配结果)获取最终的匹配结果,减小了规则扫描时间。
(2)若规则数据库或业务特征数据库的规则包括多重限定,依次对数据报文的五元组特征和字符串特征进行串行扫描,按从左往右的顺序依次查找该条规则的各个语句。具体的,如udp.payload=="\xb9\x21\x01"&&tcp.payload=="\x24\x26\x22",数据流扫描测量规则数据库和业务规则数据库时,会依次扫描udp.payload、tcp.payload,通过后缀表达式方式计算最终的扫描结果。
进一步的,基于五元组会话管理的基础上,还需要在常规业务处理核上进行hyperscan的应用识别扫描。判断常规业务核上的每个业务报文的应用ID是否为特殊业务ID;若是特殊业务ID,将数据报文的五元组信息、应用ID和预设特殊业务标识符重新组包,将组包后的数据放入特殊业务处理核的数据队列,由特殊业务处理核进行扫描匹配;若不是特殊业务ID, 由常规业务处理核进行扫描匹配。某个业务报文的应用ID为业务数据库中的特殊业务时,将该业务报文重新封装为数据包,数据包中含有特殊业务标识符、应用ID以及原业务报文的五元组信息,将该数据包回传到dpdk的接收核(1、2核)对应的收包队列中;dpdk的收包队列中检索到带有特殊标识符承载的数据包,提取数据包中的五元组信息,通过dpdk的RSS算法计算出五元组hash值,并更新到特殊业务元组hash表中;数据接收核(1、2)核接收数据后,检索hash表中的五元组信息,返回检索结果为真时,将该条数据分发到特殊业务处理核(6、8核)的队列中,进行单独处理,直接对应用ID赋值,减少了后续hyperscan识别应用的环节;这种处理方法通过hyperscan的识别扫描影响dpdk的数据分发,同时通过dpdk对特殊数据的分发减小了在hyperscan应用识别中带来的性能消耗。
对接受到的报文进行解码后,需要在基于会话管理的基础上,确认当前加载了最新监控策略规则库后,进行监控策略的扫描匹配,并输出匹配结果。在具体的使用场景中,需要根据不同的协议选择Hyperscan的不同匹配模式进行匹配。例如:若报文为UDP协议的单个报文扫描,使用block模式;若报文为TCP数据的整条流扫描,使用steam模式。
在本实施例的实际场景中,进行匹配前,需要先分配好每次匹配需要用的临时数据(scratch),并调用hs_alloc_scratch(database,&scratch)为每个数据库分配临时空间。
对于UDP协议承载的单个报文数据采用block模式,调用hs_scan进行匹配,原型为:
Figure PCTCN2021130891-appb-000003
使用的参数定义为:db为步骤103中编译获得的database;data和length分别是要匹配的数据和数据长度;flags用于在未来版本中控制函数行为,目前未使用;scratch是匹配时要用的临时数据,已分配好;onEvent:即匹配时调用的回调函数,用户可以通过回调函数来自定义匹配后采取的行为;context是用户自定义指针。
对于TCP协议承载的数据,采用stream模式匹配,由于需要用到stream模式,所以在匹配前要先将流打开
hs_error_t hs_open_stream(const hs_database_t*db,
unsigned int flag,
hs_stream_t**stream)
使用的参数定义为:db为预先编译的模式数据库;flags:修改流行为的标志,提供此参数供将来使用,目前不使用;stream:成功时,将返回指向生成的hs_stream_t的指针,失败时为NULL。
将流打开后,再调用hs_scan_stream进行匹配,原型为:
Figure PCTCN2021130891-appb-000004
使用的参数定义为:id是数据所属的stream对应hs_stream_t指针;其余参数与hs_scan相同。
Hyperscan的流模式可以实现TCP数据的跨包扫描,可以在TCP数据流未进行重组的情况下完成协议的跨包识别,但需要缓存TCP数据流的上下文,并通过hash表实现维护,当完成对当前数据包的有效负载扫描,并确定命中规则后,根据命中的识别规则在识别库中查找对应的eventID,保存在会话的hash节点中。命中的event ID即对应了数据流中命中了监控策略规则的恶意报文。
进一步的,为了提高匹配效率,无论是block模式还是stream模式,在查找匹配过程中,参数flag均要设置为HS_FLAG_DOALL,避免hyperscan的性能下降。
在步骤105中对数据队列中的报文进行扫描匹配后,即根据命中的识别规则在识别库中查找对应的eventID后,还需要命中监控策略规则的恶意报文进行阻断或netflow日志上报。
根据实施场景的实际需要,可以选择阻断方案和上报方案,也可以选择其它针对恶意报文的处理方案。
(1)阻断方案:当前数据包命中监控策略后,立即按照原始数据流中 五元组信息组建RST报文,共两个RST报文,分别发给客户端和服务端,同时输出报文(inqueue)到dpdk发包队列中,数据转发核(5-6核)在读取到报文后,通过配置好的发包端口发送到协议栈中,客户端和服务端在收到RST报文后,自动关闭会话,流结束,实现阻断的效果。
(2)netflow日志上报方案:基于会话管理的基础上,会话正常结束或超时结束后,通过预先下发的netflow定制格式提供网络流量的会话级视图,记录下每个TCP/IP事务的信息,按照模板信息组织上报字段信息,并以TLV分片的格式进行流日志信息上报;输出报文(inqueue)到dpdk发包队列中,数据转发核在读取到报文后,利用socket通信,进行监控值日的上报。netflow日志上报方案的优点是,基于模板(Template)进行统计数据输出,方便添加需要输出的数据域,并支持多种netflow新功能,使用灵活,扩展性强。具体的,可以使用netflow V9定制格式。
本实施例提供的海量流量中恶意报文识别阻断的方法,通过DPDK与Hyperscan相结合的方式,充分利用了DPDK对数据包的处理性能,以及Hyperscan高效灵活的正则表达匹配能力,提供了一种可用于海量流量场景的快速准确的规则匹配方式,实现了海量数据报文的全面快速监控,有效监控各种网络攻击行为,满足用户全面监测并保护网络安全、实现网络安全稳定运行的目的。
实施例2:
在上述实施例1提供的海量流量下报文识别的方法的基础上,本发明还提供了一种可用于实现上述方法的海量流量下报文识别的装置,如图10所示,是本发明实施例的装置架构示意图。本实施例的海量流量下报文识别的装置包括一个或多个处理器21以及存储器22。其中,图10中以一个处理器21为例。
处理器21和存储器22可以通过总线或者其他方式连接,图10中以通过总线连接为例。
存储器22作为一种海量流量下报文识别方法非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块,如实施例1中的海量流量下报文识别方法。处理器21通过运行存储在存储器22中的非易失性软件程序、指令以及模块,从而执行海量流量下报文识别的装置的各种功能应用以及数据处理,即实现实施例1的海量流量下报文识别的方法。
存储器22可以包括高速随机存取存储器,还可以包括非易失性存储器, 例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实施例中,存储器22可选包括相对于处理器21远程设置的存储器,这些远程存储器可以通过网络连接至处理器21。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
程序指令/模块存储在存储器22中,当被一个或者多个处理器21执行时,执行上述实施例1中的海量流量下报文识别的方法,例如,执行以上描述的图1、图3、图4、图8和图9所示的各个步骤。
本领域普通技术人员可以理解实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁盘或光盘等。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种海量流量下报文识别的方法,其特征在于:
    调用dpdk,分配CPU的常规业务处理核和特殊业务处理核;
    加载监控策略规则文件和业务特征文件,将监控策略规则文件和业务特征文件转换为Hyperscan模式的规则数据库和业务特征数据库;
    调用dpdk完成数据接入,根据Hyperscan的业务特征数据库和数据五元组hash值的key区分常规业务和特殊业务,将每个数据五元组放入相应CPU内核的数据队列;
    从每个CPU内核的数据队列中逐个取出数据五元组进行解码,生成每个数据五元组对应的数据报文;
    根据Hyperscan的规则数据库和业务数据库,使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,获取恶意报文,并对恶意报文进行处理。
  2. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述将每个数据五元组放入相应CPU内核的数据队列,具体包括:
    根据业务特征数据库生成hash表,hash表的key为特殊业务对应的hash值的key;
    判断每个数据五元组hash值的key是否存在于hash表中;
    若存在于hash表中,将数据五元组放入hash表中相应的位置,并放入特殊业务处理核的数据队列;
    若不存在于hash表中,将数据五元组放入常规业务处理核的处理队列。
  3. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:
    判断常规业务核上的每个业务报文的应用ID是否为特殊业务ID;
    若是特殊业务ID,将数据报文的五元组信息、应用ID和预设特殊业务标识符重新组包,将组包后的数据放入特殊业务处理核的数据队列,由特殊业务处理核进行扫描匹配;
    若不是特殊业务ID,由常规业务处理核进行扫描匹配。
  4. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:
    若规则数据库或业务特征数据库的一条规则同时包含多个并列的五元组特征和字符串特征,对规则中的数据报文的五元组特征和字符串特征进行并行扫描,再对扫描结果进行整合;
    或,若规则数据库或业务特征数据库的规则包括多重限定,依次对数据报文的五元组特征和字符串特征进行串行扫描。
  5. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述分配CPU的常规业务处理核和特殊业务处理核之后,还包括:
    配置报文的收包端口和发送端口,配置CPU的收包核、常规业务处理核、特殊业务处理核和数据转发核,配置内存通道的数量;
    为每个CPU内核申请用于接收报文数据包和将报文发送到业务处理核的ring,以及一个用于发送数据包的ring,并初始化各ring;
    建立内存池、ring和DMA之间的映射关系;
    启动配置后的各端口和ring。
  6. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述从每个CPU内核的数据队列中逐个取出数据五元组进行解码,具体包括:
    根据数据五元组中的协议层级,按照先序遍历的方式构建树结构,其中,树的层级与协议层级一致,树的每一个节点为一个协议节点;
    构建树结构的同时,对树的每一个节点进行解码。
  7. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述从每个CPU内核的数据队列中逐个取出数据五元组进行解码之后,还包括:
    判断数据五元组是否有对应的会话;
    若没有对应的会话,创建对应的会话,加入会话hash表中,并提交超时;
    若存在相应的会话,更新对应的会话hash节点,并判断会话是否结束。
  8. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述使用相应的CPU内核分别对相应队列中的数据报文进行扫描匹配,还包括:
    若报文为UDP协议的单个报文扫描,使用block模式;
    若报文为TCP数据的整条流扫描,使用steam模式。
  9. 根据权利要求1所述的海量流量下报文识别的方法,其特征在于,所述对恶意报文进行处理,具体包括:对恶意报文进行阻断或netflow日志上报。
  10. 一种海量流量下报文识别的装置,其特征在于:
    包括至少一个处理器和存储器,所述至少一个处理器和存储器之间通过数据总线连接,所述存储器存储能被所述至少一个处理器执行的指令,所述指令在被所述处理器执行后,用于完成权利要求1-9中任一项所述的海量流量下报文识别的方法。
PCT/CN2021/130891 2020-12-16 2021-11-16 一种海量流量下报文识别的方法和装置 WO2022134942A1 (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN202011481290 2020-12-16
CN202011516225.4A CN112558948A (zh) 2020-12-16 2020-12-21 一种海量流量下报文识别的方法和装置
CN202011516225.4 2020-12-21

Publications (1)

Publication Number Publication Date
WO2022134942A1 true WO2022134942A1 (zh) 2022-06-30

Family

ID=75031072

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130891 WO2022134942A1 (zh) 2020-12-16 2021-11-16 一种海量流量下报文识别的方法和装置

Country Status (2)

Country Link
CN (1) CN112558948A (zh)
WO (1) WO2022134942A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022167A (zh) * 2022-07-01 2022-09-06 天翼数字生活科技有限公司 一种用于家庭网关业务流控的方法及系统
CN117729054A (zh) * 2024-02-07 2024-03-19 北京马赫谷科技有限公司 一种基于全流量存储的vpn流量识别方法和系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558948A (zh) * 2020-12-16 2021-03-26 武汉绿色网络信息服务有限责任公司 一种海量流量下报文识别的方法和装置
CN113194000B (zh) * 2021-04-30 2022-11-01 上海金融期货信息技术有限公司 一种业务无关的分布式系统
CN113098911B (zh) * 2021-05-18 2022-10-04 神州灵云(北京)科技有限公司 一种多段链接网络的实时分析方法及旁路抓包系统
CN114125015A (zh) * 2021-11-30 2022-03-01 上海斗象信息科技有限公司 一种数据采集方法及系统
CN115297033B (zh) * 2022-07-20 2023-08-11 上海量讯物联技术有限公司 一种物联网终端流量审计方法及系统
CN116055123B (zh) * 2022-12-21 2023-08-22 长扬科技(北京)股份有限公司 一种mqtt主题匹配方法、装置、计算设备及存储介质
CN115857420B (zh) * 2023-03-03 2023-05-12 深圳市综科智控科技开发有限公司 一种工控设备之间io互控的方法
CN116781422B (zh) * 2023-08-18 2023-10-27 长扬科技(北京)股份有限公司 基于dpdk的网络病毒过滤方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108632202A (zh) * 2017-03-16 2018-10-09 哈尔滨英赛克信息技术有限公司 一种海量数据包场景下的dns欺骗方法
WO2019010702A1 (en) * 2017-07-14 2019-01-17 Zte Corporation MANAGEMENT OF ORIENTATION, SWITCHING AND DIVISION OF ACCESS TRAFFIC
CN110971487A (zh) * 2019-11-26 2020-04-07 武汉虹信通信技术有限责任公司 网络协议识别方法及装置
CN111835729A (zh) * 2020-06-15 2020-10-27 东软集团股份有限公司 报文转发方法、系统、存储介质和电子设备
CN112558948A (zh) * 2020-12-16 2021-03-26 武汉绿色网络信息服务有限责任公司 一种海量流量下报文识别的方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6798784B2 (en) * 2001-06-04 2004-09-28 Caux Networks, Inc. Concurrent switching of synchronous and asynchronous traffic
SE0302685D0 (sv) * 2003-10-07 2003-10-07 Ericsson Telefon Ab L M Method and arrangement in a telecommunication system
CN102231126B (zh) * 2011-07-28 2013-09-04 大唐移动通信设备有限公司 一种实现多核处理器中核间备份的方法及系统
CN108259371A (zh) * 2016-12-28 2018-07-06 亿阳信通股份有限公司 一种基于流处理的网络流量数据解析方法和装置
CN109828842A (zh) * 2019-01-29 2019-05-31 上海兴畅网络技术股份有限公司 一种基于dpdk技术开发的高性能数据采集引擎方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108632202A (zh) * 2017-03-16 2018-10-09 哈尔滨英赛克信息技术有限公司 一种海量数据包场景下的dns欺骗方法
WO2019010702A1 (en) * 2017-07-14 2019-01-17 Zte Corporation MANAGEMENT OF ORIENTATION, SWITCHING AND DIVISION OF ACCESS TRAFFIC
CN110971487A (zh) * 2019-11-26 2020-04-07 武汉虹信通信技术有限责任公司 网络协议识别方法及装置
CN111835729A (zh) * 2020-06-15 2020-10-27 东软集团股份有限公司 报文转发方法、系统、存储介质和电子设备
CN112558948A (zh) * 2020-12-16 2021-03-26 武汉绿色网络信息服务有限责任公司 一种海量流量下报文识别的方法和装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022167A (zh) * 2022-07-01 2022-09-06 天翼数字生活科技有限公司 一种用于家庭网关业务流控的方法及系统
CN115022167B (zh) * 2022-07-01 2024-03-01 天翼数字生活科技有限公司 一种用于家庭网关业务流控的方法及系统
CN117729054A (zh) * 2024-02-07 2024-03-19 北京马赫谷科技有限公司 一种基于全流量存储的vpn流量识别方法和系统
CN117729054B (zh) * 2024-02-07 2024-04-16 北京马赫谷科技有限公司 一种基于全流量存储的vpn流量识别方法和系统

Also Published As

Publication number Publication date
CN112558948A (zh) 2021-03-26

Similar Documents

Publication Publication Date Title
WO2022134942A1 (zh) 一种海量流量下报文识别的方法和装置
CN111371779B (zh) 一种基于dpdk虚拟化管理系统的防火墙及其实现方法
CN109547580B (zh) 一种处理数据报文的方法和装置
US6771646B1 (en) Associative cache structure for lookups and updates of flow records in a network monitor
US20160171102A1 (en) Runtime adaptable search processor
US10616101B1 (en) Forwarding element with flow learning circuit in its data plane
US20070266370A1 (en) Data Plane Technology Including Packet Processing for Network Processors
US9356844B2 (en) Efficient application recognition in network traffic
US11522773B1 (en) Optimized batched packet processing for deep packet inspection
CN111865996A (zh) 数据检测方法、装置和电子设备
Pacífico et al. Application layer packet classifier in hardware
WO2023019876A1 (zh) 基于智能决策的数据传输方法、装置、设备及存储介质
CN108462715B (zh) 基于mpi的wm串匹配并行算法的网络信息过滤方法
US10944724B2 (en) Accelerating computer network policy search
CN117997833A (zh) 数据转发系统及其控制方法
CN115033407B (zh) 一种适用于云计算的采集识别流量的系统和方法
Chen et al. Empowering network security with programmable switches: A comprehensive survey
Gallo et al. FENXI: Deep-learning Traffic Analytics at the edge
Srinivasan et al. Performance analysis of multi-dimensional packet classification on programmable network processors
Bolla et al. OpenFlow in the small: A flexible and efficient network acceleration framework for multi-core systems
De Sensi et al. Dpi over commodity hardware: implementation of a scalable framework using fastflow
US20230060132A1 (en) Coordinating data packet processing between kernel space and user space
US11876691B2 (en) End-to-end RDMA telemetry system
CN116886422A (zh) 一种基于eBPF的网络高速转发中继方法及系统
Ramesh Network traffic anomaly-detection framework using gpus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908942

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21908942

Country of ref document: EP

Kind code of ref document: A1