CN117609894B - Partition strategy-based high-performance message classification method, equipment and medium - Google Patents

Partition strategy-based high-performance message classification method, equipment and medium Download PDF

Info

Publication number
CN117609894B
CN117609894B CN202410094291.9A CN202410094291A CN117609894B CN 117609894 B CN117609894 B CN 117609894B CN 202410094291 A CN202410094291 A CN 202410094291A CN 117609894 B CN117609894 B CN 117609894B
Authority
CN
China
Prior art keywords
node
partition
rule
nodes
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410094291.9A
Other languages
Chinese (zh)
Other versions
CN117609894A (en
Inventor
钟金诚
陈曙晖
虞万荣
王飞
魏子令
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202410094291.9A priority Critical patent/CN117609894B/en
Publication of CN117609894A publication Critical patent/CN117609894A/en
Application granted granted Critical
Publication of CN117609894B publication Critical patent/CN117609894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/24765Rule-based classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a high-performance message classification method, equipment and medium based on partition strategy. The method comprises the following steps: defining a rule set according to a plurality of metadata fields and constructing a decision tree; the decision tree comprises partition nodes and partition nodes; the method comprises the steps of distributing partition nodes and partition node layers in a decision tree, wherein an odd layer is the partition node, an even layer is the partition node, classifying messages according to the partition node, the partition node and the leaf node, sequentially searching sub-nodes of the partition node by network messages when the partition node is arranged, generating a sub-node list positioning value by the network messages according to masks of the partition node when the partition node is arranged, searching only one sub-node according to the positioning value for rule matching, organizing the contained rules in a linked list according to priority order when the leaf node is arranged, and sequentially searching the linked list until the rule matching is achieved. By adopting the method, the message classification performance can be improved.

Description

Partition strategy-based high-performance message classification method, equipment and medium
Technical Field
The present disclosure relates to the field of packet classification technologies, and in particular, to a method, an apparatus, and a medium for classifying a high-performance packet based on a partition policy.
Background
Message classification is a fundamental problem of computer networks, and algorithms for solving the message classification problem are widely used in various network devices and functions such as routers, switches, firewalls, and network intrusion detection systems.
The message classification problem relates to a rule set, wherein each rule consists of three parts, namely priority, a matching domain and actions taken after successful matching. Wherein the rule matching field is defined by message header metadata (e.g., IP address, port number, etc.), which determines how a rule matches. The message classification problem is to search the rule set for matching the network message and return a matching rule with the highest priority.
The current method for solving the message classification problem comprises the following steps: decision tree methods, tuple space methods, hybrid methods of decision tree and tuple space, etc. The decision tree method is difficult to support dynamic updating of rule sets and memory explosion can occur on a large-scale rule set due to the difficulty in avoiding rule duplication (namely, one rule is distributed to a plurality of child nodes) when child nodes are partitioned. The tuple space method has the problems of more multi-field combined tuples and low classification performance. The mixing method of decision tree and tuple space balances the decision tree and tuple space, so that a better balance of performances in all aspects can be achieved, but the classification performance is still lower.
Disclosure of Invention
Based on this, it is necessary to provide a method, a device and a medium for classifying a high-performance message based on a partition policy, which can improve the classification performance of the message.
A method for classifying high-performance messages based on partition strategy, the method comprising:
acquiring a network message header; the network message header comprises a plurality of metadata fields;
defining a rule set according to a plurality of metadata fields and constructing a decision tree; the decision tree comprises partition nodes and partition nodes;
partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, containing rules, of which the number is smaller than a preset threshold value in the partition nodes and the partition nodes are set as leaf nodes;
the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, the network messages sequentially search sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the leaf nodes are arranged, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished;
when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree.
In one embodiment, defining a rule set and constructing a decision tree from a plurality of metadata fields includes:
the root node of the decision tree is initially defined as a partition node, a sub-node is constructed according to a rule set by adopting a heuristic method in the partition node, a new partition is generated by traversing the rule set once according to the heuristic method, and each new partition constructs a sub-partition node until the residual rule number of the rule set is less than or equal to a leaf node threshold value and is used for constructing a sub-leaf node.
In one embodiment, constructing child nodes from rule sets using heuristics in partition nodes includes:
firstly, setting a minimum mask valid bit number threshold B; initializing a mask M value to be all 1, traversing the rule set according to the priority order, and performing AND operation on the mask of each rule R in the rule set and M in sequence; when the effective bit number of the operation result mask is greater than or equal to B, removing the rule set from the rule R, and placing the rule set in a new partition and updating M at the same time, otherwise, reserving the rule R in the rule set and dividing the rule set into other partitions in the subsequent process; each new partition constructs a sub-partition node until the remaining number of rules of the rule set is less than or equal to the leaf node threshold and is used to construct a sub-leaf node.
In one embodiment, when dynamic rule updating is performed, a rule to be updated enters a decision tree from a root node to perform decision tree reconstruction, including:
when the rule is in the partition node of the decision tree, traversing all child nodes of the partition node in sequence until the update is completed;
when the rule is in the dividing node of the decision tree, generating a child node list positioning value according to the mask of the dividing node, positioning one child node according to the positioning value and attempting to finish updating in the child node;
when the rule is inserted into the leaf node of the decision tree, inserting the rule into a proper position of a rule linked list, when the rule is deleted, sequentially matching each rule, and deleting the rule when the rule is equal; when the leaf node rule number is greater than a predefined threshold or equal to 0, reconstructing a part of the structure of the decision tree.
In one embodiment, after rule deletion, when a node contains a rule number equal to 0, a pointer low significant digit marking strategy is adopted to mark the node; the pointer low significant digit marking strategy marks whether each node contains rules or not by using a pointer idle bit in a child node pointer list of a father node, and only selects nodes which are marked as not containing rules by traversing the decision tree when occupied memory exceeds a certain threshold.
A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:
acquiring a network message header; the network message header comprises a plurality of metadata fields;
defining a rule set according to a plurality of metadata fields and constructing a decision tree; the decision tree comprises partition nodes and partition nodes;
partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, containing rules, of which the number is smaller than a preset threshold value in the partition nodes and the partition nodes are set as leaf nodes;
the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, the network messages sequentially search sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the leaf nodes are arranged, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished;
when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring a network message header; the network message header comprises a plurality of metadata fields;
defining a rule set according to a plurality of metadata fields and constructing a decision tree; the decision tree comprises partition nodes and partition nodes;
partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, containing rules, of which the number is smaller than a preset threshold value in the partition nodes and the partition nodes are set as leaf nodes;
the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, the network messages sequentially search sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the leaf nodes are arranged, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished;
when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree.
The method, the device and the medium for classifying the high-performance messages based on the partition strategy are characterized in that firstly, a rule set is defined according to a plurality of metadata fields, and a decision tree is constructed; the decision tree comprises partition nodes and partition nodes; partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, containing rules, of which the number is smaller than a preset threshold value in the partition nodes and the partition nodes are set as leaf nodes; the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, the network messages sequentially search sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the leaf nodes are arranged, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished; when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree, the rule replication problem of the decision tree method is avoided through the partition strategy, the rapid message classification and the dynamic rule update can be performed, the rule replication problem of the existing decision tree algorithm is solved, and the method has the characteristics of small occupied space, high classification performance and support of dynamic rule update.
Drawings
FIG. 1 is a flow chart of a method for classifying high-performance messages based on partition strategy in one embodiment;
FIG. 2 is a diagram of an example message classification rule set in one embodiment;
FIG. 3 is a schematic diagram of a decision tree structure constructed on an example rule set in one embodiment;
FIG. 4 is a diagram illustrating a search process for message classification in another embodiment;
FIG. 5 is a flow diagram of dynamic rule insertion in one embodiment;
FIG. 6 is a flow logic diagram of rule partitioning in one embodiment;
fig. 7 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a method for classifying high-performance messages based on partition strategies is provided, which includes the following steps:
step 102, obtaining a network message header; the network message header comprises a plurality of metadata fields; defining a rule set according to a plurality of metadata fields and constructing a decision tree; the decision tree includes partition nodes and partition nodes.
And 104, distributing partition nodes and partition node layers in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and the partition node comprise nodes with the rule number smaller than a preset threshold value as leaf nodes. The leaf node is set to be necessary for the normal operation of the subsequent message classification function. The leaf nodes are termination nodes of the search paths, and one message classification search path ends at the leaf nodes.
By designing partition nodes and partition nodes in the decision tree, the rule replication problem of the decision tree method is avoided, and rapid message classification and dynamic rule update can be performed.
Step 106, classifying the messages according to the partition nodes, the partition nodes and the leaf nodes, and sequentially searching the sub-nodes of the partition nodes by the network messages when the network messages are in the partition nodes, and pruning the subsequent sub-nodes according to the highest priority of the matched rule in the front terminal node in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the nodes are in leaf nodes, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished.
In this step, the following packet classification procedure is described according to an example, and it is assumed that a packet p whose values in the X domain and the Y domain are (11, 10) is subjected to the packet classification procedure as shown in fig. 4. The message p firstly enters a root node A, and the root node A is a partition node, so that the message p is sequentially searched in child nodes B, C, D of the A node; searching a B point, wherein the B point is a divided node, a public determination bit mask is (11, 00), and performing AND operation based on the mask and the p value to obtain a search key value key= (11), wherein the key value (11) is not matched with the values of child nodes E (01) and F (10) of the B node, and the message p is not matched with any rule in the B node; continuing searching the C node, wherein the C node is a dividing node, the public determination bit mask is (00, 11), and performing AND operation based on the mask and the p value to obtain a search key value key= (/ 10), wherein the key value is equal to the G node value (/ 10), so that the message p is matched with the G node containing rule R3; because the priority of the matched rule R3 is higher than the highest priority of the rule contained in the C node, the C node is pruned without searching, and finally the optimal matching rule of the message p is determined to be R3.
And step 108, when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree.
By continuously updating the dynamic rules of the rule set, the method and the device can adapt to various message data, and can enable the reconstructed decision tree to match the message to the corresponding rule as soon as possible to finish message classification, so that the classification speed is improved.
In the above-mentioned high-performance packet classification method based on partition strategy, firstly define rule sets according to multiple metadata fields and construct decision trees; the decision tree comprises partition nodes and partition nodes; partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, containing rules, of which the number is smaller than a preset threshold value in the partition nodes and the partition nodes are set as leaf nodes; the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, the network messages sequentially search sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the leaf nodes are arranged, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished; when the rule set is updated dynamically, the rule to be updated enters the decision tree from the root node to reconstruct the decision tree, the rule replication problem of the decision tree method is avoided through the partition strategy, the rapid message classification and the dynamic rule update can be performed, the rule replication problem of the existing decision tree algorithm is solved, and the method has the characteristics of small occupied space, high classification performance and support of dynamic rule update.
In one embodiment, defining a rule set and constructing a decision tree from a plurality of metadata fields includes:
the root node of the decision tree is initially defined as a partition node, a sub-node is constructed according to a rule set by adopting a heuristic method in the partition node, a new partition is generated by traversing the rule set once according to the heuristic method, and each new partition constructs a sub-partition node until the residual rule number of the rule set is less than or equal to a leaf node threshold value and is used for constructing a sub-leaf node.
In one embodiment, constructing child nodes from rule sets using heuristics in partition nodes includes:
firstly, setting a minimum mask valid bit number threshold B; initializing a mask M value to be all 1, traversing the rule set according to the priority order, and performing AND operation on the mask of each rule R in the rule set and M in sequence; when the effective bit number of the operation result mask is greater than or equal to B, removing the rule set from the rule R, and placing the rule set in a new partition and updating M at the same time, otherwise, reserving the rule R in the rule set and dividing the rule set into other partitions in the subsequent process; each new partition constructs a sub-partition node until the remaining number of rules of the rule set is less than or equal to the leaf node threshold and is used to construct a sub-leaf node.
In a specific embodiment, taking the two-dimensional rule set given in fig. 2 as an example, assume that the leaf node threshold is 1 (i.e., a node containing a rule number of 1 or less is a leaf node); as shown in fig. 3, firstly, the rule set is placed in a decision tree root node a, the root node is set as a partition node by default, and rule set partition operation is performed: the rule set is divided into three subsets { R1, R2}, { R3, R4} and { R5} and placed in three child nodes B, C and D, respectively;
as the child nodes of the partition node are B, C and D partition nodes, wherein the D node only comprises one rule as a leaf node, and no further child node generation is performed; sub-space division is performed on the node B and the node C according to a common determination bit mask (11, 00), for example, the node B includes rules R1 (01) and R2 (10), and the common determination bits between the rules are two bits of the X domain, so that the node B can be divided into sub-nodes E and F; similarly, the node C includes rules R3 (10) and R4 (00) divided into child nodes G and H according to a common deterministic bit mask (00, 11);
child nodes E, F, G and H generated by the B, C node are partition nodes, and because each node contains only one rule, is smaller than the leaf node threshold value and is therefore also a leaf node, the further construction of child nodes is stopped, and the whole decision tree construction process is completed;
specifically, the regular partition flow in one partition node is shown in fig. 6, and the root node in fig. 3 is taken as an example for partitioning, assuming that a minimum mask valid bit number threshold b=2 is set. Initializing a mask M= (11, 11), traversing a rule set according to a priority order, performing AND operation on the mask (11, 00) of the rule R1 and M to obtain M ' = (11, 00), wherein the effective bit number of M ' is 2 and is more than or equal to a threshold value, so that the rule R1 is placed in a new partition, and M is updated by the value of M '; continuing to try rule R2, the result of AND operation between mask (11, 00) and M of rule R2 is still (11, 00), M significant bit is unchanged, so rule R2 is also placed in the new partition; the result of AND operation between the masks (00, 11) and M of the rule R3 is (00, 00), the effective bit number of the public mask is reduced to 0 and is smaller than B, so that the rule R3 is reserved in the rule set and is not placed in a new partition; rules R4 and R5 are identical to rule R3, and the mask and M are AND-operated to make the effective bit number smaller than B, so that the effective bit number is reserved in the rule set; traversing the rule set once to generate a first partition, wherein the first partition comprises rules { R1, R2}; because the rule set has more rules and the number of the rules is larger than the threshold value of the leaf node, performing second traversal according to the process to generate a second partition, wherein the second partition comprises rules { R3, R4}; after the second rule set traversal, only the remaining rule R5 in the rule set is less than or equal to the leaf node threshold, so the third partition contains only rule R5.
In one embodiment, when dynamic rule updating is performed, a rule to be updated enters a decision tree from a root node to perform decision tree reconstruction, including:
when the rule is in the partition node of the decision tree, traversing all child nodes of the partition node in sequence until the update is completed;
when the rule is in the dividing node of the decision tree, generating a child node list positioning value according to the mask of the dividing node, positioning one child node according to the positioning value and attempting to finish updating in the child node;
when the rule is inserted into the leaf node of the decision tree, inserting the rule into a proper position of a rule linked list, when the rule is deleted, sequentially matching each rule, and deleting the rule when the rule is equal; when the leaf node rule number is greater than a predefined threshold or equal to 0, reconstructing a part of the structure of the decision tree.
In a specific embodiment, taking an insertion rule R6 (11) as an example, the process of inserting the rule R6 is shown in fig. 5. The rule insertion process is similar to the message classification process, the rule R6 firstly enters the root node A, and as A is a partition node, the child nodes of A are tried to be traversed in sequence to find the child nodes capable of completing the rule insertion; the rule R6 enters a point B, B is a dividing node, a public mask of bits is determined to be (11, 00), the public mask is compatible with the value of R6, and the mask and the value of R6 are subjected to AND operation to obtain key= (11); since the key value is different from the E, F node value, the node B generates a new child node I (whose value is (11), and inserts the rule R6 into the rule linked list of the node I, thereby completing rule insertion.
In one embodiment, after rule deletion, when a node contains a rule number equal to 0, a pointer low significant digit marking strategy is adopted to mark the node; the pointer low significant digit marking strategy marks whether each node contains rules or not by using a pointer idle bit in a child node pointer list of a father node, and only selects nodes which are marked as not containing rules by traversing the decision tree when occupied memory exceeds a certain threshold.
In a specific embodiment, frequent memory allocation and recovery are avoided during rule updating through a marking strategy; only nodes which are marked as not containing rules when the performance is seriously affected by excessive occupied memory are selected to traverse the decision tree.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a high performance message classification method based on partition policies. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (5)

1. The method for classifying the high-performance messages based on the partition strategy is characterized by comprising the following steps:
acquiring a network message header; the network message header comprises a plurality of metadata fields;
defining a rule set according to the metadata fields and constructing a decision tree; the decision tree comprises partition nodes and partition nodes;
partition nodes and partition node interlayer distribution in the decision tree, wherein an odd layer is a partition node, an even layer is a partition node, and nodes, which contain rules, of which the number is smaller than a preset threshold, in the partition node and the partition node are set as leaf nodes;
the method comprises the steps that message classification is carried out according to partition nodes, partition nodes and leaf nodes, when the partition nodes are arranged, network messages are sequentially searched for sub-nodes of the partition nodes in sequence, and pruning of subsequent sub-nodes is carried out according to the highest priority of matched rules in front terminal nodes in the searching process; when the nodes are divided, the network message generates a child node list positioning value according to the mask of the divided nodes, only one child node is searched for rule matching according to the positioning value, the contained rules are organized in a linked list according to the priority order when the nodes are in the leaf nodes, the linked list is searched for in order until the rule matching is hit, and the matching is finished, so that the message classification is finished;
when the rule set is updated dynamically, the rule to be updated enters a decision tree from a root node to reconstruct the decision tree;
defining a rule set and constructing a decision tree from the plurality of metadata fields, comprising: the root node of the decision tree is initially defined as a partition node, a sub-node is constructed according to a rule set by adopting a heuristic method in the partition node, a new partition is generated by traversing the rule set once according to the heuristic method, and each new partition constructs a sub-partition node until the residual rule number of the rule set is less than or equal to a leaf node threshold value and is used for constructing a sub-leaf node;
when the dynamic rule is updated, the rule to be updated enters a decision tree from a root node to reconstruct the decision tree, and the method comprises the following steps:
when in the partition nodes of the decision tree, traversing all the child nodes of the partition nodes in sequence by rules until updating is completed;
when the rule is in the dividing node of the decision tree, generating a positioning value of a child node list according to the mask of the dividing node, positioning one child node according to the positioning value and attempting to finish updating in the child node;
when the rule is inserted into the leaf node of the decision tree, inserting the rule into a proper position of a rule linked list, when the rule is deleted, sequentially matching each rule, and deleting the rule when the rule is equal; when the leaf node rule number is greater than a predefined threshold or equal to 0, reconstructing a part of the structure of the decision tree.
2. The method of claim 1, wherein constructing child nodes from rule sets using heuristics in the partition nodes comprises:
firstly, setting a minimum mask valid bit number threshold B; initializing a mask M value to be all 1, traversing the rule set according to the priority order, and performing AND operation on the mask of each rule R in the rule set and M in sequence; when the effective bit number of the operation result mask is greater than or equal to B, removing the rule set from the rule R, and placing the rule set in a new partition and updating M at the same time, otherwise, reserving the rule R in the rule set and dividing the rule set into other partitions in the subsequent process; each new partition constructs a sub-partition node until the remaining number of rules of the rule set is less than or equal to the leaf node threshold and is used to construct a sub-leaf node.
3. The method according to claim 1, wherein the method further comprises:
after rule deletion, when a node contains a rule number equal to 0, marking the node by adopting a pointer low-significance digit marking strategy; the pointer low-significance digit marking strategy marks whether each node contains rules or not by using a pointer idle bit in a child node pointer list of a father node, and only selects nodes which are marked as not containing rules by traversing the decision tree when occupied memory exceeds a certain threshold.
4. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 3 when the computer program is executed.
5. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.
CN202410094291.9A 2024-01-23 2024-01-23 Partition strategy-based high-performance message classification method, equipment and medium Active CN117609894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410094291.9A CN117609894B (en) 2024-01-23 2024-01-23 Partition strategy-based high-performance message classification method, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410094291.9A CN117609894B (en) 2024-01-23 2024-01-23 Partition strategy-based high-performance message classification method, equipment and medium

Publications (2)

Publication Number Publication Date
CN117609894A CN117609894A (en) 2024-02-27
CN117609894B true CN117609894B (en) 2024-04-09

Family

ID=89953888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410094291.9A Active CN117609894B (en) 2024-01-23 2024-01-23 Partition strategy-based high-performance message classification method, equipment and medium

Country Status (1)

Country Link
CN (1) CN117609894B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4949388A (en) * 1987-02-19 1990-08-14 Gtx Corporation Method and apparatus for recognition of graphic symbols
CN102281196A (en) * 2011-08-11 2011-12-14 中兴通讯股份有限公司 Decision tree generating method and equipment, decision-tree-based message classification method and equipment
CN102308533A (en) * 2010-06-28 2012-01-04 华为技术有限公司 Classification method and device for packets
CN104506338A (en) * 2014-11-21 2015-04-08 河南中烟工业有限责任公司 Fault diagnosis expert system based on decision tree for industrial Ethernet network
KR102023475B1 (en) * 2018-03-29 2019-09-20 계명대학교 산학협력단 Double cutting based packet classification method and system for high speed security policy detection
CN114710378A (en) * 2022-03-03 2022-07-05 中国人民解放军国防科技大学 Decision tree-based parallel message classification searching method and system
CN115994331A (en) * 2022-11-29 2023-04-21 中国工商银行股份有限公司 Message sorting method and device based on decision tree
CN117034149A (en) * 2023-08-15 2023-11-10 广东电网有限责任公司 Fault processing strategy determining method and device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220292525A1 (en) * 2021-03-12 2022-09-15 Hubspot, Inc. Multi-service business platform system having event systems and methods

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4949388A (en) * 1987-02-19 1990-08-14 Gtx Corporation Method and apparatus for recognition of graphic symbols
CN102308533A (en) * 2010-06-28 2012-01-04 华为技术有限公司 Classification method and device for packets
CN102281196A (en) * 2011-08-11 2011-12-14 中兴通讯股份有限公司 Decision tree generating method and equipment, decision-tree-based message classification method and equipment
CN104506338A (en) * 2014-11-21 2015-04-08 河南中烟工业有限责任公司 Fault diagnosis expert system based on decision tree for industrial Ethernet network
KR102023475B1 (en) * 2018-03-29 2019-09-20 계명대학교 산학협력단 Double cutting based packet classification method and system for high speed security policy detection
CN114710378A (en) * 2022-03-03 2022-07-05 中国人民解放军国防科技大学 Decision tree-based parallel message classification searching method and system
CN115994331A (en) * 2022-11-29 2023-04-21 中国工商银行股份有限公司 Message sorting method and device based on decision tree
CN117034149A (en) * 2023-08-15 2023-11-10 广东电网有限责任公司 Fault processing strategy determining method and device, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PCMIgr: a fast packet classification method based on information gain ratio;Cheng, Yuzhu 等;《JOURNAL OF SUPERCOMPUTING》;20221130;第79卷(第7期);全文 *
TupleTree: A High-Performance Packet Classification Algorithm Supporting Fast Rule-Set Updates;Jincheng Zhong 等;《IEEE/ACM Transactions on Networking》;20221212;第31卷(第5期);全文 *
高性能报文分类算法的研究与实现;黄腾;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20180415(第4期);全文 *

Also Published As

Publication number Publication date
CN117609894A (en) 2024-02-27

Similar Documents

Publication Publication Date Title
CN108134739B (en) Route searching method and device based on index trie
US10229068B2 (en) Tunable oblivious RAM
Goodrich Data-oblivious external-memory algorithms for the compaction, selection, and sorting of outsourced data
JP6997297B2 (en) Establishing a packet classification decision tree
US9672239B1 (en) Efficient content addressable memory (CAM) architecture
Hubert Chan et al. Circuit OPRAM: unifying statistically and computationally secure orams and oprams
CN108322394B (en) Routing table establishing, searching, deleting and state changing method and device
Cheng et al. Maintaining generalized arc consistency on ad hoc r-ary constraints
Zhang et al. SUMMA: subgraph matching in massive graphs
Mishra et al. PC-DUOS: Fast TCAM lookup and update for packet classifiers
Nayak et al. An Oblivious Parallel RAM with $ O (\log^ 2 N) $ Parallel Runtime Blowup
Seo et al. Bitmap-based priority-NPT for packet forwarding at named data network
CN117609894B (en) Partition strategy-based high-performance message classification method, equipment and medium
Hsieh et al. Multiprefix trie: A new data structure for designing dynamic router-tables
WO2016192057A1 (en) Updating method and device for index table
CN111291085B (en) Hierarchical interest matching method, hierarchical interest matching device, computer equipment and storage medium
Moataz et al. Oblivious substring search with updates
Babka Properties of universal hashing
Lee et al. Binary search on trie levels with a bloom filter for longest prefix match
KR100328129B1 (en) Compacting, searching and insert method reflecting memory hierarchy
WO2011097385A2 (en) Duo-dual tcam architecture for routing tables with incremental update
Gordon et al. A matrix based ORAM: design, implementation and experimental analysis
CN113642594A (en) Message classification method and device, electronic equipment and readable medium
Zegour Scalable distributed compact trie hashing (CTH*)
Goodrich Isogrammic-fusion oram: Improved statistically secure privacy-preserving cloud data access for thin clients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant