CN112464047B

CN112464047B - Optimization system and method for NIDS device adopting hybrid matching engine

Info

Publication number: CN112464047B
Application number: CN202011229281.XA
Authority: CN
Inventors: 刘新闻; 陈宗朗; 郭云飞; 张燕
Original assignee: Guangzhou Jingyuan Safety Technology Co ltd
Current assignee: Guangzhou Jingyuan Safety Technology Co ltd
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-07-02
Anticipated expiration: 2040-11-06
Also published as: CN112464047A

Abstract

The invention provides an optimization system and a method of NIDS equipment adopting a hybrid matching engine, which comprises a rule base, a depth self-encoder module, a random sequencing module, a rule base optimal division training module, a rule base optimal division module, a matching performance measuring module and a matching engine pool, wherein the rule base optimal division training module is used for training the optimal division of the rule base; the rule base optimal division training module and the rule base optimal division module comprise depth reinforcement learning sub-modules; the depth reinforcement learning submodule of the rule base optimal division training module is used for learning a rule division method, and parameters of the depth reinforcement learning submodule of the rule base optimal division module are obtained after training of a set number of matching rules configured by a system, and the parameters are output to the rule base optimal division module. The system is suitable for NIDS equipment based on software matching and NIDS equipment based on special chip or network processor for matching; the device has the advantages of simple structure, convenience in operation and strong adaptability.

Description

Optimization system and method for NIDS device adopting hybrid matching engine

Technical Field

The invention relates to the technical field of secure communication, in particular to an optimization system of NIDS equipment adopting a hybrid matching engine.

Background

A Network Intrusion Detection (NIDS) device collects network messages through a network interface, and after the messages are preprocessed, the content of the messages needs to be detected to find out possible abnormal or attack traffic. Therefore, a large-scale matching rule base needs to be configured on the network intrusion detection device, wherein the matching rule base comprises a large number of character string matching rules, and the number of the character string matching rules is often tens of thousands or more. In the prior art, the network intrusion detection device implements matching operation through a multi-mode character string matching engine, and the specific method is that the multi-mode character string matching engine needs to be initialized when a system is initialized, and the important content of the initialization of the matching engine is that all matching rules are read in and processed according to a matching algorithm to form a special data structure stored in a memory of the matching engine; when a message reaches the condition that matching is needed, the byte stream of the part of the message which needs to be detected is input into a matching engine as a character string to be matched, the matching engine uses a rule database structure formed during initialization and uses an engine specific algorithm for matching, when a certain rule character string or certain rule character strings in a rule base appear in the character string to be matched, the matched rule character string is output, and if the matched rule character string does not exist, the unmatched result is output. Because the matching rule base is large in scale, the performance requirement of the matching engine on the NIDS device is high, and under the condition that the performance of the network intrusion detection device is limited, the situation of load overflow occurs when the flow reaches a certain rate, so that the function of network intrusion detection fails. The existing network intrusion equipment performance improving method comprises the steps of improving the matching performance based on a special chip or a network processor, improving the matching performance by using a more optimized pattern matching algorithm and improving the matching performance by using a hybrid engine method; however, the three methods have their own disadvantages, for example, in the method of improving matching performance based on a dedicated chip or a network processor, the design of the device needs to include the dedicated chip or the network processor, which increases the design and production cost of the device, and the solution cannot be used in an environment where the price of the device is limited. Among the methods for improving matching performance using more optimized pattern matching algorithms, there are currently many types of pattern matching algorithms, such as AC algorithms, WM algorithms, i.e., their variants, each of which has advantages in particular regular string features. Because the rule base of the network intrusion detection equipment is large in scale and different in rule characteristics, the optimal matching efficiency cannot be achieved by only selecting a certain specific algorithm. In the method for improving the matching performance by using the hybrid engine, a plurality of matching algorithms are adopted for rule matching, and each algorithm is only responsible for matching operation on one part of a rule base. However, the key technology of this method is how to divide the rule base to allocate to different algorithms, so as to achieve the optimal performance index. The existing method is very simple and judges only from the length of the rule, but the actual performance of various matching algorithms is related to the length of the rule, the internal structure of the rule such as a common prefix, a common suffix and the like, and the scale of the rule. The existing mixed-mode approach does not achieve the goal of matching efficiency optimization.

Therefore, in order to solve the problems in the prior art, it is urgently needed to provide an optimization system technology for NIDS devices using a hybrid matching engine, which improves the rule matching efficiency, and makes it very important to improve the upper limit of the traffic rate of device processing under the condition that the hardware performance of the network intrusion detection device is not changed.

Disclosure of Invention

In order to overcome the problems in the prior art, the invention provides an optimization system of network intrusion equipment adopting a hybrid matching engine, which adopts a depth-based self-encoder and a depth-enhanced learning submodule to optimally divide and assign a matching rule base when the hybrid engine is initialized. The system is suitable for NIDS equipment based on software matching and NIDS equipment based on special chip or network processor for matching.

In order to achieve the purpose, the invention adopts the following technical scheme:

a system for optimizing a NIDS device employing a hybrid matching engine, comprising: the system comprises a rule base, a depth self-encoder module, a random ordering module, a rule base optimal division training module, a rule base optimal division module, a matching performance measuring module and a matching engine pool, wherein: the rule base is configured to store a rule RDB formed by a plurality of original character strings, wherein the rule RDB is { r }₁，r₂…r_MTherein of

L_iIs the length of the character string of the ith rule, A is an ASCII code character, i.e. the ith rule is formed by L_iEach ASCII code is composed of ASCII codes; the lengths of the rules differ, L_i∈[1，LMAX]，i∈[1，M]

A depth self-encoder module configured to receive the rules from the rule base and first perform a length alignment operation on all the rules, i.e. aAll filled to length L_MAXThe character string of (1) is a character string which does not appear in all rules, the shortest patch character string is used, and the shortest patch character string is repeatedly used at the tail part of each rule character string to be completely filled to the length L_MAX(ii) a Then the original is processed by a depth self-encoder

Spatial mapping to F-dimensional real space R^FThe embedded expression of each rule in the F-dimensional real number space is obtained, the high-dimensional structural characteristics of the original rule character string are contained in the expression, and the influence of the characteristics is fully reflected in the subsequent optimal rule distribution of the engine.

A random ordering module configured to pair the codes to R^FAfter the rules are randomly ordered, outputting the rules to a rule base optimal division training module and a rule base optimal division module;

the rule base optimal division training module is configured as a method for learning rule division, and obtains parameters and outputs the parameters to the rule base optimal division module;

specifically, the rule base optimal division training module obtains parameters of the deep reinforcement learning submodule after training of a specific number of matching rules configurable by the system through a method for learning rule division of the deep reinforcement learning submodule contained in the rule base optimal division training module, and the parameters are copied to the rule base optimal division module for final optimal division.

The rule base optimal division module is configured to receive parameters from the rule base optimal division training module;

the matching performance measuring module is configured to receive rule distribution instructions sent by the rule base optimal division training module and the rule base optimal division module, after the rule base of each matching engine is initialized according to the instructions in the matching engine pool, each matching engine is driven to carry out matching test on the test character string by using the respective current rule base, and the action return value of the deep reinforcement learning algorithm DDQN in the optimal division training module and the rule base optimal division module is obtained

The rule distribution instruction from the rule base optimal division training module is to measure the performance after adding a certain specific rule into different matching engines respectively on the basis of the existing rules of all the matching engines, and the instruction from the optimal division module is to initialize the complete division scheme of one rule base to measure the performance in all the modules at one time.

The matching engine pool comprises N matching engines, wherein N is more than or equal to 2; the matching engine pool is configured to interact with data of a matching performance measurement module.

The rule base optimal division training module and the rule base optimal division module respectively comprise a depth reinforcement learning submodule; the depth reinforcement learning submodule of the rule base optimal division training module is used for learning a rule division method, and parameters of the depth reinforcement learning submodule of the rule base optimal division module are obtained after training of a set number of matching rules configured by a system, and the parameters are output to the rule base optimal division module.

As above, the Deep reinforcement learning sub-module adopts a DDQN (Double Deep Q-learning Network) Deep reinforcement learning model; wherein Q is used for estimating Q value_currentValue network and Q_targetThe value network employs the same neural network structure.

Said Q_targetThe value network is configured to receive the pass Q_currentParameters output after synchronous delay of value network, the parameters at least including Q_currentThe value network synchronizes the delayed state vector and the action and outputs the Q value of the state vector and the action; when Q is_currentAfter each round of training and learning, Q_currentParameters of a value network are synchronized to Q_targetA value network; the matching performance detection module measures each matching engine in the matching engine pool, performs matching test on the test character string set by using each current rule base, and calculates the result after measurement

Wherein the content of the first and second substances,

for computing training Q_currentLoss values used in value networks.

Said Q_currentThe value network is configured to receive the state vector

And action a_t(ii) a Wherein the state vector S_tVector RE output by random order module_tState vector output by matching engine pool

Splicing to obtain the finished product; wherein RE_t∈R^FThat is, the rule selected from the rule base by the random sorting module at time t is in R^FThe embedded code of (3);

i∈[1，N]the state vector of the ith engine at the time t in the matching engine pool is obtained by the code summation of all rules divided into the ith engine before the time t; the optimization system further comprises a policy module; said Q_currentThe value network is configured to output a Q value for the policy module to select action a_t。

Said Q_targetThe value network is configured to receive a state vector S_t+1And actions

The state vector S_t+1And act of

From Q_currentState vector S of value network_tAnd action a_tThe synchronous delay output is formed.

The strategy module adopts an e-greedy algorithm, namely selects the vector s at the current state with the probability of 1-epsilon_tLower make Q_currentAction a with maximum value net output_tAs a current policy, a_t∈[1，N]I.e. the number of the selected matching engine, is selected as the current policy with equal probability among the remaining N-1 engines of the matching engine pool with probability of epsilon.

Above, the Q_currentThe loss value formula of the value network training is

Wherein n is the total number of samples tested in each round; rt (at) is a parameter obtained by the matching performance measurement module according to the matching engine matching test.

Above, if the measured matching completion time of completing the test of the character string set by the N matching engines is t respectively_i，i∈[1，N]The ith matching engine is selected, i.e.

Of the hour

Value is represented by the formula

And (4) calculating.

The invention also provides an optimization method applied to the optimization system of the NIDS device adopting the hybrid matching engine, and the optimization method comprises the following steps:

step S1: fetching top N from a randomly ordered rule base_trainingThe rule driving module is used for driving the rule base optimal division training module to train the deep reinforcement learning submodule; wherein N is_trainingA parameter configurable for the system;

step S2: after the training is finished, copying parameters of a deep reinforcement learning submodule in the rule base optimal partition training module into the deep reinforcement learning submodule in the rule base optimal partition module;

step S3: randomly ordering the rule base, sequentially sending all the rules into the optimal division module of the rule base, and directly using Q_targetThe value network calculates the Q value to determine which matching engine the rule should belong to, eliminating the need for Q_currentTraining a value network to obtain a complete division scheme of a rule base, and recording the performance of the scheme obtained by using the division scheme to drive a matching performance measurement module to measure;

step S4: repeating the step S3 to obtain the partition scheme of the rule base obtained by the optimal partition module after different random sequences and the performance obtained by corresponding measurement; the number of times of repeating step S3 is N_testimony(ii) a Wherein N is_testimonyIs a system configurable parameter;

step S5: from N_testimonySelecting the partitioning scheme with the optimal performance, namely the shortest matching completion time measured in the steps S3 and S4 from the partitioning schemes as the final optimal partitioning scheme; the target division scheme is a division scheme with optimal performance.

The invention has the beneficial effects that:

the invention provides an optimization system and an optimization method of NIDS equipment adopting a hybrid matching engine, wherein the system improves the performance of a network intrusion detection system by using the hybrid matching engine and obtains the optimal rule division of the hybrid matching engine by using a deep reinforcement learning-based method; secondly, carrying out dimension reduction embedding expression on the rules in the original rule base by using a depth self-encoder, and reflecting high-dimensional structural features of the original rule character string by the similarity of the embedded expression vector while carrying out dimension reduction to optimize depth reinforcement learning, wherein the high-dimensional structural features are key factors influencing the performance of each matching engine. Moreover, the method of deep reinforcement learning and the method of dimension reduction embedding expression of the deep self-encoder are comprehensively used, and the effects of factors such as other important rule character string structural characteristics influencing the performance of the matching engine except the length of the rule character string in the optimal division of the rule base are fully embodied. In addition, when the complete division of the rule base is finally formed, multiple times of optimized division after random sequencing are adopted, and an optimal scheme is selected from the multiple times of optimized division. The method has the advantages that the calculation complexity is considered, and the problem of non-strict optimization caused by progressive optimization of deep reinforcement learning under the limited rule quantity is relieved. The device has the advantages of simple structure, convenience in operation and high adaptability.

Drawings

FIG. 1 is a schematic structural diagram of an optimization system provided by the present invention;

FIG. 2 is a diagram of an algorithm structure of a rule base optimal partition training module provided by the present invention.

Detailed Description

The following further describes embodiments of the present invention with reference to the drawings.

As shown in fig. 1 to 2, the present embodiment provides an optimization system for NIDS device using a hybrid matching engine, including: the system comprises a rule base, a depth self-encoder module, a random sequencing module, a rule base optimal division training module, a rule base optimal division module and a matching performance measuring module, wherein: the rule base is configured to store a rule RDB formed by a plurality of original character strings, wherein the rule RDB is { r }₁，r₂…r_MTherein of

L_iIs the length of the character string of the ith rule, A is an ASCII code character, i.e. the ith rule is formed by L_iEach ASCII code is composed of ASCII codes; the lengths of the rules differ, L_i∈[1，L_MAX]，i∈[1，M]

A depth self-encoder module configured to receive the rules from the rule base, and first perform a length alignment operation on all the rules, i.e. to fully complement the rules to a length L_MAXThe character string of (1) is a character string which does not appear in all rules, the shortest patch character string is used, and the shortest patch character string is repeatedly used at the tail part of each rule character string to be completely filled to the length L_MAX. Then the original is processed by a depth self-encoder

In this embodiment, the rule base optimal partition training module and the rule base optimal partition module each include a depth reinforcement learning sub-module; the depth reinforcement learning submodule of the rule base optimal division training module is used for learning a rule division method, and parameters of the depth reinforcement learning submodule of the rule base optimal division module are obtained after training of a set number of matching rules configured by a system, and the parameters are output to the rule base optimal division module.

In this embodiment, the Deep reinforcement learning sub-module adopts a DDQN (Double Deep Q-learning Network) Deep reinforcement learning model; wherein Q is used for estimating Q value_currentValue network and Q_targetThe value network employs the same neural network structure.

For computing training Q_currentLoss values used in value networks.

In this embodiment, Q_currentThe value network is configured to receive the state vector

i∈[1，N]the state vector of the ith engine at the time t in the matching engine pool is obtained by the code summation of all rules divided into the ith engine before the time t; the action a_tOutput to Q by the policy module_currentA value network;

The state vector S_t+1And act of

In this embodiment, the policy module uses the greedy algorithm, i.e., selects the current state vector s with a probability of 1- ε_tLower make Q_currentAction a with maximum value net output_tAs a current policy, a_t∈[1，N]I.e. the number of the selected matching engine, is selected as the current policy with equal probability among the remaining N-1 engines of the matching engine pool with probability of epsilon.

In this embodiment, Q_currentThe loss value formula of the value network training is

Wherein n is the total number of samples tested in each round; rt (at) is a parameter in the loss value formula calculated by the matching performance measurement module.

In this embodiment, if the measured matching completion time of the N matching engines for completing the testing of the string set is t_i，i∈[1，N]The ith matching engine is selected, i.e.

Of the hour

Value is represented by the formula

And (4) calculating.

The embodiment further provides an optimization method applied to the above NIDS device using the hybrid matching engine, where the optimization method includes the following steps:

step S4: repeating the step S3 to obtain the partition scheme of the rule base obtained by the optimal partition module after different random sequences and the performance obtained by corresponding measurement; the number of times of repeating step S3 isN_testimony(ii) a Wherein N is_testimonyIs a system configurable parameter;

step S5: from the above-mentioned N_testtmonyAnd selecting the partitioning scheme with the optimal performance, namely the shortest matching completion time from the partitioning schemes as the final optimal partitioning scheme.

Variations and modifications to the above-described embodiments may occur to those skilled in the art, which fall within the scope and spirit of the above description. Therefore, the present invention is not limited to the specific embodiments disclosed and described above, and some modifications and variations of the present invention should fall within the scope of the claims of the present invention. Furthermore, although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims

1. A system for optimizing a NIDS device using a hybrid matching engine, comprising: the system comprises a rule base, a depth self-encoder module, a random ordering module, a rule base optimal division training module, a rule base optimal division module, a matching performance measuring module and a matching engine pool, wherein:

the rule base is configured to store a rule RDB formed by a plurality of original character strings, wherein the rule RDB is { r }₁，r₂...r_MTherein of

L_iIs the length of the character string of the ith rule, A is an ASCII code character, i.e. the ith rule is formed by L_iEach ASCII code is composed of ASCII codes; the lengths of the rules differ, L_i∈[1，L_MAX]，i∈[1，M]；

A depth self-encoder module configured to receive rules from a rule base, wherein all rules are length aligned and filled to a length L_MAXThe character string used in the completion is a character string which does not appear in all the rules, and the shortest patch character string is used, and the shortest patch character string is repeatedly used at the tail part of each rule character string to be completedLength L_MAX(ii) a The depth self-encoder module encodes each rule, represents the rule as a vector with a uniform length F, and intercepts data

Will be original

Spatial mapping to F-dimensional real space R^FObtaining the embedded representation of each rule in the F-dimensional real number space;

the rule base optimal division training module is configured to use a method of deep reinforcement learning algorithm DDQN learning rule division to obtain parameters and output the parameters to the rule base optimal division module;

the rule base optimal division module is configured to receive parameters from the rule base optimal division training module and perform final optimal division on the rule base by using a deep reinforcement learning algorithm DDQN;

The matching engine pool comprises N matching engines, wherein N is more than or equal to 2; the matching engine pool is configured to perform data interaction with a matching performance measurement module;

the rule base optimal division training module and the rule base optimal division module both comprise a depth reinforcement learning submodule; the depth reinforcement learning submodule of the rule base optimal division training module is used for learning a rule division method, and parameters of the depth reinforcement learning submodule of the rule base optimal division module are obtained after training of a set number of matching rules configured by a system, and the parameters are output to the rule base optimal division module;

the deep reinforcement learning submodule adopts a DDQN deep reinforcement learning model; wherein Q is used for estimating Q value_currentValue network and Q_targetThe value network adopts the same neural network structure;

said Q_targetThe value network is configured to receive the pass Q_currentParameters output after synchronous delay of value network, the parameters at least including Q_currentThe value network synchronizes the delayed state vector and the action and outputs the Q value of the state vector and the action; when Q is_currentAfter each round of training and learning, Q_currentParameters of a value network are synchronized to Q_targetA value network; wherein, Q is_currentThe value network is configured to receive the state vector

the state vector of the ith engine at the time t in the matching engine pool is obtained by the code summation of all rules divided into the ith engine before the time t; the action a_tOutput to Q by the policy module_currentA value network;

said Q_targetValue network configurationFor receiving a state vector S_t+1And actions

The state vector S_t+1And act of

2. The optimization system of claim 1, further comprising a policy module; said Q_currentThe value network is configured to output the Q value for the policy module to select an action.

3. The optimization system of claim 2, wherein the policy module employs greedy-e algorithm, that is, selects s at the current state vector with a probability of 1-e_tLower make Q_currentAction a with maximum value net output_tAs a current policy, a_t∈[1，N]I.e. the number of the selected matching engine, is selected as the current policy with equal probability among the remaining N-1 engines of the matching engine pool with probability of epsilon.

4. The optimization system of claim 1, wherein the matching performance detection module performs matching test on the test rule through each matching engine in the matching engine pool, and calculates parameters after measurement

5. The optimization system of claim 4, wherein Q is_currentThe loss value formula of the value network training is

Wherein n is each roundTotal number of samples tested;

and matching the parameters obtained by the performance measurement module according to the matching engine matching test.

6. The optimizing system of claim 1, wherein the matching completion time t is measured if the N matching engines complete the testing string set_i，i∈[1，N]The ith matching engine is selected, wherein,

of the hour

By the formula

And (4) calculating.

7. An optimization method applied to the NIDS device adopting the hybrid matching engine according to any one of claims 1 to 6, wherein the optimization method comprises the following steps:

step S3: randomly ordering the rule base, sequentially sending all the rules into the optimal division module of the rule base, and directly using Q_targetThe value network calculates the Q value to determine which matching engine the rule should belong to, eliminating the need for Q_currentTraining the value network to obtain a ruleA complete partition scheme of the library, and recording the performance of the scheme obtained by using the partition scheme to drive the matching performance measurement module to measure;

step S5: from N_testimonyThe partition scheme with the best performance, i.e., the shortest matching completion time, measured in steps S3 and S4 is selected from the individual partition schemes as the final optimal partition scheme.