CN110708215A - Deep packet inspection rule base generation method and device, network equipment and storage medium - Google Patents

Deep packet inspection rule base generation method and device, network equipment and storage medium Download PDF

Info

Publication number
CN110708215A
CN110708215A CN201910957075.1A CN201910957075A CN110708215A CN 110708215 A CN110708215 A CN 110708215A CN 201910957075 A CN201910957075 A CN 201910957075A CN 110708215 A CN110708215 A CN 110708215A
Authority
CN
China
Prior art keywords
rule
rule base
data packet
data
generated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910957075.1A
Other languages
Chinese (zh)
Inventor
石仟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Onething Technology Co Ltd
Original Assignee
Shenzhen Onething Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Onething Technology Co Ltd filed Critical Shenzhen Onething Technology Co Ltd
Priority to CN201910957075.1A priority Critical patent/CN110708215A/en
Publication of CN110708215A publication Critical patent/CN110708215A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification

Abstract

The invention provides a deep packet inspection rule base generation method, which comprises the following steps: receiving a data packet and identifying a data stream to which the data packet belongs; judging whether the data packet is matched with a rule in a rule base; when the data packet does not match with the rule in the rule base, extracting the feature code of the data packet; generating a rule according to the feature code; judging whether the generated rule is effective or not according to the generated rule and other data packets in the data stream; updating the rule base when the generated rule is determined to be valid. The invention also provides a deep packet inspection rule base generation device, network equipment and a storage medium. The invention can automatically generate rules aiming at the newly applied data packet so as to update the existing rule base in real time, thereby improving the application recognition rate.

Description

Deep packet inspection rule base generation method and device, network equipment and storage medium
Technical Field
The present invention relates to the field of data network technologies, and in particular, to a method and an apparatus for generating a deep packet inspection rule base, a network device, and a storage medium.
Background
Deep Packet Inspection (DPI) is a high-speed Inspection method oriented to network data, and is mainly used for inspecting the content of a load field of a network Packet. The technology is widely applied to Intrusion Prevention Systems (IPS) and Intrusion Detection Systems (IDS).
At present, the deep packet inspection technology is to generate a deep packet inspection rule base by manually identifying applications, extracting feature codes and then compiling. However, as the number of applications increases, new applications may appear at any time, so that the existing deep packet inspection rule base cannot accurately identify the new applications.
Therefore, there is a need to provide a deep packet inspection rule base generation scheme that can update DPI rule features in time according to new applications.
Disclosure of Invention
The invention mainly aims to provide a method and a device for generating a deep packet inspection rule base, network equipment and a storage medium, and aims to solve the technical problem that the existing rule base cannot identify new applications in time.
In order to achieve the above object, a first aspect of the present invention provides a method for generating a deep packet inspection rule base, where the method includes:
receiving a data packet and identifying a data stream to which the data packet belongs;
judging whether the data packet is matched with a rule in a rule base;
when the data packet does not match with the rule in the rule base, extracting the feature code of the data packet;
generating a rule according to the feature code;
judging whether the generated rule is effective or not according to the generated rule and other data packets in the data stream;
updating the rule base when the generated rule is determined to be valid.
According to an optional embodiment of the present invention, the determining whether the generated rule is valid according to the generated rule and other data packets in the data stream includes:
in the next round of DPI detection, matching other data packets in the data flow by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is greater than a preset matching rate threshold value or not;
when the matching passing rate is greater than or equal to the preset matching rate threshold, determining that the generated rule is valid;
and when the matching passing rate is smaller than the preset matching rate threshold, determining that the generated rule is invalid.
According to an alternative embodiment of the present invention, the updating the rule base when it is determined that the generated rule is valid includes:
reporting the effective rule and the corresponding matching passing rate to a server;
receiving a final rule base generated by the server according to all the matching passing rates;
and updating the rule base into the final rule base.
According to an optional embodiment of the present invention, extracting the feature code of the data packet comprises:
extracting the header information of the data packet in each layer of protocol;
acquiring a plurality of target information in each piece of header information;
and connecting the target information to obtain the feature code.
According to an optional embodiment of the present invention, the extracting header information of the packet at each layer protocol comprises: extracting the effective load of the data packet in an application layer protocol; the obtaining the plurality of target information in each of the header information comprises: and acquiring a command code in the payload as target information, or acquiring Type information in the payload as target information.
According to an alternative embodiment of the present invention, the generating the rule according to the feature code includes:
and performing regular matching on all characters in the feature codes according to a set basic rule of a regular expression to obtain a rule.
According to an optional embodiment of the present invention, when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packets to corresponding links according to the application of the data packets.
To achieve the above object, a second aspect of the present invention provides a deep packet inspection rule base generation apparatus, including:
the receiving module is used for receiving the data packet and identifying the data stream to which the data packet belongs;
the matching module is used for judging whether the data packet is matched with the rule in the rule base;
the extraction module is used for extracting the feature codes of the data packets when the data packets are not matched with the rules in the rule base;
the generating module is used for generating rules according to the feature codes;
the judging module is used for judging whether the generated rule is effective or not according to the generated rule and other data packets in the data stream;
and the updating module is used for updating the rule base when the generated rule is determined to be effective.
In order to achieve the above object, a third aspect of the present invention provides a network device, which includes a memory and a processor, wherein the memory stores a downloaded program of a deep packet inspection rule base generation method executable on the processor, and the downloaded program of the deep packet inspection rule base generation method implements the deep packet inspection rule base generation method when executed by the processor.
To achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a downloaded program of a deep packet inspection rule base generation method, the downloaded program of the deep packet inspection rule base generation method being executable by one or more processors to implement the deep packet inspection rule base generation method.
According to the method, the device, the network equipment and the storage medium for generating the deep packet inspection rule base, provided by the embodiment of the invention, for each new application, the rule is automatically generated according to the applied data packet, and the existing rule base is updated in real time, so that the identification rate of application flow can be improved. And the generation mode of the rule does not lag behind the application. When the number of network nodes is more, the number of automatically generated rules is more, and the generated rules are calculated at the server to further determine the effectiveness of the rules, so that the cost of manual identification is reduced.
Drawings
Fig. 1 is a schematic flowchart illustrating a deep packet inspection rule base generation method according to a first embodiment of the present invention;
fig. 2 is a functional block diagram of a deep packet inspection rule base generating device according to a second embodiment of the present invention;
fig. 3 is a schematic internal structure diagram of a network device according to a third embodiment of the present invention.
The objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first" and "second" in the description and claims of the present application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Example one
Fig. 1 is a flowchart illustrating a method for generating a deep packet inspection rule base according to a first embodiment of the present invention.
The deep packet inspection rule base generation method may be applied to a network device, and the network device may include: switches, routers, firewall devices or other network security devices, and the like. The method for generating the deep packet inspection rule base specifically comprises the following steps, and according to different requirements, the sequence of the steps in the flowchart can be changed, and some steps can be omitted.
S11, receiving the data packet and identifying the data stream to which the data packet belongs.
The data Packet (Packet) is a transmission unit of a network layer in the whole TCP/IP communication protocol and is also a minimum unit. A packet with the same quadruple (e.g., source address, source port, destination address, destination port) is a data flow. I.e. there are multiple packets in a data stream. For example, there are 3 client packets and 40 server packets in a TCP data stream.
In an application scenario, a terminal may send a data packet to a server, and a network device may match the data packet with a rule when receiving the data packet, and release the data packet to the server after the matching is successful.
After receiving an application packet, the network device may identify a data stream to which the packet belongs by obtaining relevant information in the packet, for example, a source IP address, a destination IP address, a Host name, an IP protocol type (TCP/UDP/ICMP), a source port number and/or a destination port number range. Identifying the data flow to which the packet belongs is prior art and the invention will not be described in detail here. For example, in the Linux kernel, the identification of data streams may be achieved using netfilter connection tracking functionality in the kernel.
And S12, judging whether the data packet is matched with the rule in the rule base.
A rule base is usually pre-stored in the network device, so as to perform matching detection on the received data packet, thereby implementing the DPI function. The rule base may be an Intrusion Prevention System (IPS) rule base or a Uniform Resource Locator (URL) classification rule base, etc. And the rule base currently stored by the network equipment is called as a current rule base. The current rule base is not a database, but a set of rules for matching different application data packets, which is a description of a condition. The current rule base comprises at least one rule and a rule tree constructed based on the at least one rule, wherein the at least one rule is a character string.
And the network equipment performs safety control according to the matching result of the data packet and the rules in the rule base. And when the data packet is matched with the rule in the rule base, releasing the data packet to a server. When the packet does not match the rule in the rule base, S13 may be performed, or the packet may be discarded.
And S13, when the data packet does not match with the rule in the rule base, extracting the feature code of the data packet.
When the rule in the rule base is multiple, the data packet needs to be matched with each rule in the multiple rules, when the data packet is successfully matched with one rule, the data packet is considered to be successfully matched with the rule base, and when the data packet is unsuccessfully matched with all the rules, the data packet is considered to be unsuccessfully matched with the rule base.
The matching process of the data packet and the rule is prior art, and the present invention will not be described in detail herein.
According to an optional embodiment of the present invention, extracting the feature code of the data packet comprises:
extracting the header information of the data packet in each layer of protocol;
acquiring a plurality of target information in each piece of header information;
and connecting the target information to obtain the feature code.
According to an optional embodiment of the present invention, the extracting header information of the packet at each layer protocol comprises: extracting the effective load of the data packet in an application layer protocol; the obtaining the plurality of target information in each of the header information comprises: and acquiring a command code in the payload as target information, or acquiring Type information in the payload as target information.
Each data packet corresponds to a header information on each layer of protocol, and the following contents can be extracted according to the header information: remote IP, source and destination ports, three-layer protocol number, 4-layer protocol number, domain name, host name, HTTP related header information.
And for application layer protocols, the application layer protocols are generally customized by a manufacturer or an application. Generally speaking, the application is in the form of TLV, and the feature code can be extracted by analyzing the rule according to the payload (payload) information of the application layer protocol. When the payload information is a displayable character, the command code is generally used as a main code, and the extraction is performed according to words. When the payload information is binary data, it is usually a header of a private protocol, and extracts the corresponding Type information according to the correspondence of the corresponding location data.
And S14, generating a rule according to the feature code.
And the network equipment generates a rule base which can be used by the DPI system to match the data packet according to the extracted feature code. The rule base in each DPI system has a certain format, some match with character strings or regular character strings, and some match with bytecodes generated by Berkeley Packet Filter (BPF) rules.
According to an alternative embodiment of the present invention, the generating the rule according to the feature code includes:
and performing regular matching on all characters in the feature codes according to a set basic rule of a regular expression to obtain a rule.
And S15, judging whether the generated rule is valid according to the generated rule and other data packets in the data stream.
Since a rule generated by using one packet cannot represent that other packets in the data stream to which the packet belongs can also match, it is necessary to determine whether the generated rule is valid.
The determining whether the generated rule is valid according to the generated rule and other data packets in the data stream includes:
in the next round of DPI detection, matching other data packets in the data flow by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is greater than a preset matching rate threshold value or not;
when the matching passing rate is greater than or equal to the preset matching rate threshold, determining that the generated rule is valid;
and when the matching passing rate is smaller than the preset matching rate threshold, determining that the generated rule is invalid.
In this alternative embodiment, the generated rule needs to be written back into the DPI system and matched using the generated rule in the next round of DPI testing. If most of the data packets in the data stream are able to correctly match the generated rule, the applied rule generation is considered successful and the generated rule is valid. Otherwise, when most of the data packets in the data stream cannot be correctly matched with the generated rule, the applied rule is considered to be failed to be generated, and the generated rule is invalid.
For example, assuming that there are 20 data packets in a certain data stream, after a rule is generated according to a first data packet, the remaining 19 data packets are used to match with the generated rule respectively, and the matching pass rate is calculated. When the matching pass rate is higher than (greater than or equal to) a preset matching rate threshold, the generated rule is considered to be valid. When the matching pass rate is lower (smaller) than a preset matching rate threshold, the generated rule is considered to be invalid, and at this time, the feature code of the first data packet can be extracted again and the rule can be generated according to the feature code of the first data packet. Or extracting the feature code of the second data packet, generating a rule according to the feature code of the second data packet, and then matching the first data packet.
And S16, when the generated rule is determined to be valid, updating the rule base.
And for the received data packet which is not successfully matched with the rules in the current rule base, the application is possibly a new application, and the rule is generated according to the new application data packet and is updated to the current rule base, so that the rule can be directly matched from the updated rule base when the new application data packet is received again.
According to an alternative embodiment of the present invention, the updating the rule base when it is determined that the generated rule is valid includes:
reporting the effective rule and the corresponding matching passing rate to a server;
receiving a final rule base generated by the server according to all the matching passing rates;
and updating the rule base into the final rule base.
Although the rule may be generated based on the relevant information extracted from the data packet, the rule may not be accurate enough and thus needs to be uploaded to a server for further analysis and aggregation.
The server stores a rule list for recording newly generated rules received from all network devices and matching passing rates corresponding to the rules. And then, calculating, and determining a final rule base according to a calculation result. For example, assume that network device 1 sends rule 1 and the corresponding matching pass rate of 90% and rule 2 and the corresponding matching pass rate of 98% to the server. The network device 2 sends 96% of the matching passing rate of the rule 1 and the corresponding rule 1 and 97% of the matching passing rate of the rule 2 and the corresponding rule 2 to the server. The network device 3 sends the matching passing rate 92% of the rule 1 and the corresponding rule 1 and the matching passing rate 99% of the rule 2 and the corresponding rule 2 to the server. Then the server calculates the average match pass rate of rule 1 to be 92.7% and the average match pass rate of rule 2 to be 98%. Since the average matching pass rate of rule 1 is less than the predetermined matching pass rate threshold (e.g., 95%), and the average matching pass rate of rule 2 is greater than the predetermined matching pass rate threshold (e.g., 95%), the server adds rule 2 to the rule base to update the rule base, and issues the updated final rule base to the network device.
According to an optional embodiment of the present invention, when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packets to corresponding links according to the application of the data packets.
In this alternative embodiment, a data stream contains a plurality of data packets, and a data stream corresponds to an application. Therefore, after a certain data flow is successfully identified, only the application to which the data flow belongs needs to be judged, and then the subsequent data packets in the data flow do not need to be subjected to the DPI detection.
At present, the server and the exit bandwidth resources of an internet user are limited, and the stability and the real-time performance of a link are not high, so that the user often rents several telecom or connected higher-quality links for important services with high real-time performance and high stability, and rents a common link for unimportant services, so as to improve the working efficiency and the utilization rate of network resources. In this scenario, a traffic steering function is required to steer traffic to an appropriate link according to the application type and the user policy.
According to the method for generating the deep packet inspection rule base, whether the application of the data packet is a new application or not is judged by receiving the data packet and matching the data packet with the rules in the existing rule base. And when the rule is not successfully matched with the rule in the current existing rule base, the application of the data packet is indicated as new application, and the rule is generated by extracting the feature code of the data packet so as to update the current rule base. And matching other data packets in the data stream with the generated rule to obtain a matching passing rate by identifying the data stream to which the data packet belongs, and determining whether the generated rule is valid according to the matching passing rate. Therefore, the rules can be automatically generated and updated into the rule base for each new application, and the identification rate of the application flow can be improved. And the generation mode of the rule does not lag behind the application. When the number of network nodes is more, the number of automatically generated rules is more, and the generated rules are calculated at the server to further determine the effectiveness of the rules, so that the cost of manual identification is reduced.
Example two
Fig. 2 is a schematic diagram of functional modules of a deep packet inspection rule base generation apparatus according to a second embodiment of the present invention.
In some embodiments, the deep packet inspection rule base generation device 20 runs in a resource server. The deep packet inspection rule base generation means 20 may include a plurality of functional modules composed of program code segments. The program codes of the various program segments in the deep packet inspection rule base generation apparatus 20 may be stored in a memory of a network device and executed by the at least one processor to perform all or part of the steps of the deep packet inspection rule base generation method (see fig. 1 for details).
In this embodiment, the deep packet inspection rule base generating device 20 may be divided into a plurality of functional modules according to the functions executed by the device. The functional module may include: the device comprises a receiving module 201, a matching module 202, an extracting module 203, a generating module 204, a judging module 205, an updating module 206, a determining module 207 and a distributing module 208. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The receiving module 201 is configured to receive a data packet and identify a data stream to which the data packet belongs.
The data Packet (Packet) is a transmission unit of a network layer in the whole TCP/IP communication protocol and is also a minimum unit. A packet with the same quadruple (e.g., source address, source port, destination address, destination port) is a data flow. I.e. there are multiple packets in a data stream. For example, there are 3 client packets and 40 server packets in a TCP data stream.
In an application scenario, a terminal may send a data packet to a server, and a network device may match the data packet with a rule when receiving the data packet, and release the data packet to the server after the matching is successful.
After receiving an application packet, the network device may identify a data stream to which the packet belongs by obtaining relevant information in the packet, for example, a source IP address, a destination IP address, a Host name, an IP protocol type (TCP/UDP/ICMP), a source port number and/or a destination port number range. Identifying the data flow to which the packet belongs is prior art and the invention will not be described in detail here. For example, in the Linux kernel, the identification of data streams may be achieved using netfilter connection tracking functionality in the kernel.
And the matching module 202 is configured to determine whether the data packet matches a rule in a rule base.
A rule base is usually pre-stored in the network device, so as to perform matching detection on the received data packet, thereby implementing the DPI function. The rule base may be an Intrusion Prevention System (IPS) rule base or a Uniform Resource Locator (URL) classification rule base, etc. And the rule base currently stored by the network equipment is called as a current rule base. The current rule base is not a database, but a set of rules for matching different application data packets, which is a description of a condition. The current rule base comprises at least one rule and a rule tree constructed based on the at least one rule, wherein the at least one rule is a character string.
And the network equipment performs safety control according to the matching result of the data packet and the rules in the rule base. And when the data packet is matched with the rule in the rule base, releasing the data packet to a server. When the data packet does not match the rule in the rule base, the extraction module 203 may be executed, or the data packet may be discarded.
The extracting module 203 is configured to extract the feature code of the data packet when the data packet does not match the rule in the rule base.
When the rule in the rule base is multiple, the data packet needs to be matched with each rule in the multiple rules, when the data packet is successfully matched with one rule, the data packet is considered to be successfully matched with the rule base, and when the data packet is unsuccessfully matched with all the rules, the data packet is considered to be unsuccessfully matched with the rule base.
The matching process of the data packet and the rule is prior art, and the present invention will not be described in detail herein.
According to an optional embodiment of the present invention, the extracting module 203 extracts the feature code of the data packet, including:
extracting the header information of the data packet in each layer of protocol;
acquiring a plurality of target information in each piece of header information;
and connecting the target information to obtain the feature code.
According to an optional embodiment of the present invention, the extracting header information of the packet at each layer protocol comprises: extracting the effective load of the data packet in an application layer protocol; the obtaining the plurality of target information in each of the header information comprises: and acquiring a command code in the payload as target information, or acquiring Type information in the payload as target information.
Each data packet corresponds to a header information on each layer of protocol, and the following contents can be extracted according to the header information: remote IP, source and destination ports, three-layer protocol number, 4-layer protocol number, domain name, host name, HTTP related header information.
And for application layer protocols, the application layer protocols are generally customized by a manufacturer or an application. Generally speaking, the application is in the form of TLV, and the feature code can be extracted by analyzing the rule according to the payload (payload) information of the application layer protocol. When the payload information is a displayable character, the command code is generally used as a main code, and the extraction is performed according to words. When the payload information is binary data, it is usually a header of a private protocol, and extracts the corresponding Type information according to the correspondence of the corresponding location data.
A generating module 204, configured to generate a rule according to the feature code.
And the network equipment generates a rule base which can be used by the DPI system to match the data packet according to the extracted feature code. The rule base in each DPI system has a certain format, some match with character strings or regular character strings, and some match with bytecodes generated by Berkeley Packet Filter (BPF) rules.
According to an alternative embodiment of the present invention, the generating module 204 generates the rule according to the feature code, including:
and performing regular matching on all characters in the feature codes according to a set basic rule of a regular expression to obtain a rule.
The determining module 205 is configured to determine whether the generated rule is valid according to the generated rule and other data packets in the data stream.
Since a rule generated by using one packet cannot represent that other packets in the data stream to which the packet belongs can also match, it is necessary to determine whether the generated rule is valid.
The determining module 205 determines whether the generated rule is valid according to the generated rule and other data packets in the data stream, including:
in the next round of DPI detection, matching other data packets in the data flow by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is greater than a preset matching rate threshold value or not;
when the matching passing rate is greater than or equal to the preset matching rate threshold, determining that the generated rule is valid;
and when the matching passing rate is smaller than the preset matching rate threshold, determining that the generated rule is invalid.
In this alternative embodiment, the generated rule needs to be written back into the DPI system and matched using the generated rule in the next round of DPI testing. If most of the data packets in the data stream are able to correctly match the generated rule, the applied rule generation is considered successful and the generated rule is valid. Otherwise, when most of the data packets in the data stream cannot be correctly matched with the generated rule, the applied rule is considered to be failed to be generated, and the generated rule is invalid.
For example, assuming that there are 20 data packets in a certain data stream, after a rule is generated according to a first data packet, the remaining 19 data packets are used to match with the generated rule respectively, and the matching pass rate is calculated. When the matching pass rate is higher than (greater than or equal to) a preset matching rate threshold, the generated rule is considered to be valid. When the matching pass rate is lower (smaller) than a preset matching rate threshold, the generated rule is considered to be invalid, and at this time, the feature code of the first data packet can be extracted again and the rule can be generated according to the feature code of the first data packet. Or extracting the feature code of the second data packet, generating a rule according to the feature code of the second data packet, and then matching the first data packet.
An update module 206 for updating the rule base when the generated rule is determined to be valid.
And for the received data packet which is not successfully matched with the rules in the current rule base, the application is possibly a new application, and the rule is generated according to the new application data packet and is updated to the current rule base, so that the rule can be directly matched from the updated rule base when the new application data packet is received again.
According to an alternative embodiment of the invention, the updating module 206, when determining that the generated rule is valid, updating the rule base comprises:
reporting the effective rule and the corresponding matching passing rate to a server;
receiving a final rule base generated by the server according to all the matching passing rates;
and updating the rule base into the final rule base.
Although the rule may be generated based on the relevant information extracted from the data packet, the rule may not be accurate enough and thus needs to be uploaded to a server for further analysis and aggregation.
The server stores a rule list for recording newly generated rules received from all network devices and matching passing rates corresponding to the rules. And then, calculating, and determining a final rule base according to a calculation result. For example, assume that network device 1 sends rule 1 and the corresponding matching pass rate of 90% and rule 2 and the corresponding matching pass rate of 98% to the server. The network device 2 sends 96% of the matching passing rate of the rule 1 and the corresponding rule 1 and 97% of the matching passing rate of the rule 2 and the corresponding rule 2 to the server. The network device 3 sends the matching passing rate 92% of the rule 1 and the corresponding rule 1 and the matching passing rate 99% of the rule 2 and the corresponding rule 2 to the server. Then the server calculates the average match pass rate of rule 1 to be 92.7% and the average match pass rate of rule 2 to be 98%. Since the average matching pass rate of rule 1 is less than the predetermined matching pass rate threshold (e.g., 95%), and the average matching pass rate of rule 2 is greater than the predetermined matching pass rate threshold (e.g., 95%), the server adds rule 2 to the rule base to update the rule base, and issues the updated final rule base to the network device.
According to an optional embodiment of the present invention, when determining that the data packet matches a rule in the rule base, the deep packet inspection rule base generating apparatus 20 further includes:
a determining module 207, configured to determine an application of the data packet according to the matched rule;
an allocating module 208, configured to allocate the data packet to a corresponding link according to the application of the data packet.
In this alternative embodiment, a data stream contains a plurality of data packets, and a data stream corresponds to an application. Therefore, after a certain data flow is successfully identified, only the application to which the data flow belongs needs to be judged, and then the subsequent data packets in the data flow do not need to be subjected to the DPI detection.
At present, the server and the exit bandwidth resources of an internet user are limited, and the stability and the real-time performance of a link are not high, so that the user often rents several telecom or connected higher-quality links for important services with high real-time performance and high stability, and rents a common link for unimportant services, so as to improve the working efficiency and the utilization rate of network resources. In this scenario, a traffic steering function is required to steer traffic to an appropriate link according to the application type and the user policy.
The deep packet inspection rule base generation device provided by the embodiment of the invention judges whether the application of the data packet is a new application or not by receiving the data packet and matching the data packet with the rules in the current existing rule base. And when the rule is not successfully matched with the rule in the current existing rule base, the application of the data packet is indicated as new application, and the rule is generated by extracting the feature code of the data packet so as to update the current rule base. And matching other data packets in the data stream with the generated rule to obtain a matching passing rate by identifying the data stream to which the data packet belongs, and determining whether the generated rule is valid according to the matching passing rate. Therefore, the rules can be automatically generated and updated into the rule base for each new application, and the identification rate of the application flow can be improved. And the generation mode of the rule does not lag behind the application. When the number of network nodes is more, the number of automatically generated rules is more, and the generated rules are calculated at the server to further determine the effectiveness of the rules, so that the cost of manual identification is reduced.
EXAMPLE III
Fig. 3 is a schematic diagram of an internal structure of a network device according to an embodiment of the present invention.
In this embodiment, the network device 3 may be a client, a resource server, or other electronic devices.
The network device 3 may include a memory 31, a processor 32, and a bus 33.
The memory 31 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 31 may in some embodiments be an internal storage unit of the network device 3, for example a hard disk of the network device 3. The memory 31 may also be an external storage device of the network device 3 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the network device 3. Further, the memory 31 may also include both an internal storage unit of the network device 3 and an external storage device. The memory 31 may be used not only to store the application program and various types of data installed in the network device 3, such as the code of the deep packet inspection rule base generation apparatus 20 and the like, and various modules, but also to temporarily store data that has been output or is to be output.
Processor 32 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip for executing program codes stored in memory 31 or Processing data.
The bus 33 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.
Further, the network device 3 may further include a network interface, which may optionally include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), and is generally used to establish a communication connection between the network device 3 and other network devices.
Optionally, the network device 3 may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may also include a standard wired interface and a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch screen, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying messages processed in the network device and for displaying a visual user interface.
Fig. 3 only shows the network device 3 with the components 31-33, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the network device 3, and may be a bus-type structure or a star-shaped structure, and that the network device 3 may also comprise fewer or more components than shown, or may combine certain components, or may have a different arrangement of components. Other electronic products, now existing or hereafter developed, that may be adapted to the present invention, are also included within the scope of the present invention and are hereby incorporated by reference.
In the above embodiments, all or part may be implemented by an application program, hardware, firmware, or any combination thereof. When implemented using an application program, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of an application program functional unit.
The integrated unit, if implemented in the form of an application functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a computer application program product, stored in a storage medium, including instructions for causing a network device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a hard disk, a Read-only memory (ROM), a magnetic disk, or an optical disk.
It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, article, or method that includes the element.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A deep packet inspection rule base generation method is characterized by comprising the following steps:
receiving a data packet and identifying a data stream to which the data packet belongs;
judging whether the data packet is matched with a rule in a rule base;
when the data packet does not match with the rule in the rule base, extracting the feature code of the data packet;
generating a rule according to the feature code;
judging whether the generated rule is effective or not according to the generated rule and other data packets in the data stream;
updating the rule base when the generated rule is determined to be valid.
2. The method of claim 1, wherein the determining whether the generated rule is valid based on the generated rule and other packets in the data stream comprises:
in the next round of DPI detection, matching other data packets in the data flow by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is greater than a preset matching rate threshold value or not;
when the matching passing rate is greater than or equal to the preset matching rate threshold, determining that the generated rule is valid; or
And when the matching passing rate is smaller than the preset matching rate threshold, determining that the generated rule is invalid.
3. The method of claim 2, wherein updating the rule base when the generated rule is determined to be valid comprises:
reporting the effective rule and the corresponding matching passing rate to a server;
receiving a final rule base generated by the server according to all the matching passing rates;
and updating the rule base into the final rule base.
4. The method of claim 1, wherein extracting the feature code of the packet comprises:
extracting the header information of the data packet in each layer of protocol;
acquiring a plurality of target information in each piece of header information;
and connecting the target information to obtain the feature code.
5. The method of claim 4,
the extracting of the header information of the data packet at each layer protocol comprises: extracting the effective load of the data packet in an application layer protocol;
the acquiring the plurality of target information in each of the header information comprises: and acquiring a command code in the payload as target information, or acquiring Type information in the payload as target information.
6. The method of any of claims 1 to 5, wherein the generating rules according to the feature codes comprises:
and performing regular matching on all characters in the feature codes according to a set basic rule of a regular expression to obtain a rule.
7. The method of any of claims 1 to 5, wherein when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packets to corresponding links according to the application of the data packets.
8. An apparatus for generating a deep packet inspection rule base, the apparatus comprising:
the receiving module is used for receiving the data packet and identifying the data stream to which the data packet belongs;
the matching module is used for judging whether the data packet is matched with the rule in the rule base;
the extraction module is used for extracting the feature codes of the data packets when the data packets are not matched with the rules in the rule base;
the generating module is used for generating rules according to the feature codes;
the judging module is used for judging whether the generated rule is effective or not according to the generated rule and other data packets in the data stream;
and the updating module is used for updating the rule base when the generated rule is determined to be effective.
9. A network device comprising a memory and a processor, the memory having stored thereon a downloaded program of a deep packet inspection rule base generation method executable on the processor, the downloaded program of the deep packet inspection rule base generation method implementing the deep packet inspection rule base generation method according to any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, on which a downloaded program of a deep packet inspection rule base generation method is stored, the downloaded program of the deep packet inspection rule base generation method being executable by one or more processors to implement the deep packet inspection rule base generation method according to any one of claims 1 to 7.
CN201910957075.1A 2019-10-10 2019-10-10 Deep packet inspection rule base generation method and device, network equipment and storage medium Pending CN110708215A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910957075.1A CN110708215A (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method and device, network equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910957075.1A CN110708215A (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method and device, network equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110708215A true CN110708215A (en) 2020-01-17

Family

ID=69199025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910957075.1A Pending CN110708215A (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method and device, network equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110708215A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111371649A (en) * 2020-03-03 2020-07-03 恒为科技(上海)股份有限公司 Deep packet detection method and device
CN111553332A (en) * 2020-07-10 2020-08-18 杭州海康威视数字技术股份有限公司 Intrusion detection rule generation method and device and electronic equipment
CN111580931A (en) * 2020-05-10 2020-08-25 江苏省互联网行业管理服务中心 Matching rule engine supporting combined expression of multiple protocol variables
CN112583832A (en) * 2020-12-14 2021-03-30 北京鼎普科技股份有限公司 DPI-based application layer protocol identification method and system
CN112835645A (en) * 2021-02-05 2021-05-25 杭州迪普科技股份有限公司 Rule configuration method and device
WO2021164340A1 (en) * 2020-02-17 2021-08-26 华为技术有限公司 Data processing method and device therefor
CN113890835A (en) * 2021-09-29 2022-01-04 杭州迪普科技股份有限公司 Method and device for processing DPI application test message
CN113905411A (en) * 2021-10-28 2022-01-07 中国联合网络通信集团有限公司 Detection method, device, equipment and storage medium for deep packet inspection recognition rule
CN114826956A (en) * 2022-03-30 2022-07-29 杭州迪普科技股份有限公司 DPI policy library file automatic generation method and device for DPI test equipment
CN115334003A (en) * 2022-08-10 2022-11-11 上海欣诺通信技术股份有限公司 Data stream processing method and system based on convergence and diversion equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724317A (en) * 2012-06-21 2012-10-10 华为技术有限公司 Network data flow classification method and device
US20130128742A1 (en) * 2009-01-05 2013-05-23 Wuhan Research Institute Of Posts And Telecommunications Internet Real-Time Deep Packet Inspection and Control Device and Method
CN104243237A (en) * 2014-09-17 2014-12-24 杭州华三通信技术有限公司 P2P flow detection method and device
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN108289093A (en) * 2017-12-29 2018-07-17 北京拓明科技有限公司 The construction method and structure system in App application condition codes library

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130128742A1 (en) * 2009-01-05 2013-05-23 Wuhan Research Institute Of Posts And Telecommunications Internet Real-Time Deep Packet Inspection and Control Device and Method
CN102724317A (en) * 2012-06-21 2012-10-10 华为技术有限公司 Network data flow classification method and device
CN104243237A (en) * 2014-09-17 2014-12-24 杭州华三通信技术有限公司 P2P flow detection method and device
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN108289093A (en) * 2017-12-29 2018-07-17 北京拓明科技有限公司 The construction method and structure system in App application condition codes library

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021164340A1 (en) * 2020-02-17 2021-08-26 华为技术有限公司 Data processing method and device therefor
CN111371649A (en) * 2020-03-03 2020-07-03 恒为科技(上海)股份有限公司 Deep packet detection method and device
CN111580931A (en) * 2020-05-10 2020-08-25 江苏省互联网行业管理服务中心 Matching rule engine supporting combined expression of multiple protocol variables
CN111553332A (en) * 2020-07-10 2020-08-18 杭州海康威视数字技术股份有限公司 Intrusion detection rule generation method and device and electronic equipment
CN112583832A (en) * 2020-12-14 2021-03-30 北京鼎普科技股份有限公司 DPI-based application layer protocol identification method and system
CN112835645B (en) * 2021-02-05 2022-09-30 杭州迪普科技股份有限公司 Rule configuration method and device
CN112835645A (en) * 2021-02-05 2021-05-25 杭州迪普科技股份有限公司 Rule configuration method and device
CN113890835A (en) * 2021-09-29 2022-01-04 杭州迪普科技股份有限公司 Method and device for processing DPI application test message
CN113905411A (en) * 2021-10-28 2022-01-07 中国联合网络通信集团有限公司 Detection method, device, equipment and storage medium for deep packet inspection recognition rule
CN113905411B (en) * 2021-10-28 2023-05-02 中国联合网络通信集团有限公司 Detection method, device, equipment and storage medium for deep packet inspection identification rule
CN114826956A (en) * 2022-03-30 2022-07-29 杭州迪普科技股份有限公司 DPI policy library file automatic generation method and device for DPI test equipment
CN114826956B (en) * 2022-03-30 2023-05-26 杭州迪普科技股份有限公司 Automatic DPI policy library file generation method and device for DPI test equipment
CN115334003A (en) * 2022-08-10 2022-11-11 上海欣诺通信技术股份有限公司 Data stream processing method and system based on convergence and diversion equipment

Similar Documents

Publication Publication Date Title
CN110708215A (en) Deep packet inspection rule base generation method and device, network equipment and storage medium
US20180219907A1 (en) Method and apparatus for detecting website security
CN111796858B (en) Method, system and related equipment for detecting access of application programs in Kubernetes cluster
US10158733B2 (en) Automated DPI process
CN111131320B (en) Asset identification method, device, system and medium
CN102316087A (en) The detection method that network application is attacked
JP2009017298A (en) Data analysis apparatus
CN107968791A (en) A kind of detection method and device of attack message
CN112887405B (en) Intrusion prevention method, system and related equipment
CN109951562B (en) NAT traversal method and system, electronic device and storage medium
CN109981415A (en) Condition judgement method, electronic equipment, system and medium
CN109547449B (en) Safety detection method and related device
EP3242240B1 (en) Malicious communication pattern extraction device, malicious communication pattern extraction system, malicious communication pattern extraction method and malicious communication pattern extraction program
CN111277602A (en) Network data packet identification processing method and device, electronic equipment and storage medium
US8910281B1 (en) Identifying malware sources using phishing kit templates
CN105100246A (en) Network flow management and control method based on downloaded resource name
US11159548B2 (en) Analysis method, analysis device, and analysis program
CN106850349B (en) Feature information extraction method and device
CN103036895B (en) A kind of status tracking method and system
CN110708317B (en) Data packet matching method, device, network equipment and storage medium
CN104079493A (en) Flow recognition method and equipment and management and control method and equipment based on names of downloaded resources
US9160765B1 (en) Method for securing endpoints from onslaught of network attacks
CN112565290B (en) Intrusion prevention method, system and related equipment
CN112787978B (en) Data acquisition method and device, computer equipment and computer-readable storage medium
CN110868360B (en) Flow statistics method, electronic equipment, system and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination