WO2011060732A1 - 报文检测方法及装置 - Google Patents

报文检测方法及装置 Download PDF

Info

Publication number
WO2011060732A1
WO2011060732A1 PCT/CN2010/078900 CN2010078900W WO2011060732A1 WO 2011060732 A1 WO2011060732 A1 WO 2011060732A1 CN 2010078900 W CN2010078900 W CN 2010078900W WO 2011060732 A1 WO2011060732 A1 WO 2011060732A1
Authority
WO
WIPO (PCT)
Prior art keywords
matching
protocol
information
message
packet
Prior art date
Application number
PCT/CN2010/078900
Other languages
English (en)
French (fr)
Inventor
董岚君
苏德现
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP10831147.3A priority Critical patent/EP2434689B1/en
Publication of WO2011060732A1 publication Critical patent/WO2011060732A1/zh
Priority to US13/339,246 priority patent/US20120099597A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/18Protocol analysers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Definitions

  • DPI Deep Packet Inspection
  • IP packet layer 4 the payload Payload (application layer data) in the IP packet is further analyzed to more efficiently identify various applications on the network.
  • DPI technology has been widely used in areas such as traffic control, content charging, and network security.
  • protocol identification in the process of message detection is a key technology. Subsequent analysis and processing need to be based on the results of protocol identification. The speed, breadth and accuracy of protocol identification largely determine the performance of DPI equipment. Good or bad.
  • flow table matching is performed on an association protocol (a protocol in which the number of control channels is separated from a data channel) or a non-associative protocol.
  • the flow table matching refers to storing the matching information and the flow table in the received message.
  • the matching information is matched, and the matching information generally includes: a protocol domain (Type), a source IP address (SIP), a source port (SPort), a destination IP address (DIP), a destination port (DPort), and the like (the above five information) Referred to as quintuple information).
  • the quintuple information in the flow table determines the related information in the flow of the message.
  • the content of the related information in the flow is not limited. For example, it may include user ID, policy action, statistic type, protocol ID, etc., but generally includes Policy action information.
  • protocol identification means that the relevant information in the extracted message matches the rules in the rule base to identify the message.
  • rule matching means that the relevant information in the extracted message matches the rules in the rule base to identify the message.
  • the rule database is huge and the matching algorithm needs to be performed. Therefore, the processing speed is often Slower.
  • the protocol verification performs verification analysis on the result of the rule matching output, thereby identifying the relevant protocol.
  • the packet can be sent to the CPU for policy management, and the related information in the flow table is updated, including the correspondence between the quintuple information and the policy. By updating the flow table, the next message is entered.
  • the quintuple information in the flow table is indexed to the relevant policy for policy execution.
  • An embodiment of the present invention provides a packet detection method and apparatus, which are used to improve protocol identification speed, where:
  • a packet detection method includes the following steps:
  • association identification matching includes matching the triplet information in the packet, where the triplet information includes a protocol domain.
  • the source IP address, the source port, or the triplet information includes a protocol domain, a destination IP, and a destination port.
  • a packet detection method includes the following steps:
  • the content is obtained, the first matching information is corresponding to the message protocol, and the message protocol is obtained by the control message being recognized by the protocol;
  • An association protocol identification device includes:
  • a receiving unit configured to receive a message
  • the association identification table matching unit is configured to match the first matching information in the packet received by the receiving unit with the first matching information in the association identification table, and the first matching information in the association identification table passes Extracting the content of the control message for creating the data channel, the first matching information is corresponding to the message protocol, and the message protocol is obtained by the protocol after the control message is recognized by the protocol; Upon success, the association identification table is output to match the successfully obtained protocol information.
  • the matching information in the association identification table is matched, the matching information in the association identification table is obtained by extracting the content of the control message used for creating the data channel, and the protocol obtained by the protocol is recognized by the protocol.
  • the protocol can be identified without detecting the content of the message, thereby improving the speed of protocol recognition.
  • 4A is a flowchart of creating a configuration flow of a three-flow table and an association identification table according to an embodiment of the present invention
  • 5A is a schematic diagram of creating a data channel through a control channel according to Embodiment 3 of the present invention.
  • FIG. 5B is another schematic diagram of creating a data channel through a control channel according to Embodiment 3 of the present invention.
  • FIG. 6 is a schematic diagram of a deletion flow table according to Embodiment 3 of the present invention.
  • FIG. 7 is a schematic diagram of aging a control channel according to Embodiment 3 of the present invention.
  • FIG. 8 is a schematic diagram of aging a data channel according to Embodiment 3 of the present invention.
  • FIG. 10 is a schematic structural diagram of a device according to Embodiment 5 of the present invention.
  • Embodiment 1 will be further described in detail below with reference to the specific embodiments and the accompanying drawings.
  • a first embodiment of the present invention provides a packet detection method, including the following steps:
  • association identification matching includes matching the triplet information in the packet, where the triplet information includes a protocol domain.
  • the source IP address, the source port, or the triplet information includes a protocol domain, a destination IP, and a destination port.
  • the matching the table identification includes: matching the triplet information of the message with the triplet information in the association identification table, where the triplet information in the association identification table is used to create data by extraction.
  • the content of the control message of the channel is obtained, and the triplet information in the association identification table corresponds to the message protocol, and the message protocol is obtained by the protocol after the control message is identified by the protocol.
  • protocol identification is performed on the packet, and the protocol identification includes rule matching and protocol verification.
  • the flow table matching mainly matches the quintuple information in the packet and the quintuple information in the flow table, and the quintuple information may include a protocol domain, a source IP, a source port, a destination IP, and Destination port; quintuple information is mainly used to determine a stream.
  • the quintuple information is used to index the flow table. If other information can complete the corresponding function, the quintuple information can also be replaced with other information.
  • the flow table and the association identification table in the embodiment of the present invention may be two different tables, or may be the same table (for example, all located in the flow table); when located in two tables, the flow table may be separately matched by using the exact matching method. Matches with the association identification table; when located in the same table, the longest matching method can be used to achieve the sequential matching of the flow table and the association identification table.
  • the second embodiment of the method of the present invention provides a packet detection method, which is used to improve the protocol identification speed, and includes the following steps:
  • S10K matches the first matching information in the received message with the first matching information in the association identification table, and the first matching information in the association identification table extracts the control packet used to create the data channel.
  • the first matching information is corresponding to the message protocol, and the message protocol is obtained by the protocol after the control message is identified by the protocol;
  • the first matching information here includes: a protocol domain, a source IP, and a source port, or the first matching information includes a protocol domain, a destination IP, and a destination port.
  • the first matching information is from the separation of the control channel and the data channel. Protocol packets (such as most P2P protocols) can be performed after the data transmission of these protocols needs to be created through the control channel. When the control packet creates a data channel, the first matching information can be extracted. It also corresponds to the protocol information obtained after the control packet is identified by the protocol. At this time, if the association identification table is successfully matched, the protocol can be directly identified, and the steps of rule matching and protocol verification are not required.
  • the client when the FTP (File Transfer Protocol) protocol (a type of association protocol) is transmitted, the client first establishes a control connection with the port 21 of the server to create a control channel; then, through the control The channel sends a data channel command.
  • the specific commands vary according to the protocol.
  • the command includes triplet information.
  • the client when the FTP protocol establishes a data channel, the client sends a PORT command to include the corresponding data. Connected port information.
  • the triplet information in the relevant control message (such as the PORT command) can be extracted as the first matching information.
  • the embodiment of the present invention Before performing the association identification table matching, the embodiment of the present invention first performs flow table matching on the received packet, where the flow table matching includes the second matching information in the packet and the second in the flow table.
  • the matching information is matched, and the second matching information is used to search for related information in the flow corresponding to the packet, including an execution policy corresponding to the packet.
  • the second matching information is a protocol domain.
  • the information about the flow of the packet is determined by the above five information, including the user ID, policy action, and statistics type (such as traffic statistics, packet size statistics, and so on).
  • Stream ID used to indicate which user corresponds to), protocol ID, etc.
  • the information in the stream determined by the quintuple is not limited, but generally includes policy action information, so that when the quintuple is successfully matched, the tactic of the message can be found through the quintuple. Action, execute the appropriate strategy.
  • the second matching information is not limited to the protocol domain, source port, source port, destination port, and destination. The five items of the port, if the stream needs to add more identification information to determine or only need less than five information can be determined, the content of the second matching information can also be correspondingly increased or decreased.
  • the flow table and association identification table may be implemented by using a Hash table or a TCAM table (Ternary Content Addressable Memory) or other forms in the embodiment of the present invention.
  • the two can be located in different matching tables, that is, two different tables are used to store the data in the flow table and the associated identification table respectively; the two can also be located in the same matching table, that is, the same table is used to store the flow table and the association. Identify the data in the table.
  • the matching method is used to match the flow table matching and the association identification table.
  • the longest matching method is used to match the flow table matching and the association identification table.
  • the flow table or the association identification table is further configured according to the type of the received message, such as a packet of a general protocol or a control packet of an associated protocol or a data packet of an associated protocol. And including updating the second matching information in the flow table or the first matching information in the association identification table.
  • the association identification table When the association identification table is successfully matched, output protocol information that the association identification table matches successfully.
  • the output protocol information is the information after the matching is successful.
  • the policy can be managed by the protocol and the flow table is updated.
  • the policy here corresponds to the protocol, such as discarding, passing, and marking the related packets.
  • the steps of performing rule matching and protocol verification on the packet are similar to the prior art. Through these steps, the protocol can be identified, but the speed is much slower than that of the association identification table.
  • the policy execution is directly performed according to the matching result.
  • the flow table and the association identification table may be subjected to aging processing. If no data packet or control channel enters the data channel for a period of time, the related information and/or the association identification in the flow table is deleted. Relevant information in the table.
  • the first matching information is extracted by the first matching information in the packet, and the first matching information is corresponding to the protocol information obtained after the protocol packet is identified by the protocol. Therefore, as long as the association identification table is successfully matched, the protocol identification can be completed, and the steps of rule matching, protocol verification, and the like are no longer needed, thereby speeding up the protocol identification speed.
  • the embodiment of the present invention provides a packet detection method, which is used to improve the protocol identification speed.
  • FIG. 3 it is a schematic flowchart of the embodiment of the present invention.
  • the flow table is matched first. Conduct policy enforcement;
  • the embodiment of the present invention does not perform rule matching, but performs association identification table matching.
  • the association identification table stores the triplet information and the protocol ID corresponding to the triplet information.
  • the group information is obtained when the data channel is created by the control channel of the associated protocol, and the corresponding protocol ID is obtained after the control packet is identified by the protocol. If the ternary group information is successfully matched, the policy is directly managed according to the obtained protocol; Then, perform rule matching, protocol verification, and the like to perform protocol identification.
  • the foregoing association protocol refers to a protocol in which the control channel of the protocol is separated from the data channel.
  • the triplet information is extracted in the process of creating a data channel in the control channel of the associated protocol, and the triplet information and the control packet of the associated protocol are used.
  • the protocol obtained after the identification corresponds; therefore, only the triplet needs to be matched at this time. If the matching is successful, the protocol can be identified, and the policy management is performed according to the identified protocol, thereby saving the identification of the rule matching, the protocol verification, and the like. Link, which can greatly improve the speed of protocol identification.
  • the matching of the flow table refers to matching the matching information in the packet with the matching information in the flow table, and the matching information in the flow table can determine related information in the flow of the packet, including, including the user ID. , policy actions, statistic types, stream IDs, protocol IDs, and more. It should also be noted that the information in the stream determined by the quintuple is not limited, but generally includes policy action information, so that when the quintuple is successfully matched, the quintuple can be used to find the corresponding message.
  • Strategy action execute the corresponding strategy.
  • five pieces of information such as a protocol domain (Type), a source IP, a destination IP, a source port, and a destination port are used as matching information, and if there is a protocol, other information needs to be added to match, Or if less than five pieces of information can be matched, then the matching information can be adjusted according to the specific situation.
  • the matching information in the association identification table is not limited.
  • the following describes the embodiment of the present invention by using the matching information in the flow table as the quintuple information, and the information in the association identification table is the ternary group information as an example. Referring to FIG. 4, the following steps are specifically included:
  • S201 Receive a packet, and obtain quintuple information of the packet, where the quintuple information includes a protocol domain (Type), a source IP (SIP), a source port (SPort), a destination IP (DIP), and a destination port. (DPort);
  • Type protocol domain
  • SIP source IP
  • SPort source port
  • DIP destination IP
  • DPort destination port
  • the packet here refers to the packet that needs to be identified, including the association protocol packet with the control channel and the data channel separated.
  • the association protocol is a widely used protocol, such as in the P2P (Peer to Peer) application field, many Protocols are all associated protocols, and P2P occupies an important proportion in network traffic (more than 90% at night peaks); at the same time, common FTP, SIP (Session Initiation Protocol) protocols are also associated protocols. Therefore, if the recognition speed of the association protocol can be improved, the speed of the entire protocol identification process will be improved accordingly.
  • the associated protocol message both the control channel message (control message) and the data channel message (data message) need to be received and subsequently identified;
  • the protocol field (Type) in the quintuple information indicates that the message is transmitted based on TCP (Transmission Control Protocol) or UDP (User Datagram Protocol); source IP (SIP) and source
  • the port (SPort) indicates the IP address and port number of the user who sent the packet.
  • the destination IP (DIP) and destination port (DPort) indicate the IP address and port number of the user who received the packet.
  • the quintuple information is equivalent to an index information in the flow table, and can be used to find information in the flow table such as the user ID, the policy action, the statistic type, the flow ID, and the protocol ID. It should be noted that the information in the foregoing flow table is not unique, and the corresponding information may be added or deleted according to actual application conditions, but the above information generally includes a policy action. When the matching is successful, the quintuple information may be passed. Find the appropriate policy and perform policy enforcement.
  • the quintuple information of the packet is matched with the stored quintuple information to determine whether the matching is successful; if the matching is successful, the policy is executed; if the matching is unsuccessful, step S203 is performed;
  • the quintuple information is matched in the flow table.
  • the information stored in the flow table may include: user ID, flow ID, policy action (such as discarding, passing, uploading CPU, etc.), statistics, in addition to the quintuple information. Type and other information.
  • the quintuple information in the flow table is equivalent to an index information, and information such as a user ID, a stream ID, and a policy action can be indexed through the quintuple information.
  • the quintuple information By matching the quintuple information, if successful, the policy action corresponding to the quintuple information already exists in the flow table, and the quintuple information can be used to find the policy action, and then the policy is executed; if no match is successful, There is no strategy corresponding to the quintuple information, and the subsequent steps need to be performed.
  • step S206 is performed to perform policy management on the successfully matched protocol; if the matching fails, step S204 is performed;
  • the triplet information includes three information: protocol domain, source IP, and source port, or three information including protocol domain, destination IP, and destination port. Since the source node and the destination node are both opposite (for example, the client sends a message to the server, the client can be regarded as the source node or the server as the source node). Therefore, the triplet information It can be represented by the protocol domain, source IP, source port, or by protocol domain, destination IP, and destination port.
  • the information of the matching triplet is performed in the association identification table, and the association identification table records the protocol of the triplet information and the control channel message after being identified by the protocol.
  • These protocols are represented using a protocol ID, and in another embodiment, other means may be used to represent the protocol.
  • each triplet information in the association identification table corresponds to one protocol. If the triplet match is successful, it indicates that these protocols are identified, and step S206 is subsequently performed to perform policy management on the successfully matched protocols.
  • the definition of the source node and the destination node and the stored triplet information are the protocol domain, the source IP address, the source port or the protocol domain, the destination IP address, and the destination port, then the corresponding information in the packet can be directly The triplet information is matched. If it is not clear in advance that the defined and stored triplet is the protocol domain, source IP, source port or protocol domain, destination IP, and destination port, the protocol domain in the packet can be used first. , source IP, source port (or protocol domain, destination IP, destination port) for ternary matching of the association identification table. If unsuccessful, reuse protocol domain, destination IP, destination port (or protocol domain, source IP, source) Port) to match.
  • the protocol is identified, including rules matching, protocol verification, and the like.
  • the rule matching is performed by extracting the received packet feature information and matching the expression of the rule base, and if the matching succeeds, the matching result is output. Matching is usually done by the DPI acceleration chip, which speeds up the matching. If the match is unsuccessful, it can be submitted to the CPU for protocol identification through software.
  • Rule matching Because the feature information in the message matches the expression in the rule base, the rule base is often very large, and the matching algorithm needs to be operated when matching. Therefore, the relative flow table or the associated identification table is five yuan. Simple matching by groups or triples, rule matching often takes a lot of time. In particular, when rule matching is handled by software processing, its processing efficiency becomes lower compared to hardware implementation.
  • the result of the output after the rule matching is verified and analyzed, according to the verification result, it is judged whether the identification is completed, and if so, the protocol ID indicating the protocol is output.
  • the ternary group information in the association identification table extracts the content of the control message when the control packet creates the data channel, wherein the ternary group information is The protocol that identifies the control message after being identified by the protocol corresponds.
  • the following describes the creation process of the above-mentioned triplet information and the configuration of the flow table and the association identification table. Referring to FIG. 4A, the following may specifically include the following steps:
  • S203K receives the packet, determines the packet type, and if it is the control packet, performs step S2032; otherwise, performs step S2033 to identify the protocol through a process such as quintuple matching and triplet matching.
  • the step of judging the packet type is separate before the flow table is matched. Because the association protocol control channel and the data channel are separated, when the associated protocol packet comes over, the control channel needs to be established first, and the control channel does not transmit user data, but is used for Establish a control link for the protocol, and subsequently transfer the user data by creating a data channel.
  • the client connects to a port of the server to create a control channel. For example, when the FTP protocol control channel is created, the client connects to port 21 of the server.
  • Step S2032 is performed on the control packet related to the control channel. Otherwise, if it is determined that the packet is a data packet of the association protocol or a packet of the non-associative protocol, step S2033 is performed, and the flow table matching, the association identification table matching, and the like are performed. The process is processed. For details, refer to the process shown in FIG. 4.
  • S2032 Perform protocol identification or content analysis on the control packet.
  • the control packet is used to control the channel to create a data channel, extract relevant triplet information and configure a flow table and/or an association identification table.
  • the association protocol control channel and the data channel are separated.
  • the control channel does not transmit user data, but is used to establish a control connection of the protocol, and subsequently creates a data channel to transmit user data.
  • the client connects to a port of the server to create a control channel. For example, when the FTP protocol control channel is created, the client connects to port 21 of the server.
  • the flow table is matched first; if the matching is unsuccessful, the protocol is identified by the steps of rule matching, protocol verification, etc. (refer to steps S204 and S205 for details); The identified protocol is managed by the policy, and the result of the policy management is updated to the flow table.
  • the entry records the quintuple information and the corresponding policy action (may also have Other information such as the protocol ID, the stream ID, etc.), when the message containing the same quintuple information is entered next time, the corresponding policy action can be found by matching the quintuple information in the flow table, and the policy is executed. .
  • the packet content may be analyzed (including extracting quintuple information, subsequent flow table matching, etc.), and performing corresponding steps according to the content of the packet, such as flow table matching. Wait for steps.
  • the FTP protocol is used as an example to describe the specific creation of the data channel and the configuration process of the flow table and association identification table.
  • the FTP protocol supports two connection modes, one is Standard mode (also called PORT mode, active mode); the other is Passive mode (also called PASV mode, passive mode).
  • Standard mode also called PORT mode, active mode
  • Passive mode also called PASV mode, passive mode
  • the FTP client sends a PORT command to the FTP server, telling the server which port the client uses for data connection
  • Passive mode The FTP client sends a PASV command to the FTP server.
  • the server tells the client which port the server uses when performing data connection according to the PASV command, and allows the client to connect to the port.
  • the client uses its own port (such as port 1173) to establish a connection with the port 21 of the FTP server through the TCP protocol to create a control channel.
  • the control flow table and the data flow table are configured, and the control is performed here.
  • the flow table and the data flow table are both part of the flow table.
  • the entries in the control flow table are for control packets, and the entries in the data flow table are for data packets.
  • Source port 1173; Source IP: 2. 2. 2. 2; Destination port: 21; Destination IP: 1. 1. 1. 1; Type: TCP;
  • the quintuple information in the data flow table is as follows:
  • Source port X
  • Source IP X
  • Destination port X
  • Destination IP 1. 1. 1. 1
  • Type TCP
  • the data flow table at this time is a temporary data flow flow table containing incomplete information
  • the client sends a PORT command to the server to establish a data connection and create a data channel.
  • the PORT command includes a port (such as port 1174) of the client, and the server can connect to the client according to the port.
  • association identification table may be configured according to the PORT command, where the triplet information in the association identification table is: source port: 1174; source IP: 2. 2. 2. 2; Type: TCP;
  • the association identification table may further include a protocol ID corresponding to the triplet information (used to represent the protocol), and the protocol ID may be obtained by performing protocol identification on the control message in step S2032 (assuming that it is already The control message is identified by the protocol, and the protocol ID corresponding to the quintuple is extracted.
  • the source port and the source IP at this time are for the client as the source node and the server as the destination node. If the server is regarded as the source node and the client is regarded as the destination node, the triplet information should be: destination port 1174; destination IP: 2. 2. 2. 2 ; Type: TCP; therefore, source port, source The IP, destination port, and destination IP can be selected in one of the specific ways.
  • Source port 1174; Source IP: 2. 2. 2. 2; Destination port: X; Destination IP: 1. 1. 1. 1; Type: TCP; The quintuple information in the data flow table at this time is still Is incomplete (lack of destination port); 3) The server sends an SNY packet to create a data channel;
  • the server actively opens the port 20 to send a SY message to the client 1174 port for TCP negotiation to complete the establishment of the data channel.
  • the quintuple information in the data flow table can be determined as Source port: 1174; Source IP: 2. 2. 2. 2; Destination port: 20; Destination IP: 1. 1. 1. 1; Type: TCP.
  • TCP negotiation succeeds but the actual data channel is not successfully established (in this case, TCP negotiation needs to be performed again), so only the actual receiving end receives the transmitted datagram.
  • the text indicates that the data channel has been actually established.
  • the quintuple information extracted during the SYN message negotiation process cannot be placed in the flow table, but the triplet information in the process of establishing the data channel is determined ( The TCP negotiation does not affect the extracted triplet information after the PORT command, and can be saved in the association identification table.
  • the above process is a process of building an FTP through the Standard mode.
  • the working mode is similar.
  • the IP, port and Type required to extract the associated identification table can be obtained by creating related commands when the data channel is created. information.
  • configuration flow table and the association identification table may be processed according to different protocols and actual application situations, and are not limited to the foregoing embodiments. Listed situations.
  • an association flag bit may be set to identify whether the data flow table has an associated identification table associated with the data flow table, and if yes, the flag bit may be set to 1; if not, Can be set to 0.
  • the associated flag bit it is convenient to check whether the data flow table has an associated identification table associated with it, so that subsequent operations (such as deleting the association identification table) are more convenient and convenient.
  • the control channel may be deleted, and the related flow table and the association identification table are also deleted.
  • the method may include the following steps: S30K receiving Delete the control channel message;
  • step S304 Determine whether the data flow flow table is associated with the association identification table. If yes, go to step S305; if no, end;
  • determining whether the data flow table is associated with the association identification table may be performed by judging the associated flag bit introduced above, and the association flag bit is created when the association identification table is created, if the data flow table has associated association identification associated therewith. Table, its value is 1; if not, its value is 0;
  • the foregoing step is a logical implementation process.
  • the association identification table may be deleted first, then the data flow flow table is deleted, and finally the control flow flow table is deleted;
  • a similar approach can be used when the control channel or data channel is aging.
  • the aging process may be performed on the flow table and the association identification table, where the aging of the control channel and the aging of the data channel are included.
  • the aging of the control channel means that when a control channel does not enter a packet, the control flow table corresponding to the control flow is deleted, and the data flow table corresponding to the control channel is deleted.
  • the association identification table is associated, the association identification table is also deleted.
  • the aging of the control channel can include the following steps:
  • step S40 K periodically determines whether the control channel needs to be aged; if yes, go to step S402, if no, continue to determine whether aging is required;
  • control channel It is determined whether the control channel needs to be aged at regular intervals. If it is determined that a control channel does not enter a control flow within a certain period of time, it indicates that aging is required.
  • step S405 determining whether the data flow flow table is associated with the association identification table; if yes, executing step S405; if not, performing step S406;
  • the aging of the data channel can include the following steps:
  • S50 K periodically determines whether the data channel needs to be aged; if yes, go to step 502, if no, continue to determine whether aging is required;
  • the matching process of the flow table and the association identification table can be performed by NP (Net Processor, Network Processor), FPGA (Field Programmable Gate Array, field programmable Gate array), ASIC (Appl ication Specific Integrated Circuits) and other dedicated hardware processing units to complete, the processing speed is relatively fast, using the perfect matching method to match.
  • NP Network Processor
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuits
  • the association identification table ternary information is matched, and the matching can be directly performed. Policy management, without the need for rule matching and protocol verification to identify the protocol.
  • the matching process of the flow table and the associated identification table is processed by hardware, which can greatly speed up the processing and improve the processing efficiency.
  • Another advantage of the embodiment of the present invention is that if the protocol can be identified by the association identification table, there is no need to perform two modules processing by rule matching and protocol verification, and the rule matching requires a lot of rules, and therefore, a lot of space is needed to store these. In the embodiment of the present invention, since the rule matching is not required, the space for storing the rules can be saved in the process of speeding up the matching.
  • Embodiment 3 of the present invention provides an association protocol identification method for improving protocol identification speed.
  • the information in the association identification table is not stored separately, but is stored together with the flow table, and is matched in the form of a matching table. At this time, the matching is performed by the longest matching method. match.
  • the quintuple information is matched in the flow table. If the matching is successful, the policy is executed. When the quintuple information is unsuccessful, the ternary information is matched in the flow table. If successful, conduct policy management; If none of the matches is successful, the protocol is identified by rules matching, protocol verification, and the like.
  • the flow table includes the triplet information and the protocol ID corresponding thereto, and the method for obtaining the flow table and the method for configuring the flow table can be referred to the related steps in the third embodiment, and details are not described herein again.
  • the control channel creates a triplet information corresponding to the protocol in the process of creating a data stream of the data stream, and first matches the quintuple information in the matching process, and if not, matches the triplet information, and when When the tuple information is successfully matched, the policy management is performed, so that the rules matching and protocol verification are not required, and the processing speed of the protocol identification is improved.
  • the flow table matching is processed by using hardware such as NP, FPGA, ASIC, etc., the processing can be improved to a greater extent. Processing efficiency.
  • Another advantage of the embodiment of the present invention is that since the rule matching is not required when the triplet matching is successful, the space for storing the rules can be saved.
  • the embodiment of the invention provides an association protocol identification device, which is used for improving the speed of protocol identification processing, see the figure
  • the receiving unit 901 is configured to receive a message
  • the association identification table matching 902 is configured to match the first matching information in the packet received by the receiving unit with the first matching information in the association identification table, and the first matching information in the association identification table is passed. Extracting the content of the control message for creating the data channel, the first matching information is corresponding to the message protocol, and the message protocol is obtained by the protocol after the control message is recognized by the protocol; Upon success, the association identification table is output to match the successfully obtained protocol information.
  • the flow table matching unit 903 is configured to perform flow table matching on the received packet before the association identification table is matched, where the flow table matching includes the second matching information and the flow table in the packet.
  • the second matching information is matched, and the second matching information is used to search for related information in the data stream corresponding to the packet, including an execution policy corresponding to the packet.
  • the first matching information includes: a protocol domain, a source IP, and a source port; or the first matching information includes: a protocol domain, a destination IP, and a destination port; where the second matching information includes: a protocol domain, a source IP, and a source Port, head IP and destination port. It should be noted that the foregoing related information is not uniquely determined, and the information may be added or deleted according to the actual protocol matching required information.
  • a matching table storage unit 904 configured to store the association identification table and the flow table
  • the flow table and the association identification table may be located in different matching tables, that is, in the form of two independent tables; or the association identification table and the flow table are in the same matching table, and the data of the two tables are in the same table.
  • the matching table storage unit 904 can be implemented by using one physical storage medium (such as a memory, a hard disk, and various types of memory), or can be implemented by using multiple physical storage media, and the flow table and the associated identification table are on the storage medium.
  • the storage location is also not limited.
  • the storage information can be defined as a storage structure (such as a flow table) to store data in the flow table and the associated identification table, or multiple storage structures can be defined separately (such as a flow table + association identification table).
  • the flow table and association identification table can be implemented by using a hash table or a TCAM table or other tables having similar functions.
  • the flow table storage unit is not strictly distinguished from the association identification table storage unit, and may be two units, or may be merged into one unit (for example, the flow table storage unit may be included in the association identification table storage.
  • the unit, that is, the flow table storage unit may implement the function of the association identification table storage unit, may be different depending on the implementation form.
  • the flow table matching and the association identification table matching are performed by using an exact matching method; when the association identification table and the flow table are in the same match In the table time, the flow table matching and the association identification table matching are performed by using the longest matching method.
  • the information processing unit 905 is configured to extract the first matching information in the control packet to the matching table storage unit when the control packet is created, and the information processing unit is further configured to receive, according to the Configuring the flow table or the association identification table of the type of the packet includes updating the second matching information in the flow table or the first matching information in the association identification table.
  • the policy management unit 906 is configured to perform policy management according to the output protocol information when the association identification table is successfully matched, and update the flow table or the association identification table according to the result of the policy management.
  • the policy execution unit 907 is configured to perform a corresponding policy on the successfully matched packet after the matching of the flow table is successful.
  • the rule matching unit 908 is configured to perform rule matching on the received message when the association identification table is unsuccessful;
  • the protocol verification unit 909 is configured to perform protocol verification on the packet processed by the rule matching unit, and send the result of the protocol verification to the policy management unit.
  • the aging unit 910 is configured to: when the data channel or the control channel does not enter the message, the related information in the flow table and/or the associated identification table in the matching table storage unit are deleted.
  • each unit may be implemented by a general-purpose processor, or may be implemented by a dedicated processor or hardware having processing functions, and each unit may be implemented by the same processor, or several processing units.
  • the implementation is implemented using the same processor, and the specific implementation process is not limited.
  • a processor such as an NP (Net Processor), an FPGA (Field Programmable Gate Array), or an ASIC (Appliable Specific Integrated Circuits) can be used.
  • NP Network Processor
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuits
  • Another advantage of the embodiment of the present invention is that if the protocol can be identified by the association identification table, there is no need to perform two modules processing by rule matching and protocol verification, and the rule matching requires a lot of rules, and therefore, a lot of space is needed to store these. In the embodiment of the present invention, since the rule matching is not required, the space for storing the rules can be saved in the process of speeding up the matching.
  • the storage medium is It is a disk, a compact disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Description

报 测雄及輕
本申请要求于 2009年 11月 19日提交中国专利局、 申请号为 200910109820. 3、 发 明名称为 "一种报文检测方法及装置"的中国专利申请的优先权, 其全部内容通过引用 结合在本申请中。 技术领域 本发明涉及通信技术领域, 尤其涉及一种报文检测方法及装置。 背景技术
DPI (Deep Packet Inspection, 深度包检测) 是一种包 (报文) 检测技术, 它除 了对 IP包中的源 IP地址、 目的 IP地址、 源端口、 目的端口、 会话信息等 (IP包层 4 以下数据)信息进行检测分析外, 还对 IP包中的荷载 Payload (应用层数据)进行深入 分析, 从而可以更高效地识别出网络上的各种应用。
目前, DPI技术已广泛应用于流量控制、 内容计费和网络安全等领域。 在 DPI技术 应用中, 报文检测过程中的协议识别是一个关键的技术, 后续的分析处理都需要依据协 议识别的结果,协议识别的速度、广度和精度在很大程度上决定了 DPI设备性能的好坏。
参见图 1, 为现有 DPI技术中进行协议识别的流程示意图, 包括如下步骤:
1 ) 对接收的报文进行流表匹配, 如果匹配成功, 则进行策略执行; 如果不成功, 则执行步骤 2;
现有技术无论对关联协议(控制通道数与数据通道分离的协议)还是非关联协议都 进行流表匹配,这里的流表匹配是指对接收到的报文中的匹配信息与流表中存储的匹配 信息进行匹配,该匹配信息一般包括:协议域(Type)、源 IP地址(SIP)、源端口(SPort )、 目的 IP地址 (DIP)、 目的端口 (DPort ) 等信息 (以上五个信息简称五元组信息)。 流 表中的五元组信息确定了报文所在流中的相关信息, 流中的相关信息内容并不限定, 如 可以包括用户 ID、 策略动作、 统计类型、 协议 ID等信息, 但一般都会包括策略动作信 息, 当流表匹配成功时, 就可以通过五元组信息的索引查找得到相关的策略动作, 并进 行策略执行。
2) 对报文进行规则匹配, 输出匹配结果;
如果流表匹配不成功, 则说明并没有对应的策略, 此时, 需要进行协议识别、 策略 管理等步骤来建立五元组信息与策略的对应关系。 其中, 协议识别首先对报文进行规则 匹配,规则匹配是指提取报文中的相关信息与规则库中的规则进行匹配以对报文进行识 别, 规则匹配时, 由于规则库数据庞大以及需要进行匹配算法运算, 因此, 其处理速度 往往较慢。
3) 根据匹配结果进行协议验证, 识别出相关协议;
协议验证对规则匹配输出后的结果进行验证分析, 从而识别出相关协议。
4) 根据识别的协议进行策略管理, 同时更新流表;
协议识别成功后,可以将报文上送到 CPU进行策略管理,并更新流表中的相关信息, 包括五元组信息与策略之间的对应关系; 通过更新流表, 当下一次有报文进入时, 便可 以根据流表中的五元组信息索引到相关的策略进行策略执行。
发明人在实现本发明的过程中, 发现现有技术至少存在如下缺点:
当流表匹配不成功时, 需要通过规则匹配、 协议验证、 策略管理等步骤进行协议识 别及流表更新, 而规则匹配、 协议验证等步骤处理速度往往较慢, 因此, 使得协议识别 速度得不到进一步提升。 发明内容
本发明实施例提供一种报文检测方法及装置, 用于提高协议识别速度, 所述方法, 其中:
一种报文检测方法, 包括如下步骤:
接收报文, 对所述报文进行流表匹配;
当所述流表匹配不成功时, 对所述报文进行关联识别表匹配; 所述关联识别匹配包 括对所述报文中的三元组信息进行匹配, 所述三元组信息包括协议域、 源 IP、 源端口或 者所述三元组信息包括协议域、 目的 IP、 目的端口;
当所述关联识别表匹配成功时, 输出与所述关联识别表匹配成功的协议信息。 以及,
一种报文检测方法, 包括如下步骤:
将接收到的报文中的第一匹配信息与关联识别表中的第一匹配信息进行关联识别 表匹配; 所述关联识别表中的第一匹配信息通过提取用于创建数据通道的控制报文内容 得到, 所述第一匹配信息与报文协议相对应, 所述报文协议由所述控制报文经协议识别 后得到;
当所述关联识别表匹配成功时, 输出所述关联识别表匹配成功后的协议信息。 以及,
一种关联协议识别装置, 包括:
接收单元, 用于接收报文;
关联识别表匹配单元,用于将接收单元接到的报文中的第一匹配信息与关联识别表 中的第一匹配信息进行关联识别表匹配; 所述关联识别表中的第一匹配信息通过提取用 于创建数据通道的控制报文内容得到, 所述第一匹配信息与报文协议相对应, 所述报文 协议由所述控制报文经协议识别后得到; 当所述关联识别表匹配成功时, 输出所述关联 识别表匹配成功后的协议信息。
本发明实施例对关联识别表中的匹配信息进行匹配时, 由于关联识别表中的匹配信 息通过提取用于创建数据通道的控制报文内容得到, 并与控制报文经协议识别后得到的 协议相对应, 因此可以不需要检测报文内容而识别出协议, 从而提高了协议识别速度。 附图说明 为了更清楚地说明本发明实施例中的技术方案, 下面将对实施例或现有技术描述中 所需要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明的一些 实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以根据这 些附图获得其他的附图。
图 1为现有技术进行协议识别示意图;
图 2为本发明实施例二方法流程图;
图 3为本发明实施例三协议识别示意图;
图 4为本发明实施例三方法流程图;
图 4A为本发明实施例三流表及关联识别表创建配置流程;
图 5A为本发明实施例三通过控制通道创建数据通道示意图;
图 5B为本发明实施例三通过控制通道创建数据通道另一示意图;
图 6为本发明实施例三删除流表示意图;
图 7为本发明实施例三对控制通道进行老化示意图;
图 8为本发明实施例三对数据通道进行老化示意图;
图 9为本发明实施例四协议识别示意图;
图 10为本发明实施例五装置结构示意图。 具体实肺式 为使本发明的目的、 技术方案及优点更加清楚明白, 以下将通过具体实施例和相关 附图, 对本发明作进一步详细说明。 实施例一
本发明实施例一提供了一种报文检测方法, 包括如下步骤:
接收报文, 对所述报文进行流表匹配;
当所述流表匹配不成功时, 对所述报文进行关联识别表匹配; 所述关联识别匹配包 括对所述报文中的三元组信息进行匹配, 所述三元组信息包括协议域、 源 IP、 源端口或 者所述三元组信息包括协议域、 目的 IP、 目的端口;
当所述关联识别表匹配成功时, 输出与所述关联识别表匹配成功的协议信息。 其中, 所述关联识别表匹配包括: 将所述报文的三元组信息与关联识别表中的三元 组信息进行匹配,所述关联识别表中的三元组信息通过提取用于创建数据通道的控制报 文内容得到, 所述关联识别表中的三元组信息与报文协议相对应, 所述报文协议由所述 控制报文经协议识别后得到。
本发明实施例当所述报文进行所述关联识别表匹配不成功时,对所述报文进行协议 识别, 所述协议识别包括规则匹配和协议验证。
此外, 当流表匹配成功时, 对匹配成功的报文执行相应的策略。
在本发明实施例中, 流表匹配主要对报文中的五元组信息和流表中的五元组信息进 行匹配, 五元组信息可以包括协议域、 源 IP、 源端口、 目的 IP和目的端口; 五元组信息 主要用于确定一条流, 通过五元组信息来对流表进行索引查找, 如果有其他信息也能完 成相应的功能, 那么也可以将五元组信息换成其他信息。
本发明实施例中的流表和关联识别表可以为两个不同的表, 也可以是同一个表(例 如都位于流表当中) ; 当位于两个表时, 可以使用完全匹配法分别对流表和关联识别表 进行匹配; 当位于同一个表时, 可以使用最长匹配法来实现对流表和关联识别表的依次 匹配。
本发明实施例在流表匹配未成功时, 进行关联识别表匹配, 可以直接识别出协议, 而不需再经过协议识别过程, 因此, 可以提高协议识别速度。 实施例二
如图 2所示, 本发明方法实施例二提供了一种报文检测方法, 用于提高协议识别速 度, 包括如下步骤:
S10K将接收到的报文中的第一匹配信息与关联识别表中的第一匹配信息进行关联 识别表匹配; 所述关联识别表中的第一匹配信息通过提取用于创建数据通道的控制报文 内容得到, 所述第一匹配信息与报文协议相对应, 所述报文协议由所述控制报文经协议 识别后得到;
这里的第一匹配信息包括:协议域、源 IP和源端口,或者第一匹配信息包括协议域、 目的 IP和目的端口; 需要说明的是, 这里的第一匹配信息来自控制通道和数据通道分离 的协议报文 (如绝大多数 P2P协议) , 在这些协议的数据传输需要通过控制通道创建数 据通道后才能进行, 当控制报文创建数据通道时, 可以提取第一匹配信息; 第一匹配信 息又与控制报文经过协议识别后得到的协议信息相对应, 此时, 如果关联识别表匹配成 功, 则可以直接识别出协议, 不需要再进行规则匹配、 协议验证等步骤。
例如, FTP (File Transfer Protocol , 文件传输协议) 协议 (关联协议的一种) 传输时, 客户端会先与服务器的 21号端口建立一个控制连接, 来创建一个控制通道; 接 着, 通过所述控制通道发送建立数据通道命令; 具体命令根据协议不同而不同, 但一般 都会在相应的命令中包括有三元组信息, 例如, FTP协议建立数据通道时, 客户端发送 PORT命令中会包含相应的进行数据连接的端口信息。 此时, 就可以通过提取相关控制报 文 (如 PORT命令) 中三元组信息作为第一匹配信息。
在进行关联识别表匹配之前, 本发明实施例先对接收到的所述报文进行流表匹配, 所述流表匹配包括对所述报文中的第二匹配信息与流表中的第二匹配信息进行匹配,所 述第二匹配信息用于查找所述报文对应的流中的相关信息,包括所述报文对应的执行策 略; 在本发明实施例中, 第二匹配信息为协议域、 源 IP、 源端口、 目的 IP、 目的端口; 一般通过以上 5个信息就可以确定报文所在流的相关信息, 包括用户 ID、 策略动作、 统 计类型 (如流量统计、 包大小统计等) 、 流 ID (用于表示对应于哪个用户) 、 协议 ID等 自
I Ή、
这里需要说明的是, 五元组所确定的流中信息并不限定, 但一般都会包括策略动作 信息, 使得当五元组匹配成功后, 可以通过五元组查找到所述报文对应的策略动作, 执 行相应的策略。 同时, 第二匹配信息也并不局限于协议域、 源 ΙΡ、 源端口、 目的 ΙΡ、 目 的端口等五项, 如果流需要增加更多的识别信息才能确定或只需要不到五个信息就可以 确定, 则第二匹配信息的内容也可以进行相应的增加或减少。
上述流表和关联识别表在本发明实施例中可以采用 Hash表 (哈希表) 或 TCAM表 ( Ternary Content Addressable Memory, 三态内容寻址存储器) 或其他形式的表来实 现。 同时, 两者可以位于不同的匹配表, 即采用两个不同的表来分别存储流表和关联识 别表中的数据; 两者也可以位于同一匹配表, 即采用同一表来存储流表和关联识别表中 的数据。 当位于不同的匹配表时, 采用完全匹配法进行流表匹配和关联识别表匹配; 当 位于相同的匹配表时, 采用最长匹配法进行流表匹配和关联识别表匹配。
本发明实施例中, 还根据接收到的所述报文的类型(如一般协议的报文或关联协议 的控制报文或关联协议的数据报文)配置所述流表或所述关联识别表, 包括更新所述流 表中所述第二匹配信息或所述关联识别表中所述第一匹配信息。
S102、当所述关联识别表匹配成功时,输出所述关联识别表匹配成功后的协议信息。 输出的协议信息即为匹配成功后的信息, 可以通过对协议进行策略管理, 并更新流 表, 这里的策略与协议相对应, 如对相关报文执行丢弃、 通过、 做标记等动作。
如果关联识别表匹配未成功, 则对报文进行规则匹配、 协议验证等步骤, 与现有技 术类似, 通过这些步骤, 可以对协议进行识别, 但速度相比关联识别表匹配要慢很多。
本发明实施例中, 当流表匹配成功时, 直接根据匹配结果进行策略执行。
本发明实施例中, 还可以对流表和关联识别表进行老化处理, 如果监测到一段时间 内数据通道或控制通道没有报文进入, 删除所述流表中的相关信息和 /或所述关联识别 表中的相关信息。
本发明实施例在关联协议的控制报文通过控制通道创建数据通道时,通过提取报文 中的第一匹配信息, 第一匹配信息又与控制报文经协议识别后得到的协议信息相对应, 因此, 只要关联识别表匹配成功, 即可以完成协议识别, 而不再需要经过规则匹配、 协 议验证等步骤, 从而加快了协议识别速度。 实施例三
本发明实施例提供了一种报文检测方法, 用于提高协议识别速度, 参见图 3, 为本 发明实施例流程示意图, 当接收到报文时, 先进行流表匹配, 如果匹配成功, 则进行策 略执行; 当流表匹配不成功, 则本发明实施例并不进行规则匹配, 而是再进行关联识别表匹 配, 关联识别表中存放着三元组信息以及三元组信息对应的协议 ID, 这些三元组信息由 关联协议的控制通道创建数据通道时得到,其对应的协议 ID由控制报文经过协议识别后 得到; 如果三元组信息匹配成功, 则直接根据得到的协议进行策略管理; 如果不成功, 则再进行规则匹配、 协议验证等步骤进行协议识别。
上述关联协议是指协议的控制通道与数据通道分离的协议,本发明实施例在关联协 议的控制通道创建数据通道的过程中提取三元组信息,这些三元组信息与关联协议的控 制报文经识别后得到的协议对应; 因此, 此时只需要匹配三元组, 如果匹配成功, 则可 以识别出协议, 从而根据识别出的协议进行策略管理, 这样可以节省规则匹配、 协议验 证等识别的环节, 从而可以在很大程度上提升协议识别的速度。
需要说明的是, 上述流表匹配是指对报文中的匹配信息与流表中的匹配信息进行匹 配, 流表中的匹配信息可以确定报文所在流中的相关信息, 包括, 包括用户 ID、 策略动 作、 统计类型、 流 ID、 协议 ID等信息。 这里还需要说明的是, 五元组所确定的流中信息 并不限定, 但一般都会包括策略动作信息, 使得当五元组匹配成功后, 可以通过五元组 查找到所述报文对应的策略动作, 执行相应的策略。
本发明实施例中, 以协议域 (Type) 、 源 IP、 目的 IP、 源端口和目的端口等五个信 息 (五元组)作为匹配信息, 如果有协议还需要再加上其他信息才能匹配, 或者少于五 个信息的就能匹配, 那么可以根据具体情况对匹配信息进行调整。 同理, 关联识别表中 的匹配信息也不限定。
下面以流表中的匹配信息为五元组信息, 关联识别表中的信息为三元组信息为例, 来对本发明实施例进行具体描述, 参见图 4, 具体包括如下步骤:
S201、接收报文,获取所述报文的五元组信息,所述五元组信息包括协议域(Type )、 源 IP (SIP) 、 源端口 (SPort ) 、 目的 IP (DIP) 和目的端口 (DPort ) ;
这里的报文是指需要进行识别的报文,其中包括控制通道和数据通道分离的关联协 议报文, 关联协议是应用很广的协议, 如在 P2P (Peer to Peer, 点对点) 应用领域, 很多协议都是关联协议, 而 P2P在网络流量中占据着重要的比重(晚间高峰时能达到 90% 以上) ; 同时, 常见的 FTP、 SIP (Session Initiation Protocol , 会话起始协议) 等 协议也是关联协议, 因此, 如果能对关联协议的识别速度进行提升, 那么整个协议识别 过程的速度也会因此而得到提升。 对于关联协议报文, 无论是控制通道的报文 (控制报文)还是数据通道的报文 (数 据报文) 都需要进行接收及进行后续识别;
五元组信息中的协议域 (Type) 表示此报文是基于 TCP (Transmission Control Protocol , 传输控制协议) 或 UDP (User Datagram Protocol , 用户数据报协议) 进行 传输的; 源 IP (SIP) 和源端口 (SPort ) 分别表示发送此报文用户的 IP地址和端口号; 目的 IP (DIP) 和目的端口 (DPort ) 分别表示接收此报文用户的 IP地址和端口号。
五元组信息在流表中相当于一条索引信息, 可以通过该信息来查找用户 ID、 策略动 作、 统计类型、 流 ID、 协议 ID等流表中的信息。 需要说明的是, 上述流表中的信息并不 是唯一的, 可以根据实际应用情况来增加或删减相应的信息, 但上述信息一般都会包括 策略动作, 当匹配成功时, 可以通过五元组信息查找到相应的策略, 进行策略执行。
5202、 对报文的五元组信息与存储的五元组信息进行匹配, 判断是否匹配成功; 如 果匹配成功, 则进行策略执行; 如果匹配不成功, 执行步骤 S203;
对五元组信息进行匹配在流表中进行, 流表中存储的信息除了五元组信息外, 还可 以包括: 用户 ID、 流 ID、 策略动作 (如丢弃、 通过、 上传 CPU等) 、 统计类型等信息。 流表中的五元组信息相当于一个索引信息, 可以通过五元组信息索引用户 ID、 流 ID、 策 略动作等信息。 通过匹配五元组信息, 如果成功, 说明流表中已经存在着该五元组信息 对应的策略动作, 可以通过五元组信息查找得到策略动作, 接着执行该策略; 如果没有 匹配成功, 则说明还没有该五元组信息对应的策略, 需要执行后续步骤。
5203、 对报文的三元组信息与存储的三元组信息进行匹配, 如果成功, 则执行步骤 S206, 对匹配成功的协议进行策略管理; 如果匹配失败, 执行步骤 S204;
三元组信息包括协议域、 源 IP、 源端口三个信息, 或者包括协议域、 目的 IP、 目的 端口三个信息。 由于源节点与目的节点都是相对的(如同样是客户端向服务器发送消息 这一动作, 既可以将客户端看成源节点, 也可以将服务器看成源节点) , 因此, 三元组 信息既可以通过协议域、 源 IP、 源端口来表示, 也可以通过协议域、 目的 IP、 目的端口 来表示。
在本发明实施例中, 匹配三元组的信息在关联识别表中进行, 关联识别表中记录有 三元组信息和控制通道报文经过协议识别后识别出来的协议, 在本发明实施例中, 使用 协议 ID来表示这些协议, 在另一实施例中, 也可以使用其他方式来表示协议。 其中, 关 联识别表中的每个三元组信息都与一个协议对应。 如果三元组匹配成功, 表示识别出了 这些协议, 后续执行步骤 S206, 对这些匹配成功的协议进行策略管理。 在匹配过程中, 如果事先知道源节点和目的节点的定义以及存储的三元组信息是协 议域、 源 IP、 源端口还是协议域、 目的 IP、 目的端口, 那么, 可以直接对报文中相应的 三元组信息进行匹配; 如果事先并不清楚定义及存储的三元组是采用协议域、 源 IP、 源 端口还是协议域、 目的 IP、 目的端口, 则可以先对报文中的协议域、 源 IP、 源端口 (或 协议域、 目的 IP、 目的端口)进行关联识别表的三元组匹配, 如果不成功, 再用协议域、 目的 IP、 目的端口 (或协议域、 源 IP、 源端口) 进行匹配。
5204、 对报文进行规则匹配, 输出匹配结果;
当流表匹配不成功时, 对协议进行识别, 包括规则匹配、 协议验证等步骤, 其中, 规则匹配通过提取接收的报文特征信息与规则库的表达式进行匹配,若匹配成功输出匹 配结果。 匹配一般通过 DPI加速芯片来完成, 这样可以加快匹配速度。 如果匹配不成功, 可提交到 CPU, 通过软件再来进行协议识别。 规则匹配由于将报文中的特征信息与规则 库中的表达式进行匹配,而规则库往往很庞大, 匹配时又需要经过匹配算法运算, 因此, 相对比流表或关联识别表中对五元组或三元组进行的简单匹配,规则匹配往往需要花费 大量时间。 尤其是规则匹配由软件处理来处理时, 其处理效率相比于硬件实现会变得更 加低下。
5205、 根据匹配结果进行协议验证;
对通过规则匹配后输出的结果进行验证分析, 根据验证结果判断出是否完成识别, 如果是, 则输出用以表示协议的协议 ID。
5206、 根据经规则匹配、 协议验证后识别出的协议进行策略管理, 并更新流表; 进行协议验证后确定该协议对应的策略, 进行策略管理, 并将该协议及其对应的策 略更新到流表中, 通过更新流表, 使得下一次有报文如果匹配五元组时, 可以直接识别 出该报文对应的协议以及与该协议对应的执行策略。
本发明实施例在关联识别表进行三元组匹配过程中, 关联识别表中的三元组信息在 控制报文创建数据通道时, 提取控制报文的内容得到, 其中, 上述三元组信息与控制报 文经过协议识别后识别出来的协议相对应。 为使方案更加清楚完整, 下面对上述三元组 信息的创建过程以及流表和关联识别表的配置作详细阐述, 参见图 4A, 具体可以包括如 下步骤:
S203K 接收报文, 判断报文类型, 如果是控制报文, 执行步骤 S2032; 否则, 执行 步骤 S2033, 通过五元组匹配、 三元组匹配等流程对协议进行识别。 判断报文类型的步骤在流表匹配之前, 由于关联协议控制通道和数据通道是分离 的, 当关联协议报文过来时, 首先需要建立控制通道, 控制通道并不传输用户数据, 而 是用于建立协议的控制链接, 后续通过创建数据通道来来传输用户数据。 一般创建控制 通道时, 客户端与服务器的某一端口进行连接, 进而创建控制通道。 例如, FTP协议控 制通道创建时, 客户端会与服务器的 21号端口进行连接。
对于跟控制通道有关的控制报文, 执行步骤 S2032, 否则, 如果判断报文为关联协 议的数据报文或非关联协议的报文, 则执行步骤 S2033, 通过流表匹配、 关联识别表匹 配等过程进行处理, 具体可以参见图 4所示的流程。
S2032、 对控制报文进行协议识别或内容分析, 当控制报文用于控制通道创建数据 通道时, 提取相关的三元组信息并配置流表和 /或关联识别表;
关联协议控制通道和数据通道是分离的, 其中, 控制通道并不传输用户数据, 而是 用于建立协议的控制连接, 后续通过创建数据通道来传输用户数据。 一般创建控制通道 时, 客户端与服务器的某一端口进行连接, 进而创建控制通道。 例如, FTP协议控制通 道创建时, 客户端会与服务器的 21号端口进行连接。
当判断接收的为控制报文时,先进行流表匹配;如果匹配不成功,则通过规则匹配、 协议验证等步骤来进行协议识别 (具体请参见步骤 S204、 S205等步骤) ; 协议识别后对 识别出来的协议进行策略管理, 并将策略管理后的结果更新至流表, 至此, 流表中又多 了一个表项, 该表项记录了五元组信息以及对应的策略动作 (还可以有协议 ID、 流 ID等 其他信息) , 当下次再有包含相同的五元组信息的报文进入时, 就可以通过匹配流表中 的五元组信息查找到相应的策略动作, 并执行该策略。
协议的首包由于是第一次接收, 所以一般都需要经过规则匹配、 协议验证等步骤来 进行协议识别。 当协议控制报文为非首包时, 可以对报文内容进行分析 (包括提取五元 组信息, 后续进行流表匹配等动作) , 根据报文的内容后续执行相应的步骤, 如流表匹 配等步骤。
下面以 FTP协议为例, 来对数据通道的具体创建以及流表、 关联识别表的配置过程 作详细说明:
FTP协议支持两种连接模式, 一种为 Standard模式 (也称 PORT模式, 主动模式) ; 另一种是 Passive模式 (也称 PASV模式, 被动模式) 。 Standard模式中, FTP客户端发送 PORT命令到 FTP服务器,告诉服务器进行数据连接时客户端使用的端口; Passive模式中, FTP客户端发送 PASV命令到 FTP服务器,服务器根据 PASV命令告诉客户端进行数据连接时 服务器使用哪个端口, 让客户端来连接这个端口。
以下通过 PORT模式进行连接为例来对数据通道创建流程作具体说明。参见图 5A及图 5B, 假设客户端的 IP为 2. 2. 2. 2, 服务器的 IP为 1. 1. 1. 1, 包括如下步骤:
1 ) 建立控制通道, 配置控制流流表和数据流流表;
参见图 5A, 首先, 客户端使用自己的一个端口 (如端口 1173)通过 TCP协议先和 FTP 服务器的 21端口建立连接以创建控制通道; 同时配置控制流流表和数据流流表, 这里的 控制流流表和数据流流表都是流表的一部分, 其中, 控制流流表中的表项针对的是控制 报文, 数据流流表中的表项针对的是数据报文。
此时, 控制流流表中的五元组信息如下:
源端口: 1173; 源 IP: 2. 2. 2. 2; 目的端口: 21; 目的 IP: 1. 1. 1. 1; Type: TCP; 数据流流表中的五元组信息如下:
源端口: X; 源 IP: X; 目的端口: X; 目的 IP: 1. 1. 1. 1 ; Type: TCP; (字母 X表 示该信息未知, 下同) ; 此时的数据流流表是一个包含不完整信息的临时数据流流表;
2) 通过客户端发送的 PORT命令获取三元组信息, 并配置关联识别表;
在控制通道建立完成后, 客户端向服务器发送 PORT命令用以建立数据连接, 创建数 据通道, PORT命令中包括了客户端的一个端口 (如端口 1174) , 服务器可以根据这个端 口来连接客户端。
此时, 可以根据 PORT命令来配置关联识别表, 其中, 关联识别表中三元组信息为: 源端口: 1174; 源 IP: 2. 2. 2. 2; Type: TCP;
关联识别表中除了三元组信息, 还可以包括三元组信息对应的协议 ID (用于表示协 议) , 该协议 ID可以通过步骤 S2032对控制报文进行协议识别后得到 (假设此时已对控 制报文进行了协议识别, 并提取出了与五元组对应的协议 ID) ; 这里需要说明的是, 此 时的源端口和源 IP是针对客户端为源节点, 服务器为目的节点而来的; 如果将服务器看 成源节点,客户端看成目的节点,则三元组信息应该是:目的端口 1174;目的 IP: 2. 2. 2. 2; Type: TCP; 因此, 源端口、 源 IP以及目的端口、 目的 IP可以具体情况选择其中一种表 述方法即可。
同时, 可以更新数据流流表中的五元组信息为:
源端口: 1174; 源 IP: 2. 2. 2. 2; 目的端口: X; 目的 IP: 1. 1. 1. 1; Type: TCP; 此 时的数据流流表中的五元组信息仍然是不完整的 (缺少目的端口) ; 3 ) 服务器发送 SNY报文, 创建数据通道;
参见图 5B, 服务器收到 PORT命令后, 主动打开端口 20向客户端 1174端口发送 SY 报 文进行 TCP协商来完成数据通道的建立, 此时, 可以确定数据流流表中的五元组信息为 源端口: 1174; 源 IP: 2. 2. 2. 2; 目的端口: 20; 目的 IP: 1. 1. 1. 1; Type: TCP。 这里 需要说明的是, 由于在实际网络中, 会存在 TCP协商成功但实际数据通道并没有建立成 功的情况 (此时, 需要再次进行 TCP协商) , 所以只有实际接收端收到发送过来的数据 报文时才说明数据通道已真正建立, 因此, 此时的 SYN报文协商过程中提取的五元组信 息还不能放到流表中, 但建数据通道过程中的三元组信息是确定的 (TCP协商在 PORT命 令之后, 不影响已经提取的三元组信息) , 可以被保存在关联识别表中。
上述过程为通过 Standard模式进行 FTP建链的过程,采用 Passive模式或其他协议时 工作方式与此类似,都是可以通过创建数据通道时的相关命令来提取关联识别表所需的 IP、 端口和 Type信息。
这里还需要说明的是, 上述例子仅仅是本发明一个具体实施例, 在其他实施中, 配 置流表及关联识别表可以根据协议的不同以及实际应用情况进行处理, 并不局限于上述 实施例所列举的情形。
可选地, 在配置关联识别表时, 还可以设置一个关联标志位, 用于标识数据流表是 否存在着与之关联的关联识别表, 如果存在, 则可以设置标志位为 1 ; 如果不是, 则可 以设置为 0。 通过设置关联标志位, 可以方便地查看数据流表是否存在与之关联的关联 识别表, 使得后续操作 (如删除关联识别表) 更加简单方便。
本发明实施例中, 如果不需要对关联识别协议进行识别时, 可以将控制通道进行删 除, 同时也对相关的流表和关联识别表进行删除, 参见图 6, 具体可以包括如下步骤: S30K 接收删除控制通道消息;
5302、 删除控制流流表;
5303、 删除数据流流表;
S304、 判断数据流流表是否关联着关联识别表, 如果是, 转到步骤 S305; 如果否, 结束;
这里判断数据流流表是否关联着关联识别表可以采用通过判断上文介绍的关联标 志位来进行, 该关联标志位在创建关联识别表时创建, 如果数据流表存在着与之关联的 关联识别表, 其值为 1 ; 如果不是, 其值为 0;
S305、 删除关联识别表。 至此步骤, 完成对关联识别表的删除。
需要说明的是, 在进行流表删除时, 上述步骤是一个逻辑实现过程, 具体操作时, 可以先删除掉关联识别表, 再删除掉数据流流表, 最后删除掉控制流流表; 在对控制通 道或数据通道进行老化时, 也可以采用类似的方法。
本发明实施例中, 还可以对流表及关联识别表进行老化处理, 其中, 包括控制通道 的老化和数据通道的老化。
控制通道的老化是指当检测到一段时间内某条控制通道没有报文进入, 则将该控制 流对应的控制流流表删除, 同时删除控制通道对应的数据流流表, 如果数据流流表关联 着关联识别表, 则也将关联识别表删除。
参见图 7, 控制通道的老化具体可以包括如下步骤:
S40 K 定时判断控制通道是否需要老化; 如果是, 转到步骤 S402, 如果否, 继续判 断是否需要老化;
每隔一定时间间隔判断控制通道是否需要老化, 如果判断得到控制通道在一段时间 内某条控制流没有报文进入, 则说明需要老化;
S402、 删除控制流流表;
5403、 删除数据流流表;
5404、判断该数据流流表是否关联着关联识别表; 如果是, 执行步骤 S405, 如果否, 执行步骤 S406 ;
5405、 删除关联识别表;
S406、 判断控制流流表对应的数据流流表是否都删除完成; 如果是, 则结束流程, 如果否, 转到步骤 S403。
至此完成控制通道老化的流程。
参见图 8, 数据通道的老化具体可以包括如下步骤:
S50 K 定时判断数据通道是否需要老化; 如果是, 转到步骤 502, 如果否, 继续判 断是否需要老化;
5502、 删除控制流流表;
5503、 删除数据流流表;
5504、判断该数据流流表是否关联着关联识别表, 如果是, 执行步骤 S505 ; 如果否, 结束流程。
S505、 删除关联识别表。 至此完成数据通道老化的流程。
这里需要说明的是,控制通道和数据通道老化流程中判断数据流流表是否关联着关 联识别表都可以采用判断关联标志位来进行,具体判断方法可参见上文,在此不再赘述。
本发明实施例中,流表和关联识别表的匹配过程(五元组、三元组信息的匹配过程) 可以通过 NP (Net Processor, 网络处理器) 、 FPGA (Field Programmable Gate Array, 现场可编程门阵列) 、 ASIC (Appl ication Specific Integrated Circuits , 专用集成 电路)等专用硬件处理单元来完成, 处理速度比较快, 采用完全匹配法进行匹配。 当有 数据报文需要进行关联协议识别时, 如果按照现有方法, 流表中五元组信息没有匹配成 功时, 需要进行规则匹配和协议验证这两个步骤, 因此, 由于规则库的数量及匹配算法 运算, 降低了处理速度; 而采用本发明实施例中的方法时, 如果流表中的五元组信息匹 配没有成功, 则进行关联识别表三元组信息匹配, 匹配成功则可以直接进行策略管理, 而不需要再经规则匹配和协议验证来对协议进行识别。 当有大量数据需要匹配识别时, 流表和关联识别表的匹配过程都采用硬件处理,可以大大加快处理速度,提高处理效率。
本发明实施例另一好处在于, 如果协议能够通过关联识别表识别出来, 则不需要经 过规则匹配和协议验证两个模块处理, 而规则匹配需要用到很多规则, 因此, 也需要很 多空间存储这些规则, 而本发明实施例中由于不需要经过规则匹配, 因此, 在加速匹配 的过程当中还能节省存储规则的空间。
此外, 当有内容加密 (如 IP报文第 7层信息) 的数据报文进来时 (三元组信息还是 可以检测到) , 由于该数据报文中的三元组信息和协议 ID在建数据通道时就已经完成, 因此, 可以根据三元组信息找到对应的协议 ID, 完成对该协议的识别。 而现有技术因为 没有关联识别表, 只能对整个报文进行协议识别, 但因为报文内容经过加密, 从而无法 完成协议识别任务。 实施例四
本发明实施例三提供了一种关联协议识别方法, 用于提高协议识别速度。
参见图 9, 在本发明实施例中, 关联识别表中的信息并不单独存储, 而是跟流表一 起存储, 以一个匹配表的形式进行匹配, 此时, 匹配时采用最长匹配法进行匹配。
当接收报文时, 先在流表中进行五元组信息匹配, 如果匹配成功, 进行策略执行; 当五元组信息匹配不成功时, 再在流表中进行三元组信息匹配, 如果匹配成功, 则 进行策略管理; 如果都未匹配成功, 则通过规则匹配、 协议验证等步骤进行协议识别。
本发明实施例中, 流表包括三元组信息以及与之对应的协议 ID, 其获取方法及流表 的配置方法可参见实施例三中的相关步骤, 在此不再赘述。
本发明实施例中,控制通道创建数据流的数据通道过程中创建与协议对应的三元组 信息, 在匹配过程当中先匹配五元组信息, 如果不成功, 再匹配三元组信息, 当三元组 信息匹配成功时进行策略管理, 从而不需要进行规则匹配、 协议验证等处理, 提高了协 议识别的处理速度, 当流表匹配采用 NP、 FPGA、 ASIC等硬件进行处理, 可以更大程度提 高处理效率。
本发明实施例另一好处在于, 由于三元组匹配成功时不需要进行规则匹配, 因此, 还能节省存储规则的空间。
此外, 由于三元组信息在创建数据通道的过程中建立, 不需要检验报文内容部分, 这样, 即使内容经过加密的报文也能检验得到。 实施例五
本发明实施例提供了一种关联协议识别装置, 用于提高协议识别处理速度, 参见图
10, 包括如下单元:
接收单元 901, 用于接收报文;
关联识别表匹配 902, 用于将接收单元接到的报文中的第一匹配信息与关联识别表 中的第一匹配信息进行关联识别表匹配; 所述关联识别表中的第一匹配信息通过提取用 于创建数据通道的控制报文内容得到, 所述第一匹配信息与报文协议相对应, 所述报文 协议由所述控制报文经协议识别后得到; 当所述关联识别表匹配成功时, 输出所述关联 识别表匹配成功后的协议信息。
本发明实施例还包括:
流表匹配单元 903, 用于在所述关联识别表匹配之前, 对接收到的所述报文进行流 表匹配,所述流表匹配包括对所述报文中的第二匹配信息与流表中的第二匹配信息进行 匹配, 所述第二匹配信息用于查找所述报文对应的数据流中的相关信息, 包括所述报文 对应的执行策略。
其中, 上述第一匹配信息包括: 协议域、 源 IP和源端口; 或者所述第一匹配信息包 括: 协议域、 目的 IP和目的端口; 上述第二匹配信息包括: 协议域、 源 IP、 源端口、 目 的 IP和目的端口。 需要说明的是, 上述相关信息并不唯一确定, 可以根据实际协议匹配 需要的信息来对这些信息进行增加或删除。
本发明实施例还包括:
匹配表存储单元 904, 用于存储所述关联识别表和所述流表;
流表与关联识别表可以位于不同的匹配表, 即作为两个独立的表的形式存在; 或者 所述关联识别表与所述流表位于同一匹配表, 这两个表的数据在同一表中一起存在。 在 实际应用中, 匹配表存储单元 904可以使用一个物理存储介质 (如内存, 硬盘及各种存 储器) 实现, 也可以使用多个物理存储介质实现, 而流表和关联识别表在这些存储介质 上的存储位置也并不限定。 同时, 在软件实现上, 既可以将存储信息定义成一个存储结 构 (如流表)来存储流表和关联识别表中的数据, 也可以分别定义多个存储结构 (如流 表 +关联识别表的形式) 进行存储, 具体的, 流表和关联识别表可用 Hash表或 TCAM表或 其他具有类似功能的表来实现。 本发明实施例中, 流表存储单元跟关联识别表存储单元 并不严格区分, 可以是两个单元, 也可以将其合并成一个单元 (如可看成流表存储单元 包括了关联识别表存储单元, 即流表存储单元可以实现关联识别表存储单元的功能) , 具体可以根据实现形式不同而不同。
当所述关联识别表与所述流表位于不同的匹配表时,采用完全匹配法进行所述流表 匹配和所述关联识别表匹配; 当所述关联识别表与所述流表位于同一匹配表时, 采用最 长匹配法进行所述流表匹配和所述关联识别表匹配。
本发明实施例还包括:
信息处理单元 905, 用于所述控制报文创建数据通道时提取所述控制报文中的所述 第一匹配信息到所述匹配表存储单元; 所述信息处理单元还用于, 根据接收到的所述报 文的类型配置所述流表或所述关联识别表,包括更新所述流表中所述第二匹配信息或所 述关联识别表中所述第一匹配信息。具体的提取及配置过程可以参见上述方法实施例中 的相关步骤, 在此不再赘述。
本发明实施例还包括:
策略管理单元 906, 用于当所述关联识别表匹配成功时, 根据输出的协议信息进行 策略管理, 根据策略管理后的结果更新所述流表或所述关联识别表。
策略执行单元 907, 用于当所述流表匹配成功后, 对匹配成功后的报文执行相应的 策略。 规则匹配单元 908, 用于当所述关联识别表匹配不成功时, 对所述接收到的报文进 行规则匹配;
协议验证单元 909, 用于对所述规则匹配单元处理后的报文进行协议验证, 并将协 议验证后的结果送至所述策略管理单元。
老化单元 910, 用于监测到一段时间内数据通道或控制通道没有报文进入时, 删除 所述匹配表存储单元中存储的所述流表中的相关信息和 /或所述关联识别表中的相关信 白
上述各单元具体处理过程也可参见上述方法实施例中的相关步骤, 在此不再赘述。 本发明实施例中, 各单元的具体实现可以通过通用处理器来实现, 也可以通过专用 处理器或具有处理功能的硬件来实现, 且每个单元可由同一处理器来实现, 或几个处理 单元使用同一处理器来实现, 具体实现过程并不限定。 例如, 在本明实施例中, 可以采 用 NP (Net Processor, 网络处理器) 、 FPGA (Field Programmable Gate Array, 现场 可编程门阵列) 、 ASIC (Appl ication Specific Integrated Circuits , 专用集成电路) 等处理器来实现流表匹配和关联识别表匹配等功能。
在采用 NP、 FPGA, ASIC等进行匹配时, 由于用硬件实现, 因此, 处理速度快, 当数 据通道的五元组信息匹配未成功时, 再匹配三元组信息, 如果成功, 则进行策略管理, 因此, 省去了现有技术中规则匹配和协议验证两个步骤, 大大地提升了处理效率。
本发明实施例另一好处在于, 如果协议能够通过关联识别表识别出来, 则不需要经 过规则匹配和协议验证两个模块处理, 而规则匹配需要用到很多规则, 因此, 也需要很 多空间存储这些规则, 而本发明实施例中由于不需要经过规则匹配, 因此, 在加速匹配 的过程当中还能节省存储规则的空间。
此外, 当有内容加密 (如 IP报文第 7层信息) 的数据报文进来时 (三元组信息还是 可以检测到) , 由于该数据报文中的三元组信息和协议 ID在建数据通道时就已经完成, 因此, 可以根据三元组信息找到对应的协议 ID, 完成对该协议的识别。 而现有技术因为 没有关联识别表, 只能对整个报文进行协议识别, 但因为报文内容经过加密, 从而无法 完成协议识别任务。 本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通 过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质 中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可 为磁碟、 光盘、 只读存储记忆体(Read-Only Memory, ROM)或随机存储记忆体(Random Access Memory, RAM) 等。
上列较佳实施例, 对本发明的目的、 技术方案和优点进行了进一步详细说明, 所应 理解的是, 以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在本发明 的精神和原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范 围之内。

Claims

权利要求
1、 一种报文检测方法, 其特征在于, 包括如下步骤:
接收报文, 对所述报文进行流表匹配;
当所述流表匹配不成功时, 对所述报文进行关联识别表匹配; 所述关联识别匹配包 括对所述报文中的三元组信息进行匹配, 所述三元组信息包括协议域、 源 IP、 源端口或 者所述三元组信息包括协议域、 目的 IP、 目的端口;
当所述关联识别表匹配成功时, 输出与所述关联识别表匹配成功的协议信息。
2、 如权利要求 1所述的报文检测方法, 其特征在于:
所述关联识别表匹配包括: 将所述报文的三元组信息与关联识别表中的三元组信息 进行匹配,所述关联识别表中的三元组信息通过提取用于创建数据通道的控制报文内容 得到, 所述关联识别表中的三元组信息与报文协议相对应, 所述报文协议由所述控制报 文经协议识别后得到。
3、 如权利要求 1所述的检测方法, 其特征在于:
当所述报文进行所述关联识别表匹配不成功时, 对所述报文进行协议识别, 所述协 议识别包括规则匹配和协议验证。
4、 一种报文检测方法, 其特征在于, 包括如下步骤:
将接收到的报文中的第一匹配信息与关联识别表中的第一匹配信息进行关联识别 表匹配; 所述关联识别表中的第一匹配信息通过提取用于创建数据通道的控制报文内容 得到, 所述第一匹配信息与报文协议相对应, 所述报文协议由所述控制报文经协议识别 后得到;
当所述关联识别表匹配成功时, 输出所述关联识别表匹配成功后的协议信息。
5、 如权利要求 4所述的报文检测方法, 其特征在于, 还包括:
在所述关联识别表匹配之前, 对接收到的所述报文进行流表匹配, 所述流表匹配包 括对所述报文中的第二匹配信息与流表中的第二匹配信息进行匹配,所述第二匹配信息 用于查找所述报文对应的流中的相关信息, 包括所述报文对应的执行策略。
6、 如权利要求 5所述的报文检测方法, 其特征在于:
所述第一匹配信息包括: 协议域、 源 IP和源端口; 或者所述第一匹配信息包括: 协 议域、 目的 IP和目的端口;
所述第二匹配信息包括: 协议域、 源 IP、 源端口、 目的 IP和目的端口。
7、 如权利要求 5所述的报文检测方法, 其特征在于:
所述关联识别表与所述流表位于不同的匹配表;
或者, 所述关联识别表与所述流表位于同一匹配表。
8、 如权利要求 7所述的报文检测方法, 其特征在于:
根据接收到的所述报文的类型配置所述流表或所述关联识别表,包括更新所述流表 中所述第二匹配信息或所述关联识别表中所述第一匹配信息。
9、 如权利要求 4所述的报文检测方法, 其特征在于, 还包括:
当对所述报文进行所述流表匹配成功时, 对匹配成功后的报文执行相应的策略; 当所述关联识别表匹配成功时,根据输出的所述关联识别表匹配成功后的协议信息 对所述报文进行策略管理, 并根据策略管理后的结果更新所述流表。
10、 如权利要求 4所述的报文检测方法, 其特征在于:
当所述报文进行所述关联识别表匹配不成功时, 对所述报文进行协议识别, 所述协 议识别包括规则匹配和协议验证。
11、 如权利要求 7所述的报文检测方法, 其特征在于:
如果监测到一段时间内数据通道或控制通道没有报文进入,删除所述流表中的相关 信息和 /或所述关联识别表中的相关信息。
12、 一种报文检测装置, 其特征在于, 包括:
接收单元, 用于接收报文;
关联识别表匹配单元,用于将接收单元接到的报文中的第一匹配信息与关联识别表 中的第一匹配信息进行关联识别表匹配; 所述关联识别表中的第一匹配信息通过提取用 于创建数据通道的控制报文内容得到, 所述第一匹配信息与报文协议相对应, 所述报文 协议由所述控制报文经协议识别后得到; 当所述关联识别表匹配成功时, 输出所述关联 识别表匹配成功后的协议信息。
13、 如权利要求 12所述的报文检测装置, 其特征在于, 还包括:
流表匹配单元, 用于在所述关联识别表匹配之前, 对接收到的所述报文进行流表匹 配, 所述流表匹配包括对所述报文中的第二匹配信息与流表中的第二匹配信息进行匹 配, 所述第二匹配信息用于查找所述报文对应的流中的相关信息, 包括所述报文对应的 执行策略。
14、 如权利要求 13所述的报文检测装置, 其特征在于: 所述第一匹配信息包括: 协议域、 源 IP和源端口; 或者所述第一匹配信息包括: 协 议域、 目的 IP和目的端口;
所述第二匹配信息包括: 协议域、 源 IP、 源端口、 目的 IP和目的端口。
15、 如权利要求 13所述的报文检测装置, 其特征在于, 还包括:
匹配表存储单元, 用于存储所述关联识别表和所述流表;
所述关联识别表与所述流表位于不同的匹配表,或者所述关联识别表与所述流表位 于同一匹配表。
16、 如权利要求 15所述的报文检测装置, 其特征在于, 还包括:
信息处理单元,用于所述控制报文创建数据通道时提取所述控制报文中的所述第一 匹配信息到所述匹配表存储单元;
所述信息处理单元还用于,根据接收到的所述报文的类型配置所述流表或所述关联 识别表, 包括更新所述流表中所述第二匹配信息或所述关联识别表中所述第一匹配信 白
17、 如权利要求 12所述的报文检测装置, 其特征在于, 还包括:
策略管理单元, 用于当所述关联识别表匹配成功时, 根据输出的协议信息进行策略 管理, 根据策略管理后的结果更新所述流表或所述关联识别表。
18、 如权利要求 13所述的报文检测装置, 其特征在于, 还包括:
策略执行单元,用于当所述流表匹配成功后,对匹配成功后的报文执行相应的策略。
19、 如权利要求 17所述的报文检测装置, 其特征在于, 还包括:
规则匹配单元, 用于当所述关联识别表匹配不成功时, 对所述接收到的报文进行规 则匹配;
协议验证单元, 用于对所述规则匹配单元处理后的报文进行协议验证, 并将协议验 证后的结果送至所述策略管理单元。
20、 如权利要求 15所述的报文检测装置, 其特征在于, 还包括:
老化单元, 用于监测到一段时间内数据通道或控制通道没有报文进入时, 删除所述 匹配表存储单元中存储的所述流表中的相关信息和 /或所述关联识别表中的相关信息。
PCT/CN2010/078900 2009-11-19 2010-11-19 报文检测方法及装置 WO2011060732A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP10831147.3A EP2434689B1 (en) 2009-11-19 2010-11-19 Method and apparatus for detecting message
US13/339,246 US20120099597A1 (en) 2009-11-19 2011-12-28 Method and device for detecting a packet

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2009101098203A CN102075404A (zh) 2009-11-19 2009-11-19 一种报文检测方法及装置
CN200910109820.3 2009-11-19

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/339,246 Continuation US20120099597A1 (en) 2009-11-19 2011-12-28 Method and device for detecting a packet

Publications (1)

Publication Number Publication Date
WO2011060732A1 true WO2011060732A1 (zh) 2011-05-26

Family

ID=44033756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/078900 WO2011060732A1 (zh) 2009-11-19 2010-11-19 报文检测方法及装置

Country Status (4)

Country Link
US (1) US20120099597A1 (zh)
EP (1) EP2434689B1 (zh)
CN (1) CN102075404A (zh)
WO (1) WO2011060732A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104007338A (zh) * 2014-05-09 2014-08-27 国家电网公司 一种基于iec61850电能质量监测的自动侦听可测量对象的方法
CN112104518A (zh) * 2019-08-26 2020-12-18 中国科学院国家空间科学中心 一种比特数据特征挖掘方法、系统、设备及可读介质

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2560338B1 (en) 2011-06-13 2016-01-13 Huawei Technologies Co., Ltd. Method and apparatus for protocol parsing
WO2013042358A1 (en) * 2011-09-21 2013-03-28 Nec Corporation Communication apparatus, communication system, communication control method, and program
CN102394893B (zh) * 2011-11-23 2014-11-26 Tcl王牌电器(惠州)有限公司 检验终端与服务端通讯协议的方法、服务器及系统
CN102497297A (zh) * 2011-12-13 2012-06-13 曙光信息产业(北京)有限公司 基于多核多线程的深度报文检测技术的实现系统和方法
CN102571956B (zh) * 2012-01-09 2015-08-19 华为技术有限公司 关联识别表更新方法、关联识别方法、装置及系统
CN103248609A (zh) * 2012-02-06 2013-08-14 同方股份有限公司 一种端到端的数据检测系统、装置和方法
CN102891893B (zh) * 2012-10-16 2015-07-15 苏州迈科网络安全技术股份有限公司 P2p流量识别方法及系统
US9477718B2 (en) 2012-12-31 2016-10-25 Huawei Technologies Co., Ltd Application identification method, and data mining method, apparatus, and system
CN103051725B (zh) * 2012-12-31 2015-07-29 华为技术有限公司 应用识别方法、数据挖掘方法、装置及系统
CN103095604A (zh) * 2013-01-04 2013-05-08 海信集团有限公司 识别家庭网络具体应用的系统及方法
CN103312700B (zh) * 2013-05-30 2016-03-30 中国人民解放军国防科学技术大学 基于计数容错的报文匹配方法
CN103475537A (zh) * 2013-08-30 2013-12-25 华为技术有限公司 一种报文特征提取方法和装置
CN103716187B (zh) * 2013-12-20 2017-03-29 新浪网技术(中国)有限公司 网络拓扑结构确定方法和系统
CN103701809A (zh) * 2013-12-27 2014-04-02 山石网科通信技术有限公司 应用的识别方法和装置
CN103916294B (zh) 2014-04-29 2018-05-04 华为技术有限公司 协议类型的识别方法和装置
CN105099918B (zh) * 2014-05-13 2019-01-29 华为技术有限公司 一种数据查找匹配的方法和装置
WO2015192344A1 (zh) * 2014-06-18 2015-12-23 华为技术有限公司 一种控制业务数据流的方法及装置
CN105376159A (zh) * 2014-08-25 2016-03-02 深圳市中兴微电子技术有限公司 报文处理转发装置及方法
CN105704035B (zh) * 2014-11-25 2020-02-14 中兴通讯股份有限公司 报文匹配处理方法及装置
CN104735060B (zh) * 2015-03-09 2018-02-09 清华大学 路由器及其数据平面信息的验证方法和验证装置
CN105516173B (zh) * 2015-12-25 2018-10-23 北京中安智达科技有限公司 一种网络应用层协议识别的方法和系统
CN106100997B (zh) * 2016-06-03 2021-04-30 新华三技术有限公司 一种网络流量信息处理方法及装置
CN106961445B (zh) * 2017-04-28 2019-10-29 中国人民解放军信息工程大学 基于fpga硬件并行流水线的报文解析装置
CN109995602B (zh) * 2017-12-29 2021-03-16 中国移动通信集团设计院有限公司 一种协议识别的方法、系统和装置
CN108650221B (zh) * 2018-03-29 2020-12-15 烽火通信科技股份有限公司 一种sptn设备的控制报文提取装置及方法
CN108540480B (zh) * 2018-04-19 2021-01-08 中电和瑞科技有限公司 一种网关以及基于网关的文件访问控制方法
CN108900429A (zh) * 2018-06-12 2018-11-27 北京奇安信科技有限公司 一种共享接入多策略控制方法及装置
CN113901431B (zh) * 2021-09-18 2023-03-21 锐捷网络股份有限公司 认证用户特征信息的提取方法及装置
CN117938987B (zh) * 2024-03-25 2024-06-18 天津布尔科技有限公司 一种应用于车联网的协议网关及其控制方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1835500A (zh) * 2005-03-15 2006-09-20 华为技术有限公司 一种移动IPv6数据穿越状态防火墙的方法
CN1941716A (zh) * 2005-09-30 2007-04-04 杭州华为三康技术有限公司 应用流量统计方法及装置和应用流量统计系统
CN101202652A (zh) * 2006-12-15 2008-06-18 北京大学 网络应用流量分类识别装置及其方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6701432B1 (en) * 1999-04-01 2004-03-02 Netscreen Technologies, Inc. Firewall including local bus
US20060045014A1 (en) * 2002-09-30 2006-03-02 Siemens Aktiengesellschaft Method for partially maintaining packet sequences in connectionless packet switching with alternative routing
US7724660B2 (en) * 2005-12-13 2010-05-25 Alcatel Lucent Communication traffic congestion management systems and methods
EP2560338B1 (en) * 2011-06-13 2016-01-13 Huawei Technologies Co., Ltd. Method and apparatus for protocol parsing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1835500A (zh) * 2005-03-15 2006-09-20 华为技术有限公司 一种移动IPv6数据穿越状态防火墙的方法
CN1941716A (zh) * 2005-09-30 2007-04-04 杭州华为三康技术有限公司 应用流量统计方法及装置和应用流量统计系统
CN101202652A (zh) * 2006-12-15 2008-06-18 北京大学 网络应用流量分类识别装置及其方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104007338A (zh) * 2014-05-09 2014-08-27 国家电网公司 一种基于iec61850电能质量监测的自动侦听可测量对象的方法
CN112104518A (zh) * 2019-08-26 2020-12-18 中国科学院国家空间科学中心 一种比特数据特征挖掘方法、系统、设备及可读介质

Also Published As

Publication number Publication date
US20120099597A1 (en) 2012-04-26
EP2434689B1 (en) 2016-02-17
EP2434689A4 (en) 2012-05-16
CN102075404A (zh) 2011-05-25
EP2434689A1 (en) 2012-03-28

Similar Documents

Publication Publication Date Title
WO2011060732A1 (zh) 报文检测方法及装置
WO2019096308A1 (zh) 一种识别加密数据流的方法及装置
US10084713B2 (en) Protocol type identification method and apparatus
US6496935B1 (en) System, device and method for rapid packet filtering and processing
WO2020135233A1 (zh) 僵尸网络检测方法、系统及存储介质
WO2012171166A1 (zh) 协议解析方法及装置
EP2482497B1 (en) Data forwarding method, data processing method, system and device thereof
US9866639B2 (en) Communication apparatus, information processor, communication method, and computer-readable storage medium
WO2010063228A1 (zh) 防御域名系统欺骗攻击的方法及装置
WO2011069388A1 (zh) 一种协议识别的方法、装置和系统
CN104468252A (zh) 一种基于正迁移学习的智能网络业务识别方法
US20190068468A1 (en) Attributing network address translation device processed traffic to individual hosts
WO2016062031A1 (zh) 一种openflow流表的查表方法和装置、存储介质
WO2010139237A1 (zh) 一种深度报文检测方法和装置
WO2023019876A1 (zh) 基于智能决策的数据传输方法、装置、设备及存储介质
JP4263718B2 (ja) 通信処理装置及び通信処理方法
WO2015027401A1 (zh) 报文处理方法、设备及系统
JP2007228217A (ja) トラフィック判定装置、トラフィック判定方法、及びそのプログラム
WO2020187295A1 (zh) 异常主机的监控
CN104683241A (zh) 一种报文检测方法及装置
CN111030976A (zh) 一种基于密钥的分布式访问控制方法、装置及存储设备
WO2022268226A1 (zh) 客户端识别方法、装置、存储介质及网络设备
TWI757207B (zh) 不完全比對的資料流處理方法與系統
JP4729389B2 (ja) パターン照合装置、パターン照合方法、パターン照合プログラム及び記録媒体
WO2014079319A1 (zh) 报文的转发方法及其路由设备、识别设备

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2010831147

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE