WO2015165296A1 - 协议类型的识别方法和装置 - Google Patents

协议类型的识别方法和装置 Download PDF

Info

Publication number
WO2015165296A1
WO2015165296A1 PCT/CN2015/072529 CN2015072529W WO2015165296A1 WO 2015165296 A1 WO2015165296 A1 WO 2015165296A1 CN 2015072529 W CN2015072529 W CN 2015072529W WO 2015165296 A1 WO2015165296 A1 WO 2015165296A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
user terminal
information
data packet
dimensional information
Prior art date
Application number
PCT/CN2015/072529
Other languages
English (en)
French (fr)
Inventor
潘能毅
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CA2947325A priority Critical patent/CA2947325C/en
Publication of WO2015165296A1 publication Critical patent/WO2015165296A1/zh
Priority to US15/338,105 priority patent/US10084713B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2483Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/303Terminal profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the present invention relates to network traffic management technologies, and in particular, to a protocol type identification method and apparatus.
  • DPI Deep Packet Inspection
  • L3 network layer
  • L4 transport layer
  • L7 application layer
  • the DPI usually identifies the message based on the data stream, that is, the DPI processes the data as a single data stream, and after the data stream is searched through the flow table, various identification methods, such as feature recognition and port classification, are used. , statistical methods, etc., scan the packets in the data stream to complete the identification and classification of the stream.
  • identification of each stream is an independent process, and the results are also saved in units of streams.
  • the disadvantage of the stream-based identification method is that the stream-based identification method realizes the identification and protocol classification by scanning the content of the message in the data stream within the range of each stream, without utilizing the correlation between the data streams, The performance of data stream identification is low and accurate user control based on user units cannot be achieved.
  • Embodiments of the present invention provide a method and apparatus for identifying a protocol type to improve the efficiency of data stream protocol identification.
  • an embodiment of the present invention provides a method for identifying a protocol type, where the method includes:
  • connection of the data packet is based on the user multi-dimensionality according to the information of all connections currently established by the user terminal identified by the acquired user multi-dimensional information. Protocol type identification of information;
  • the data stream-based protocol type identification is performed on the connection where the data packet is located according to the packet feature of the data packet.
  • the searching for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user includes: searching for the multi-dimensional information of the user according to the user terminal address information in the data packet Whether there is user multi-dimensional information corresponding to the user terminal address information in the table.
  • the method further includes: The user multi-dimensional information corresponding to the user terminal does not exist in the user multi-dimensional information table, and the user multi-dimensional information corresponding to the user terminal is added to the user multi-dimensional information table.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the source IP address information and the destination IP address of the user terminal that are currently connected Address information, user terminal address information that has been connected to the user terminal, server address information that the user terminal has visited, a protocol list of the user terminal, and behavior characteristic information that the user terminal has been connected;
  • the user multi-dimensional information table includes the user multi-dimensional information, and a correspondence relationship between the user multi-dimensional information and a protocol type to which the user terminal has been connected.
  • the information about all the connections currently established by the user terminal that is identified according to the acquired user multi-dimensional information is The connection of the data packet is performed based on the protocol type identification of the user multi-dimensional information, including: determining server address information in the quintuple of the data packet, whether the user is stored in the user multi-dimensional information table The server address information that has been accessed by the terminal, if yes, the protocol type of the connection to which the data packet is located corresponds to the existing connection of the server address information stored in the user multidimensional information table and the data packet.
  • the protocol type of the data packet is the source IP address information, the destination IP address information, and the character feature stored in the user multidimensional information table.
  • the protocol type corresponding to the existing connection Whether the information is consistent with the data packet, the protocol type corresponding to the existing connection; or determining whether the user terminal address information in the quintuple information of the data packet is included in the user stored in the user multidimensional information table
  • the address of the user terminal that is connected to the terminal if yes, the protocol type of the data packet is the existing connection of the user terminal address information stored in the user multidimensional information table and the data packet.
  • the protocol type of the data packet is connected to the behavior characteristic information and the number stored in the user multidimensional information table. Packets corresponding to the same protocol type existing connection.
  • the The information of all the connections that the user terminal has established in the information identification, after the connection of the data packet is performed based on the protocol type of the user multi-dimensional information further includes: if the identification is successful, updating the user multi-dimensional information table Identify the result data and output the recognition result,
  • the identification result data is a protocol type of the connection in which the identified data packet is located.
  • the The user type of the information identifier has been connected to the protocol type. After the data packet is identified by the protocol type based on the user multi-dimensional information, the method further includes: if the identification is successful, further determining whether the data packet is a pass-through feature The successful message is identified, and if yes, the user-based behavior recognition statistics are performed, and the behavior characteristic information of the connected user terminal in the user multi-dimensional information table is updated.
  • an embodiment of the present invention further provides a method for identifying a protocol type, where the method includes:
  • the user multi-dimensional information corresponding to the user terminal is found in the user multi-dimensional information table, and the user multi-dimensional information is used to indicate information about all connections currently established by the user terminal;
  • connection of the data packet is based on the user multi-dimensionality according to the information of all connections currently established by the user terminal identified by the acquired user multi-dimensional information. Protocol type identification of information.
  • the searching for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user includes: searching according to the user terminal address information in the data packet Whether there is user multi-dimensional information corresponding to the user terminal address information in the user multi-dimensional information table. .
  • the method further includes: The user corresponding to the user terminal does not exist in the user multi-dimensional information table The multi-dimensional information adds the user multi-dimensional information corresponding to the user terminal to the user multi-dimensional information table.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the source IP address information and the destination IP address of the user terminal that are currently connected Address information, user terminal address information that has been connected to the user terminal, server address information that the user terminal has visited, a protocol list of the user terminal, and behavior characteristic information that the user terminal has been connected;
  • the user multi-dimensional information table includes the user multi-dimensional information, and a correspondence relationship between the user multi-dimensional information and a protocol type to which the user terminal has been connected.
  • the information about all connections currently established by the user terminal that is identified according to the acquired user multi-dimensional information is The connection of the data packet is performed based on the protocol type identification of the user multi-dimensional information, including: determining server address information in the quintuple of the data packet, whether the user is stored in the user multi-dimensional information table The server address information that has been accessed by the terminal, if yes, the protocol type of the connection to which the data packet is located corresponds to the existing connection of the server address information stored in the user multidimensional information table and the data packet.
  • the protocol type of the data packet is the source IP address information, the destination IP address information, and the character feature stored in the user multidimensional information table.
  • the protocol type corresponding to the existing connection Whether the information is consistent with the data packet, the protocol type corresponding to the existing connection; or determining whether the user terminal address information in the quintuple information of the data packet is included in the user stored in the user multidimensional information table The address of the user terminal that is connected to the terminal, if yes, the protocol type of the data packet is the existing connection of the user terminal address information stored in the user multidimensional information table and the data packet.
  • the type of the protocol is the protocol type corresponding to the existing connection in which the behavior characteristic information stored in the user multidimensional information table is consistent with the data packet.
  • the acquired multi-dimensional information about the user And identifying, by the identifier, all the connection information that is currently established by the user terminal, after performing the protocol type identification based on the user multi-dimensional information on the connection of the data packet, the method further includes: if the identification is successful, updating the user in the multi-dimensional information table The result data is identified, and the recognition result is output, wherein the identification result data is a protocol type of the connection in which the identified data message is located.
  • the data packet is connected After performing the data stream-based protocol type identification, the method further includes: performing corresponding service processing on the data packet if the data stream is successfully identified.
  • the data packet is connected After performing the protocol type identification based on the user multi-dimensional information, the method further includes: if the identification is successful, further determining whether the data packet is a packet that cannot be successfully identified by the feature, and if yes, performing user-based behavior recognition statistics, and updating
  • the user terminal in the user connection data table has connection behavior characteristic information.
  • an embodiment of the present invention provides a protocol type identification device, where the device includes:
  • An obtaining unit configured to acquire a data packet transmitted on a connection established between the user terminal and the server;
  • a search unit configured to find, in the user multi-dimensional information table, whether there is user multi-dimensional information corresponding to the user terminal, where the user multi-dimensional information is used to represent all connections currently established by the user terminal. information;
  • a first processing unit configured to: if the user multi-dimensional information corresponding to the user terminal is found, the datagram is sent according to the information about all the connections currently established by the user terminal that are identified by the acquired user multi-dimensional information.
  • the text is connected to perform protocol type identification based on user multi-dimensional information;
  • the second processing unit is configured to perform, according to the packet feature of the data packet, a protocol type identification based on the data stream according to the packet feature of the data packet, if the user multi-dimensional information corresponding to the user terminal is not found.
  • the searching unit is specifically configured to: according to the user terminal address information in the data packet, searching whether the user multi-dimensional information table has a corresponding address information of the user terminal User multidimensional information.
  • a second possible implementation manner if there is no user multi-dimensional information corresponding to the user terminal in the user multi-dimensional information table, Adding user multidimensional information corresponding to the user terminal in the information table
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information:
  • the user terminal currently has the source IP address information and the destination IP address information corresponding to the connection, the user terminal address information that the user terminal has been connected to, the server address information that the user terminal has visited, and the protocol of the user terminal.
  • the list and the user terminal have connected behavior characteristic information;
  • the user multi-dimensional information table includes the user multi-dimensional information, and a correspondence relationship between the user multi-dimensional information and a protocol type to which the user terminal has been connected.
  • the first processing unit is specifically configured to: determine server address information in a quintuple of the data packet, Included in the server address information that has been accessed by the user terminal stored in the user multi-dimensional information table, if yes, the protocol type of the connection where the data message is located is the server address stored in the user multi-dimensional information table.
  • the type of the protocol corresponding to the existing connection whose information is consistent with the data packet; or the source IP address information and the destination IP address in the quintuple of the data packet.
  • the information includes the source IP address information and the destination IP address information corresponding to the existing connection of the user terminal stored in the user multi-dimensional information table, and if yes, continue to determine whether the feature information of the data packet is included in the In the behavior characteristic information that the user terminal has been connected in the multi-dimensional information table, if yes, the protocol type of the data packet is the source IP address information and the destination IP address stored in the user multi-dimensional information table.
  • the address information and the character feature information are the protocol types corresponding to the existing connections that are consistent with the data message; or whether the user terminal address information in the quintuple information of the data message is included in the multi-dimensional information of the user
  • the user terminal address information of the user terminal that is already connected in the table if yes, the protocol type of the data packet is the user terminal address information and the data packet stored in the user multidimensional information table.
  • Consistent type of protocol corresponding to the existing connection or determining the behavior statistics of the data packet and the historical data packet Whether it is included in the behavior characteristic information that the user terminal stored in the user multi-dimensional information table is already connected, and if yes, the protocol type of the connection where the data message is located is a behavior characteristic stored in the user multi-dimensional information table The protocol type corresponding to the existing connection whose information is consistent with the data packet.
  • the first processing unit is further configured to: If the recognition is successful, the recognition result data in the user multi-dimensional information table is updated, and the recognition result is output, wherein the recognition result data is a protocol type of the connection in which the identified data message is located.
  • the first processing unit is further used to If the identification is successful, it is further determined whether the data packet is a packet that cannot be successfully identified by the feature, and if yes, performing user-based behavior recognition statistics, and updating the user terminal in the user multi-dimensional information table Behavioral characteristics of the connection.
  • an embodiment of the present invention provides a protocol type identification apparatus, where the apparatus includes:
  • An obtaining unit configured to acquire a datagram transmitted on a connection established between the user terminal and the server Text
  • the first processing unit is configured to perform, according to the packet feature of the data packet, a protocol type identification based on the data stream in the connection where the data packet is located;
  • a searching unit configured to: if there is unsuccessful identification based on the data stream, whether there is user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user, where the multi-dimensional information of the user is used to indicate information about all connections currently established by the user terminal. ;
  • a second processing unit configured to: if the user multi-dimensional information corresponding to the user terminal is found, the datagram is obtained according to the information about all the connections currently established by the user terminal that are identified by the acquired user multi-dimensional information.
  • the text is connected to perform protocol type identification based on user multidimensional information.
  • the searching unit is configured to: search for the presence of the user in the multi-dimensional information table of the user according to the user terminal address information in the data packet.
  • User multi-dimensional information corresponding to the terminal address information.
  • the second processing unit is further configured to: if there is no user multi-dimensional corresponding to the user terminal in the user multi-dimensional information table The information is added to the user multi-dimensional information table to add user multi-dimensional information corresponding to the user terminal.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the source IP address information and the destination IP address of the user terminal that are currently connected Address information, user terminal address information that has been connected to the user terminal, server address information that the user terminal has visited, a protocol list of the user terminal, and behavior characteristic information that the user terminal has been connected;
  • the user multi-dimensional information table includes the user multi-dimensional information, and a correspondence relationship between the user multi-dimensional information and a protocol type to which the user terminal has been connected.
  • the second processing unit is specifically configured to: determine server address information in a quintuple of the data packet, a service that has been accessed by the user terminal stored in the user multi-dimensional information table In the device address information, if yes, the protocol type of the connection to which the data packet is located is a protocol type corresponding to an existing connection in which the server address information stored in the user multi-dimensional information table is consistent with the data packet; or Whether the source IP address information and the destination IP address information in the quintuple of the data packet are included in the source IP address information and the destination IP address information corresponding to the existing connection of the user terminal stored in the user multi-dimensional information table.
  • the protocol type is a protocol type corresponding to an existing connection in which the source IP address information, the destination IP address information, and the line feature information stored in the user multi-dimensional information table are consistent with the data packet; or the data packet is determined.
  • the protocol type of the connection in which the data packet is located is a protocol type corresponding to an existing connection in which the user terminal address information stored in the user multi-dimensional information table is consistent with the data packet; or Determining whether the data statistics of the data packet and the historical data packet are included in the behavior characteristic information of the user terminal that is already connected in the user multi-dimensional information table, and if yes, the data packet is connected.
  • the protocol type is a protocol type corresponding to an existing connection in which the behavior characteristic information stored in the user multi-dimensional information table is consistent with the data packet.
  • the second processing unit is further configured to: If the recognition for the multidimensional information is successful based on the recognition, the recognition result data in the user multidimensional information table is updated, and the recognition result is output.
  • the first processing unit is further configured to: If the data stream is successfully identified based on the data stream, the corresponding data processing is performed on the data packet.
  • the second processing unit is further used If the data is successfully identified based on the user, the data packet is further determined to be a message that cannot be identified by the feature, and if yes, the user-based behavior recognition statistics are performed, and the user connection data table is updated.
  • the user terminal has the behavior characteristic information of the connection.
  • the method and device for identifying the type of the protocol provided by the embodiment of the present invention can implement the service control based on the user unit by performing protocol type identification based on the multi-dimensional information of the user according to the type of the protocol that the user terminal has already connected. And by combining protocol type identification based on user multi-dimensional information and protocol type identification based on data stream, the recognition accuracy of the DPI system can be improved, and the protocol recognition performance can be improved.
  • FIG. 1 is a flowchart of a method for identifying a protocol type according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for identifying a protocol type according to an embodiment of the present invention.
  • FIG. 3 is a block diagram of a DPI system according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of another method for identifying a protocol type according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of another method for identifying a protocol type according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a protocol type identification apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another protocol type identification apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a network device according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of another network device according to an embodiment of the present invention.
  • the method for identifying the protocol type provided by the embodiment of the present invention can be applied to a service scenario such as network optimization and application flow control, as a new protocol identification method.
  • a network device such as a service gateway, a router, or the like
  • receives a newly-connected data packet the protocol type of the service data packet can be analyzed based on the user multi-dimensional information table, so that the embodiment of the present invention can implement the user-based
  • the unit's business control combined with protocol type identification based on user multi-dimensional information and protocol type identification based on data stream, can improve the recognition accuracy of DPI system and improve the performance of protocol identification.
  • the server address information in the quintuple mentioned in the application file may be the source address information in the quintuple of the data packet, or the destination address information, and the quintuple of the data packet sent by the user terminal to the server.
  • the destination address information of the group is the server address information in the quintuple
  • the source address information of the quintuple of the data packet sent by the server to the user terminal is the server address information in the quintuple.
  • the user terminal may specifically be a client or an application running in the user terminal.
  • FIG. 1 is a flowchart of a method for identifying a protocol type according to an embodiment of the present invention.
  • the execution body of the embodiment is a network device, such as a service gateway or a router. This embodiment describes in detail the network device receives the received data packet.
  • a method of user-based protocol type identification is performed. As shown, the embodiment includes the following steps:
  • Step 101 Acquire a data packet transmitted on a connection established between the user terminal and the server.
  • the network device After receiving the data packet of the data stream, the network device parses the packet and obtains the corresponding quintuple information according to the packet header information.
  • the quintuple information includes the destination IP address, the destination port number, the source IP address, the source port number, and the transport layer protocol number (such as the Transmission Control Protocol (TCP) number and the user datagram protocol (User). Datagram Protocol (UDP) number), and then based on the quintuple information to determine whether the connection corresponding to the data stream is a newly created connection.
  • TCP Transmission Control Protocol
  • User user datagram protocol
  • UDP Datagram Protocol
  • the network device may query the flow table to determine whether the connection record information corresponding to the quintuple information of the service data packet exists in the flow table, and if yes, determine that the connection corresponding to the data flow is There is already a connection. If not, it is determined that the connection corresponding to the data flow is a newly established connection.
  • step 102 After the flow table is queried, it is determined that the connection of the data packet is an existing connection, and the quintuple information of the data packet stored in the flow table is directly corresponding to the protocol type identification result and the service processing method.
  • the data packet is processed correspondingly, such as flow control. Need to say It is obvious that even if the connection of the data packet is an existing connection, step 102 can be continued, that is, the corresponding protocol identification is performed on the connection where the data packet is located.
  • Step 102 Search for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user, where the multi-dimensional information of the user is used to indicate information about all connections currently established by the user terminal.
  • the searching for the multi-dimensional information of the user corresponding to the user terminal in the multi-dimensional information table of the user includes: searching whether the user terminal address exists in the multi-dimensional information table of the user according to the user terminal address information in the data packet User multidimensional information corresponding to the information.
  • the multi-dimensional information of the user includes one or a combination of the following information: the address pair information corresponding to the connection of the user terminal, the address information of the user terminal that the user terminal has been connected to, the address information of the server that the user terminal has visited, The protocol list information of the user and the behavior characteristic information of the connection of the user terminal; the multi-dimensional information table of the user includes the multi-dimensional information of the user, and the correspondence between the multi-dimensional information of the user and the protocol type that the user terminal has already connected.
  • the user multi-dimensional information table may include a correspondence between server address information that the user terminal has accessed and a protocol type that the user terminal has already connected, and/or source IP address information and destination IP address information that the user terminal has already connected. Correspondence between the types of protocols that have been connected to the user terminal, and/or the correspondence between the address information of the user terminal to which the user terminal is connected and the protocol type to which the user terminal is already connected, and/or the behavioral characteristics of the existing connection of the user terminal. The correspondence between the information and the type of protocol that the user terminal has connected to.
  • the user terminal has an address pair formed by connecting the corresponding address pair information to the existing source IP address and the destination IP address, and the user terminal address information of the user terminal that has been connected is already connected.
  • the IP address and port number of the corresponding user terminal are composed.
  • the server address information that the user terminal has visited is composed of the IP address and port number of the server that the user terminal has visited.
  • the protocol list of the user stores the protocol record information commonly used by the user.
  • the behavior characteristic information that the user terminal has connected includes the protocol characteristics corresponding to the common protocol type of the user and the behavior statistics of the user.
  • the network device determines that the user corresponding to the connection where the data packet exists in the user multi-dimensional information table After the user's multi-dimensional information of the terminal, step 103 can be performed.
  • the network device finds the user multi-dimensional information corresponding to the user terminal in the user multi-dimensional information table, if the user multi-dimensional information corresponding to the user terminal is not found in the user multi-dimensional information table, the user is in the user User multi-dimensional information corresponding to the user terminal is added to the multi-dimensional information table.
  • the user multi-dimensional information of the user may be added in the user multi-dimensional information table after the protocol type of the connection to which the data packet is successfully identified.
  • Step 103 If the user multi-dimensional information corresponding to the user terminal is found, the connection of the data packet is performed according to the information about all connections currently established by the user terminal that are identified by the user multi-dimensional information. Protocol type identification based on user multidimensional information.
  • the process of identifying the protocol type based on the user multi-dimensional information in the embodiment of the present invention is a process of identifying the protocol type of the received data message based on the related information that the user terminal has already connected.
  • the protocol identification method based on user multi-dimensional information includes a plurality of independent identification methods, and a fixed sequence is not required between the various identification methods.
  • Each of the independent identification methods is a method for identifying a certain dimension information based on the user's multi-dimensional information, for example, a protocol identification method based on server address information, and protocol identification based on the address of the user terminal already connected.
  • the method is based on a protocol identification method for user terminal address information already connected by a user terminal, a feature recognition method based on an existing connection of the user terminal, a feature recognition method based on the existing connection of the user terminal, and the like.
  • the protocol type identification based on the multi-dimensional information of the user is performed on the connection where the data packet is located, according to the obtained information about all connections currently established by the user terminal that is identified by the user multi-dimensional information, including: determining the Whether the server address information in the quintuple of the data packet is included in the server address information that the user terminal has visited in the user multi-dimensional information table, and if so, the data packet is connected.
  • the protocol type is a protocol type corresponding to an existing connection in which the server address information stored in the user multi-dimensional information table is consistent with the data packet; Or determining whether the source IP address information and the destination IP address information in the quintuple of the data packet include the source IP address information and the destination IP address corresponding to the existing connection of the user terminal stored in the user multi-dimensional information table.
  • the protocol type of the connection is a protocol type corresponding to an existing connection in which the source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are consistent with the data packet; or the datagram is determined.
  • the protocol in which the data message is connected a protocol corresponding to an existing connection in which the user terminal address information stored in the user multi-dimensional information table is consistent with the data packet
  • the protocol type to which the text is connected is the protocol type corresponding to the existing connection in which the behavior characteristic information stored in the user multidimensional information table is consistent with the data packet.
  • the recognition result of the connection in the user multi-dimensional information table is updated.
  • Step 104 If the user multi-dimensional information corresponding to the user terminal is not found, the data stream-based protocol type identification is performed on the connection where the data packet is located according to the packet feature of the data packet.
  • the data stream based protocol identification refers to identifying the type of protocol used by the data stream/connection by detecting one or more data packets in a data stream. If the network device does not recognize the protocol type of the data flow based on the user multi-dimensional information or if the multi-dimensional information of the user does not exist in the user multi-dimensional information table, the protocol identification is performed based on the data flow, and the flow-based protocol identification method includes association identification, port Method of identification, feature recognition, behavior recognition, etc., after the recognition is successful, the recognition result of the connection in the user multidimensional information table is also updated, if the recognition is unsuccessful The output identifies an unsuccessful recognition result.
  • the method and device for identifying a protocol type provided by the embodiment of the present invention can implement user-based unit identification by performing protocol type identification based on user multi-dimensional information according to a protocol type that the user terminal has already connected.
  • the business control and by combining the protocol type identification based on the user multi-dimensional information and the protocol type identification based on the data stream, can improve the recognition accuracy of the DPI system and improve the protocol identification performance.
  • the network device may arrange a DPI system in the device.
  • the DPI system may perform corresponding message protocol identification.
  • FIG. 3 is a block diagram of a DPI system according to an embodiment of the present invention.
  • the DPI system includes a flow table 301, a user connection management module 303, and a user.
  • the protocol identification module includes a protocol identification sub-module 305 based on user multi-dimensional information and a data stream-based protocol identification sub-module 306.
  • the protocol identification sub-module based on the user multi-dimensional information can use various independent identification methods to identify the message protocol type, such as the identification method based on the server address information, the identification of the information based on the address of the existing connection, based on the existing connection.
  • the identification of the user terminal address information, the feature recognition based on the user terminal, and the behavior recognition based on the user terminal, these independent identification methods may also be used in combination; and the data stream based protocol identification submodule may also use multiple independent
  • the identification method identifies the protocol type of the message, such as association identification, port identification, feature recognition, behavior recognition, and the like.
  • the DPI system When the DPI system is running, first look in the flow table to determine whether the connection is a newly created connection, and then enter the user connection management module, the module finds in the user multidimensional information table whether there is a user record to which the newly created connection belongs, if If yes, the protocol is identified based on the user multi-dimensional information in the user multi-dimensional information table; if the protocol identification based on the user multi-dimensional information is successful, the user multi-dimensional information table is updated and then the recognition result is outputted to the service processing module, otherwise the flow-based protocol identification module is continued. Identification; if the stream-based protocol identification is successful, the user multi-dimensional information table is updated and the recognition result is output into the business process.
  • the above embodiment briefly describes the process of protocol identification by the DPI system.
  • the protocol identification process is described below by a detailed embodiment.
  • the execution body of the embodiment is a network device, such as a service gateway or a router, where the network device details the protocol identification of the received message. the process of. As shown, this embodiment includes the following steps:
  • Step 201 Receive a data packet.
  • Step 202 Determine whether the connection where the data packet is located is a newly established connection.
  • the network device parses the received data packet, and obtains corresponding quintuple information according to the packet header information.
  • the quintuple information includes the destination IP address, destination port number, source IP address, source port number, and transport layer protocol number of the packet.
  • connection record information corresponding to the quintuple information it may be found in the flow table whether there is connection record information corresponding to the quintuple information.
  • the flow table stores the record information of the connection that the DPI system has detected.
  • the flow table may include the quintuple information, the identification result of the corresponding connection, and the corresponding service control policy.
  • the connection corresponding to the data packet is an existing connection. Otherwise, the corresponding connection is a newly established connection. If it is determined to be a newly created connection, step 203 is performed.
  • Step 203 Determine whether there is user multi-dimensional information corresponding to the newly created connection in the multi-dimensional information table of the user.
  • the user multi-dimensional information table may be queried to determine whether there is user multi-dimensional information corresponding to the address information of the user in the quintuple information in the multi-dimensional information table of the user, and if yes, it is determined that the user corresponding to the newly created connection exists in the multi-dimensional information table of the user.
  • the user multi-dimensional information of the terminal if not, determines that the user multi-dimensional information of the user terminal corresponding to the newly created connection does not exist in the user multi-dimensional information table.
  • the address information of the user terminal is the IP address information or IP address information and port information of the user terminal device.
  • the user multi-dimensional information includes one or a combination of the following information: the address pair information corresponding to the connection of the user terminal, the address information of the user terminal to which the user terminal is already connected, The server address information that the user terminal has visited, the protocol list information of the user, and the behavior characteristic information of the connection that the user terminal has already connected.
  • the user multi-dimensional information table includes, in addition to the user multi-dimensional information, the multi-dimensional information of the user, and the correspondence between the multi-dimensional information of the user and the protocol type to which the user terminal has been connected.
  • step 204 is performed; otherwise, step 205 is performed.
  • Step 204 Perform protocol identification based on user multi-dimensional information.
  • the protocol identification method based on user multi-dimensional information includes a plurality of independent identification methods, and a fixed sequence is not required between the various identification methods.
  • Each of the independent identification methods is a method for identifying a certain dimension information based on the user's multi-dimensional information, for example, a protocol identification method based on server address information, and protocol identification based on the address of the user terminal already connected.
  • the method is based on a protocol identification method for user terminal address information that has been connected to a user terminal, a feature recognition method based on an existing connection of the user terminal, and a behavior recognition method based on the existing connection of the user terminal.
  • the protocol identification method based on the server address information is specifically: if a user initiates a connection to a server port, the protocol type of the connection initiated by the user to the same server port and the protocol type of the first connection are definitely identical. For example, if a user accesses a server using the HTTP protocol (eg, 1.2.3.4:80), then the protocol type of all connections that the user subsequently accesses to the server (1.2.3.4:80) is also HTTP.
  • HTTP protocol eg, 1.2.3.4:80
  • the protocol identification method for information based on the address that the user terminal has already connected is specifically: if a user initiates a connection to a server, then the protocol type and the first connection of the connection initiated by the user to the same server IP address
  • the type of protocol being connected may be the same.
  • the identification method finds the same connection with the newly created IP address pair (destination IP address, source IP address) in the historical connection of the user, and then confirms the newly established connection by simple judgment (for example, simple feature word confirmation). Whether the protocol type is the same as the protocol type of the historical connection.
  • the protocol identification method based on the user terminal address information that the user terminal has already connected is specifically: if a user initiates multiple connections to one or more destination addresses by the same (IP: Port) Then, the protocol types of these connections with the same user terminal (IP: Port) are the same.
  • the identification method finds the same connection as the newly connected user terminal address (IP: Port) in the historical connection of the user, and can confirm that the protocol type of the newly established connection is the same as the protocol type of the historical connection.
  • the user-based feature identification method is specifically: recording, by the user, a protocol list commonly used by the user, and the source of the protocol list includes a protocol used by the user and a pre-configured protocol list (for example, a popular protocol application in a user's region).
  • the user-based feature recognition method identifies the protocol in the user's common protocol list by scanning the protocol feature in the identification process.
  • the user-based behavior recognition method is specifically: comparing user behavior statistics and user behavior feature sets of the user's packets, and if yes, confirming the protocol to which the current message belongs.
  • the statistics of the user behavior include the statistical distribution of the binary values in the packets, the port range, and the packet length statistics (the packet length range, the packet length sequence, the packet length set, the packet length average, and the uplink and downlink interactions. The sum of the message lengths, the frequency of message transmission, the proportion of packets sent and received, and the degree of dispersion of the destination address.
  • the user behavior feature set is saved in the user record, and the initial content of the user behavior feature set is a pre-configured user behavior feature, and is enriched and updated according to the behavior statistics of the user's historical connection in the identification process.
  • step 206 is performed, otherwise step 209 is performed.
  • Step 205 Add user multi-dimensional information corresponding to the new user terminal in the user multi-dimensional information table.
  • step 209 is performed, that is, the data packet is protocol-recognized based on the data stream.
  • Step 206 If the identification is successful, it is determined whether the successfully identified protocol message contains traffic that cannot be identified by the feature recognition method.
  • the message protocol type is successfully identified based on the user multi-dimensional information, it is further determined whether the successfully identified packet contains traffic that cannot be identified by the feature recognition method. For example, if the first connection established by the user is an encrypted connection, the feature cannot be identified. Method identification, but The "behavior identification" method in the protocol identification of the data stream identifies the IP, port, etc. of the encrypted connection; when the user establishes the second identical encrypted connection, the DPI can be invented. One of the five methods identifies the second encrypted connection, which is triggered by the DPI, which updates the behavior of the second encrypted connection to update the behavior of the corresponding protocol.
  • step 207 If yes, go to step 207, otherwise go to step 208.
  • Step 207 Perform behavior recognition statistics based on the user terminal, and update user behavior characteristic information that the user terminal in the user multi-dimensional information table has connected.
  • connection Since a connection is originally identified by a behavioral feature, if it cannot be identified by a behavioral feature, the connection can serve as a sample of data corresponding to the behavioral characteristics of the protocol, helping to improve and improve the behavioral characteristics of the corresponding protocol.
  • Step 208 Update the recognition result data corresponding to the connection in the user multidimensional information table.
  • the protocol identification result corresponding to the data flow in the flow table, the service control policy, and the like may also be updated.
  • step 209 if the identification is unsuccessful, the protocol identification is performed based on the data flow.
  • step 208 is performed, otherwise step 210 is performed.
  • Step 210 outputting the recognition result.
  • the recognition result may be output regardless of whether the recognition is successful, so as to perform corresponding business control according to the recognition result.
  • the method and device for identifying a protocol type provided by the embodiment of the present invention can implement user-based unit identification by performing protocol type identification based on user multi-dimensional information according to a protocol type that the user terminal has already connected.
  • Business control and by combining user-based multidimensional
  • the protocol type identification of the information and the protocol type identification based on the data stream can improve the recognition accuracy of the DPI system and improve the protocol identification performance.
  • FIG. 4 is a flowchart of another method for identifying a protocol type according to an embodiment of the present invention.
  • the execution body of the embodiment is a network device, such as a service gateway or a router.
  • This embodiment describes in detail a method for the network device to perform user-based protocol type identification on the received data packet. As shown in the figure, the embodiment includes the following steps:
  • Step 401 Acquire a data packet transmitted on a connection established between the user terminal and the server.
  • the network device After receiving the data packet of the data stream, the network device parses the packet and obtains the corresponding quintuple information according to the packet header information.
  • the quintuple information includes the destination IP address, destination port number, source IP address, source port number, and TCP number of the packet, and then determines whether the connection corresponding to the data flow is a newly established connection according to the quintuple information.
  • the network device may query the flow table to determine whether the connection record information corresponding to the quintuple information of the service data packet exists in the flow table, and if yes, determine that the connection corresponding to the data flow is There is already a connection. If not, it is determined that the connection corresponding to the data flow is a newly established connection.
  • connection corresponding to the data flow of the data packet is determined to be an existing connection after the flow table is queried, the quintuple information corresponding to the data packet stored in the flow table is directly corresponding to the protocol type identification result and the service processing.
  • the method performs corresponding processing on the data packet, such as flow control. It should be noted that, even if the connection of the data packet is an existing connection, step 402 can be continued, that is, the corresponding protocol identification is performed on the connection where the data packet is located.
  • Step 402 Perform a data stream based protocol type identification on the connection where the data packet is located according to the packet feature of the data packet.
  • the data stream based protocol identification method includes association identification, port identification, feature recognition, behavior recognition, etc., and after the recognition succeeds, the identification result of the connection in the user multidimensional information table is also updated, and if the recognition is unsuccessful, the output identification is unsuccessful. Identify the results.
  • Step 403 if the data stream identification is unsuccessful, the user is found in the multidimensional information table. There is user multi-dimensional information corresponding to the user terminal, and the user multi-dimensional information is used to indicate information of all connections currently established by the user terminal.
  • Locating the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user includes: searching whether the user multi-dimensional information table has the address information corresponding to the user terminal according to the user terminal address information in the data packet User multidimensional information.
  • the multi-dimensional information of the user includes one or a combination of the following information: the address pair information corresponding to the connection of the user terminal, the address information of the user terminal that the user terminal has been connected to, the address information of the server that the user terminal has visited, The user's protocol list information and the behavior characteristic information of the user terminal having the connection, the user multi-dimensional information table includes the user multi-dimensional information, and the correspondence relationship between the user multi-dimensional information and the protocol type to which the user terminal has been connected.
  • the user terminal has an address pair formed by connecting the corresponding address pair information to the existing source IP address and the destination IP address, and the user terminal address information of the user terminal that has been connected is already connected.
  • the IP address and port number of the corresponding user terminal are composed.
  • the server address information that the user terminal has visited is composed of the IP address and port number of the server that the user terminal has visited.
  • the protocol list of the user stores the protocol record information commonly used by the user.
  • the behavior characteristic information that the user terminal has connected includes the protocol characteristics corresponding to the common protocol type of the user and the behavior statistics of the user.
  • step 404 is performed.
  • the network device finds the user multi-dimensional information corresponding to the user terminal in the user multi-dimensional information table, if the user multi-dimensional information corresponding to the user terminal is not found in the user multi-dimensional information table, the user is in the user User multi-dimensional information corresponding to the user terminal is added to the multi-dimensional information table.
  • the user multi-dimensional information of the user may be added in the user multi-dimensional information table after the protocol type of the connection to which the data packet is successfully identified.
  • Step 404 If the user multi-dimensional information corresponding to the user terminal is found, the information about all connections currently established by the user terminal that is identified by the acquired user multi-dimensional information is The protocol type identification based on the user multi-dimensional information is performed on the connection where the data message is located.
  • the protocol identification method based on user multi-dimensional information includes a plurality of independent identification methods, and a fixed sequence is not required between the various identification methods.
  • Each of the independent identification methods is a method for identifying a certain dimension information based on the user's multi-dimensional information, for example, a protocol identification method based on server address information, and protocol identification based on the address of the user terminal already connected.
  • the method is based on a protocol identification method for user terminal address information already connected by a user terminal, a feature recognition method based on an existing connection of the user terminal, a feature recognition method based on the existing connection of the user terminal, and the like.
  • the protocol type identification based on the multi-dimensional information of the user is performed on the connection where the data packet is located, according to the obtained information about all connections currently established by the user terminal that is identified by the user multi-dimensional information, including: determining the Whether the server address information in the quintuple of the data packet is included in the server address information that the user terminal has visited in the user multi-dimensional information table, and if so, the data packet is connected.
  • the protocol type is a protocol type corresponding to an existing connection in which the server address information stored in the user multi-dimensional information table is consistent with the data packet; or the source IP address information and purpose in the quintuple of the data packet is determined.
  • the IP address information is included in the source IP address information and the destination IP address information corresponding to the existing connection of the user terminal stored in the user multi-dimensional information table, and if yes, continue to determine whether the feature information of the data packet includes In the behavior characteristic information that the user terminal stored in the user multi-dimensional information table has been connected, if yes, the The type of the protocol to which the packet is connected is the protocol type corresponding to the existing connection in which the source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are consistent with the data packet; or Whether the user terminal address information in the quintuple information of the data packet is included in the user terminal address information that has been connected to the user terminal stored in the user multi-dimensional information table, and if so, the data packet
  • the protocol type of the connection is the protocol type corresponding to the existing connection that the user terminal address information stored in the user multi-dimensional information table is consistent with the data packet; or the behavior statistics of the data packet and the historical data packet are determined.
  • the protocol type of the data packet is connected to the user.
  • the recognition result of the connection in the user multi-dimensional information table is updated.
  • the embodiment of the present invention implements protocol identification based on user multi-dimensional information, thereby implementing user-based service control, and can improve the recognition accuracy of the DPI system and improve protocol identification performance.
  • the above embodiment corresponding to FIG. 4 briefly describes the process of protocol identification by the DPI system.
  • the protocol identification process is described below by a detailed embodiment.
  • FIG. 5 is a flowchart of another method for identifying a protocol type according to an embodiment of the present invention.
  • the execution body of the embodiment is a network device, such as a service gateway or a router, where the network device details the protocol identification of the received message. the process of. As shown, this embodiment includes the following steps:
  • Step 501 Receive a data packet.
  • Step 502 Determine whether the connection where the data packet is located is a newly established connection.
  • the network device parses the received data packet, and obtains corresponding quintuple information according to the packet header information.
  • the quintuple information includes the destination IP address, destination port number, source IP address, source port number, and TCP protocol number of the packet.
  • connection record information corresponding to the quintuple information it may be found in the flow table whether there is connection record information corresponding to the quintuple information.
  • the flow table stores the record information corresponding to the connection of the data flow detected by the DPI system, and the flow table may include the quintuple information, the recognition result of the corresponding connection, and the corresponding service control strategy.
  • the connection corresponding to the data packet is an existing connection. Otherwise, the corresponding connection is a newly established connection. If it is determined to be a newly created connection, step 503 is performed.
  • Step 503 Perform data stream based protocol type identification on the connection where the data packet is located.
  • the data stream based protocol identification method includes association identification, port identification, feature recognition, behavior recognition, and the like, and the recognition result of the connection in the user multidimensional information table is updated after the identification is successful. If the recognition is unsuccessful, the recognition result that the recognition is unsuccessful is output.
  • Step 504 Perform service processing on the service packet if the data stream is successfully identified.
  • Step 505 If the data stream identification is unsuccessful, determine whether the user multi-dimensional information corresponding to the connection where the data message is located exists in the user multi-dimensional information table.
  • Determining whether the user multi-dimensional information corresponding to the connection of the data packet exists in the multi-dimensional information table of the user includes: determining, according to the user terminal address information in the quintuple information of the data packet, whether the multi-dimensional information table of the user is There is user multi-dimensional information corresponding to the user terminal address information.
  • the multi-dimensional information of the user includes one or a combination of the following information: the address pair information corresponding to the connection of the user terminal, the address information of the user terminal that the user terminal has been connected to, the address information of the server that the user terminal has visited, The protocol list information of the user and the behavior characteristic information of the connection of the user terminal.
  • the user terminal has an address pair formed by connecting the corresponding address pair information to the existing source IP address and the destination IP address, and the user terminal address information of the user terminal that has been connected is already connected.
  • the IP address and port number of the corresponding user terminal are composed.
  • the server address information that the user terminal has visited is composed of the IP address and port number of the server that the user terminal has visited.
  • the protocol list of the user stores the protocol record information commonly used by the user.
  • the behavior characteristic information that the user terminal has connected includes the protocol characteristics corresponding to the common protocol type of the user and the behavior statistics of the user.
  • the network device After the network device determines that the user multi-dimensional information of the user terminal corresponding to the connection of the data packet does not exist in the user multi-dimensional information table, the user multi-dimensional information of the user may be added to the user multi-dimensional information table.
  • step 506 If there is user multi-dimensional information of the user terminal corresponding to the newly created connection in the user multi-dimensional information table, step 506 is performed; otherwise, step 507 is performed.
  • Step 506 performing protocol identification based on user multi-dimensional information.
  • the protocol identification method based on user multi-dimensional information includes a plurality of independent identification methods, and a fixed sequence is not required between the various identification methods.
  • Each of the independent identification methods is a method for identifying a certain dimension information based on the user's multi-dimensional information, for example, a protocol identification method based on server address information, and protocol identification based on the address of the user terminal already connected.
  • the method is based on a protocol identification method for user terminal address information that has been connected to a user terminal, a feature recognition method based on an existing connection of the user terminal, and a behavior recognition method based on the existing connection of the user terminal.
  • the protocol identification method based on the server address information is specifically: if a user initiates a connection to a server port, the protocol type of the connection initiated by the user to the same server port and the protocol type of the first connection are definitely identical. For example, if a user accesses a server using the HTTP protocol (eg, 1.2.3.4:80), then the protocol type of all connections that the user subsequently accesses to the server (1.2.3.4:80) is also HTTP.
  • HTTP protocol eg, 1.2.3.4:80
  • the protocol identification method for information based on the address that the user terminal has already connected is specifically: if a user initiates a connection to a server, then the protocol type and the first connection of the connection initiated by the user to the same server IP address
  • the type of protocol being connected may be the same.
  • the identification method finds the same connection with the newly created IP address pair (destination IP address, source IP address) in the historical connection of the user, and then confirms the newly established connection by simple judgment (for example, simple feature word confirmation). Whether the protocol type is the same as the protocol type of the historical connection.
  • the protocol identification method based on the user terminal address information that the user terminal has already connected is specifically: if one user initiates multiple connections to one or more destination addresses with the same (IP: Port), then the same users
  • IP: Port The protocol type of the connection of the terminal
  • the identification method finds the same connection as the newly connected user terminal address (IP: Port) in the historical connection of the user, and can confirm that the protocol type of the newly established connection is the same as the protocol type of the historical connection.
  • the user-based feature identification method is specifically: recording, by the user, a protocol list commonly used by the user, and the source of the protocol list includes a protocol used by the user and a pre-configured protocol list (for example, a popular protocol application in a user's region).
  • a protocol list commonly used by the user includes a protocol used by the user and a pre-configured protocol list (for example, a popular protocol application in a user's region).
  • a pre-configured protocol list for example, a popular protocol application in a user's region.
  • the user-based behavior recognition method is specifically: comparing user behavior statistics and user behavior feature sets of the user's packets, and if yes, confirming the protocol to which the current message belongs.
  • the statistics of the user behavior include the statistical distribution of the binary values in the packets, the port range, and the packet length statistics (the packet length range, the packet length sequence, the packet length set, the packet length average, and the uplink and downlink interactions. The sum of the message lengths, the frequency of message transmission, the proportion of packets sent and received, and the degree of dispersion of the destination address.
  • the user behavior feature set is saved in the user record, and the initial content of the user behavior feature set is a pre-configured user behavior feature, and is enriched and updated according to the behavior statistics of the user's historical connection in the identification process.
  • step 508 is performed, otherwise step 511 is performed, and the recognition result is output.
  • Step 507 adding user multi-dimensional information corresponding to the new user terminal in the user multi-dimensional information table.
  • step 509 is performed, that is, the data packet is protocol-recognized based on the data stream.
  • Step 508 If the identification is successful, it is determined whether the successfully identified protocol message contains traffic that cannot be identified by the feature recognition method.
  • the message protocol type is successfully identified based on the user multi-dimensional information, it is further determined whether the successfully identified packet contains traffic that cannot be identified by the feature recognition method. For example, if the first connection established by the user is an encrypted connection, the feature cannot be identified. Method identification, but identified by the "behavior identification" method in "flow-based protocol identification", then DPI will record the IP, port, etc. of the encrypted connection; when the user establishes a second identical encrypted connection When DPI can identify the second encrypted connection by one of the five methods invented, the judgment will be triggered, and the DPI will update the behavior of the second encrypted connection to update the behavior of the corresponding protocol. feature.
  • step 509 If yes, go to step 509, otherwise go to step 510.
  • Step 509 Perform user-based behavior recognition statistics, and update user behavior characteristic information that the user terminal in the user multi-dimensional information table has connected.
  • connection Since a connection is originally identified by a behavioral feature, if it cannot be identified by a behavioral feature, the connection can serve as a sample of data corresponding to the behavioral characteristics of the protocol, helping to improve and improve the behavioral characteristics of the corresponding protocol.
  • Step 510 Update the recognition result data corresponding to the connection in the user multidimensional information table.
  • the protocol identification result corresponding to the data flow in the flow table, the service control policy, and the like may also be updated.
  • step 511 the recognition result is output.
  • the recognition result may be output regardless of whether the recognition is successful, so as to perform corresponding business control according to the recognition result.
  • the embodiment of the present invention implements user-based protocol identification, and the user-based service control can also be implemented based on the user's protocol identification.
  • the protocol identification based on the user's multi-dimensional information can be based only on the IP address and port of the packet, and does not perform deep content scanning on the packet, so the performance of the protocol identification can be significantly improved.
  • FIG. 6 is a schematic diagram of a protocol type identification apparatus according to an embodiment of the present invention. As shown in the figure, the embodiment includes the following functional units:
  • the obtaining unit 601 is configured to acquire a data packet transmitted on a connection established between the user terminal and the server.
  • the network device After receiving the data packet of the data stream, the network device parses the packet and obtains the corresponding quintuple information according to the packet header information.
  • the quintuple information includes the destination IP address, destination port number, source IP address, source port number, and TCP protocol number of the packet.
  • the network device may query the flow table to determine whether the connection record information corresponding to the quintuple information of the service data packet exists in the flow table, and if yes, determine the data flow corresponding to the data flow.
  • the connection is an existing connection, and if not, the connection corresponding to the data flow is determined. For a new connection.
  • connection corresponding to the data flow of the data packet is determined to be an existing connection after the flow table is queried, the quintuple information corresponding to the data packet stored in the flow table is directly corresponding to the protocol type identification result and the service processing.
  • the method performs corresponding processing on the data packet, such as flow control. It should be noted that, even if the connection of the data packet is an existing connection, the following determining operation may be performed by the searching unit 602, that is, the corresponding protocol identification is performed on the connection where the data packet is located.
  • the searching unit 602 is configured to search for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user, where the multi-dimensional information of the user is used to indicate information about all connections currently established by the user terminal.
  • the searching unit 602 is specifically configured to: according to the user terminal address information in the data packet, search for whether the user multi-dimensional information corresponding to the user terminal address information exists in the user multi-dimensional information table.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the user terminal currently has a source IP address information and a destination IP address information corresponding to the connection, and the user terminal to which the user terminal is already connected.
  • the user multi-dimensional information table includes the multi-dimensional information of the user, and a correspondence relationship between the multi-dimensional information of the user and a protocol type to which the user terminal has been connected.
  • the user terminal has an address pair corresponding to the address pair information and the destination IP address of the existing connection, and the user terminal address information that has been connected to the user terminal is the IP address of the user terminal corresponding to the existing connection.
  • the address and port number are composed.
  • the server address information that the user terminal has visited is composed of the IP address and port number of the server that the user terminal has visited.
  • the user's protocol list stores the protocol record information commonly used by the user, and the user terminal is already connected.
  • the behavior characteristic information includes protocol characteristics corresponding to common user protocol types and user behavior statistics.
  • the first processing unit 603 is configured to: if the user multi-dimensional information corresponding to the user terminal is found, all the currently established users of the user terminal are identified according to the acquired multi-dimensional information of the user. The received information is used to identify the type of the data packet based on the multi-dimensional information of the user.
  • the protocol identification method based on user multi-dimensional information includes a plurality of independent identification methods, and a fixed sequence is not required between the various identification methods.
  • Each of the independent identification methods is a method for identifying a certain dimension information based on the user's multi-dimensional information, for example, a protocol identification method based on server address information, and protocol identification based on the address of the user terminal already connected.
  • the method is based on a protocol identification method for user terminal address information already connected by a user terminal, a feature recognition method based on an existing connection of the user terminal, a feature recognition method based on the existing connection of the user terminal, and the like.
  • the first processing unit 603 is specifically configured to: determine whether the server address information in the quintuple of the data packet is included in the server address information that the user terminal has visited in the user multi-dimensional information table, If yes, the protocol type of the data packet is the protocol type corresponding to the existing connection that the server address information stored in the user multi-dimensional information table is consistent with the data packet; or the data packet is determined.
  • the source IP address information and the destination IP address information in the quintuple are included in the source IP address information and the destination IP address information corresponding to the existing connection of the user terminal stored in the user multidimensional information table, and if so, Continuing to determine whether the feature information of the data packet is included in the behavior characteristic information of the user terminal that is already connected in the user multi-dimensional information table, and if yes, the protocol type of the data packet to which the connection is located is The source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are all consistent with the data packet.
  • the protocol type of the data packet is the protocol type corresponding to the existing connection that the user terminal address information in the user multi-dimensional information table is consistent with the data packet; or the data packet is determined.
  • the behavior statistics of the historical data packet whether it is included in the behavior characteristic information of the user terminal that is already connected in the user multi-dimensional information table, and if yes, the protocol type of the connection of the data packet is the The protocol type corresponding to the existing connection in which the behavior characteristic information stored in the user multidimensional information table is consistent with the data message.
  • the first processing unit 603 is further configured to: if the identification is successful, update the recognition result data in the user multi-dimensional information table, and output a recognition result, where the identification result data is the identified data message The type of protocol being connected.
  • the first processing unit 603 is further configured to: if the identification is successful, further determine whether the data packet is a message that cannot be identified by the feature, and if yes, perform a user-based behavior recognition statistics, and update the user connection data.
  • the user terminal in the table has the behavior characteristic information of the connection.
  • the second processing unit 604 is configured to: if the user multi-dimensional information corresponding to the user terminal is not found, perform protocol-based protocol type identification on the connection where the data packet is located according to the packet feature of the data packet .
  • the second processing unit 604 is further configured to: if the user multi-dimensional information corresponding to the user terminal does not exist in the user multi-dimensional information table, add user multi-dimensional information corresponding to the user terminal in the user multi-dimensional information table.
  • the second processing unit 604 is further configured to: if the identification is successful, update the recognition result data in the user multi-dimensional information table, and otherwise output the recognition result.
  • the protocol identification is performed based on the data flow, and the flow-based protocol identification method includes association identification, port The method of identification, feature recognition, behavior recognition, etc., after the recognition succeeds, the recognition result of the connection in the user multidimensional information table is also updated, and if the recognition is unsuccessful, the recognition result that the recognition is unsuccessful is output.
  • the method and device for identifying a protocol type provided by the embodiment of the present invention can implement user-based unit identification by performing protocol type identification based on user multi-dimensional information according to a protocol type that the user terminal has already connected.
  • the business control and by combining the protocol type identification based on the user multi-dimensional information and the protocol type identification based on the data stream, can improve the recognition accuracy of the DPI system and improve the protocol identification performance.
  • FIG. 7 is a schematic diagram of another protocol type identification device according to an embodiment of the present invention.
  • the device includes:
  • the obtaining unit 701 is configured to acquire a data packet transmitted on a connection established between the user terminal and the server.
  • the quintuple information includes the destination IP address, destination port number, source IP address, source port number, and TCP number of the packet, and then determines whether the connection corresponding to the data flow is a newly established connection according to the quintuple information.
  • the network device may query the flow table to determine whether the connection record information corresponding to the quintuple information of the service data packet exists in the flow table, and if yes, determine that the connection corresponding to the data flow is There is already a connection. If not, it is determined that the connection corresponding to the data flow is a newly established connection.
  • connection corresponding to the data flow of the data packet is determined to be an existing connection after the flow table is queried, the quintuple information corresponding to the data packet stored in the flow table is directly corresponding to the protocol type identification result and the service processing.
  • the method performs corresponding processing on the data packet, such as flow control. It should be noted that, even if the connection of the data packet is an existing connection, the first processing unit 702 can continue to perform the related operation, that is, the data stream-based protocol identification is performed on the connection where the data packet is located.
  • the first processing unit 702 is configured to perform, according to the packet feature of the data packet, a protocol type identification based on the data stream in the connection where the data packet is located.
  • the data stream based protocol identification method includes association identification, port identification, feature recognition, behavior recognition, etc., and after the recognition succeeds, the identification result of the connection in the user multidimensional information table is also updated, and if the recognition is unsuccessful, the output identification is unsuccessful. Identify the results.
  • the first processing unit 702 is further configured to perform corresponding service processing on the data packet if the data stream is successfully identified.
  • the searching unit 703 is configured to: if the data stream identification is unsuccessful, search for whether the user multi-dimensional information corresponding to the user terminal exists in the user multi-dimensional information table, where the user multi-dimensional information is used to indicate that all connections of the user terminal are currently established. information.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the user terminal currently has a source IP address information and a destination IP address information corresponding to the connection, and the user terminal to which the user terminal is already connected
  • the user multi-dimensional information table includes the multi-dimensional information of the user, and the correspondence between the multi-dimensional information of the user and the protocol type that the user terminal has already connected
  • the searching unit 703 is specifically configured to: according to the user terminal address information in the data packet, search for whether the user multi-dimensional information corresponding to the user terminal address information exists in the user multi-dimensional information table.
  • the second processing unit 704 is configured to: if the user multi-dimensional information corresponding to the user terminal is found, the data is obtained according to the information about all connections currently established by the user terminal that are identified by the acquired user multi-dimensional information.
  • the connection where the message is located performs protocol type identification based on the user's multidimensional information.
  • the second processing unit 704 is specifically configured to: determine whether the server address information in the quintuple of the data packet is included in the server address information that the user terminal has visited in the user multi-dimensional information table, If yes, the protocol type of the data packet is the protocol type corresponding to the existing connection that the server address information stored in the user multi-dimensional information table is consistent with the data packet; or the data packet is determined.
  • the source IP address information and the destination IP address information in the quintuple are included in the source IP address information and the destination IP address information corresponding to the existing connection of the user terminal stored in the user multidimensional information table, and if so, Continuing to determine whether the feature information of the data packet is included in the behavior characteristic information of the user terminal that is already connected in the user multi-dimensional information table, and if yes, the protocol type of the data packet to which the connection is located is The source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are all consistent with the data packet.
  • the protocol type of the data packet is the user terminal address information and the data stored in the user multi-dimensional information table.
  • the protocol type of the existing connection is consistent; or the behavior statistics of the data packet and the historical data packet are included in the behavior information of the user terminal that is stored in the user multidimensional information table. If yes, the protocol type of the data packet is the protocol type corresponding to the existing connection in which the behavior characteristic information stored in the user multi-dimensional information table is consistent with the data packet.
  • the second processing unit 704 is further configured to: if the user multi-dimensional information corresponding to the user terminal does not exist in the user multi-dimensional information table, add user multi-dimensional information corresponding to the user terminal in the user multi-dimensional information table.
  • the second processing unit 704 is further configured to: if the recognition for the multi-dimensional information is successful, update the recognition result data in the user multi-dimensional information table, and output the recognition result.
  • the second processing unit 704 is further configured to: if the data is successfully identified based on the multi-dimensional information of the user, further determine whether the data packet is a message that cannot be identified by the feature, and if yes, perform a behavior-based identification statistics based on the user, and Update the behavior characteristic information of the connected terminal of the user terminal in the user connection data table.
  • the embodiment of the present invention implements protocol identification based on user multi-dimensional information, thereby implementing user-based service control, and can improve the recognition accuracy of the DPI system and improve protocol identification performance.
  • FIG. 8 is a schematic diagram of a network device according to an embodiment of the present invention.
  • the network device includes a network interface 801, a processor 802, and a memory 803.
  • System bus 804 is used to connect network interface 801, processor 802, and memory 803.
  • the network interface 801 is used to connect with user terminal devices, server side devices, and other network devices.
  • the memory 803 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver therein.
  • the software modules are capable of performing the various functional modules of the above described methods of the present invention; the device drivers can be network and interface drivers.
  • connection of the data packet is based on the user multi-dimensionality according to the information of all connections currently established by the user terminal identified by the acquired user multi-dimensional information. Protocol type identification of information;
  • the data stream-based protocol type identification is performed on the connection where the data packet is located according to the packet feature of the data packet.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the source IP address information and the destination IP address information that the user terminal currently has a connection, and the user terminal address information that the user terminal has been connected to.
  • the user multi-dimensional information table includes the user multi-dimensional information, and the The correspondence between the user multi-dimensional information and the protocol type to which the user terminal has been connected.
  • the process for the processor 802 to search for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user includes: searching, according to the user terminal address information in the data packet, whether the multi-dimensional information table of the user is found There is user multi-dimensional information corresponding to the user terminal address information.
  • the processor 802 finds the user multi-dimensional information corresponding to the user terminal in the search user multi-dimensional information table, the following instruction is further executed: if the user multi-dimensional information table does not exist corresponding to the user terminal User multi-dimensional information, the user multi-dimensional information corresponding to the user terminal is added in the user multi-dimensional information table.
  • the processor 802 performs, based on the acquired information of all the connections currently established by the user terminal that is identified by the user multi-dimensional information, the connection of the data packet based on the user.
  • the process of identifying the protocol type of the multi-dimensional information is specifically: determining whether the server address information in the quintuple of the data packet includes the server address information that the user terminal has visited in the user multi-dimensional information table. If yes, the protocol type of the connection in which the data packet is located is a protocol type corresponding to an existing connection in which the server address information stored in the user multi-dimensional information table is consistent with the data packet; or
  • the protocol type is a protocol type corresponding to an existing connection in which the source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are consistent with the data packet; or
  • the protocol type to which the text is connected is the protocol type corresponding to the existing connection in which the user terminal address information stored in the user multi-dimensional information table is consistent with the data packet; or
  • the protocol type is a protocol type corresponding to an existing connection in which the behavior characteristic information stored in the user multi-dimensional information table is consistent with the data packet.
  • the processor 802 identifies the connection of the data packet based on the multi-dimensional information of the user
  • the processor 802 further executes the following instruction: if the identification is successful, the user multi-dimensional information table is updated. Identifying the result data, and outputting the recognition result, wherein the recognition result data is a protocol type of the connection in which the identified data message is located.
  • the processor 802 performs, based on the acquired information of all the connections currently established by the user terminal that is identified by the user multi-dimensional information, the connection of the data packet based on the user. After the protocol type identification of the multidimensional information, the memory 803 is accessed, and an instruction is executed: if the recognition is successful, the recognition result data in the user multidimensional information table is updated, otherwise the recognition result is output.
  • the processor 802 accesses the memory 803 after performing the protocol type identification based on the user multi-dimensional information after the connection of the data packet is determined according to the obtained protocol type of the user terminal that is identified by the user multi-dimensional information. Executing an instruction: if the identification is successful, further determining whether the data packet is a message that cannot be identified by the feature, and if yes, performing user-based behavior recognition statistics, and updating the user terminal in the user connection data table There are connected behavior characteristics information.
  • the network device provided by the embodiment of the present invention can perform service control based on the user unit by performing protocol type identification based on the multi-dimensional information of the user according to the type of the protocol that the user terminal has already connected to the received data packet, and By combining protocol type identification based on user multi-dimensional information and protocol type identification based on data stream, the recognition accuracy of DPI system can be improved and the protocol recognition performance can be improved.
  • FIG. 9 is a schematic diagram of a network device according to an embodiment of the present invention.
  • the network device includes a network interface 901, a processor 902, and a memory 903.
  • System bus 904 is used to connect network interface 901, processor 902, and memory 903.
  • the network interface 901 is used to connect with user terminal devices, server side devices, and other network devices.
  • the memory 903 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver therein.
  • the software modules are capable of performing the various functional modules of the above described methods of the present invention; the device drivers can be network and interface drivers.
  • the user multi-dimensional information corresponding to the user terminal is found in the user multi-dimensional information table, and the user multi-dimensional information is used to indicate information about all connections currently established by the user terminal;
  • connection of the data packet is based on the user multi-dimensionality according to the information of all connections currently established by the user terminal identified by the acquired user multi-dimensional information. Protocol type identification of information.
  • the user multi-dimensional information corresponding to the user terminal includes at least one of the following information: the source IP address information and the destination IP address information that the user terminal currently has a connection, and the user terminal address information that the user terminal has been connected to.
  • the user multi-dimensional information table includes the user multi-dimensional information, and a correspondence relationship between the user multi-dimensional information and a protocol type to which the user terminal has been connected.
  • the process of the processor 902 searching for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table of the user includes: searching, according to the user terminal address information in the data packet, whether the multi-dimensional information table of the user is found in the user There is user multi-dimensional information corresponding to the user terminal address information.
  • the processor 902 searches for the user multi-dimensional information corresponding to the user terminal in the multi-dimensional information table, after accessing the memory 903, the processor 902 further executes the following instruction: if the user does not exist in the multi-dimensional information table The user multi-dimensional information corresponding to the user terminal adds the user multi-dimensional information corresponding to the user terminal to the user multi-dimensional information table.
  • the processor 902 performs a process of identifying the type of the protocol based on the multi-dimensional information of the user according to the information about all the connections currently established by the user terminal that is identified by the user multi-dimensional information. Is: determining whether the server address information in the quintuple of the data packet is included in the server address information that the user terminal has visited in the user multi-dimensional information table, and if yes, the data
  • the protocol type of the packet to which the packet is connected is a protocol type corresponding to an existing connection in which the server address information stored in the user multi-dimensional information table is consistent with the data packet; or
  • the protocol type is a protocol type corresponding to an existing connection in which the source IP address information, the destination IP address information, and the line feature information stored in the user multidimensional information table are consistent with the data packet; or
  • the protocol type to which the text is connected is the protocol type corresponding to the existing connection in which the user terminal address information stored in the user multi-dimensional information table is consistent with the data packet; or
  • the protocol type is a protocol type corresponding to an existing connection in which the behavior characteristic information stored in the user multi-dimensional information table is consistent with the data packet.
  • the processor 902 according to the obtained information about all the connections currently established by the user terminal that is identified by the user multi-dimensional information, after the connection of the data packet is performed based on the protocol type identification of the user multi-dimensional information, After the memory 903, an instruction is also executed to update the recognition result data in the user multidimensional information table and output the recognition result if the recognition based on the multidimensional information is successful.
  • the processor 902 accesses the memory 903, and executes an instruction: if the data stream is successfully identified, the data packet is correspondingly Business processing.
  • the processor 902 is connected to the data packet based on the multi-dimensional information of the user.
  • the memory 903 is accessed, and the instruction is executed: if the identification based on the multi-dimensional information of the user is successful, further determining whether the data message is a message that cannot be identified by the feature, and if yes, performing the user-based The behavior recognition statistics, and updating the behavior characteristic information of the connected connection of the user terminal in the user connection data table.
  • the network device provided by the embodiment of the present invention can perform service control based on the user unit by performing protocol type identification based on the multi-dimensional information of the user according to the type of the protocol that the user terminal has already connected to the received data packet, and By combining protocol type identification based on user multi-dimensional information and protocol type identification based on data stream, the recognition accuracy of DPI system can be improved and the protocol recognition performance can be improved.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

本发明涉及一种协议类型的识别方法和装置。该方法包括:获取用户终端与服务器之间建立的连接上传送的数据报文;查找用户多维信息表中是否存在用户终端对应的用户多维信息,用户多维信息用于表示用户终端当前已建立的所有连接的信息;如果查找到用户终端对应的用户多维信息,则根据所获取的用户多维信息标识的用户终端当前已建立的所有连接的信息,对数据报文所在连接进行基于用户多维信息的协议类型识别;如果没有查找到用户终端对应的用户多维信息,则根据数据报文的报文特征,对数据报文所在连接进行基于数据流的协议类型识别。本发明实现了基于用户多维信息的协议识别,进而实现了基于用户的业务控制。

Description

协议类型的识别方法和装置 技术领域
本发明涉及网络流量管理技术,尤其涉及一种协议类型的识别方法和装置。
背景技术
深度报文检测(Deep Packet Inspect ion,DPI)技术可以深入分析报文来识别报文,DPI除了对报文L2(数据链路层)、L3(网络层)、L4(传输层)的内容进行分析外,还增加了对L7(应用层)内容的分析,能识别各种真实的应用及其内容,,进而用于网络优化和流量控制等应用场景。
在现有技术下,DPI通常基于数据流来识别报文,即DPI以单条数据流为对象来进行处理,对数据流经过流表查找之后,通过使用各种识别方法,例如特征识别、端口分类、统计方法等,对数据流中的报文进行扫描,完成流的识别和分类。每条流的识别都是一个独立的处理过程,识别结果时也以流为单位进行保存。
基于流的识别方法的缺点是:基于流的识别方式在每条流的范围内通过对数据流中的报文内容进行扫描来实现识别和协议分类,没有利用数据流之间的关联性,对数据流识别的性能低并且无法实现基于用户为单位的精准的业务控制。
发明内容
本发明实施例提供了一种协议类型的识别方法和装置,以提高数据流协议识别的效率。
第一方面,本发明实施例提供了一种协议类型的识别方法,所述方法包括:
获取用户终端与服务器之间建立的连接上传送的数据报文;
查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别;
如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
在第一种可能的实现方式中,所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
根据第一方面的第一种可能的实现方式,在第二种可能的实现方式中,在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,还包括:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
结合第一方面,在第三种可能的实现方式中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
结合第一方面的第三种可能的实现方式,在第四种可能的实现方式中,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
结合第一方面或者第一方面的第一种至第四种可能的实现方式中的任意一种可能的实现方式,在第五种可能的实现方式中,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果, 其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
结合第一方面或者第一方面的第一种至第五种可能的实现方式中的任意一种可能的实现方式,在第六种可能的实现方式中,所述根据所获取的所述用户多维信息标识的用户终端已有连接的协议类型,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户多维信息表中的用户终端已有连接的行为特征信息。
第二方面,本发明实施例还提供了一种协议类型的识别方法,该方法包括:
获取用户终端与服务器之间建立的连接上传送的数据报文;
根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别;
如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
根据第二方面,在第一种可能的实现方式中,所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。。
根据第二方面的第一种可能的实现方式,在第二种可能的实现方式中,在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,还包括:如果所述用户多维信息表中不存在与所述用户终端对应的用户 多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
结合第二方面,在第三种可能的实现方式中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
结合第二方面的第三种可能的实现方式,在第四种可能的实现方式中,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应 的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
结合第二方面或者第二方面的第一种至第四种可能的实现方式中的任意一种可能的实现方式,在第五种可能的实现方式中,所述所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
结合第二方面或者第二方面的第一种至第五种可能的实现方式中的任意一种可能的实现方式,在第六种可能的实现方式中,所述对所述数据报文所在连接进行基于数据流的协议类型识别之后还包括:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
结合第二方面或者第二方面的第一种至第六种可能的实现方式中的任意一种可能的实现方式,在第七种可能的实现方式中,所述对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户连接数据表中的用户终端已有连接的行为特征信息。
第三方面,本发明实施例提供了一种协议类型的识别装置,所述装置包括:
获取单元,用于获取用户终端与服务器之间建立的连接上传送的数据报文;
查找单元,用于查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的 信息;
第一处理单元,用于如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别;
第二处理单元,用于如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
在第一种可能的实现方式中,所述查找单元具体用于:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
根据第三方面的第一种可能的实现方式,在第二种可能的实现方式中,如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息
结合第三方面或者第三方面的第一种、第二种可能的实现方式,在第三种可能的实现方式中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
结合第三方面的第三种可能的实现方式,在第四种可能的实现方式中,所述第一处理单元具体用于:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址 信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
结合第三方面或者第三方面的第一种至第四种可能的实现方式中的任意一种可能的实现方式,在第五种可能的实现方式中,所述第一处理单元还用于:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
结合第三方面的或者第三方面的第一种至第五种可能的实现方式中的任意一种可能的实现方式,在第六种可能的实现方式中,所述第一处理单元还用于:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户多维信息表中的用户终端已有连接的行为特征信息。
第四方面,本发明实施例提供了一种协议类型的识别装置,该装置包括:
获取单元,用于获取用户终端与服务器之间建立的连接上传送的数据报 文;
第一处理单元,用于根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别;
查找单元,用于如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
第二处理单元,用于如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
根据第四方面,在第一种可能的实现方式中,所述查找单元具体用于:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
根据第四方面的第一种可能的实现方式,在第二种可能的实现方式中,第二处理单元还用于:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
结合第四方面,在第三种可能的实现方式中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
结合第四方面的第三种可能的实现方式,在第四种可能的实现方式中,所述第二处理单元具体用于:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务 器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
结合第四方面或者第四方面的第一种至第四种可能的实现方式中的任意一种可能的实现方式,在第五种可能的实现方式中,所述第二处理单元还用于:如果基于所述用于多维信息识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果。
结合第四方面或者第四方面的第一种至第五种可能的实现方式中的任意一种可能的实现方式,在第六种可能的实现方式中,所述第一处理单元还用于:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
结合第四方面或者第四方面的第一种至第七种可能的实现方式中的任意一种可能的实现方式,在第八种可能的实现方式中,所述第二处理单元还用 于:如果基于所述用户多维信息识别成功,则进一步判断所述数据报文是否为无法通过特征识别的报文,如果是,则进行基于用户的行为识别统计,并且更新用户连接数据表中的用户终端已有连接的行为特征信息。
本发明实施例提供的协议类型的识别方法和装置,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。
附图说明
图1为本发明实施例提供的一种协议类型的识别方法流程图;
图2为本发明实施例提供的另一协议类型的识别方法流程图;
图3为本发明实施例提供的一种DPI系统模块图;
图4为本发明实施例提供的另一协议类型的识别方法流程图;
图5为本发明实施例提供的另一协议类型的识别方法流程图;
图6为本发明实施例提供的一种协议类型的识别装置示意图;
图7为本发明实施例提供的另一协议类型的识别装置示意图;
图8为本发明实施例提供的一种网络设备示意图;
图9为本发明实施例提供的另一网络设备示意图。
具体实施方式
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
本发明实施例提供的协议类型的识别方法在实际应用时,作为一种新的协议识别方法可应用于网络优化、应用流量控制等业务场景。当网络设备,如业务网关、路由器等接收到新建的连接的数据报文时,可基于用户多维信息表分析该业务数据报文的协议类型,由此本发明实施例可实现基于用户为 单位的业务控制,结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别两种方法,可以提高DPI系统的识别准确率,提升协议识别性能。
其中,本申请文件中提到的五元组中的服务器地址信息可以为数据报文五元组中的源地址信息,也可以为目的地址信息,用户终端向服务器发送的数据报文的五元组的目的地址信息即为五元组中的服务器地址信息,服务器向用户终端发送的数据报文的五元组的源地址信息即为五元组中的服务器地址信息。另外,用户终端具体可以为客户端,或者运行在用户终端中的应用程序。
图1为本发明实施例提供的一种协议类型的识别方法流程图,该实施例的执行主体是网络设备,如业务网关或者路由器,该实施例详细描述了网络设备对接收到的数据报文进行基于用户的协议类型识别的方法,如图所示,该实施例包括以下步骤:
步骤101,获取用户终端与服务器之间建立的连接上传送的数据报文。
网络设备接收到数据流的数据报文后,对报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、传输层协议号(如传输控制协议(Transmission Control Protocol,TCP)号、用户数据报协议(User Datagram Protocol,UDP)号),然后根据五元组信息判断数据流对应的连接是否为新建的连接。
优选地,接收到数据报文后,网络设备可查询流表,判断流表中是否存在该业务数据报文的五元组信息对应的连接记录信息,如果是,则判断数据流对应的连接是已有连接,如果否,则判断所述数据流对应的连接为新建的连接。
如果查询流表之后,判断出数据报文所在连接为已有连接,则直接根据所述流表存储的所述该数据报文的五元组信息对应协议类型识别结果以及业务处理方法,即可对所述该数据报文进行相应处理,如流量控制等。需要说 明的是,即使数据报文所在连接为已有连接,也可以继续执行步骤102,即对数据报文所在连接进行相应的协议识别。
步骤102,查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息。
所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
其中,用户多维信息包括以下信息中的一项或者任意项的组合:用户终端已有连接对应的地址对信息、用户终端已有连接的用户终端地址信息、用户终端曾访问过的服务器地址信息、用户的协议列表信息、用户终端已有连接的行为特征信息;用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
具体地,用户多维信息表中可以包括用户终端曾访问过的服务器地址信息与用户终端已有连接的协议类型的对应关系,和/或用户终端已有连接的源IP地址信息和目的IP地址信息与用户终端已有连接的协议类型的对应关系,和/或用户终端已有连接的用户终端地址信息与用户终端已有连接的协议类型的对应关系,和/或用户终端已有连接的行为特征信息与用户终端已有连接的协议类型的对应关系。
具体地,本发明实施例中,用户终端已有连接对应的地址对信息为已有连接的源IP地址和目的IP地址组成的地址对,用户终端已有连接的用户终端地址信息由已有连接对应的用户终端的IP地址和端口号组成,用户终端曾访问过的服务器地址信息由用户终端曾访问过的服务器的IP地址和端口号组成,用户的协议列表中存储了用户常用的协议记录信息,用户终端已有连接的行为特征信息包括用户常见协议类型对应的协议特征以及用户的行为统计数据。
网络设备判断出用户多维信息表中存在该数据报文所在连接对应的用户 终端的用户多维信息之后,即可执行步骤103。
网络设备在查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,如果没有在所述用户多维信息表中查找到与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。优选地,可在对该数据报文所在连接的协议类型识别成功后在用户多维信息表中添加该用户的用户多维信息。
步骤103,如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
其中,例如对一个报文进行检测,发现它是HTTP协议的消息,那么就认为该报文所属的连接/数据流是HTTP协议的,该连接/数据流中的所有数据报文都是HTTP协议的。因此,本发明实施例中基于用户多维信息识别协议类型的过程即是基于该用户终端已有连接的相关信息识别所接收到的数据报文的协议类型的过程。
基于用户多维信息进行的协议识别方法包括多种独立的识别方法,各种识别方法之间不需要固定的先后顺序。其中,每种独立的识别方法都为基于用户多维信息中的某一维信息进行识别的方法,例如基于服务器地址信息进行的协议识别方法,基于用户终端已有连接的地址对信息进行的协议识别方法,基于用户终端已有连接的用户终端地址信息的协议识别方法,基于用户终端已有连接的特征识别方法,基于用户终端已有连接的特征识别方法等。
具体地,根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型; 或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
如果网络设备基于用户多维信息识别出了数据流的协议类型,则更新用户多维信息表中的连接的识别结果。
步骤104,如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
其中,基于数据流的协议识别,指的是通过对一个数据流内的一个或者多个数据报文进行检测,从而识别出该数据流/连接所使用的协议类型。如果网络设备基于用户多维信息没有识别出数据流的协议类型或者说如果用户多维信息表中不存在该用户的多维信息,则基于数据流进行协议识别,基于流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等方法,识别成功后同样更新用户多维信息表中的连接的识别结果,如果识别不成功则 输出识别不成功的识别结果。
由此,本发明实施例提供的协议类型的识别方法和装置,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。
需要说明的是,网路设备为了实现协议识别功能,可以在设备中布置一个DPI系统,当网络设备接收到数据报文时,DPI系统可以进行相应的报文协议识别。
具体地,DPI系统所包括的内容如图3所示,图3为本发明实施例提供的一种DPI系统模块图,如图所示,DPI系统包括流表301、用户连接管理模块303、用户多维信息表302、协议识别模块304、业务处理模块307。其中,协议识别模块包括基于用户多维信息的协议识别子模块305和基于数据流的协议识别子模块306。其中,基于用户多维信息的协议识别子模块可以使用多种独立的识别方法来识别报文协议类型,如基于服务器地址信息的识别方法、基于已有连接的地址对信息的识别、基于已有连接的用户终端地址信息的识别、基于用户终端的特征识别、基于用户终端的行为识别,这些独立的识别方法也可以结合在一起使用;而基于数据流的协议识别子模块也可以使用多种独立的识别方法来识别报文的协议类型,如关联识别、端口识别、特征识别、行为识别等。在DPI系统运行时,首先在流表中查找以判断该连接是否为新建的连接,然后进入用户连接管理模块,该模块在用户多维信息表中查找到是否存在新建的连接所属的用户记录,如果存在,则基于用户多维信息表中的用户多维信息进行协议识别;如果基于用户多维信息的协议识别成功,更新用户多维信息表然后输出识别结果进入业务处理模块,否则进入基于流的协议识别模块继续识别;如果基于流的协议识别成功,更新用户多维信息表然后输出识别结果进入业务处理。
上述实施例简单描述了DPI系统进行协议识别的过程,下面通过一个详细的实施例描述协议识别过程。
图2为本发明实施例提供的另一协议类型的识别方法流程图,该实施例的执行主体是网络设备,如业务网关或者路由器,其中详细描述了网络设备对接收到的报文进行协议识别的过程。如图所示,该实施例包括以下步骤:
步骤201,接收数据报文。
步骤202,判断该数据报文所在连接是否为新建的连接。
网络设备对接收到的数据报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、传输层协议号。
具体地,可以在流表中查找是否存在该五元组信息对应的连接记录信息。流表中保存了DPI系统曾检测过的连接的记录信息,流表中可以包括五元组信息、对应连接的识别结果、相应的业务控制策略等。
如果流表中保存了接收到的数据报文对应的五元组信息,则说明该数据报文对应的连接为已有连接,否则说明对应的连接为新建的连接。如果判断为新建的连接,则执行步骤203。
步骤203,判断用户多维信息表中是否存在新建的连接对应的用户多维信息。
具体地,可以查询用户多维信息表,判断用户多维信息表中是否存在五元组信息中用户的地址信息对应的用户多维信息,如果是,则判断用户多维信息表中存在新建的连接对应的用户终端的用户多维信息,如果否,则判断用户多维信息表中不存在新建的连接对应的用户终端的用户多维信息。用户终端的地址信息即为用户终端设备的IP地址信息或者IP地址信息及端口信息。
其中,用户多维信息包括以下信息中的一项或者任意项的组合:用户终端已有连接对应的地址对信息、用户终端已有连接的用户终端地址信息、用 户终端曾访问过的服务器地址信息、用户的协议列表信息、用户终端已有连接的行为特征信息。用户多维信息表中除包括用户多维信息之外,还包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
如果用户多维信息表中存在新建的连接对应的用户终端的用户多维信息,执行步骤204,否则执行步骤205。
步骤204,基于用户多维信息进行协议识别。
基于用户多维信息进行的协议识别方法包括多种独立的识别方法,各种识别方法之间不需要固定的先后顺序。其中,每种独立的识别方法都为基于用户多维信息中的某一维信息进行识别的方法,例如基于服务器地址信息进行的协议识别方法,基于用户终端已有连接的地址对信息进行的协议识别方法,基于用户终端已有连接的用户终端地址信息的协议识别方法,基于用户终端已有连接的特征识别方法,基于用户终端已有连接的行为识别方法等。
具体地,基于服务器地址信息进行的协议识别方法具体为:如果一个用户向一个服务器端口发起连接,那么该用户后续向相同的服务器端口发起的连接的协议类型和第一个连接的协议类型肯定是相同的。例如用户用HTTP协议访问了某服务器(如:1.2.3.4:80),那么该用户后续访问该服务器(1.2.3.4:80)的所有连接的协议类型也都是HTTP的。
具体地,基于用户终端已有连接的地址对信息进行的协议识别方法具体为:如果一个用户向一个服务器发起连接,那么该用户后续向相同的服务器IP地址发起的连接的协议类型和第一个连接的协议类型有可能是相同的。该识别方法在该用户的历史连接中找到和新建的连接的IP地址对(目的IP地址,源IP地址)相同的连接,然后通过简单的判断(例如简单的特征字确认)来确认新建的连接的协议类型是否和历史连接的协议类型相同。
具体地,基于用户终端已有连接的用户终端地址信息的协议识别方法具体为:如果一个用户以相同的(IP:Port)向一个或者多个目的地址发起多个连 接,那么这些具有相同的用户终端(IP:Port)的连接的协议类型是相同的。该识别方法在该用户的历史连接中找到和新建的连接的用户终端地址(IP:Port)相同的连接,就可以确认新建的连接的协议类型和历史连接的协议类型相同。
具体地,基于用户的特征识别方法具体为:按用户记录用户常用的协议列表,协议列表的来源包括该用户曾经使用的协议和预先配置的协议列表(例如用户所在地区的热门协议应用)。基于用户的特征识别方法在识别过程中,对用户的常用协议列表中的协议通过扫描协议特征的方法进行识别。
具体地,基于用户的行为识别方法具体为:通过比较用户的报文的用户行为统计数据和用户行为特征集,如果匹配则可以确认当前报文所属的协议。用户行为的统计数据包括报文中二进制数值的统计分布、端口范围、报文长度统计(报文长度范围、报文长度序列、报文长度集合、报文长度平均值、上下行每次交互的报文长度求和)、报文发送频度、报文收发比例以及目的地址的分散程度,等维度。用户行为特征集保存在用户记录中,用户行为特征集的初始内容是预先配置的用户行为特征,并在识别过程中根据该用户的历史连接的行为统计数据来丰富和更新。
如果识别成功,则执行步骤206,否则执行步骤209。
步骤205,在用户多维信息表中添加新用户终端对应的用户多维信息。
如果用户多维信息表中不存在新用户对应的用户多维信息,则在用户多维信息表中添加对应的记录。添加记录后,执行步骤209,即基于数据流对该数据报文进行协议识别。
步骤206,如果识别成功,则判断识别成功的协议报文是否包含无法通过特征识别方法识别的流量。
如果基于用户多维信息成功识别报文协议类型,则进一步判断该识别成功的报文是否包含无法通过特征识别方法识别的流量,例如,如果用户建立的第一个连接是加密连接,无法通过特征识别的方法识别,而是通过“基于 数据流的协议识别”中的“行为识别”方法识别的,然后DPI会记录该加密连接的IP、端口等等信息;当该用户建立第二个同样的加密连接时,DPI就能够通过发明的五种方法中的某一种来识别第二个加密连接,此时就会触发该判断,DPI会将第二个加密连接的行为统计数据来更新对应协议的行为特征。
如果是,则执行步骤207,否则执行步骤208。
步骤207,进行基于用户终端的行为识别统计,并更新用户多维信息表中的用户终端已有连接的用户行为特征信息。
由于一个连接本来是需要通过行为特征来识别的,如果不能通过行为特征来识别出来,那么这个连接可以作为对应协议的行为特征的一个样本数据、帮助改进和完善对应协议的行为特征。
步骤208,更新用户多维信息表中该连接对应的识别结果数据。
不管是通过基于流的协议识别方法,还是通过基于用户的协议识别方法,如果成功识别了该连接对应的协议类型,则需要更新用户多维信息表中对应的识别结果数据。可选地,还可以更新流表中该数据流对应的协议识别结果以及业务控制策略等。
步骤209,如果识别不成功,则基于数据流进行协议识别。
如果基于用户的协议识别不成功,则基于数据流进行识别,基于流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等。如果基于流的协议识别成功,则执行步骤208,否则执行步骤210。
步骤210,输出识别结果。
不管是否识别成功,都可输出识别结果,以便根据该识别结果进行相应的业务控制。
由此,本发明实施例提供的协议类型的识别方法和装置,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维 信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。
本发明实施例还提供了一种协议类型的识别方法,图4为本发明实施例提供的另一协议类型的识别方法流程图,该实施例的执行主体是网络设备,如业务网关或者路由器,该实施例详细描述了网络设备对接收到的数据报文进行基于用户的协议类型识别的方法,如图所示,该实施例包括以下步骤:
步骤401,获取用户终端与服务器之间建立的连接上传送的数据报文。
网络设备接收到数据流的数据报文后,对报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、TCP号,然后根据五元组信息判断数据流对应的连接是否为新建的连接。
优选地,接收到数据报文后,网络设备可查询流表,判断流表中是否存在该业务数据报文的五元组信息对应的连接记录信息,如果是,则判断数据流对应的连接是已有连接,如果否,则判断所述数据流对应的连接为新建的连接。
如果查询流表之后,判断出数据报文所在数据流对应的连接为已有连接,则直接根据所述流表存储的所述该数据报文的五元组信息对应协议类型识别结果以及业务处理方法,对所述该数据报文进行相应处理,如流量控制等。需要说明的是,即使数据报文所在连接为已有连接,也可以继续执行步骤402,即对数据报文所在连接进行相应的协议识别。
步骤402,根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
其中,基于数据流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等方法,识别成功后同样更新用户多维信息表中的连接的识别结果,如果识别不成功则输出识别不成功的识别结果。
步骤403,如果基于数据流识别不成功,则查找用户多维信息表中是否 存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息。
查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
其中,用户多维信息包括以下信息中的一项或者任意项的组合:用户终端已有连接对应的地址对信息、用户终端已有连接的用户终端地址信息、用户终端曾访问过的服务器地址信息、用户的协议列表信息、用户终端已有连接的行为特征信息,用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
具体地,本发明实施例中,用户终端已有连接对应的地址对信息为已有连接的源IP地址和目的IP地址组成的地址对,用户终端已有连接的用户终端地址信息由已有连接对应的用户终端的IP地址和端口号组成,用户终端曾访问过的服务器地址信息由用户终端曾访问过的服务器的IP地址和端口号组成,用户的协议列表中存储了用户常用的协议记录信息,用户终端已有连接的行为特征信息包括用户常见协议类型对应的协议特征以及用户的行为统计数据。
网络设备判断出用户多维信息表中存在该数据报文所在连接对应的用户终端的用户多维信息之后,即可执行步骤404。
网络设备在查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,如果没有在所述用户多维信息表中查找到与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。优选地,可在对该数据报文所在连接的协议类型识别成功后在用户多维信息表中添加该用户的用户多维信息。
步骤404,如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息, 对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
基于用户多维信息进行的协议识别方法包括多种独立的识别方法,各种识别方法之间不需要固定的先后顺序。其中,每种独立的识别方法都为基于用户多维信息中的某一维信息进行识别的方法,例如基于服务器地址信息进行的协议识别方法,基于用户终端已有连接的地址对信息进行的协议识别方法,基于用户终端已有连接的用户终端地址信息的协议识别方法,基于用户终端已有连接的特征识别方法,基于用户终端已有连接的特征识别方法等。
具体地,根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多 维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
如果网络设备基于用户多维信息识别出了数据流的协议类型,则更新用户多维信息表中的连接的识别结果。
由此,本发明实施例实现了基于用户多维信息的协议识别,进而可实现以用户为单位的业务控制,并且可以提高DPI系统的识别准确率,提升协议识别性能。
图4所对应的上述实施例简单描述了DPI系统进行协议识别的过程,下面通过一个详细的实施例描述协议识别过程。
图5为本发明实施例提供的另一协议类型的识别方法流程图,该实施例的执行主体是网络设备,如业务网关或者路由器,其中详细描述了网络设备对接收到的报文进行协议识别的过程。如图所示,该实施例包括以下步骤:
步骤501,接收数据报文。
步骤502,判断该数据报文所在连接是否为新建的连接。
网络设备对接收到的数据报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、TCP协议号。
具体地,可以在流表中查找是否存在该五元组信息对应的连接记录信息。流表中保存了DPI系统曾检测过的数据流对应连接的记录信息,流表中可以包括五元组信息、对应连接的识别结果、相应的业务控制策略等。
如果流表中保存了接收到的数据报文对应的五元组信息,则说明该数据报文对应的连接为已有连接,否则说明对应的连接为新建的连接。如果判断为新建的连接,则执行步骤503。
步骤503,对数据报文所在连接进行基于数据流的协议类型识别。
其中,基于数据流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等方法,识别成功后同样更新用户多维信息表中的连接的识别结果, 如果识别不成功则输出识别不成功的识别结果。
步骤504,如果基于数据流识别成功,则对业务报文进行相应的业务处理。
步骤505,如果基于数据流识别不成功,则判断用户多维信息表中是否存在所述数据报文所在连接对应的用户多维信息。
所述判断用户多维信息表中是否存在所述数据报文所在连接对应的用户多维信息包括:根据所述数据报文的五元组信息中用户终端地址信息,判断所述用户多维信息表中是否存在所述用户终端地址信息对应的用户多维信息。
其中,用户多维信息包括以下信息中的一项或者任意项的组合:用户终端已有连接对应的地址对信息、用户终端已有连接的用户终端地址信息、用户终端曾访问过的服务器地址信息、用户的协议列表信息、用户终端已有连接的行为特征信息。
具体地,本发明实施例中,用户终端已有连接对应的地址对信息为已有连接的源IP地址和目的IP地址组成的地址对,用户终端已有连接的用户终端地址信息由已有连接对应的用户终端的IP地址和端口号组成,用户终端曾访问过的服务器地址信息由用户终端曾访问过的服务器的IP地址和端口号组成,用户的协议列表中存储了用户常用的协议记录信息,用户终端已有连接的行为特征信息包括用户常见协议类型对应的协议特征以及用户的行为统计数据。
网络设备判断出用户多维信息表中不存在该数据报文所在连接对应的用户终端的用户多维信息之后,可在用户多维信息表中添加该用户的用户多维信息。
如果用户多维信息表中存在新建的连接对应的用户终端的用户多维信息,执行步骤506,否则执行步骤507。
步骤506,基于用户多维信息进行协议识别。
基于用户多维信息进行的协议识别方法包括多种独立的识别方法,各种识别方法之间不需要固定的先后顺序。其中,每种独立的识别方法都为基于用户多维信息中的某一维信息进行识别的方法,例如基于服务器地址信息进行的协议识别方法,基于用户终端已有连接的地址对信息进行的协议识别方法,基于用户终端已有连接的用户终端地址信息的协议识别方法,基于用户终端已有连接的特征识别方法,基于用户终端已有连接的行为识别方法等。
具体地,基于服务器地址信息进行的协议识别方法具体为:如果一个用户向一个服务器端口发起连接,那么该用户后续向相同的服务器端口发起的连接的协议类型和第一个连接的协议类型肯定是相同的。例如用户用HTTP协议访问了某服务器(如:1.2.3.4:80),那么该用户后续访问该服务器(1.2.3.4:80)的所有连接的协议类型也都是HTTP的。
具体地,基于用户终端已有连接的地址对信息进行的协议识别方法具体为:如果一个用户向一个服务器发起连接,那么该用户后续向相同的服务器IP地址发起的连接的协议类型和第一个连接的协议类型有可能是相同的。该识别方法在该用户的历史连接中找到和新建的连接的IP地址对(目的IP地址,源IP地址)相同的连接,然后通过简单的判断(例如简单的特征字确认)来确认新建的连接的协议类型是否和历史连接的协议类型相同。
具体地,基于用户终端已有连接的用户终端地址信息的协议识别方法具体为:如果一个用户以相同的(IP:Port)向一个或者多个目的地址发起多个连接,那么这些具有相同的用户终端(IP:Port)的连接的协议类型是相同的。该识别方法在该用户的历史连接中找到和新建的连接的用户终端地址(IP:Port)相同的连接,就可以确认新建的连接的协议类型和历史连接的协议类型相同。
具体地,基于用户的特征识别方法具体为:按用户记录用户常用的协议列表,协议列表的来源包括该用户曾经使用的协议和预先配置的协议列表(例如用户所在地区的热门协议应用)。基于用户的特征识别方法在识别过程中, 对用户的常用协议列表中的协议通过扫描协议特征的方法进行识别。
具体地,基于用户的行为识别方法具体为:通过比较用户的报文的用户行为统计数据和用户行为特征集,如果匹配则可以确认当前报文所属的协议。用户行为的统计数据包括报文中二进制数值的统计分布、端口范围、报文长度统计(报文长度范围、报文长度序列、报文长度集合、报文长度平均值、上下行每次交互的报文长度求和)、报文发送频度、报文收发比例以及目的地址的分散程度,等维度。用户行为特征集保存在用户记录中,用户行为特征集的初始内容是预先配置的用户行为特征,并在识别过程中根据该用户的历史连接的行为统计数据来丰富和更新。
如果识别成功,则执行步骤508,否则执行步骤511,输出识别结果。
步骤507,在用户多维信息表中添加新用户终端对应的用户多维信息。
如果用户连接数据表中不存在新用户对应的用户多维信息,则在用户连接数据表中添加对应的记录。添加记录后,执行步骤509,即基于数据流对该数据报文进行协议识别。
步骤508,如果识别成功,则判断识别成功的协议报文是否包含无法通过特征识别方法识别的流量。
如果基于用户多维信息成功识别报文协议类型,则进一步判断该识别成功的报文是否包含无法通过特征识别方法识别的流量,例如,如果用户建立的第一个连接是加密连接,无法通过特征识别的方法识别,而是通过“基于流的协议识别”中的“行为识别”方法识别的,然后DPI会记录该加密连接的IP、端口等等信息;当该用户建立第二个同样的加密连接时,DPI就能够通过发明的五种方法中的某一种来识别第二个加密连接,此时就会触发该判断,DPI会将第二个加密连接的行为统计数据来更新对应协议的行为特征。
如果是,则执行步骤509,否则执行步骤510。
步骤509,进行基于用户的行为识别统计,并更新用户多维信息表中的用户终端已有连接的用户行为特征信息。
由于一个连接本来是需要通过行为特征来识别的,如果不能通过行为特征来识别出来,那么这个连接可以作为对应协议的行为特征的一个样本数据、帮助改进和完善对应协议的行为特征。
步骤510,更新用户多维信息表中该连接对应的识别结果数据。
不管是通过基于流的协议识别方法,还是通过基于用户的协议识别方法,如果成功识别了该连接对应的协议类型,则需要更新用户多维信息表中对应的识别结果数据。可选地,还可以更新流表中该数据流对应的协议识别结果以及业务控制策略等。
步骤511,输出识别结果。
不管是否识别成功,都可输出识别结果,以便根据该识别结果进行相应的业务控制。
由此,本发明实施例实现了基于用户的协议识别,基于用户的协议识别还可以实现以用户为单位的业务控制。其中,由于基于用户多维信息的协议识别可以只基于报文的IP地址和端口,不对报文进行很深的内容扫描,因此可以显著提升协议识别的性能。
相应地,本发明实施例还提供了一种协议类型的识别装置,图6为本发明实施例提供的一种协议类型的识别装置示意图,如图所示,本实施例包括以下功能单元:
获取单元601,用于获取用户终端与服务器之间建立的连接上传送的数据报文。
网络设备接收到数据流的数据报文后,对报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、TCP协议号。
优选地,网络设备接收到数据报文后,网络设备可查询流表,判断流表中是否存在该业务数据报文的五元组信息对应的连接记录信息,如果是,则判断数据流对应的连接是已有连接,如果否,则判断所述数据流对应的连接 为新建的连接。
如果查询流表之后,判断出数据报文所在数据流对应的连接为已有连接,则直接根据所述流表存储的所述该数据报文的五元组信息对应协议类型识别结果以及业务处理方法,对所述该数据报文进行相应处理,如流量控制等。需要说明的是,即使数据报文所在连接为已有连接,也可以继续由查找单元602执行下面的判断操作,即对数据报文所在连接进行相应的协议识别。
查找单元602,用于查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息。
其中查找单元602具体用于:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
其中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息。所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
具体地,用户终端已有连接对应的地址对信息为已有连接的源IP地址和目的IP地址组成的地址对,用户终端已有连接的用户终端地址信息由已有连接对应的用户终端的IP地址和端口号组成,用户终端曾访问过的服务器地址信息由用户终端曾访问过的服务器的IP地址和端口号组成,用户的协议列表中存储了用户常用的协议记录信息,用户终端已有连接的行为特征信息包括用户常见协议类型对应的协议特征以及用户的行为统计数据。
第一处理单元603,用于如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连 接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
基于用户多维信息进行的协议识别方法包括多种独立的识别方法,各种识别方法之间不需要固定的先后顺序。其中,每种独立的识别方法都为基于用户多维信息中的某一维信息进行识别的方法,例如基于服务器地址信息进行的协议识别方法,基于用户终端已有连接的地址对信息进行的协议识别方法,基于用户终端已有连接的用户终端地址信息的协议识别方法,基于用户终端已有连接的特征识别方法,基于用户终端已有连接的特征识别方法等。
第一处理单元603具体用于:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
所述第一处理单元603还用于:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
第一处理单元603还用于:如果识别成功,则进一步判断所述数据报文是否为无法通过特征识别的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户连接数据表中的用户终端已有连接的行为特征信息。
第二处理单元604,用于如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
第二处理单元604还用于:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
所述第二处理单元604还用于:如果识别成功,则更新所述用户多维信息表中的识别结果数据,否则输出识别结果。
如果网络设备基于用户多维信息没有识别出数据流的协议类型或者说如果用户多维信息表中不存在该用户的多维信息,则基于数据流进行协议识别,基于流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等方法,识别成功后同样更新用户多维信息表中的连接的识别结果,如果识别不成功则输出识别不成功的识别结果。
由此,本发明实施例提供的协议类型的识别方法和装置,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。
相应地,本发明实施例还提供了一种协议类型的识别装置,图7为本发明实施例提供的另一协议类型的识别装置示意图;该装置包括:
获取单元701,用于获取用户终端与服务器之间建立的连接上传送的数据报文。
接收到数据流的数据报文后,对报文进行解析,根据报文头部信息,得到对应的五元组信息。其中,五元组信息包括报文的目的IP地址、目的端口号、源IP地址、源端口号、TCP号,然后根据五元组信息判断数据流对应的连接是否为新建的连接。
优选地,接收到数据报文后,网络设备可查询流表,判断流表中是否存在该业务数据报文的五元组信息对应的连接记录信息,如果是,则判断数据流对应的连接是已有连接,如果否,则判断所述数据流对应的连接为新建的连接。
如果查询流表之后,判断出数据报文所在数据流对应的连接为已有连接,则直接根据所述流表存储的所述该数据报文的五元组信息对应协议类型识别结果以及业务处理方法,对所述该数据报文进行相应处理,如流量控制等。需要说明的是,即使数据报文所在连接为已有连接,也可以继续由第一处理单元702执行相关的操作,即对数据报文所在连接进行基于数据流的协议识别。
第一处理单元702,用于根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
其中,基于数据流的协议识别方法包括关联识别、端口识别、特征识别、行为识别等方法,识别成功后同样更新用户多维信息表中的连接的识别结果,如果识别不成功则输出识别不成功的识别结果。
第一处理单元702还用于:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
查找单元703,用于如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息。
其中,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息。用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系
查找单元703具体用于:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
第二处理单元704,用于如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
第二处理单元704具体用于:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据 报文一致的已有连接对应的协议类型;或者判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
第二处理单元704还用于:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
第二处理单元704还用于:如果基于所述用于多维信息识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果。
第二处理单元704还用于:如果基于所述用户多维信息识别成功,则进一步判断所述数据报文是否为无法通过特征识别的报文,如果是,则进行基于用户的行为识别统计,并且更新用户连接数据表中的用户终端已有连接的行为特征信息。
由此,本发明实施例实现了基于用户多维信息的协议识别,进而可实现以用户为单位的业务控制,并且可以提高DPI系统的识别准确率,提升协议识别性能。
相应地,本发明实施例还提供了一种网络设备,图8为本发明实施例提供的一种网络设备示意图,如图所示,该网络设备包括网络接口801、处理器802和存储器803。系统总线804用于连接网络接口801、处理器802和存储器803。
网络接口801用于与用户终端设备、服务器侧设备,以及其他网络设备进行连接。
存储器803可以是永久存储器,例如硬盘驱动器和闪存,存储器803中具有软件模块和设备驱动程序。软件模块能够执行本发明上述方法的各种功能模块;设备驱动程序可以是网络和接口驱动程序。
在启动时,这些软件模块被加载到存储器803中,然后被处理器802访问并执行如下指令:
获取用户终端与服务器之间建立的连接上传送的数据报文;
查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别;
如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
其中,用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
进一步的,处理器802查找用户多维信息表中是否存在所述用户终端对应的用户多维信息的过程具体包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
进一步的,处理器802在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,还将执行以下指令:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
进一步的,处理器802根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户 多维信息的协议类型识别的过程具体为:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
进一步的,处理器802基于所述用户多维信息,对所述数据报文所在连接进行协议类型识别之后,访问存储器803后,还将执行以下指令:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
进一步的,处理器802根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户 多维信息的协议类型识别之后,访问存储器803,并且执行指令:如果识别成功,则更新所述用户多维信息表中的识别结果数据,否则输出识别结果。
进一步的,处理器802根据所获取的所述用户多维信息标识的用户终端已有连接的协议类型,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后,访问存储器803,并且执行指令:如果识别成功,则进一步判断所述数据报文是否为无法通过特征识别的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户连接数据表中的用户终端已有连接的行为特征信息。
由此,本发明实施例提供的网络设备,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。。
相应地,本发明实施例还提供了一种网络设备,图9为本发明实施例提供的一种网络设备示意图,如图所示,该网络设备包括网络接口901、处理器902和存储器903。系统总线904用于连接网络接口901、处理器902和存储器903。
网络接口901用于与用户终端设备、服务器侧设备,以及其他网络设备进行连接。
存储器903可以是永久存储器,例如硬盘驱动器和闪存,存储器903中具有软件模块和设备驱动程序。软件模块能够执行本发明上述方法的各种功能模块;设备驱动程序可以是网络和接口驱动程序。
在启动时,这些软件模块被加载到存储器903中,然后被处理器902访问并执行如下指令:
获取用户终端与服务器之间建立的连接上传送的数据报文;
根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据 流的协议类型识别;
如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
其中,用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息。用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
进一步的,处理器902查找用户多维信息表中是否存在所述用户终端对应的用户多维信息的过程具体包括:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
进一步的,处理器902在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,访问存储器903后,还将执行以下指令:如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
进一步的,处理器902根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别的过程具体为:判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为 所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
进一步的,处理器902根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后,访问存储器903后,还将执行以下指令:如果基于所述用于多维信息识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果。
进一步的,处理器902对所述数据报文所在连接进行基于数据流的协议类型识别之后,访问存储器903,并且执行指令:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
进一步的,处理器902基于所述用户多维信息,对所述数据报文所在连 接进行协议类型识别之后,访问存储器903,并且执行指令:如果基于所述用户多维信息识别成功,则进一步判断所述数据报文是否为无法通过特征识别的报文,如果是,则进行基于用户的行为识别统计,并且更新用户连接数据表中的用户终端已有连接的行为特征信息。
由此,本发明实施例提供的网络设备,通过对接收到的数据报文根据用户终端已有连接的协议类型进行基于用户多维信息的协议类型识别,可以实现以用户为单位的业务控制,并且通过结合基于用户多维信息的协议类型识别和基于数据流的协议类型识别,可以提高DPI系统的识别准确率,提升协议识别性能。
专业人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (30)

  1. 一种协议类型的识别方法,其特征在于,所述方法包括:
    获取用户终端与服务器之间建立的连接上传送的数据报文;
    查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
    如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别;
    如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
  2. 根据权利要求1所述的协议类型的识别方法,其特征在于,所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:
    根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
  3. 根据权利要求2所述的协议类型的识别方法,其特征在于,在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,还包括:
    如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
  4. 根据权利要求1所述的协议类型的识别方法,其特征在于,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;
    所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
  5. 根据权利要求4所述的协议类型的识别方法,其特征在于,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:
    判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
  6. 根据权利要求1-5任一项所述的协议类型的识别方法,其特征在于,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则更新所述用户多维信息表中的识别结果数 据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
  7. 根据权利要求1-6任一项所述的协议类型的识别方法,其特征在于,所述根据所获取的所述用户多维信息标识的用户终端已有连接的协议类型,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户多维信息表中的用户终端已有连接的行为特征信息。
  8. 一种协议类型的识别方法,其特征在于,所述方法包括:
    获取用户终端与服务器之间建立的连接上传送的数据报文;
    根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别;
    如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
    如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
  9. 根据权利要求8所述的协议类型的识别方法,其特征在于,所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息包括:
    根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
  10. 根据权利要求9所述的协议类型的识别方法,其特征在于,在所述查找用户多维信息表中是否存在所述用户终端对应的用户多维信息之后,还包括:
    如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信 息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
  11. 根据权利要求8所述的协议类型的识别方法,其特征在于,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;
    所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
  12. 根据权利要求11所述的协议类型的识别方法,其特征在于,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别,包括:
    判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用 户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
  13. 根据权利要求9-12任一项所述的协议类型的识别方法,其特征在于,所述根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
  14. 根据权利要求8-13任一项所述的协议类型的识别方法,其特征在于,所述对所述数据报文所在连接进行基于数据流的协议类型识别之后还包括:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
  15. 根据权利要求8-14任一项所述的协议类型的识别方法,其特征在于,所述对所述数据报文所在连接进行基于用户多维信息的协议类型识别之后还包括:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户连接数据表中的用户终端已有连接的行为特征信息。
  16. 一种协议类型的识别装置,其特征在于,所述装置包括:
    获取单元,用于获取用户终端与服务器之间建立的连接上传送的数据报文;
    查找单元,用于查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
    第一处理单元,用于如果查找到所述用户终端对应的用户多维信息,则 根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别;
    第二处理单元,用于如果没有查找到所述用户终端对应的用户多维信息,则根据所述数据报文的报文特征,对所述数据报文所在连接进行基于数据流的协议类型识别。
  17. 根据权利要求16所述的协议类型的识别装置,其特征在于,所述查找单元具体用于:根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
  18. 根据权利要求17所述的协议类型的识别装置,其特征在于,所述第二处理单元还用于:
    如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
  19. 根据权利要求16所述的协议类型的识别装置,其特征在于,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;
    所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
  20. 根据权利要求19所述的协议类型的识别装置,其特征在于,所述第一处理单元具体用于:
    判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是 否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
  21. 根据权利要16-20任一项所述的协议类型的识别装置,其特征在于,所述第一处理单元还用于:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
  22. 根据权利要求16-21所述的协议类型的识别装置,其特征在于,所述第一处理单元还用于:如果识别成功,则进一步判断所述数据报文是否为通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户多维信息表中的用户终端已有连接的行为特征信息。
  23. 一种协议类型的识别装置,其特征在于,所述装置包括:
    获取单元,用于获取用户终端与服务器之间建立的连接上传送的数据报文;
    第一处理单元,用于根据所述数据报文的报文特征,对所述数据报文所 在连接进行基于数据流的协议类型识别;
    查找单元,用于如果基于数据流识别不成功,则查找用户多维信息表中是否存在所述用户终端对应的用户多维信息,所述用户多维信息用于表示用户终端当前已建立的所有连接的信息;
    第二处理单元,用于如果查找到所述用户终端对应的用户多维信息,则根据所获取的所述用户多维信息标识的所述用户终端当前已建立的所有连接的信息,对所述数据报文所在连接进行基于用户多维信息的协议类型识别。
  24. 根据权利要求23所述的协议类型的识别装置,其特征在于,所述查找单元具体用于:
    根据所述数据报文中的用户终端地址信息,查找所述用户多维信息表中是否存在与所述用户终端地址信息对应的用户多维信息。
  25. 根据权利要求24所述的协议类型的识别装置,其特征在于,所述第二处理单元还用于:
    如果所述用户多维信息表中不存在与所述用户终端对应的用户多维信息,则在所述用户多维信息表中添加所述用户终端对应的用户多维信息。
  26. 根据权利要求23所述的协议类型的识别装置,其特征在于,所述用户终端对应的用户多维信息包括以下信息中的至少一项:所述用户终端当前已有连接对应的源IP地址信息和目的IP地址信息、所述用户终端已有连接的用户终端地址信息、所述用户终端曾访问过的服务器地址信息、所述用户终端的协议列表、所述用户终端已有连接的行为特征信息;
    所述用户多维信息表中包括所述用户多维信息,以及所述用户多维信息与用户终端已有连接的协议类型的对应关系。
  27. 根据权利要求26所述的协议类型的识别装置,其特征在于,所述第二处理单元具体用于:
    判断所述数据报文的五元组中的服务器地址信息,是否包含在所述用户多维信息表中存储的所述用户终端曾访问过的服务器地址信息中,如果是, 则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的服务器地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组中的源IP地址信息和目的IP地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接对应的源IP地址信息和目的IP地址信息中,如果是,则继续判断所述数据报文的特征信息是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的源IP地址信息、目的IP地址信息以及行文特征信息都与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文的五元组信息中的用户终端地址信息,是否包含在所述用户多维信息表中存储的用户终端已有连接的用户终端地址信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的用户终端地址信息与所述数据报文一致的已有连接对应的协议类型;或者
    判断所述数据报文以及历史数据报文的行为统计数据,是否包含在所述用户多维信息表中存储的用户终端已有连接的行为特征信息中,如果是,则所述数据报文所在连接的协议类型为所述用户多维信息表中存储的行为特征信息与所述数据报文一致的已有连接对应的协议类型。
  28. 根据权利要求23-27任一项所述的协议类型的识别装置,其特征在于,所述第二处理单元还用于:如果识别成功,则更新所述用户多维信息表中的识别结果数据,并且输出识别结果,其中所述识别结果数据为识别出的所述数据报文所在连接的协议类型。
  29. 根据权利要求23-28任一项所述的协议类型的识别装置,其特征在于,所述第一处理单元还用于:如果基于所述数据流识别成功,则对所述数据报文进行相应的业务处理。
  30. 根据权利要求23-29所述的协议类型的识别装置,其特征在于,所述第二处理单元还用于:如果识别成功,则进一步判断所述数据报文是否为 通过特征无法识别成功的报文,如果是,则进行基于用户的行为识别统计,并且更新所述用户连接数据表中的用户终端已有连接的行为特征信息。
PCT/CN2015/072529 2014-04-29 2015-02-09 协议类型的识别方法和装置 WO2015165296A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CA2947325A CA2947325C (en) 2014-04-29 2015-02-09 Protocol type identification method and apparatus
US15/338,105 US10084713B2 (en) 2014-04-29 2016-10-28 Protocol type identification method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410177705.0 2014-04-29
CN201410177705.0A CN103916294B (zh) 2014-04-29 2014-04-29 协议类型的识别方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/338,105 Continuation US10084713B2 (en) 2014-04-29 2016-10-28 Protocol type identification method and apparatus

Publications (1)

Publication Number Publication Date
WO2015165296A1 true WO2015165296A1 (zh) 2015-11-05

Family

ID=51041712

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/072529 WO2015165296A1 (zh) 2014-04-29 2015-02-09 协议类型的识别方法和装置

Country Status (4)

Country Link
US (1) US10084713B2 (zh)
CN (1) CN103916294B (zh)
CA (1) CA2947325C (zh)
WO (1) WO2015165296A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106921637A (zh) * 2015-12-28 2017-07-04 华为技术有限公司 网络流量中的应用信息的识别方法和装置
CN109936512A (zh) * 2017-12-15 2019-06-25 华为技术有限公司 流量分析方法、公共服务流量归属方法及相应的计算机系统
WO2021018252A1 (zh) * 2019-07-31 2021-02-04 华为技术有限公司 一种数据处理方法及装置

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103916294B (zh) * 2014-04-29 2018-05-04 华为技术有限公司 协议类型的识别方法和装置
CN107547437B (zh) * 2017-05-11 2020-09-08 新华三信息安全技术有限公司 应用识别方法及装置
CN109428774B (zh) * 2017-08-22 2020-12-22 网宿科技股份有限公司 一种dpi设备的数据处理方法及相关的dpi设备
CN109951430B (zh) * 2017-12-21 2021-04-30 中移(杭州)信息技术有限公司 一种数据处理方法及装置
CN108900374B (zh) * 2018-06-22 2021-05-25 网宿科技股份有限公司 一种应用于dpi设备的数据处理方法和装置
CN111867146B (zh) * 2019-04-30 2022-07-22 大唐移动通信设备有限公司 一种标识信息发送、接收方法、设备及装置
CN111953552B (zh) * 2019-05-14 2022-12-13 华为技术有限公司 数据流的分类方法和报文转发设备
CN110320853A (zh) * 2019-07-30 2019-10-11 翼石电子股份有限公司 一种plc数据采集分析方法及系统
CN110581780B (zh) * 2019-08-27 2022-10-21 杭州安恒信息技术股份有限公司 针对web服务器资产的自动识别方法
CN112565106B (zh) * 2019-09-26 2023-04-28 中国移动通信集团河北有限公司 流量业务识别方法、装置、设备及计算机存储介质
CN110808879B (zh) * 2019-11-01 2021-11-02 杭州安恒信息技术股份有限公司 一种协议识别方法、装置、设备及可读存储介质
CN112653740A (zh) * 2020-12-11 2021-04-13 北京金山云网络技术有限公司 支持quic连接迁移的负载均衡方法、装置及计算机产品
CN112866289B (zh) * 2021-03-02 2022-09-30 恒为科技(上海)股份有限公司 一种提取特征规则的方法及系统
CN112994984B (zh) * 2021-04-15 2021-07-30 紫光恒越技术有限公司 识别协议及内容的方法、存储设备、安全网关、服务器

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070044147A1 (en) * 2005-08-17 2007-02-22 Korea University Industry And Academy Collaboration Foundation Apparatus and method for monitoring network using the parallel coordinate system
US20070263548A1 (en) * 2006-05-15 2007-11-15 Fujitsu Limited Communication control system
CN102523196A (zh) * 2011-11-21 2012-06-27 北京神州绿盟信息安全科技股份有限公司 一种信息识别方法、装置及系统
CN103023670A (zh) * 2011-09-20 2013-04-03 中兴通讯股份有限公司 基于dpi的报文业务类型识别方法及装置
CN103297270A (zh) * 2013-05-24 2013-09-11 华为技术有限公司 应用类型识别方法及网络设备
CN103916294A (zh) * 2014-04-29 2014-07-09 华为技术有限公司 协议类型的识别方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101202652B (zh) * 2006-12-15 2011-05-04 北京大学 网络应用流量分类识别装置及其方法
EP2258084B1 (en) 2008-03-10 2012-06-06 Telefonaktiebolaget L M Ericsson (PUBL) Technique for classifying network traffic and for validating a mechanism for calassifying network traffic
CN101645806B (zh) * 2009-09-04 2011-09-07 东南大学 Dpi和dfi相结合的网络流量分类系统及分类方法
CN102075404A (zh) * 2009-11-19 2011-05-25 华为技术有限公司 一种报文检测方法及装置
CN101873259B (zh) * 2010-06-01 2013-01-09 华为技术有限公司 Sctp报文识别方法和装置
WO2012171166A1 (zh) * 2011-06-13 2012-12-20 华为技术有限公司 协议解析方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070044147A1 (en) * 2005-08-17 2007-02-22 Korea University Industry And Academy Collaboration Foundation Apparatus and method for monitoring network using the parallel coordinate system
US20070263548A1 (en) * 2006-05-15 2007-11-15 Fujitsu Limited Communication control system
CN103023670A (zh) * 2011-09-20 2013-04-03 中兴通讯股份有限公司 基于dpi的报文业务类型识别方法及装置
CN102523196A (zh) * 2011-11-21 2012-06-27 北京神州绿盟信息安全科技股份有限公司 一种信息识别方法、装置及系统
CN103297270A (zh) * 2013-05-24 2013-09-11 华为技术有限公司 应用类型识别方法及网络设备
CN103916294A (zh) * 2014-04-29 2014-07-09 华为技术有限公司 协议类型的识别方法和装置

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106921637A (zh) * 2015-12-28 2017-07-04 华为技术有限公司 网络流量中的应用信息的识别方法和装置
EP3297213A4 (en) * 2015-12-28 2018-05-30 Huawei Technologies Co., Ltd. Method and apparatus for identifying application information in network traffic
EP3496338A1 (en) * 2015-12-28 2019-06-12 Huawei Technologies Co., Ltd. Method for identifying application information in network traffic, and apparatus
CN106921637B (zh) * 2015-12-28 2020-02-14 华为技术有限公司 网络流量中的应用信息的识别方法和装置
US11582188B2 (en) 2015-12-28 2023-02-14 Huawei Technologies Co., Ltd. Method for identifying application information in network traffic, and apparatus
US11855967B2 (en) 2015-12-28 2023-12-26 Huawei Technologies Co., Ltd. Method for identifying application information in network traffic, and apparatus
CN109936512A (zh) * 2017-12-15 2019-06-25 华为技术有限公司 流量分析方法、公共服务流量归属方法及相应的计算机系统
CN109936512B (zh) * 2017-12-15 2021-10-01 华为技术有限公司 流量分析方法、公共服务流量归属方法及相应的计算机系统
US11425047B2 (en) 2017-12-15 2022-08-23 Huawei Technologies Co., Ltd. Traffic analysis method, common service traffic attribution method, and corresponding computer system
WO2021018252A1 (zh) * 2019-07-31 2021-02-04 华为技术有限公司 一种数据处理方法及装置

Also Published As

Publication number Publication date
US10084713B2 (en) 2018-09-25
US20170048155A1 (en) 2017-02-16
CA2947325A1 (en) 2015-11-05
CN103916294B (zh) 2018-05-04
CN103916294A (zh) 2014-07-09
CA2947325C (en) 2020-11-10

Similar Documents

Publication Publication Date Title
WO2015165296A1 (zh) 协议类型的识别方法和装置
US10951495B2 (en) Application signature generation and distribution
US10348631B2 (en) Processing packet header with hardware assistance
EP2434689B1 (en) Method and apparatus for detecting message
US10547674B2 (en) Methods and systems for network flow analysis
US8073936B2 (en) Providing support for responding to location protocol queries within a network node
US10218733B1 (en) System and method for detecting a malicious activity in a computing environment
US10498618B2 (en) Attributing network address translation device processed traffic to individual hosts
CN103297270A (zh) 应用类型识别方法及网络设备
CN110855576A (zh) 应用识别方法及装置
JP2017016650A (ja) コンピュータネットワーク上の資産を検出および識別するための方法およびシステム
US20170134413A1 (en) System and method for connection fingerprint generation and stepping-stone traceback based on netflow
CN110581780B (zh) 针对web服务器资产的自动识别方法
CN111182072A (zh) 会话请求的应用识别方法、装置和计算机设备
CN107592299B (zh) 代理上网识别方法、计算机装置及计算机可读存储介质
CN113055420B (zh) Https业务识别方法、装置及计算设备
CN106961393B (zh) 网络会话中udp报文的检测方法及装置
CN105703930A (zh) 基于应用的会话日志处理方法及装置
CN104683241A (zh) 一种报文检测方法及装置
JP5925287B1 (ja) 情報処理装置、方法およびプログラム
KR101743471B1 (ko) 에이전트 기반 시그니처리스 애플리케이션 인지 시스템 및 방법
CN114363032B (zh) 网络攻击检测方法、装置、计算机设备及存储介质
CN115941314A (zh) 一种基于snort的规则分类方法
CN111200652A (zh) 应用识别方法、应用识别装置和计算设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15785481

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2947325

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15785481

Country of ref document: EP

Kind code of ref document: A1