WO2017050038A1 - 报文识别方法、装置和计算机存储介质 - Google Patents

报文识别方法、装置和计算机存储介质 Download PDF

Info

Publication number
WO2017050038A1
WO2017050038A1 PCT/CN2016/094459 CN2016094459W WO2017050038A1 WO 2017050038 A1 WO2017050038 A1 WO 2017050038A1 CN 2016094459 W CN2016094459 W CN 2016094459W WO 2017050038 A1 WO2017050038 A1 WO 2017050038A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
custom
rule
packet
identified
Prior art date
Application number
PCT/CN2016/094459
Other languages
English (en)
French (fr)
Inventor
李乐村
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2017050038A1 publication Critical patent/WO2017050038A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks

Definitions

  • the present invention relates to the field of communications, and in particular, to a message recognition method, apparatus, and computer storage medium.
  • Ethernet is the most popular LAN technology in the current application.
  • the most common Ethernet data packet frame format network is Ethernet II, Ethernet 802.3 SAP, and Ethernet 802.3 SNAP.
  • the above three formats contain a large number of specific protocol packets.
  • the main method for identifying data packets is to serially identify the message. Only when the previous message is completely analyzed, the next message is identified, and the recognition efficiency is low, and the high-speed broadband service cannot be satisfied.
  • the packet identification method, apparatus, and computer storage medium that the embodiment of the present invention is expected to provide at least partially solve the problem of low packet identification efficiency or partially solve the problem that some messages cannot be identified.
  • a first aspect of the embodiments of the present invention provides a packet identification method, where the method includes:
  • the field to be identified is analyzed using at least a 2-stage pipeline.
  • the identifying the to-be-identified field by using at least two levels of pipelines includes:
  • the first-stage pipeline is used to analyze the destination address DA, the source address SA, and the VLAN number of the virtual local area network (LAN) of the packet;
  • the second format pipeline is used to analyze the encapsulation format Ethernet type of the packet
  • the third-stage pipeline is used to analyze whether the packet carries a PPPoE encapsulation format header and a network protocol IP header through an Ethernet transmission point-to-point protocol;
  • the data packet is encoded according to the analysis result of the first-stage pipeline to the fourth-stage pipeline by the fifth-stage pipeline.
  • the analyzing the to-be-identified field by using at least two levels of pipelines includes:
  • the position offset of the identified number of bytes and the field to be identified are output;
  • n is an integer not less than 1; the nth stage pipeline is a previous stage pipeline of the n+1th stage pipeline.
  • the method further includes:
  • the field to be identified in the extracted data packet includes:
  • the first N bytes of the data packet are extracted as the field to be identified; and the N is equal to the number of bytes corresponding to the specified length.
  • the field to be identified in the extracted data packet further includes:
  • the data packet When the length of the data packet is not greater than the specified length, the data packet is regarded as the entire field to be identified.
  • the method further includes:
  • the obtaining a custom identification rule includes:
  • the identifying the packet according to the custom identification rule to form a recognition result includes:
  • the M1 and the M2 are integers not less than 1;
  • Determining, according to the identification result, whether the data packet is a customized packet that meets the custom rule including:
  • the data packet is determined to be a customized message that satisfies the custom rule.
  • the obtaining a custom identification rule further includes
  • the identifying the packet according to the custom identification rule to form a recognition result further comprising:
  • M1 data extraction is performed from the start position of the extracted data and M2 bytes are extracted each time.
  • a second aspect of the embodiments of the present invention provides another packet identification method, where the method further includes:
  • the obtaining a custom identification rule includes:
  • the identifying the packet according to the custom identification rule to form a recognition result includes:
  • the M1 and the M2 are integers not less than 1;
  • Determining, according to the identification result, whether the data packet is a customized packet that meets the custom rule including:
  • the data packet is determined to be a customized message that satisfies the custom rule.
  • the obtaining a custom identification rule further includes
  • the identifying the packet according to the custom identification rule to form a recognition result further comprising:
  • M1 data extraction is performed from the start position of the extracted data and M2 bytes are extracted each time.
  • a third aspect of the embodiments of the present invention provides a message identifying apparatus, where the apparatus includes:
  • a receiving unit configured to receive a data packet
  • An extracting unit configured to extract a field to be identified in the data packet
  • the first identification unit is configured to analyze the field to be identified by using at least two stages of pipelines.
  • the first identification unit is configured to analyze the destination address DA, the source address SA, and the VLAN number of the virtual local area network (VLAN) of the packet by using a first-stage pipeline; and analyze the encapsulation of the packet by using a second-stage pipeline.
  • a format Ethernet type using a level 3 pipeline to analyze whether the packet carries a PPPoE encapsulation format header and a network protocol IP header through an Ethernet transmission point; the fourth-stage pipeline is used to analyze the protocol type of the packet;
  • the data packet is encoded according to the analysis result of the first-stage pipeline to the fourth-stage pipeline by the fifth-stage pipeline.
  • the first identifying unit is configured to output the position offset of the identified number of bytes and the field to be identified after the nth stage pipeline is identified; and the n+1th stage pipeline receives the location An offset and the field to be identified, starting from the offset position corresponding to the position offset of the identified number of bytes output by the nth stage pipeline, identifying the field to be identified;
  • n is an integer not less than 1; the nth stage pipeline is a previous stage pipeline of the n+1th stage pipeline.
  • the device further includes:
  • a storage unit configured to store the data message in a first-in first-out queue after receiving the data message
  • the extracting unit is configured to take out the to-be-identified packet from the first-in-first-out queue, and extract the to-be-identified field of the to-be-identified packet.
  • the extraction unit is configured to determine a packet length of the data packet, and when the data packet length is greater than a specified length, extract the first N bytes of the data packet as the to-be-identified a field; the N is equal to the number of bytes corresponding to the specified length.
  • the extracting unit is further configured to treat the data packet as the field to be identified when the length of the data packet is not greater than the specified length.
  • the device further includes:
  • a second identifying unit configured to identify the data packet according to the custom identification rule, to form a recognition result
  • the determining unit is configured to determine, according to the identification result, whether the data packet is a custom message that satisfies the custom rule.
  • the acquiring unit is configured to acquire a mask and a matching manner to extract a starting position of the data from the custom identification rule according to the foregoing solution.
  • the second identifying unit is configured to perform M1 data extraction and extract M2 bytes each time starting from the starting position of the extracted data; the M1 and the M2 are integers not less than 1; The extracted bytes are operated with the mask to obtain data to be matched; the data to be matched is matched with the matching data table according to the matching manner to form a recognition result;
  • the determining unit is configured to: when the identification result indicates the data to be matched and the When the data in the matching data table matches, the data packet is determined to be a custom message that satisfies the custom rule.
  • the acquiring unit is further configured to obtain a rule valid enable bit from the custom identification rule
  • the second identifying unit is further configured to determine, according to the information of the rule valid enable bit, whether the custom identification rule is valid; when it is determined that the custom identification rule is valid, from the extracting data
  • the start position starts, M1 data extraction is performed and M2 bytes are extracted each time.
  • a fourth aspect of the embodiments of the present invention provides a message identifying apparatus, where the apparatus further includes:
  • a second identifying unit configured to identify a data packet according to the custom identification rule, to form a recognition result
  • the determining unit is configured to determine, according to the identification result, whether the data packet is a custom message that satisfies the custom rule.
  • the acquiring unit is specifically configured to obtain a starting position of the data by using the custom identification rule to obtain a mask and a matching manner;
  • the second identifying unit is configured to perform M1 data extraction and extract M2 bytes each time starting from the starting position of the extracted data; the M1 and the M2 are integers not less than 1; The extracted bytes are operated with the mask to obtain data to be matched; the data to be matched is matched with the matching data table according to the matching manner to form the identification result;
  • the determining unit is configured to: when the identification result indicates that the data to be matched matches the data in the matching data table, determine that the data packet is a customized message that meets the custom rule. .
  • the acquiring unit is further configured to obtain a rule valid enable bit from the custom identification rule
  • the second identifying unit is further configured to determine, according to the content of the rule valid enable bit Whether the custom identification rule is valid; when it is determined that the custom recognition rule is valid, starting from the starting position of the extracted data, performing M1 data extraction and extracting M2 bytes each time.
  • a fifth aspect of the embodiments of the present invention provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute any one of the foregoing message identification methods.
  • the first packet identification method, apparatus, and computer storage medium provided by the embodiments of the present invention use at least two levels of pipelines to identify data packets. Obviously, at least two data packets can be identified in the pipeline at the same time, so that When the identification device needs to identify multiple data packets, it can significantly improve the overall efficiency of message recognition, improve the response speed of message recognition, and satisfy high-speed bandwidth services.
  • the second packet identification method and device provided by the embodiment of the present invention can use the custom rule to identify the message, and can identify the customized message, and solve the problem that the existing identification method and the device cannot perform the customized message.
  • the phenomenon of recognition especially the difficulty in recognizing the difficulty of identifying irregular custom messages.
  • FIG. 1 is a schematic flowchart diagram of a first packet identification method according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a data packet according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a pipeline of a 5-stage pipeline according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a second packet identification method according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a third packet identification method according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of forming a recognition result according to a custom identification rule according to an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart diagram of a fourth packet identification method according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic flowchart diagram of a fifth packet identification method according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a first packet identification apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a second packet identification apparatus according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a third packet identification apparatus according to an embodiment of the present invention.
  • this embodiment provides a packet identification method, where the method includes:
  • Step S110 Receive a data message
  • Step S120 Extract a field to be identified in the data packet
  • Step S130 analyzing the field to be identified by using at least a 2-stage pipeline.
  • the packet identification method described in this embodiment is generally applied to a data packet receiving end.
  • the fields to be identified in the data packet are extracted.
  • the fields to be identified include at least a part of the bytes in the header of the data packet.
  • At least two levels of willow lines are used to identify the field to be identified in step S130.
  • at least two identified messages can exist in one pipeline at a time, obviously using a processing module to identify the message in the prior art. Only one data message can be identified at a time, which greatly improves the rate of message recognition and reduces the overall delay of the identification device for multiple message recognition.
  • the at least two-stage pipeline may be a 2-stage pipeline, a 3-stage pipeline, a 4-stage pipeline or a 5-stage pipeline, a 6-stage pipeline, or even a pipeline of 6 or more stages.
  • the number of stages of the pipeline can be segmented according to the identification requirements. The following provides a method for identifying a 5-stage pipeline:
  • the step S130 may include:
  • the first-stage pipeline is used to analyze the destination address DA, the source address SA, and the VLAN number of the virtual local area network (LAN) of the packet;
  • the packet format and the Ethernet type of the packet are analyzed by using a level 2 pipeline;
  • the third-stage pipeline is used to analyze whether the packet is carried over the Ethernet to transmit a point-to-point protocol.
  • the data packet is encoded according to the analysis result of the first-stage pipeline to the fourth-stage pipeline by the fifth-stage pipeline.
  • the first stage pipeline is configured to analyze a first byte of the to-be-identified field to a byte corresponding to the number of VLAN layers.
  • the byte corresponding to the number of VLAN layers may be a tag indicating the VLAN tag TAG.
  • the second stage pipeline analyzes the first byte after the byte corresponding to the number of VLAN layers, and identifies the Ethernet type field in the byte to be identified.
  • the Ethernet network type field may characterize what type of Ethernet the data message is from.
  • the field identified by the level 2 pipeline may also include the encapsulation type field in the field to be identified.
  • the encapsulation type may be Ethernet 802.3 SAP, Ethernet 802.2 LLC, Ethernet 802.3 SNAP or Ethernet II.
  • the Ethernet 802.3 SAP, the Ethernet 802.2 LLC, the Ethernet 802.3 SNAP, and the Ethernet II are all ones of an Ethernet standard frame format. The characteristics of the formats of the three Ethernet standard frames can be found in the prior art. This is repeated.
  • the field of the level 3 pipeline analysis may include a PPPoE format header and an IP header in the field to be identified. If the PPPoE format header is included in the to-be-identified field, it indicates that the PPPoE format header is carried in the data packet. Analyzing the IP header includes analyzing the IP type, the IP type combination format, and whether there is an IP extension header. Specifically, the IP type may include IPv4 or IPv6. The IP type combination may include a combination of IPv4 plus IPv6, or a combination of IPv6 plus IPv4. The IP extension header may include an extended header carried by IPv6 or the like.
  • the specific protocol type that the data packet of the level 4 pipeline analysis follows for example, whether the data packet is a TCP packet conforming to the Transmission Control Protocol (TCP) protocol, or a user datagram protocol (User) UDP packet of the Datagram Protocol (UDP) protocol.
  • TCP Transmission Control Protocol
  • User user datagram protocol
  • UDP Datagram Protocol
  • the field identified by the level 4 pipeline message may further include the Destination port DPORT field and/or source port SPORT field in the field to be identified.
  • the fifth stage pipeline analyzes the remaining fields in the to-be-identified field that have not been analyzed, and outputs the data message code according to the recognition result of the pipelines of the first to fourth levels.
  • the data message code here is combined to form a field string or a code string for the recognition result of each stage pipeline. After the data packet encoding is matched with the pre-configured identification code, it is determined which data packet the packet is, and the data packet is recognized. In a specific implementation process, the level 5 pipeline can also be used to match the action of the data packet configuration.
  • the actions configured herein may include operations such as forwarding or receiving storage, and after the identification of the data message is completed, the action of configuring the data message is performed.
  • the above operation is a standard 5-stage pipeline identification method. If the current pipeline is a 3-stage pipeline, the adjacent 2-stage or 3-stage pipelines in the 5-stage pipeline can be combined to realize the 3-stage pipeline identification.
  • the time for identifying each stage of the pipeline is equivalent, and it is rare that the recognition rate of the pipeline is too slow to identify the bottleneck, which is characterized by simple implementation and high recognition efficiency.
  • Figure 2 can be a composition of a data message, and the contents are sorted in the order shown in Figure 2 to form a header of the data message.
  • the tag TAG adjacent to the SA may be a VLAN tag.
  • LLC indicates Ethernet 802.2 LLC
  • SNAP indicates Ethernet 802.3 SNAP
  • Ethernet type field indicated by TYPE adjacent to LLC and SNAP.
  • IPv6 and IPv4 represent a combination of IPv6 and IPv4.
  • Figure 2 shows the combination of IPv4 and IPv6 represented by IPv4 and IPv6.
  • the text or business content of the data packet that can be represented by the data in FIG.
  • Figure 3 shows a 5-stage pipeline that analyzes the processed fields separately.
  • the first 256 bytes of the data message are input to the pipelines in parallel, or the first 256 bytes of the data packet are input to the pipelines through the pipeline.
  • the first 256 bytes include the header of the data message.
  • the first stage pipeline goes to the tag TAG1 in the 256 bytes, such as the VLAN tag.
  • Line analysis after the analysis, the analysis results are input into the second-stage pipeline.
  • the second-stage pipeline begins analysis from the byte of the first-stage pipeline analysis cutoff, and the tag TAG2 and Ethernet type fields are analyzed.
  • TAG2 can include a frame format tag.
  • the Level 3 pipeline receives analysis dismissal from the Level 2 pipeline and analyzes it from the byte of the Level 2 pipeline analysis cutoff, which analyzes the IP header.
  • the fourth-stage pipeline receives the analysis result of the third-stage pipeline and analyzes the communication protocol used by the data packet.
  • the data packet is a TCP packet or a UDP packet.
  • the fifth stage pipeline will receive the analysis result of the first four stages of pipeline from level 4, the analysis ends, and the output data message is encoded.
  • the data message here is encoded as the code consisting of the analysis results.
  • the pipeline identification may identify the sequence of the bytes to be identified.
  • the step S130 may include: after the n-th pipeline is identified, outputting the position offset of the identified number of bytes. And the field to be identified; the n+1th pipeline receives the position offset and the field to be identified, and the position offset corresponding to the number of recognized bytes output from the nth stage pipeline corresponds
  • the offset position begins to identify the field to be identified; the n is an integer not less than 1; the nth stage pipeline is a previous stage pipeline of the n+1th stage pipeline. For example, if the nth stage pipeline is a 2nd stage pipeline, then the n+1th stage pipeline will be the 3rd stage pipeline.
  • the identification result of the upper-stage pipeline is also transmitted to the next-stage pipeline or the last-stage pipeline, which facilitates the sequential transmission of the next-stage pipeline to the last-stage pipeline, facilitating the formation of the data packet by the last-stage pipeline. Encoding to finalize the type of data message, etc.
  • the method further includes:
  • Step S111 After receiving the data packet, storing the data packet in a first-in first-out queue;
  • the step S120 may include: extracting a packet to be identified from the first in first out queue, and extracting a to-be-identified field of the to-be-identified packet.
  • the first in first out queue may also be referred to as a FIFO queue, and the FIFO is an abbreviation of First Input First Output.
  • the first-in first-out queue has the following characteristics: the data packet that enters the queue first is first taken out. In this way, the sequence of extracting data packets in the step S120 is the same as the order in which the data packets enter the FIFO queue. In this way, the phenomenon that a certain data message is not recognized for a long time can be avoided.
  • the step S120 may include:
  • the first N bytes of the data packet are extracted as the field to be identified; and the N is equal to the number of bytes corresponding to the specified length.
  • the specified length may be previously configured in the identification device, or may be determined based on a communication protocol.
  • the specified length may be 256 bytes.
  • the long queue of data packets is generally between 64 bytes and 1518 bytes, and the header of data packets usually does not exceed 256 bytes.
  • the N can be set to 256 in this embodiment.
  • the packet length of the data packet exceeds 256 bytes, the first 256 bytes of the data packet are extracted.
  • it is equivalent to extracting the header of the data packet and generally carries various fields for identifying the packet in the header of the data packet, so that the device can be easily identified and the data packet identification can be completed quickly.
  • step S120 further includes: when the length of the data packet is not greater than the specified length, the data packet is regarded as the field to be identified.
  • the N is equal to 256
  • the packet length of a data packet is less than 256 bytes
  • the entire data packet can be regarded as the field to be identified, which necessarily includes the data.
  • the header of the message is the information that specifies the length of the packet.
  • the method of extracting the byte to be identified may further include: determining a packet length of the data packet, and determining a header length of the data packet by querying a mapping relationship between the length of the packet and the length of the packet according to the length of the packet; The header length extracts the header of the data message.
  • the method in this embodiment further includes:
  • Step S140 Acquire a custom identification rule.
  • Step S150 Identify the data packet according to the custom identification rule to form a recognition result
  • Step S160 Determine, according to the identification result, whether the data packet is a custom message that satisfies the custom rule.
  • the added technical solution is mainly used to identify custom messages.
  • the custom recognition rule will be extracted in step S140.
  • the message is identified in accordance with the identification rule in step S150.
  • the step S150 may specifically include extracting a field specified in the custom identification rule in the data packet, and matching the extracted field with a preset field to form a recognition result. If the result of the recognition indicates that the extracted field matches the preset field in step S160, it can be confirmed that the data packet is a custom message that satisfies the custom rule, otherwise it is not the custom message.
  • the step S140 may include: obtaining a mask from the custom identification rule, and extracting a starting position of the data in a matching manner.
  • the step S150 may include:
  • Step S151 starting from the start position of the extracted data, performing M1 data extraction and extracting M2 bytes each time; the M1 and the M2 are integers not less than 1;
  • Step S152 Perform an operation on the extracted byte and the mask to obtain data to be matched
  • Step S153 Match the data to be matched according to the matching manner with the matching data table to form the identification result
  • the step S160 may include: when the identification result indicates that the data to be matched matches the data in the matching data table, determining that the data packet is a customized message that satisfies the custom rule. .
  • M1 data extraction is performed from the start position of the extracted data, and M2 bytes are extracted each time, and both M1 and M2 may be defined in the custom identification rule. Value.
  • the M1 is equal to 10
  • the M2 is equal to 2, in which case 20 bytes will be extracted in the step S151.
  • the 20 bytes are sorted according to the custom identification rule to form a field string to be matched, for example, the 20 bytes are sequentially sorted in the order of the data packets to form the to-be-matched. Field string.
  • a mask is introduced in this embodiment.
  • Performing the byte to be matched and the mask may include an AND operation, for example, the mask sets the data bit corresponding to the data to be matched to 1, and does not correspond to The data bit of the data to be matched is set to 0.
  • the mask sets the data bit corresponding to the data to be matched to 1, and does not correspond to The data bit of the data to be matched is set to 0.
  • step S140 may further include
  • the step S150 further includes:
  • the step S151 may specifically include:
  • M1 data extraction is performed from the start position of the extracted data and M2 bytes are extracted each time.
  • the method further includes: extracting the valid enable bit of the custom identification rule.
  • the indicating device needs to determine whether the data packet is self-determined. Defining a message, if the content of the valid enable bit indicates the custom rule If it is invalid, it means that the identification device does not need to determine whether the data is a custom message, so it is not necessary to perform the current identification. Therefore, it is not necessary to execute the step S151, and the current identification can be stopped.
  • the subsequent identification device determines whether the message is a custom message by setting the valid enable bit.
  • a method is provided, which can easily identify a customized message.
  • the customized message can be a non-standard message, which satisfies the user's need for identifying the customized message, and improves the requirement. Identification of intelligence and user satisfaction.
  • the rule table is used to store the custom identification rule.
  • the rule table of each of the custom identification rules occupies 261 storage locations, and a total of eight custom identification rules, so the capacity of the storage space occupied by the rule table It is 8*261.
  • the first step is to provide the first 256 bytes of the header of the data message when the data message is stored.
  • data extraction is started from the offset start position, and 2 bytes are extracted each time, and a total of 10 times are extracted, and a total of 20 bytes are extracted.
  • the extracted 20 bytes are operated with the mask.
  • the configured data is read from the data table. Match the calculated data with the configured data in a matching manner.
  • the capacity of the storage space occupied by the data table is 8*160 storage locations.
  • the data table for each custom identification rule occupies 160 storage locations.
  • the matching methods include equal to, greater than and less than.
  • the matching field is an address
  • the method of equalization can be used to determine whether the destination address is the destination address in the configuration table.
  • the matching field is a port
  • a matching method greater than or smaller than, such as port number 80, 1000, etc. may be adopted, in which case the port is The value of the number compares the port number with the port number in the data table.
  • the fifth step when the data of the configuration in the plurality of custom rules can be matched, it is preferred to select the custom rule with the highest ranking to match.
  • the top ranking is stored in front of the storage location in the identification device, and the custom rule with the highest priority can also be identified.
  • the method implementation method and the message identification method provided in the second embodiment of the method may be used together. Any one of the technical solutions in the second embodiment of the method may be combined with any one of the technical implementations of the method.
  • the steps S140 to 160 in the embodiment are not in a certain order with respect to the step S110 to the step S130.
  • the step S140 and the step S110 may be started synchronously.
  • the step S140 may also be started after the step S130.
  • the step S110 may also be performed after the step S140.
  • the embodiment provides a packet identification method, including:
  • Step A Receive a data packet, extract a header from the data packet for analysis, for example, extract 256 bytes, and proceed to step B.
  • Step B Store the header, start the first-stage pipeline analysis, analyze the DA, SA, and VLAN layer key fields of the packet, output the position offset of the number of header bytes that have been identified, and pass the data header to the next. Stage flow, the end of this stage of flow analysis ends, go to step C.
  • Step C Start the second-stage pipeline analysis.
  • the pipeline water of this level starts to identify the packet from the offset position input by the first-stage pipeline, analyzes the data encapsulation protocol format and Ethernet type field of the packet, and outputs the recognized header. The position offset of the number of bytes, passing the data header to the next stage pipeline.
  • This stage of pipeline analysis ends and proceeds to step D.
  • step C when the message to be analyzed enters step C, it can receive the next data message and start a new level 1 pipeline analysis, and then proceeds to step B. So you can receive new data messages every step to start a new level of water Line analysis to achieve high-speed flow identification data messages.
  • Step D Start the third-stage pipeline analysis.
  • the pipeline of this stage starts the packet identification from the offset position input by the second-stage pipeline, analyzes the PPPoE format header and IP header type of the packet, and outputs the number of header bytes that have been identified.
  • the position offset passes the data header to the next stage pipeline. This stage of pipeline analysis ends and proceeds to step E.
  • Step E Start the 4th stage pipeline analysis.
  • the pipeline of this stage starts the packet identification from the offset position of the 3rd stage flow input, analyzes the DPORT field, SPORT field, TCP header or UDP header of the packet, and passes the data header. Give the next level of assembly.
  • This level of pipeline analysis ends, and the specific protocol type of the data message has been identified.
  • Step F Start the fifth-stage pipeline analysis, obtain the specific protocol packet type given by the upper pipeline, and output the data packet encoding, and proceed to step J.
  • step F a step of determining whether the custom recognition mode is turned on will also be entered. If the result of the determination is yes, then step G is also required. If the result of the determination is no, the custom recognition process ends.
  • Step G Obtain 8 custom rule configuration tables one by one, and obtain 20 bytes of data from the 256-byte header according to the extracted data position offset configured by the user, and go to step H.
  • Step H Obtain 8 matching data tables one by one, and each rule configuration table and the matching data table are in one-to-one correspondence, and extract 20 bytes of data extracted from the mask value header in the rule configuration table to perform corresponding operations, according to the rule configuration table.
  • the matching method is compared with the data of the matching data table, and the process proceeds to step I.
  • Step I When all the 20-byte data are successfully matched, it indicates that the custom identification rule is successfully matched, and the data packet is output as a customized message encoding. If there are multiple hits in the eight custom rules, the former is preferred; any custom rule matching succeeds in step K, otherwise the custom message recognition process ends.
  • Step K Determine whether the input data message needs to match the standard format of the communication protocol and the message of the custom format at the same time. If it is to proceed to step J, if not, go to step L.
  • Step J If the step I custom rule matching is successful, that is, the data message encoding of the step F and the data message encoding of the step I are satisfied at the same time, the data message encoding of the step F and/or the step I is output according to the pre-configuration.
  • Step L Select to output a data message encoding.
  • the embodiment provides a message identification apparatus, where the apparatus includes:
  • the receiving unit 110 is configured to receive a data packet
  • the extracting unit 120 is configured to extract a field to be identified in the data packet
  • the first identifying unit 130 is configured to analyze the field to be identified by using at least two stages of pipelines.
  • the packet identification apparatus in this embodiment may be applied to a network node capable of receiving or forwarding data packets.
  • the receiving unit 110 can include various types of receiving interfaces, such as fiber optic cable interfaces or cable interfaces.
  • Both the extraction unit 120 and the physical structure of the first identification unit 130 may correspond to a processor or a processing circuit.
  • the processor can include an application processor, a central processing unit, a microprocessor, or a digital signal processor.
  • the processing circuit can include an application specific integrated circuit.
  • the extracting unit 120 and the first identifying unit 130 may be integrated to correspond to the same processor or processing circuit, or may respectively correspond to different processors or processing circuits.
  • the central processing unit or the microprocessor or the like can implement the functions of the above-described extracting unit and first identifying unit 130 by executing executable code.
  • the message identification device in this embodiment uses a multi-stage pipeline to identify the data message, and each pipeline of one pipeline can simultaneously identify different data packets, thereby improving the overall identification of the message. Efficiency and responsiveness.
  • the pipeline identifying the data message may include at least 2 levels, such as 3-stage pipeline identification, 4-stage pipeline identification, 5-stage pipeline identification or 6-stage pipeline identification, and even pipeline recognition above 6 levels.
  • the specific structure of the first identification unit 130 using the 5-stage pipeline identification is received in detail below.
  • the first identifying unit 130 is configured to analyze, by using a level 1 pipeline, a destination address DA, a source address SA, and a virtual local area network VLAN layer of the packet; and using a second-stage pipeline to analyze the encapsulation format of the packet.
  • the type of the network is used to analyze whether the packet carries a PPPoE encapsulation format header and a network protocol IP header through an Ethernet transmission point; the fourth-stage pipeline analyzes the protocol type of the packet; The 5-stage pipeline outputs the data message code according to the analysis result of the first-stage pipeline to the fourth-stage pipeline.
  • the first identifying unit 130 is specifically configured to output the position offset of the identified number of bytes and the to-be-identified after the n-th pipeline is identified.
  • the n+1th pipeline receives the position offset and the to-be-identified field, and starts to identify the offset position corresponding to the position offset of the identified number of bytes output by the nth stage pipeline
  • the field to be identified wherein n is an integer not less than 1; the nth stage pipeline is a previous stage pipeline of the (n+1)th pipeline.
  • the data transmission between the adjacent two-stage pipelines has a certain successive bearing relationship; the pipelines of each level sequentially identify the field to be identified, and the next-stage pipeline receives the identification result of the upper-stage pipeline.
  • the device also includes:
  • a storage unit configured to store the data message in a first-in first-out queue after receiving the data message
  • the extracting unit 120 is configured to take out the to-be-identified packet from the first-in-first-out queue, and extract the to-be-identified field of the to-be-identified packet.
  • the storage unit in this embodiment may include various types of storage media, which can be used to store the data message, but in the embodiment, the storage medium will use a FIFO queue to store the data message, Guarantee the sequence identification of the data messages to avoid individual data messages being kept The phenomenon that the user's usage satisfaction is low due to unrecognized pile pressure.
  • the extracting unit 120 may be an information reading structure for reading the corresponding data from the data packet specifying location.
  • the extracting unit 120 is specifically configured to determine the packet length of the data packet; When the length of the data packet is greater than the specified length, the first N bytes of the data packet are extracted as the field to be identified; the N is equal to the number of bytes corresponding to the specified length.
  • the extracting unit 120 is further configured to treat the data packet as the field to be identified when the length of the data packet is not greater than the specified length.
  • the extracting unit 120 in this embodiment can quickly and easily extract the bytes to be identified from the data packet, thereby facilitating the subsequent processing of the bytes to be recognized by the subsequent pipelines.
  • the device further includes:
  • the obtaining unit 140 is configured to obtain a custom identification rule.
  • the second identifying unit 150 is configured to identify the data packet according to the custom identification rule to form the identification result
  • the determining unit 160 is configured to determine, according to the recognition result, whether the data packet is a custom message that satisfies the custom rule.
  • the obtaining unit 140 in this embodiment may include a processor or a processing circuit to the storage space of the device storing the custom identification rule, and the custom rule is read.
  • the obtaining unit 140 may also include a communication interface that receives the custom identification rule from a peripheral device.
  • the specific structures of the second identifying unit 150 and the determining unit 160 may each include a processor or a processing circuit.
  • the structure of the processor or the processing circuit may be referred to the foregoing part, and is not repeated here.
  • the device in the embodiment can be used to identify whether the data packet is a custom packet by using the setting of the obtaining unit 140, the second identifying unit 150, and the determining unit 160, and avoiding the use of the data packet in some application scenarios.
  • the phenomenon that irregular custom messages cannot be recognized, broadens the The device is able to identify the extent of the message.
  • the obtaining unit 140 is configured to acquire a mask and a matching manner to obtain a starting position of the data from the custom identification rule;
  • the second identifying unit 150 is configured to perform M1 data extraction and extract M2 bytes each time starting from the starting position of the extracted data; the M1 and the M2 are integers not less than 1; Performing operations on the extracted bytes and the mask to obtain data to be matched; matching the data to be matched according to the matching manner with the matching data table to form a recognition result;
  • the determining unit 160 is specifically configured to: when the identification result indicates that the data to be matched matches the data in the matching data table, determine that the data packet is customized to satisfy the custom rule. Message.
  • the obtaining unit 140 is configured to read or receive the mask and the matching manner to extract the starting position of the data.
  • the second identifying unit 150 may include a data reading structure, a logical computing unit, and a comparison structure, and the data reading structure is configured to extract M1*M2 bytes, and the logical computing unit may be configured to extract the mask and extract The bytes are evaluated.
  • the comparison structure may include a comparator or comparison circuit or a processor having a comparison function that compares and matches data to be matched with data in the matching data packet to form the recognition result.
  • the determining unit 160 may be the processor or the processing circuit, and determine, according to the identification result, whether the currently identified data message is the customized message that satisfies the custom rule. In this way, the identification of the custom message is easily realized, in particular, the identification of the custom message that is not defined according to the existing communication protocol, and the range of the data message that the identification device can recognize is broadened.
  • the message identification apparatus may include only the acquisition unit 140, the second identification unit 150, and the determination unit 160.
  • the specific structure of the obtaining unit 140, the second identifying unit 150, and the determining unit 140 can be referred to the foregoing part, and is not repeated here.
  • An embodiment of the present invention further provides a computer storage medium, where the computer storage medium is stored
  • the computer executable instructions are used to execute the message identification method provided by any of the foregoing method embodiments, for example, can be used to implement FIG. 1, FIG. 4, FIG. 5, FIG. 6, FIG. The method of any of the preceding claims.
  • the computer storage medium may be various types of storage media, such as a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk a magnetic disk
  • optical disk a medium that can store program code.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Communication Control (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明实施例公开了一种报文识别方法及装置,所述方法包括:接收数据报文;提取数据报文中待识别的字段;采用至少2级流水线分析所述待识别的字段。本发明实施例还公开了一种计算机存储介质。

Description

报文识别方法、装置和计算机存储介质 技术领域
本发明涉及通信领域,尤其涉及一种报文识别方法、装置和计算机存储介质。
背景技术
随着家庭对数据带宽业务需求增大,比如网络电视IPTV、在线直播、高清视频、无线智能设备接入等,上下行带宽各10G的终端接入设备逐步研发,在这种大流量、高带宽的背景下,如何高速识别数据报文将显得尤为重要。以太网是当前应用最普遍的局域网技术,以太网数据报文帧格式网络中最常见的是EthernetⅡ、Ethernet 802.3 SAP、Ethernet 802.3 SNAP,以上三种格式封装的具体协议报文类型众多,要想识别具体的数据报文类型,就要考虑包长,分析包内容提取关键字段,如何高速准确地识别数据包内容,成为提高网络传输速率,细分带宽业务,满足不同用户宽带需求的重要技术。目前主要识别数据报文方法是串行识别报文,只有当上一个报文完全分析完后才启动下一个报文的识别,识别效率低,不能满足高速宽带业务。
当然在某些情况下,发现在进行报文识别时,有些报文是无法被识别。
发明内容
有鉴于此,本发明实施例期望提供的报文识别方法、装置和计算机存储介质,能够至少部分解决报文识别效率低的问题或部分解决某些报文无法被识别的问题。
本发明实施例的技术方案是这样实现的:
本发明实施例第一方面提供一种报文识别方法,所述方法包括:
接收数据报文;
提取数据报文中待识别的字段;
采用至少2级流水线分析所述待识别的字段。
基于上述方案,所述采用至少2级流水线识别所述待识别的字段,包括:
采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;
采用第2级流水线分析所述报文的封装格式以太网类型;
采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协议PPPoE封装格式头及网络协议IP头;
采用第4级流水线分析所述报文的遵循的协议类型;
采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
基于上述方案,所述采用至少2级流水线分析所述待识别的字段,包括:
第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;
第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;
所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。
基于上述方案,所述方法还包括:
在接收所述数据报文之后,将所述数据报文存储先入先出队列;
所述提取数据报文中待识别的字段,包括:
从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
基于上述方案,所述提取数据报文中待识别的字段,包括:
判断所述数据报文的报文长度;
当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。
基于上述方案,所述提取数据报文中待识别的字段,还包括:
当所述数据报文的长度不大于所述指定长度时,将所述数据报文整个视为所述待识别的字段。
基于上述方案,所述方法还包括:
获取自定义识别规则;
根据所述自定义识别规则识别所述数据报文,形成识别结果;
根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
基于上述方案,所述获取自定义识别规则,包括:
从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
所述根据所述自定义识别规则识别所述报文,形成识别结果,包括:
从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;
将提取的字节与所述掩码进行运算,获得待匹配的数据;
将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
所述根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文,包括:
当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
基于上述方案,所述获取自定义识别规则,还包括,
从所述自定义识别规则中获取规则有效使能位;
所述根据所述自定义识别规则识别所述报文,形成识别结果,还包括:
根据所述规则有效使能位的信息,确定所述自定义识别规则是否有效;
所述从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节,包括:
当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
本发明实施例第二方面提供另一种报文识别方法,所述方法还包括:
获取自定义识别规则;
根据所述自定义识别规则识别数据报文,形成识别结果;
根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
基于上述方案,所述获取自定义识别规则,包括:
从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
所述根据所述自定义识别规则识别所述报文,形成识别结果,包括:
从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;
将提取的字节与所述掩码进行运算,获得待匹配的数据;
将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
所述根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文,包括:
当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
基于上述方案,所述获取自定义识别规则,还包括,
从所述自定义识别规则中获取规则有效使能位;
所述根据所述自定义识别规则识别所述报文,形成识别结果,还包括:
根据所述规则有效使能位的内容,确定所述自定义识别规则是否有效;
所述从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节,包括:
当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
本发明实施例第三方面提供一种报文识别装置,所述装置包括:
接收单元,配置为接收数据报文;
提取单元,配置为提取数据报文中待识别的字段;
第一识别单元,配置为采用至少2级流水线分析所述待识别的字段。
基于上述方案,所述第一识别单元,配置为采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;采用第2级流水线分析所述报文的封装格式以太网类型;采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协议PPPoE封装格式头及网络协议IP头;采用第4级流水线分析所述报文的遵循的协议类型;采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
基于上述方案,所述第一识别单元,配置为第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;及第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;
所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。
基于上述方案,所述装置还包括:
存储单元,配置为在接收所述数据报文之后,将所述数据报文存储先入先出队列;
所述提取单元,配置为从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
基于上述方案,所述提取单元,配置为判断所述数据报文的报文长度;当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。
基于上述方案,所述提取单元,还配置为当所述数据报文的长度不大于所述指定长度时,将所述数据报文整个视为所述待识别的字段。
基于上述方案,所述装置还包括:
获取单元,配置为获取自定义识别规则;
第二识别单元,配置为根据所述自定义识别规则识别所述数据报文,形成识别结果;
判断单元,配置为根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
基于上述方案,所述获取单元,配置为从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
所述第二识别单元,配置为从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;将提取的字节与所述掩码进行运算,获得待匹配的数据;将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
所述判断单元,配置为当所述识别结果表明所述待匹配的数据与所述 匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
基于上述方案,所述获取单元,还配置为从所述自定义识别规则中获取规则有效使能位;
所述第二识别单元,还配置为根据所述规则有效使能位的信息,确定所述自定义识别规则是否有效;当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
本发明实施例第四方面提供一种报文识别装置,所述装置还包括:
获取单元,配置为获取自定义识别规则;
第二识别单元,配置为根据所述自定义识别规则识别数据报文,形成识别结果;
判断单元,配置为根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
基于上述方案,所述获取单元,具体用于从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
所述第二识别单元,配置为从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;将提取的字节与所述掩码进行运算,获得待匹配的数据;将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成所述识别结果;
所述判断单元,配置为当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
基于上述方案,所述获取单元,还配置为从所述自定义识别规则中获取规则有效使能位;
所述第二识别单元,还配置为根据所述规则有效使能位的内容,确定 所述自定义识别规则是否有效;当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
本发明实施例第五方面提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行前述任意一个所述报文识别方法。
本发明实施例提供的第一种报文识别方法、装置和计算机存储介质,采用至少2级流水线对数据报文进行识别,显然该条流水线中同时至少可对两个数据报文进行识别,这样当识别装置需要对多个数据报文进行识别时,能够显著的提高报文识别的整体效率,提高报文识别的响应速度,满足高速带宽的业务。本发明实施例提供的第二种报文识别方法及装置,利用自定义规则进行报文识别,能够识别出自定义报文,解决了现有的识别方法及装置中无法对自定义本报文进行识别的现象,尤其是无法识别不规则的自定义报文的识别困难的问题。
附图说明
图1为本发明实施例提供的第一种报文识别方法的流程示意图;
图2为本发明实施例提供的一种数据报文的结构示意图;
图3为本发明实施例提供的一种5级流水线的流水线结构示意图;
图4为本发明实施例提供的第二种报文识别方法的流程示意图;
图5为本发明实施例提供的第三种报文识别方法的流程示意图;
图6为本发明实施例提供的根据自定义识别规则形成识别结果的示意图;
图7为本发明实施例提供的第四种报文识别方法的流程示意图;
图8为本发明实施例提供的第五种报文识别方法的流程示意图;
图9为本发明实施例提供的第一种报文识别装置的结构示意图;
图10为本发明实施例提供的第二种报文识别装置的结构示意图;
图11为本发明实施例提供的第三种报文识别装置的结构示意图。
具体实施方式
以下结合说明书附图及具体实施例对本发明的技术方案做进一步的详细阐述,应当理解,以下所说明的优选实施例仅用于说明和解释本发明,并不用于限定本发明。
方法实施例一:
如图1所示,本实施例提供一种报文识别方法,所述方法包括:
步骤S110:接收数据报文;
步骤S120:提取数据报文中待识别的字段;
步骤S130:采用至少2级流水线分析所述待识别的字段。
本实施例所述的报文识别方法通常应用于数据报文接收端中。步骤S120中提取数据报文中的待识别的字段,通常这些待识别的字段至少包括数据报文的报头中部分字节。
在步骤S130中采用至少2级柳树线来识别待识别的字段,这样的话,一条流水线中可至少同时存在两个被识别的报文,显然相对于现有技术中采用一个处理模块识别报文,一次性仅能识别一个数据报文,大大的提升了报文识别的速率,减少了识别装置对多个报文识别的整体延时。
所述至少2级流水线可为2级流水线、3级流水线、4级流水线或5级流水线、6级流水线、甚至6级以上的流水线。所述流水线的级数可根据识别需求进行分割。以下提供一种5级流水线的识别方法:
所述步骤S130可包括:
采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;
采用第2级流水线分析所述报文的封装格式及以太网类型;
采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协 议PPPoE封装格式头及网络协议IP头;
采用第4级流水线分析所述报文的遵循的协议类型;
采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
所述第1级流水线用于分析所述待识别的字段中的第一个字节至所述VLAN层数对应的字节。所述VLAN层数对应的字节可为表示所述VLAN标签TAG标签。
所述第2级流水线分析所述VLAN层数对应的字节之后的第一个字节开始识别,一直识别至所述待识别的字节中的以太网类型字段。所述以太网络类型字段可表征所述数据报文来自什么样类型的以太网。当然所述第2级流水线所识别的字段还可包所述待识别的字段中的封装类型字段。所述封装类型可为Ethernet 802.3 SAP、Ethernet 802.2 LLC、Ethernet 802.3 SNAP或EthernetⅡ。所述Ethernet 802.3 SAP、所述Ethernet 802.2 LLC、所述Ethernet 802.3 SNAP及所述EthernetⅡ都是以太网标准帧格式的一种,这三种以太网标准帧的格式的特点可参见现有技术,在此就赘述了。
所述第3级流水线分析的字段可包括所述待识别的字段中的PPPoE格式头和IP头。若所述待识别的字段中包括所述PPPoE格式头,则表明所述数据报文中携带有所述PPPoE格式头。分析所述IP头包括分析IP类型、IP类型组合格式以及是否有IP扩展头等。具体如,所述IP类型可包括IPv4或IPv6。所述IP类型组合可包括IPv4加上IPv6的组合,也可以是IPv6加上IPv4的组合。所述IP扩展头可包括IPv6携带的扩展头等。
第4级流水线分析的所述数据报文遵循的具体的协议类型,例如所述数据报文是遵循传输控制协议(Transmission Control Protocol,TCP)协议的TCP报文,还是遵循用户数据报协议(User Datagram Protocol,UDP)协议的UDP报文。通常所述第4级流水线报文识别的字段还可包括为所述 待识别字段中的目的端口DPORT字段和/或源端口SPORT字段。
第5级流水线分析所述待识别字段中尚未被分析的剩余字段,可根据第1至4级流水线的识别结果,输出数据报文编码。这里的数据报文编码为每一级流水线的识别结果组合形成字段串或编码串。最后将所述数据报文编码与预先配置的识别编码进行匹配后,将确定出所述报文是哪一种数据报文,实现了所述数据报文的识别。在具体的实现过程中,所述第5级流水线还可用于匹配出所述数据报文配置的动作。这里所配置的动作可包括转发或接收存储等操作,至此完成所述数据报文的识别后,执行所述数据报文配置的动作。
上述操作为标准的5级流水线的识别方式,若当前所述流水线是3级流水线时,可以将5级流水线中相邻的2级或3级流水线进行合并处理,从而实现3级流水线识别。本实施例所述的5级流水线的识别处理,每一级流水线识别的时间相当,很少会出现哪一级流水线识别速率过慢导致识别瓶颈的现象,具有实现简便及识别效率高的特点。
图2可为数据报文的一种组成结构,这些内容按照如图2所示的顺序进行排序可构成数据报文的报头。与SA相邻的标签TAG可为VLAN标签。在图2中,LLC表示Ethernet 802.2 LLC、SNAP表示Ethernet 802.3 SNAP,与LLC、SNAP相邻的TYPE表示的以太网类型字段。图2中IPv6、IPv4表示的是IPv6加上IPv4的组合。图2中IPv4、IPv6表示的IPv4加上IPv6的组合。图2中数据可表示的所述数据包的正文或业务内容。
图3表示的为一条5级流水线,分别分析处理的字段。分别并行向各级流水线输入数据报文的前256个字节,或通过流水线逐级向各级流水线输入所述数据报文的前256个字节。通常这前256个字节包括数据报文的报头。
第1级流水线对这256个字节中的标签TAG1,例如VLAN标签,进 行分析,分析完后将分析结果输入第2级流水线。第2级流水线从第1级流水线分析截止的字节开始分析,将对标签TAG2和以太网类型字段进行分析,TAG2可包括帧格式标签。第3级流水线从第2级流水线接收分析解雇,并从第2级流水线分析截止的字节开始分析,将对IP头进行分析。第4级流水线接收第3级流水线的分析结果并分析数据报文采用的通信协议,该数据报文是TCP报文还是UDP报文。第5级流水线将从第4级接收到前4级流水线的分析结果,分析结束,输出数据报文编码。这里的数据报文编码为分析结果构成的编码。
在本实施例中,所述流水线识别可是对所述待识别的字节的顺序识别,具体如,所述步骤S130可包括:第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。例如,所述第n级流水线为第2级流水线,则所述第n+1级流水线将是所述第3级流水线。
在具体实现时,上一级流水线的识别结果还会传输给下一级流水线或最后一级流水线,方便下一级流水线顺序传输给最后一级流水线,方便最后一级流水线形成所述数据报文编码,以最终确定数据报文的类型等。
在本实施例中,
如图4所示,所述方法还包括:
步骤S111:在接收所述数据报文之后,将所述数据报文存储先入先出队列;
所述步骤S120可包括:从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
所述先入先出队列又可称为FIFO队列,所述FIFO为First Input First Output的缩写。所述先入先出的队列具有以下特点,先进入队列的数据报文,先被取出来。这样的话,所述步骤S120中在提取数据报文的先后顺序与数据报文进入所述FIFO队列的顺序一致,这样的话,就可以避免某一个数据报文长时间没有被识别的现象。
所述步骤S120可包括:
判断所述数据报文的报文长度;
当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。
例如,所述指定长度可为事先配置在识别装置中的,也可以是基于通信协议而确定的。所述指定长度可为256个字节。通常数据报文的长队一般在64个字节到1518个字节之间,数据报文的报头通常不超过256个字节。在本实施例中可以将所述N设置为256。这样的话,当所述数据报文的报文长度超过256个字节时,将提取所述数据报文的前256个字节。这样话,相当于提取了所述数据报文的报头,在数据报文的报头中通常承载了各种需要进行报文识别的字段,这样就能够方便识别装置,快速的完成数据报文识别。
此外,所述步骤S120还包括:当所述数据报文的长度不大于所述指定长度时,将所述数据报文整个视为所述待识别的字段。
例如,当所述N等于256时,若一个数据报文的报文长度小于所256个字节,这个时候,可以将整个数据报文视为所述待识别的字段,这样必然包括所述数据报文的报头。
当然提取所述待识别的字节,还可包括:确定数据报文的报文长度,依据该报文长度通过查询报文长度与报头长度的映射关系,确定该数据报文的报头长度;基于该报头长度提取该数据报文的报头。
上述两种提取所述待识别的字节的方法,均能够简便的提取出识别所述数据报文所需的待识别的字段。
方法实施例二:
如图5所示,本实施例所述方法还包括:
步骤S140:获取自定义识别规则;
步骤S150:根据所述自定义识别规则识别所述数据报文,形成识别结果;
步骤S160:根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
此处,增加的技术方案,主要用于识别自定义报文。
首先,在步骤S140中将提取所述自定义识别规则。接下来在步骤S150中根据识别规则识别所述报文。所述步骤S150具体可包括提取所述数据报文中在所述自定义识别规则中指定提取的字段,并将提取的字段与预设的字段进行匹配,形成识别结果。在步骤S160中若识别结果表明提取的字段与预设的字段都匹配,则可确认该数据报文为满足自定义规则的自定义报文,否则不是所述自定义报文。
所述步骤S140可包括:从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置。
如图6所示,
所述步骤S150可包括:
步骤S151:从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;
步骤S152:将提取的字节与所述掩码进行运算,获得待匹配的数据;
步骤S153:将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成所述识别结果;
所述步骤S160可包括:当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
在本实施例中将从所述提取数据的开始位置开始进行M1次的数据提取,每一次提取M2个字节,所述M1和所述M2都可为定义在所述自定义识别规则中的数值。例如,所述M1等于10,所述M2等于2,这样的话,所述步骤S151中将提取20个字节。在步骤S152中将这20个字节按所述自定义识别规则排序形成待匹配的字段串,例如按这20个字节在所述数据报文中的先后顺序依次排序,形成所述待匹配的字段串。当然这20个字节中并非每一个字节或每一个字节中的每一位数都需要和匹配数据表中的数据进行匹配,为了提高匹配效率,在本实施例中引入了掩码,将所述待匹配的字节与掩码进行运算,例如,进行的运算可包括与运算,例如所述掩码将对应于所述待匹配的数据的数据位设置为1,将不对应于所述待匹配的数据的数据位设置为0,通过将所述掩码与待匹配字节的与运算,显然就可以将无需进行匹配的数据位置为0,方便后续进行快速匹配。
可选地,所述步骤S140可还包括,
从所述自定义识别规则中获取规则有效使能位;
所述步骤S150还包括:
根据所述规则有效使能位的内容,确定所述自定义识别规则是否有效;
所述步骤S151具体可包括:
当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
在本实施例中还包括提取所述自定义识别规则的有效使能位,通常若所述有效使能位的内容表示该自定义规则有效时,表示识别装置需要判断该数据报文是否为自定义报文,若所述有效使能位的内容表示该自定义规 则无效时,表示识别装置无需判断该数据是否为自定义报文,故无需进行本次识别,故无需执行所述步骤S151,且即可停止本次识别。
这样的话,方便后续识别装置通过设置所述有效使能位,来确定是否识别报文是否为自定义报文。在本实施例中提供了一种方法,能够简便的识别出自定义报文,本实施例中所述自定义报文可为非标准报文,满足用户对自定义报文的识别需求,提高了识别的智能性及用户使用满意度。
以下提供基于本实施例所述的方法,提供一个示例:
如7所示,本示例中利用8组寄存区搭建并行索引,形成8个自定义识别规则,通过该并行索引可以查询到所述自定义识别规则。利用规则表来存储所述自定义识别规则,在本示例中每一个所述自定义识别规则的规则表占用261个存储位置,共8个自定义识别规则,故规则表占用的存储空间的容量为8*261。
第一步,当数据报文存储完后,提供数据报文的报头的前256个字节。
第二步,从偏移量起始位置开始进行数据提取,每次提取2个字节,共提取10次,共提取20个字节。
第三步,当规则使能位表示自定义识别规则有效时,将提取的20个字节与掩码进行运算。从数据表中读出配置的数据。将运算后的数据与配置的数据按照匹配方式进行匹配。在本示例中数据表占用的存储空间的容量为8*160个存储位置。每一个自定义识别规则的数据表占用160个存储位置。
第四步,当10组运算后的数据与配置的数据全部匹配时,表示数据报文识别成功,输出对应匹配方式的自定义报文编码。这里的匹配方式包括等于,大于和小于。例如,当匹配的字段为地址时,可以采用等于的方式,确定目的地址是否为配置表中的目的地址。当匹配的字段为端口时,可以采用大于或小于的匹配方式,例如端口号80、1000等,这样的话,将端口 号数值话,与数据表中的端口号比较端口号的大小。
第五步,当可以与多条自定义规则中的配置的数据进行匹配时,优选选择排序靠前的自定义规则进行匹配。这里排序靠前为存储在识别装置中存储位置靠前的,也可以识别优先级靠前的自定义规则。
在具体实现过程中,可以将方法实施一和方法实施例二提供的报文识别方法配合使用。所述方法实施例二中任意一个技术方案都可以与方法实施一中的任意一个技术方案,结合使用。在结合使用进行报文识别时,本实施例所述步骤S140至步骤160相对于步骤S110至步骤S130之间没有一定的先后顺序。所述步骤S140和步骤S110可以同步开始,所述步骤S140也可以完成所述步骤S130之后开始,所述步骤S110也可以位于所述步骤S140之后执行。
以下结合方法实施一至方法实施例二中所述的方法,提供一个具体示例。
如图8所示,本实施例提供一种报文识别方法,包括:
步骤A:接收数据报文,从数据报文中提取报头进行分析,例如提取256个字节,进入步骤B。
步骤B:存储报头,启动第1级流水线分析,分析出报文的DA、SA、VLAN层数关键字段,输出已识别完报头字节数的位置偏移量,把数据报头传递给下一级流水,此级流水分析结束,进入步骤C。
步骤C:启动第2级流水线分析,此级流水线水从第1级流水线输入的偏移位置开始进行报文识别,分析出报文的数据封装协议格式、以太网类型字段,输出已识别完报头字节数的位置偏移量,把数据报头传递给下一级流水线。此级流水线分析结束,进入步骤D。如图2所示,待分析报文进入步骤C时,已经可以接收下一个数据报文并启动新的第1级流水线分析,再进入步骤B。如此可以每一步接收新的数据报文启动新一级流水 线分析,实现高速流水识别数据报文。
步骤D:启动第3级流水线分析,此级流水线从第2级流水线输入的偏移位置开始进行报文识别,分析出报文PPPoE格式头、IP头类型,输出已识别完报头字节数的位置偏移量,把数据报头传递给下一级流水线。此级流水线分析结束,进入步骤E。
步骤E:启动第4级流水线分析,此级流水线从第3级流水输入的偏移位置开始进行报文识别,分析出报文的DPORT字段、SPORT字段、TCP头或UDP头,把数据报头传递给下一级流水线。此级流水线分析结束,至此数据报文具体协议类型已经识别出。接下来将判断输入的手报文是否同时匹配通信协议标准格式报文和自定义格式的自定义报文;若是,则进入步骤,
步骤F:启动第5级流水分析,获取上级流水线给出的具体协议报文类型,输出数据报文编码,进入步骤J。在执行步骤F之后还将进入判断自定义识别模式是否开启的步骤。若判断结果为是,则还需进入步骤G,若判断结果为否,则自定义识别流程结束。
步骤G:逐个获取8个自定义规则配置表,根据用户配置的提取数据位置偏移,从256字节的报头中对应获取20字节数据,进入步骤H。
步骤H:逐个获取8个匹配数据表,每个规则配置表和匹配数据表一一对应,提取规则配置表中的掩码值报头中提取的20字节数据进行相应运算,根据规则配置表中的匹配方式与匹配数据表的数据进行比较,进入步骤I。
步骤I:当20字节数据全部匹配成功时,表示此条自定义识别规则匹配成功,输出该数据报文为自定义的报文编码。如果8条自定义规则中有多条命中,优先选择前者;任意一条自定义规则匹配成功进入步骤K,否则结束自定义报文识别流程。
步骤K:判断输入的数据报文是否需要同时匹配通信协议的标准格式和自定义格式的报文。若是进入步骤J,若否进入步骤L。
步骤J:如果步骤I自定义规则匹配成功,即同时满足了步骤F的数据报文编码和步骤I的数据报文编码,根据预先配置输出步骤F和/或步骤I的数据报文编码。
步骤L:选择输出一种数据报文编码。
设备实施例:
如图9所示,本实施例提供一种报文识别装置,所述装置包括:
接收单元110,配置为接收数据报文;
提取单元120,配置为提取数据报文中待识别的字段;
第一识别单元130,用于采用至少2级流水线分析所述待识别的字段。
本实施例所述的报文识别装置可为应用于能够接收或转发数据报文网络节点中。
所述接收单元110可包括各种类型的接收接口,例如光缆接口或电缆接口等。
所述提取单元120和所述第一识别单元130的物理结构都可对应于处理器或处理电路。所述处理器可包括应用处理器、中央处理器、微处理器或数字信号处理器等结构。所述处理电路可包括专用集成电路。所述提取单元120和所述第一识别单元130可集成对应于相同的处理器或处理电路,也可以分别对应于不同的处理器或处理电路。所述中央处理器或微处理器等结构可以通过执行可执行代码实现上述提取单元和第一识别单元130的功能。
本实施例所述的报文识别装置,将采用多级流水线来识别所述数据报文,一条流水线的各级流水线可以同时对不同的数据报文进行识别,从而整体上提高了报文识别的效率及响应速度。
识别所述数据报文的流水线可包括至少2级,例如3级流水线识别、4级流水线识别、5级流水线识别或6级流水线识别,甚至6级以上的流水线识别。以下详细接收一种采用5级流水线识别的第一识别单元130的具体结构。所述第一识别单元130,具体用于采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;采用第2级流水线分析所述报文的封装格式以太网类型;采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协议PPPoE封装格式头及网络协议IP头;采用第4级流水线分析所述报文的遵循的协议类型;采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
当然,在采用多级流水线识别所述数据报文时,所述第一识别单元130,具体用于第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;及第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;其中,所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。显然相邻2级流水线之间的数据传输有一定的先后承接关系;各级流水线顺序识别所述待识别的字段,下一级流水线接收上一级流水线的识别结果等。
所述装置还包括:
存储单元,配置为在接收所述数据报文之后,将所述数据报文存储先入先出队列;
所述提取单元120,配置为从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
在本实施例所述存储单元可包括各种类型的存储介质,能够用于存储所述数据报文,但是在本实施例中所述存储介质将采用FIFO队列来存储所述数据报文,以保证所述数据报文的顺序识别,避免个别数据报文一直被 堆压未被识别导致的用户使用满意度低的现象。
所述提取单元120可为从数据报文指定位置读取对应数据的信息读取结构,在本实施例中所述提取单元120,具体用于判断所述数据报文的报文长度;当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。当然,所述提取单元120,还用于当所述数据报文的长度不大于所述指定长度时,将所述数据报文整个视为所述待识别的字段。本实施例所述提取单元120可快速简便的从数据报文中提取出待识别的字节,从而方便后续各级流水线快速仅需要对待识别的字节进行处理即可。
此外,如图10所示,所述装置还包括:
获取单元140,配置为获取自定义识别规则;
第二识别单元150,配置为根据所述自定义识别规则识别数据报文,形成所述识别结果;
判断单元160,配置为根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
本实施例所述的获取单元140可包括处理器或处理电路,所述处理器或处理电路到所述装置的存储有所述自定义识别规则的存储空间,读取所述自定义规则。所述获取单元140也可包括通信接口,从外设接收所述自定义识别规则。
所述第二识别单元150和所述判断单元160的具体结构可均包括处理器或处理电路,所述处理器或处理电路的结构可参见前述部分,在此就不重复了。
总之,本实施例中所述装置通过获取单元140、第二识别单元150及判断单元160的设置,还能够识别出该数据报文是否为自定义报文,避免了在某些应用场景下,非规则的自定义报文无法被识别的现象,拓宽了所述 装置能够识别报文的范围。
所述获取单元140,配置为从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
所述第二识别单元150,配置为从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;将提取的字节与所述掩码进行运算,获得待匹配的数据;将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
所述判断单元160,具体用于当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
在本实施例中所述获取单元140主用配置为读取或接收所述掩码、匹配方式提取数据的起始位置。所述第二识别单元150可包括数据读取结构、逻辑计算单元及比较结构等接结构,数据读取结构用于提取M1*M2个字节,逻辑计算单元可用于将所述掩码与提取的字节进行运算。所述比较结构可包括比较器或比较电路或具有比较功能的处理器,所述比较结构将待匹配的数据与匹配数据包中的数据进行比较匹配,从而形成所述识别结果。
所述判断单元160可为所述处理器或处理电路,将根据所述识别结果确定当前被识别的数据报文是否为满足所述自定义规则的所述自定义报文。这样的话,就简便的实现了对自定义报文进行识别,尤其对一些并没有按照现有通信协议定义的自定义报文进行识别,拓宽了所述识别装置能够识别的数据报文的范围。
如图11所示,所述报文识别装置,也可以仅包括所述获取单元140、第二识别单元150及所述判断单元160。所述获取单元140、第二识别单元150及所述判断单元140的具体结构可参见前述部分,在此就不重复了。
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存 储有计算机可执行指令,所述计算机可执行指令用于执行前述任意方法实施例提供的报文识别方法,例如,可用于实现图1、图4、图5、图6、图7及图8任一项所述的方法。
所述计算机存储介质可为各种类型的存储介质,例如,移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理模块中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计 算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,凡按照本发明原理所作的修改,都应当理解为落入本发明的保护范围。

Claims (25)

  1. 一种报文识别方法,所述方法包括:
    接收数据报文;
    提取数据报文中待识别的字段;
    采用至少2级流水线分析所述待识别的字段。
  2. 根据权利要求1所述的方法,其中,
    所述采用至少2级流水线识别所述待识别的字段,包括:
    采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;
    采用第2级流水线分析所述报文的封装格式以太网类型;
    采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协议PPPoE封装格式头及网络协议IP头;
    采用第4级流水线分析所述报文的遵循的协议类型;
    采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
  3. 根据权利要求1或2所述的方法,其中,
    所述采用至少2级流水线分析所述待识别的字段,包括:
    第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;
    第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;
    所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。
  4. 根据权利要求1所述的方法,其中,
    所述方法还包括:
    在接收所述数据报文之后,将所述数据报文存储先入先出队列;
    所述提取数据报文中待识别的字段,包括:
    从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
  5. 根据权利要求1或2所述的方法,其中,
    所述提取数据报文中待识别的字段,包括:
    判断所述数据报文的报文长度;
    当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。
  6. 根据权利要求5所述的方法,其中,
    所述提取数据报文中待识别的字段,还包括:
    当所述数据报文的长度不大于所述指定长度时,将所述数据报文整个视为所述待识别的字段。
  7. 根据权利要求1或2所述的方法,其中,
    所述方法还包括:
    获取自定义识别规则;
    根据所述自定义识别规则识别所述数据报文,形成识别结果;
    根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
  8. 根据权利要求7所述的方法,其中,
    所述获取自定义识别规则,包括:
    从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
    所述根据所述自定义识别规则识别所述报文,形成识别结果,包括:
    从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;
    将提取的字节与所述掩码进行运算,获得待匹配的数据;
    将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
    所述根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文,包括:
    当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
  9. 根据权利要求8所述的方法,其中,
    所述获取自定义识别规则,还包括,
    从所述自定义识别规则中获取规则有效使能位;
    所述根据所述自定义识别规则识别所述报文,形成识别结果,还包括:
    根据所述规则有效使能位的信息,确定所述自定义识别规则是否有效;
    所述从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节,包括:
    当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
  10. 一种报文识别方法,所述方法还包括:
    获取自定义识别规则;
    根据所述自定义识别规则识别数据报文,形成识别结果;
    根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
  11. 根据权利要求10所述的方法,其中,
    所述获取自定义识别规则,包括:
    从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
    所述根据所述自定义识别规则识别所述报文,形成识别结果,包括:
    从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;
    将提取的字节与所述掩码进行运算,获得待匹配的数据;
    将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
    所述根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文,包括:
    当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
  12. 根据权利要求11所述的方法,其中,
    所述获取自定义识别规则,还包括,
    从所述自定义识别规则中获取规则有效使能位;
    所述根据所述自定义识别规则识别所述报文,形成识别结果,还包括:
    根据所述规则有效使能位的内容,确定所述自定义识别规则是否有效;
    所述从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节,包括:
    当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
  13. 一种报文识别装置,其中,所述装置包括:
    接收单元,配置为接收数据报文;
    提取单元,配置为提取数据报文中待识别的字段;
    第一识别单元,配置为采用至少2级流水线分析所述待识别的字段。
  14. 根据权利要求13所述的装置,其中,
    所述第一识别单元,配置为采用第1级流水线分析所述报文的目的地址DA、源地址SA及虚拟局域网VLAN层数;采用第2级流水线分析所述报文的封装格式以太网类型;采用第3级流水线分析所述报文是否有携带通过以太网传输点对点协议PPPoE封装格式头及网络协议IP头;采用第4级流水线分析所述报文的遵循的协议类型;采用第5级流水线根据所述第1级流水线至第4级流水线的分析结果,输出数据报文编码。
  15. 根据权利要求13或14所述的装置,其中,
    所述第一识别单元,配置为第n级流水线识别后,输出已识别字节数的位置偏移量和所述待识别的字段;及第n+1级流水线接收所述位置偏移量和所述待识别的字段,从所述第n级流水线输出的已识别字节数的位置偏移量对应的偏移位置开始识别所述待识别的字段;
    所述n为不小于1的整数;所述第n级流水线是所述第n+1级流水线的前一级流水线。
  16. 根据权利要求13所述的装置,其中,
    所述装置还包括:
    存储单元,配置为在接收所述数据报文之后,将所述数据报文存储先入先出队列;
    所述提取单元,配置为从所述先入先出队列中取出待识别的报文,并提取所述待识别报文的待识别字段。
  17. 根据权利要求13或14所述的装置,其中,
    所述提取单元,配置为判断所述数据报文的报文长度;当所述数据报文长度大于指定长度时,提取所述数据报文前N个字节作为所述待识别的字段;所述N等于所述指定长度对应的字节数。
  18. 根据权利要求17所述的装置,其中,
    所述提取单元,还配置为当所述数据报文的长度不大于所述指定长度 时,将所述数据报文整个视为所述待识别的字段。
  19. 根据权利要求13或14所述的装置,其中,
    所述装置还包括:
    获取单元,配置为获取自定义识别规则;
    第二识别单元,配置为根据所述自定义识别规则识别所述数据报文,形成识别结果;
    判断单元,配置为根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
  20. 根据权利要求19所述的装置,其中,
    所述获取单元,配置为从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
    所述第二识别单元,配置为从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;将提取的字节与所述掩码进行运算,获得待匹配的数据;将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成识别结果;
    所述判断单元,配置为当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
  21. 根据权利要求9所述的装置,其中,
    所述获取单元,还配置为从所述自定义识别规则中获取规则有效使能位;
    所述第二识别单元,还用于根据所述规则有效使能位的信息,确定所述自定义识别规则是否有效;当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
  22. 一种报文识别装置,所述装置还包括:
    获取单元,配置为获取自定义识别规则;
    第二识别单元,配置为根据所述自定义识别规则识别数据报文,形成识别结果;
    判断单元,配置为根据所述识别结果,判断所述数据报文是否为满足所述自定义规则的自定义报文。
  23. 根据权利要求22所述的装置,其中,
    所述获取单元,配置为从所述自定义识别规则获取掩码、匹配方式提取数据的起始位置;
    所述第二识别单元,配置为从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节;所述M1和所述M2均为不小于1的整数;将提取的字节与所述掩码进行运算,获得待匹配的数据;将所述待匹配的数据按照所述匹配方式与匹配数据表进行匹配,形成所述识别结果;
    所述判断单元,配置为当所述识别结果表明所述待匹配的数据与所述匹配数据表中的数据相匹配时,确定所述数据报文为满足所述自定义规则的自定义报文。
  24. 根据权利要求23所述的装置,其中,
    所述获取单元,还配置为从所述自定义识别规则中获取规则有效使能位;
    所述第二识别单元,还用于根据所述规则有效使能位的内容,确定所述自定义识别规则是否有效;当判断出所述自定义识别规则有效时,从所述提取数据的起始位置开始,进行M1次数据提取且每次提取M2个字节。
  25. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至2任一项所述的方法。
PCT/CN2016/094459 2015-09-21 2016-08-10 报文识别方法、装置和计算机存储介质 WO2017050038A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510604461.4 2015-09-21
CN201510604461.4A CN106549817A (zh) 2015-09-21 2015-09-21 报文识别方法及装置

Publications (1)

Publication Number Publication Date
WO2017050038A1 true WO2017050038A1 (zh) 2017-03-30

Family

ID=58365419

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/094459 WO2017050038A1 (zh) 2015-09-21 2016-08-10 报文识别方法、装置和计算机存储介质

Country Status (2)

Country Link
CN (1) CN106549817A (zh)
WO (1) WO2017050038A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765195A (zh) * 2019-10-23 2020-02-07 北京锐安科技有限公司 一种数据解析方法、装置、存储介质及电子设备
CN110808915A (zh) * 2019-10-21 2020-02-18 新华三信息安全技术有限公司 数据流所属应用识别方法、装置及数据处理设备
CN111143743A (zh) * 2019-12-26 2020-05-12 杭州迪普科技股份有限公司 一种自动扩充应用识别库的方法及装置
CN111897644A (zh) * 2020-08-06 2020-11-06 成都九洲电子信息系统股份有限公司 一种基于多维度的网络数据融合匹配方法
CN112491828A (zh) * 2020-11-13 2021-03-12 北京金山云网络技术有限公司 报文分析方法、装置、服务器及存储介质
CN112688884A (zh) * 2020-12-30 2021-04-20 北京安博通科技股份有限公司 加密流量自定义应用识别方法、系统、装置及存储介质
CN113824724A (zh) * 2021-09-24 2021-12-21 山东能士信息科技有限公司 一种智能变电站的传感器数据被篡改的判断方法、装置及存储介质
CN114143385A (zh) * 2021-11-24 2022-03-04 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质
CN114697273A (zh) * 2022-03-29 2022-07-01 杭州安恒信息技术股份有限公司 流量识别方法、装置、计算机设备和存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108234455B (zh) * 2017-12-14 2021-03-19 北京东土科技股份有限公司 一种报文转发控制方法、装置、计算机装置及存储介质
CN109005174A (zh) * 2018-08-03 2018-12-14 京信通信系统(中国)有限公司 数据通信方法、装置、计算机存储介质及设备
CN111835591B (zh) * 2020-07-10 2022-05-03 芯河半导体科技(无锡)有限公司 一种以太网报文快速协议识别的方法
CN112202670B (zh) * 2020-09-04 2022-08-30 烽火通信科技股份有限公司 一种SRv6段路由转发方法及装置
CN116033044A (zh) * 2021-10-25 2023-04-28 中移(苏州)软件技术有限公司 报文的分段解析方法、装置、设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102195977A (zh) * 2011-04-13 2011-09-21 北京恒光创新科技股份有限公司 一种网络协议识别方法及装置
CN102685008A (zh) * 2012-05-07 2012-09-19 西安电子科技大学 基于流水线的快速流识别方法及设备
CN102739553A (zh) * 2012-07-20 2012-10-17 烽火通信科技股份有限公司 一种以太网数据包的识别和处理装置
CN103401777A (zh) * 2013-08-21 2013-11-20 中国人民解放军国防科学技术大学 Openflow的并行查找方法和系统
CN104168203A (zh) * 2014-09-03 2014-11-26 上海斐讯数据通信技术有限公司 一种基于流表的处理方法及系统
CN104580202A (zh) * 2014-12-31 2015-04-29 曙光信息产业(北京)有限公司 报文的匹配方法和装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1331336C (zh) * 2004-05-25 2007-08-08 华中科技大学 一种数据包的快速解析方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102195977A (zh) * 2011-04-13 2011-09-21 北京恒光创新科技股份有限公司 一种网络协议识别方法及装置
CN102685008A (zh) * 2012-05-07 2012-09-19 西安电子科技大学 基于流水线的快速流识别方法及设备
CN102739553A (zh) * 2012-07-20 2012-10-17 烽火通信科技股份有限公司 一种以太网数据包的识别和处理装置
CN103401777A (zh) * 2013-08-21 2013-11-20 中国人民解放军国防科学技术大学 Openflow的并行查找方法和系统
CN104168203A (zh) * 2014-09-03 2014-11-26 上海斐讯数据通信技术有限公司 一种基于流表的处理方法及系统
CN104580202A (zh) * 2014-12-31 2015-04-29 曙光信息产业(北京)有限公司 报文的匹配方法和装置

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110808915A (zh) * 2019-10-21 2020-02-18 新华三信息安全技术有限公司 数据流所属应用识别方法、装置及数据处理设备
CN110808915B (zh) * 2019-10-21 2022-03-08 新华三信息安全技术有限公司 数据流所属应用识别方法、装置及数据处理设备
CN110765195A (zh) * 2019-10-23 2020-02-07 北京锐安科技有限公司 一种数据解析方法、装置、存储介质及电子设备
CN111143743B (zh) * 2019-12-26 2023-09-26 杭州迪普科技股份有限公司 一种自动扩充应用识别库的方法及装置
CN111143743A (zh) * 2019-12-26 2020-05-12 杭州迪普科技股份有限公司 一种自动扩充应用识别库的方法及装置
CN111897644A (zh) * 2020-08-06 2020-11-06 成都九洲电子信息系统股份有限公司 一种基于多维度的网络数据融合匹配方法
CN111897644B (zh) * 2020-08-06 2024-01-30 成都九洲电子信息系统股份有限公司 一种基于多维度的网络数据融合匹配方法
CN112491828A (zh) * 2020-11-13 2021-03-12 北京金山云网络技术有限公司 报文分析方法、装置、服务器及存储介质
CN112688884A (zh) * 2020-12-30 2021-04-20 北京安博通科技股份有限公司 加密流量自定义应用识别方法、系统、装置及存储介质
CN113824724A (zh) * 2021-09-24 2021-12-21 山东能士信息科技有限公司 一种智能变电站的传感器数据被篡改的判断方法、装置及存储介质
CN113824724B (zh) * 2021-09-24 2023-09-22 山东能士信息科技有限公司 一种智能变电站的传感器数据被篡改的判断方法、装置及存储介质
CN114143385B (zh) * 2021-11-24 2024-01-05 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质
CN114143385A (zh) * 2021-11-24 2022-03-04 广东电网有限责任公司 一种网络流量数据的识别方法、装置、设备和介质
CN114697273A (zh) * 2022-03-29 2022-07-01 杭州安恒信息技术股份有限公司 流量识别方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN106549817A (zh) 2017-03-29

Similar Documents

Publication Publication Date Title
WO2017050038A1 (zh) 报文识别方法、装置和计算机存储介质
US10547523B2 (en) Systems and methods for extracting media from network traffic having unknown protocols
US9590910B1 (en) Methods and apparatus for handling multicast packets in an audio video bridging (AVB) network
JP6369532B2 (ja) ネットワーク制御方法、ネットワークシステムと装置及びプログラム
US20170300595A1 (en) Data packet extraction method and apparatus
CN103281213A (zh) 一种网络流量内容提取和分析检索方法
TW201543846A (zh) 用於產生查找和進行判定的引擎、方法和軟體定義網路
US9660903B2 (en) Method and system for inserting an openflow flow entry into a flow table using openflow protocol
US7616662B2 (en) Parser for parsing data packets
US8009673B2 (en) Method and device for processing frames
US8081634B2 (en) Method and apparatus for processing downstream packets of cable modem in hybrid fiber coaxial networks
US11258886B2 (en) Method of handling large protocol layers for configurable extraction of layer information and an apparatus thereof
US20050002332A1 (en) Method, apparatus and computer program for performing a frame flow control, and method, apparatus and computer program for transmitting a frame
WO2018171115A1 (zh) 一种分片的服务质量保证方法及现场可编程逻辑门阵列
US20220158954A1 (en) Virtual network device
US20200053017A1 (en) Apparatus and method for configuring mmt payload header
US20220303157A1 (en) Virtual network
US20140016486A1 (en) Fabric Cell Packing in a Switch Device
JP2013141140A (ja) 通信装置及び通信装置のためのプログラム
WO2016082331A1 (zh) 一种帧定位方法及装置
US9729680B2 (en) Methods and systems to embed valid-field (VF) bits in classification keys for network packet frames
JP2023530269A (ja) 仮想ネットワークを形成する方法
AU2021268768B2 (en) Virtual network device
CN106134118B (zh) 光收发机以及使用该光收发机的数据映射方法
CN115552860B (zh) 虚拟网络

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16847920

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16847920

Country of ref document: EP

Kind code of ref document: A1