CN114896207A - Method and device for retrieving pcapng data packet file - Google Patents

Method and device for retrieving pcapng data packet file Download PDF

Info

Publication number
CN114896207A
CN114896207A CN202210309317.8A CN202210309317A CN114896207A CN 114896207 A CN114896207 A CN 114896207A CN 202210309317 A CN202210309317 A CN 202210309317A CN 114896207 A CN114896207 A CN 114896207A
Authority
CN
China
Prior art keywords
block
index
field
file
data packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210309317.8A
Other languages
Chinese (zh)
Inventor
汪庆权
李志�
林俊龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou DPTech Technologies Co Ltd
Original Assignee
Hangzhou DPTech Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou DPTech Technologies Co Ltd filed Critical Hangzhou DPTech Technologies Co Ltd
Priority to CN202210309317.8A priority Critical patent/CN114896207A/en
Publication of CN114896207A publication Critical patent/CN114896207A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems

Abstract

The present disclosure relates to a method and apparatus for retrieving pcapng packet files. The method comprises the following steps: adding a self-defined index block to the tail of a pcapng data packet file containing an enhanced data packet block or a simple data packet block to form a pcapng extended data packet file, wherein the self-defined index block comprises an index aiming at a message contained in the enhanced data block; and after receiving a user retrieval condition, retrieving in the user-defined index block according to the user retrieval condition, when an index meeting the user retrieval condition is retrieved in the user-defined index block, acquiring messages in the enhanced data packet block or the simple data packet block according to file offset contained in the index, and sending all messages meeting the user retrieval condition to a user after retrieval is finished. The disk IO expense and the CPU utilization rate can be greatly reduced, and the retrieval time is short. In addition, by adding the user-defined index block, the data packet tool can still be opened, and the user can not be influenced to check the data packet.

Description

Method and device for retrieving pcapng data packet file
Technical Field
The disclosure relates to the technical field of data retrieval, in particular to a method and a device for retrieving a pcapng data packet file.
Background
When a network traceability system is used for retrospective analysis, network data needs to be stored by message capture. And subsequently, the generated network behavior, the application data and the host data are backtracked and analyzed through backtracking and analyzing the original data packet in the stored network data, such as virus troubleshooting, attack, secret divulgence, network behavior analysis and the like.
The Pcap file and the pcapng file are common data packet storage format files, and the pcapng is the next generation data packet storage format of the Pcap. The pcapng file may contain more information than the pcap file, such as interface statistics, interface description information, etc. And (3) opening the pcapng file by using message capturing software such as ethereal and wirehardk, and the like, and checking network datagrams in the pcapng file, searching and analyzing the original data packet, and the like.
Specifically, when performing network tracing analysis, the original data packet can be queried according to conditions such as a source ip address, a source port, a destination ip address, a destination port, a protocol, time and the like of a user, and the original data packet can also be exported to data packet software such as wireshark and the like for further analysis, so that a basis is provided for rapidly positioning the cause of the problem, and meanwhile, a powerful data analysis guarantee is provided for network security.
When the network data is searched and analyzed, the data packet is generally required to be searched and analyzed, when the data packet is searched and analyzed, the whole file is required to be traversed, and after the search is completed, the data is returned to the user. Specifically, the corresponding data packet file needs to be opened, data in the packet is traversed, and searching is performed according to user input information. With the development of the internet, the network bandwidth is higher and higher. With the calculation of ordinary hundred million network bandwidth, assuming that data is stored at 12.5M per second, 750M in one minute and 4.5G in one hour, 108G data is obtained in one day, when the network bandwidth reaches giga net or even ten thousand million, the occupied space of the network data becomes huge, and because the IO performance of a disk is limited, IO reading and writing is a significant bottleneck, the query speed is very slow. If data of one day is queried, the data needs to be retrieved in 108G, and the retrieval takes 18 minutes at the rate of 100M per second, so that the user experience is poor, and even the user needs are difficult to meet.
Therefore, a method and an apparatus for searching pcapng packet files are needed to overcome the low searching efficiency.
Disclosure of Invention
In view of the above, the present disclosure provides a method and apparatus for retrieving a pcapng packet file. According to an aspect of the present disclosure, a method of retrieving a pcapng packet file is presented, comprising: adding a self-defined index block to the tail of a pcapng data packet file containing an enhanced data packet block or a simple data packet block to form a pcapng extended data packet file, wherein the self-defined index block comprises an index aiming at a message contained in the enhanced data block; and after receiving a user retrieval condition, retrieving in the user-defined index block according to the user retrieval condition, when an index meeting the user retrieval condition is retrieved in the user-defined index block, acquiring messages in the enhanced data packet block or the simple data packet block according to file offset contained in the index, and sending all messages meeting the user retrieval condition to a user after retrieval is finished.
The method for retrieving the pcapng data packet file according to the present disclosure, wherein the custom index block comprises: a block type field, a first block total length field, a magic field, an index area and a second block total length field; the block type field is used for identifying a custom index block; the first block total length field is used for recording the total length of the user-defined index block; the magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area; the index area is used for storing an index aiming at a message contained in an enhanced data packet block or a simple data packet block; and the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
According to the method for retrieving the pcapng data packet file, indexes contained in the index area sequentially comprise: an index type field, a file offset field, and a five-tuple, wherein the five-tuple comprises: source ip field, destination ip field, source port field, destination port field, transport layer protocol type field.
According to the method for retrieving the pcapng data packet file, when the retrieval is carried out in the self-defined index block according to the user retrieval condition, the method comprises the following steps: acquiring a second block total length field value; calculating the initial file position of the self-defined index block according to the field value of the total length of the second block; acquiring a block type field value and a magic number field value based on the initial file position of the user-defined index block, and verifying the acquired block type field value and the magic number field value; when the block type field value and the magic number field value pass the verification, analyzing each index contained in the index area, comparing a quintuple contained in the index obtained after the analysis with a user retrieval condition, and when the quintuple contained in the index is matched with the user retrieval condition, directly reading a message through the content of a file offset field contained in the index obtained after the analysis so as to send all messages of which the quintuple contained in the index is matched with the user retrieval condition to a user after the retrieval is finished.
According to the method for retrieving the pcapng data packet file, the index type field comprises the following values: ipv4, ipv6, arp or other types.
Another aspect of the present disclosure proposes an apparatus for retrieving a pcapng packet file, which includes: the pcapng extended data packet file forming component is used for forming a pcapng extended data packet file by adding a self-defined index block at the tail part of the pcapng extended data packet file containing an enhanced data packet block or a simple data packet block, wherein the self-defined index block comprises an index aiming at a message contained in the enhanced data block; and the retrieval component is used for firstly retrieving in the user-defined index block according to the user retrieval condition after receiving the user retrieval condition, acquiring messages in the enhanced data packet block or the simple data packet block according to the file offset contained in the index when the index meeting the user retrieval condition is retrieved in the user-defined index block, and sending all the messages meeting the user retrieval condition to a user after the retrieval is finished.
The apparatus for retrieving pcapng packet file according to the present disclosure, wherein the custom index block comprises: a block type field, a first block total length field, a magic field, an index area and a second block total length field; the block type field is used for identifying a custom index block; the first block total length field is used for recording the total length of the user-defined index block; the magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area; the index area is used for storing an index aiming at a message contained in an enhanced data packet block or a simple data packet block; and the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
According to the device of retrieving pcapng data packet file of this disclosure, wherein the index that the index district contains includes in proper order: an index type field, a file offset field, and a five-tuple, wherein the five-tuple comprises: source ip field, destination ip field, source port field, destination port field, transport layer protocol type field.
The apparatus for retrieving pcapng data packet file according to the present disclosure, wherein the retrieving component, when retrieving in the custom index block according to the user retrieving condition, comprises: the user-defined index block positioning component is used for acquiring the field value of the total length of the second block and calculating the position of the initial file of the user-defined index block according to the field value of the total length of the second block; the verification component is used for acquiring a block type field value and a magic number field value based on the initial file position of the user-defined index block and verifying the acquired block type field value and the magic number field value; and the analysis and matching component is used for analyzing each index contained in the index area when the block type field value and the magic number field value pass verification, comparing quintuple field content contained in the index obtained after analysis with the user retrieval condition, and directly reading messages through file offset field content contained in the index obtained after analysis when the quintuple field content contained in the index is matched with the user retrieval condition so as to send all messages of which the quintuple field content contained in the index is matched with the user retrieval condition to the user by the sending component after the retrieval is finished.
According to the device of retrieving pcapng data packet file of this disclosure, wherein the dereferencing of index type field includes: ipv4, ipv6, arp or other types.
In summary, by using the method and apparatus for retrieving the pcapng data packet file disclosed by the present disclosure, the pcpang data packet file is extended, the custom index block is added at the tail of the pcapng data packet file, and the message index data and the offset information of the data packet file are stored according to the message type. During retrieval, the file data is firstly shifted to the user-defined index block forward according to the length of the last 4 bytes of the pcapng data packet file, instead of being sequentially analyzed from front to back to analyze the file data to the user-defined index block, so that the user-defined index block can be quickly positioned. And retrieving the data packet in the index according to the retrieval condition, and reading the specified data packet through the file offset when the data packet is retrieved, so that the file reading IO performance is greatly reduced, the disk IO expense and the CPU utilization rate are reduced, and the retrieval time is short. In addition, by adding the custom index block, other data packet tools (such as wireshark and the like) can still be opened, and the user is not influenced to view the data packet.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings. The drawings described below are merely some embodiments of the present disclosure, and other drawings may be derived from those drawings by those of ordinary skill in the art without inventive effort.
FIG. 1 is a flow chart illustrating a method of retrieving a pcapng packet file according to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating a structure of a custom index block in a pcapng extended packet file according to an embodiment of the present disclosure.
Fig. 3 is a schematic diagram illustrating a structure of an index included in an index area in a custom index block according to an embodiment of the present disclosure.
Fig. 4 is a schematic flow chart illustrating the process of creating an index by using an enhanced packet block EPB or a simple packet block SPB in the method for retrieving a pcapng packet file according to the embodiment of the present disclosure.
Fig. 5 is a schematic diagram of an apparatus for retrieving a pcapng packet file according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, systems, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood by those skilled in the art that the drawings are merely schematic representations of exemplary embodiments, and that the blocks or processes shown in the drawings are not necessarily required to practice the present disclosure and are, therefore, not intended to limit the scope of the present disclosure.
The pcap file is a common datagram storage format file, and because the pcap file has a specific file format, the pcap file needs to be opened by using a message grabbing tool such as ethereal and wireshark to check the network datagrams contained in the pcap file, so that the retrieval and analysis of the network datagrams are realized. The pcapng file is a next generation datagram file storage format of the pcap file, and can contain more information, such as interface statistical information, interface description information and the like, compared with the pcap file.
The pcapng file contains a plurality of data blocks, each block containing a different type of information, for example: a Header block shb (section Header block), an interface Statistics block isb (interface Statistics block), an interface Description block idb (interface Description block), a domain name Resolution block nrb (name Resolution block), an enhanced Packet block epb (enhanced Packet block), a simple Packet block spb (simple Packet block), and a custom data block cb (custom block). The EPB is a standard container for storing packets from the network, and the SPB is a lightweight container, so the datagram is generally stored in the EPB or SPB in the pcapng file.
When the network tracing analysis is carried out, the messages in the data packet are retrieved according to specific conditions, so that the efficiency of the network tracing analysis can be improved.
FIG. 1 is a flow chart illustrating a method of retrieving a pcapng packet file according to an embodiment of the present disclosure.
As shown in fig. 1, in step S102, a pcapng extended packet file is formed, and the pcapng extended packet file contains the custom index block. Specifically, a pcapng extended packet file is formed by adding a custom index block to the tail of a pcapng packet file containing an enhanced packet block EPB or a simple packet block SPB, where the custom index block includes an index for a packet contained in the enhanced packet block.
When performing network tracing analysis, firstly, network data is stored in a pcapng packet file through message capture, specifically, the network data is usually stored in an enhanced packet block EPB or a simple packet block SPB of the pcapng packet file. According to the method for retrieving the pcapng data packet file in the embodiment of the disclosure, after the pcapng data packet file containing the message is obtained, a user-defined data block, namely the user-defined index block in the embodiment of the disclosure, is added at the tail of the pcapng data packet file, and the user-defined index block is used for storing the index of the message contained in the enhanced data packet block EPB or the simple data packet block SPB in the pcapng data packet file.
In step S104, retrieving is performed in the custom index block according to the user retrieval condition and a message is obtained based on the retrieved index. Specifically, after receiving a user retrieval condition, retrieving in the custom index block according to the user retrieval condition, and when an index meeting the user retrieval condition is retrieved in the custom index block, obtaining a message in the enhanced data packet block EPB or the simple data packet block SPB according to a file offset contained in the index.
In step S106, all messages meeting the user search condition are sent to the user after the search is completed.
Fig. 2 is a schematic diagram illustrating a structure of a custom index block in a pcapng extended packet file according to an embodiment of the present disclosure.
As shown in fig. 2, the custom index chunk includes: a block type field, a first block total length field, a magic field, an index area, and a second block total length field.
The block type field is used for identifying a custom index block. More specifically, the block type field is stored in the block header, and is used to identify the data block type as a custom index block, and a value of the block type is a 32-bit integer value conforming to the pcapng data packet file definition specification, and is used to uniquely identify the custom index block.
The first block total length field is used for recording the total length of the custom index block.
The magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area. In the method for retrieving the pcapng data packet file in the embodiment of the disclosure, the value of the magic number field is "0 xacebeded".
The index area is used for storing indexes aiming at messages contained in the enhanced data packet block or the simple data packet block.
And the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
Fig. 3 is a schematic diagram illustrating a structure of an index included in an index area of a custom index block according to an embodiment of the disclosure.
As shown in fig. 3, the index includes an index type field, a file offset field, a source ip field, a destination ip field, a source port field, a destination port field, and a transport layer protocol type field.
The value of the index type field is the message type of the message, and according to the method for retrieving the pcapng data packet file, the value of the index type field can be 1, 2, 3 and 4. Specifically, when the value of the index type field is 1, the type of the indication message is an ipv4 type message; when the value of the index type field is 2, indicating that the message type is an ipv6 message; when the value of the index type field is 3, indicating that the message type is an arp type message; when the value of the index type field is 4, the message type is indicated to be other types except for ipv4 type, ipv6 type and arp type.
And the file offset field takes the value of the offset of the message in the file, and determines the physical position of the message in the file based on the offset.
The source ip, the destination ip, the source port, the destination port and the transport layer protocol type (TCP/UDP) are quintuple information of the message.
Fig. 4 is a schematic flow chart illustrating the process of creating an index by using an enhanced packet block EPB or a simple packet block SPB in the method for retrieving a pcapng packet file according to the embodiment of the present disclosure.
As shown in fig. 4, in step S402, the pcapng packet file is read.
In step S404, it is determined whether it is an enhanced packet block EPB or a simple packet block SPB.
If it is determined in step S404 that the result of the enhancement packet block EPB or the simple packet block SPB is yes, the flow proceeds to step S406. In step S406, the packet included in the enhanced packet EPB or the simple packet SPB is analyzed, the index data information and the file offset are extracted, and the index is established. More specifically, the packet is parsed based on the packet format and the packet type of the packet, the file offset of the packet, and quintuple information (source ip, destination ip, source port, destination port, and transport layer protocol type) are extracted as index information.
The specific message parsing process is as follows: and analyzing from the MAC head of the message, and analyzing the protocol type of the upper layer protocol according to the MAC head. For example, if the upper layer protocol field value in the MAC header is "0 x 0800", it indicates that the protocol type of the upper layer protocol of the packet is ipv 4; if the upper layer protocol field value is '0 x86 dd', it means that the protocol type of the upper layer protocol is ipv 6; if the upper layer protocol field value is '0 x 0806', the protocol type of the upper layer protocol is arp, and if the upper layer protocol field value is '0 x 8100', the protocol type vlan of the upper layer protocol is indicated.
Then, extracting the quadruple information of the transmission layer in the message according to the message format corresponding to the protocol type, namely: source ip, destination ip, source port, destination port. For example, if the upper layer protocol is ipv4, further parsing is performed according to the protocol field in the protocol header; if the upper layer protocol is ipv6, further parsing is done according to the next header field in the header. If the protocol field in the ipv4 header or the next header field in the ipV6 header is 6, the protocol type is TCP, and if the protocol field in the ipv4 header or the next header field in the ipv6 header is 17, the protocol type is UDP. Further, four-layer port information is extracted from a header of a TCP protocol message or a header of a UDP protocol message. If the protocol type of the upper layer protocol is vlan, the vlan header needs to be stripped off for further analysis. If the message is not the message of more than 3 types including the ipv4 type, the ipv6 type and the arp type, the message is of other types, and the index record of the message of other types is '0 xff'.
According to the method for retrieving the pcapng data packet file in the embodiment of the present disclosure, the index information for the packet extracted in step S406 is written into the index area in the custom index block according to the index format shown in fig. 3, that is, for each packet, the extracted packet index information is uniformly stored in the index block according to the following format: index type, file offset, source ip, destination ip, source port, destination port, transport layer protocol type.
Wherein index type and file offset are mandatory options. The source ip, the destination ip, the source port, the destination port and the transport layer protocol type are optional items, and when the contents of the above 4 optional items are not included in the index information, the value of the corresponding field may be 0.
According to different index types, indexes with different lengths can be respectively stored. For example, the index type of ipv4 type packet occupies 1 byte, the file offset occupies 4 bytes, the source ip occupies 4 bytes, the destination ip occupies 4 bytes, the source port occupies 2 bytes, the destination port occupies 2 bytes, and the transport layer protocol type occupies 1 byte. For another example, the index type of ipv6 type packet occupies 1 byte, the file offset occupies 4 bytes, the source ip occupies 8 bytes, the destination ip occupies 8 bytes, the source port occupies 2 bytes, the destination port occupies 2 bytes, and the transport layer protocol type occupies 1 byte.
In step S408, it is determined whether traversal is completed.
When it is determined in step S408 that the result of the traversal is yes, the process proceeds to step S410. In step S410, the total length of the index contained in the index area and the total length of the custom index block are calculated, and the total length of the custom index block is written into the designated file location. More specifically, after traversing the messages contained in the enhanced packet block EPB or the simple packet block SPB in the pcapng packet file and generating all indexes, the total length of the indexes is calculated. When the total length of the index is less than 4 bytes, the index is aligned according to 4 bytes, and the index content is filled with 0 to meet the 4-byte alignment. And finally, calculating the total length of the custom index block, and storing the total length of the custom index block into a block total length field in the custom index block according to the pcapng file format requirement.
When it is determined in step S408 that the result of the traversal is "no", the process re-enters step S404 until the traversal is completed.
According to the method for searching the pcapng data packet file, when message searching is carried out according to the user searching condition, the method comprises the following steps: acquiring a second block total length field value; calculating the initial file position of the self-defined index block according to the field value of the total length of the second block; acquiring a block type field value and a magic number field value based on the initial file position of the user-defined index block, and verifying the acquired block type field value and the magic number field value; when the block type field value and the magic number field value pass the verification, analyzing each index contained in the index area, comparing a quintuple contained in the index obtained after the analysis with a user retrieval condition, and when the quintuple contained in the index is matched with the user retrieval condition, directly reading a message through the content of a file offset field contained in the index obtained after the analysis so as to send all messages of which the quintuple contained in the index is matched with the user retrieval condition to a user after the retrieval is finished.
More specifically, when searching is carried out according to user searching conditions, firstly, a pcapng data packet file to be searched is opened, 4 bytes at the tail part of the pcapng data packet file are positioned, as the total length of blocks of the self-defined index block is represented, the block type and the magic number are verified after the total length of the blocks of the self-defined index block is forwards shifted to the file start position of the self-defined index block, if the verification is not passed, the result shows that the file is a common pcapng file without index, and the whole file is required to be traversed for searching; if the pcapng file contains the self-defined index block, analyzing the index according to the index format, extracting quintuple information and comparing with the retrieval condition input by the user, if the message meeting the retrieval condition is found, directly reading the message content through file offset, and finally uniformly returning the message meeting the retrieval condition to the user for displaying.
Fig. 5 is a schematic diagram of an apparatus for retrieving a pcapng packet file according to an embodiment of the present disclosure. As shown in fig. 5, an apparatus for retrieving a pcapng packet file according to an embodiment of the present disclosure includes: pcapng extended packet file forming component 502, retrieving component 504, and sending component 506.
The pcapng extended packet file forming component 502 is configured to form a pcapng extended packet file by adding a custom index block to a tail portion of the pcapng extended packet file including an enhanced packet block or a simple packet block, where the custom index block includes an index for a packet included in the enhanced packet block; a retrieval component 504, configured to, after receiving a user retrieval condition, first perform retrieval in the custom index block according to the user retrieval condition, and when an index meeting the user retrieval condition is retrieved in the custom index block, obtain a message in the enhanced data packet block or the simple data packet block according to a file offset included in the index, and a sending component 506, configured to send all messages meeting the user retrieval condition to a user after the retrieval is completed.
According to the embodiment of the disclosure, the apparatus for retrieving the pcapng data packet file, wherein the custom index block comprises: a block type field, a first block total length field, a magic field, an index area and a second block total length field; the block type field is used for identifying a custom index block; the first block total length field is used for recording the total length of the custom index block; the magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area; the index area is used for storing an index aiming at a message contained in an enhanced data packet block or a simple data packet block; and the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
According to the device for retrieving the pcapng data packet file in the embodiment of the disclosure, the indexes contained in the index area sequentially comprise: an index type field, a file offset field, and a five-tuple, wherein the five-tuple comprises: source ip field, destination ip field, source port field, destination port field, transport layer protocol type field.
An apparatus for retrieving a pcapng packet file according to an embodiment of the present disclosure, wherein the retrieving component 504 comprises: a custom index chunk locating component 504a, a verification component 504b, and a parsing and matching component 504 c.
The custom index block positioning component 504a is configured to obtain a total length field value of the second block and calculate a start file position of the custom index block according to the total length field value of the second block; the verification component 504b is configured to obtain a block type field value and a magic number field value based on the start file position of the custom index block, and verify the obtained block type field value and the magic number field value; an analyzing and matching component 504c, configured to analyze each index included in the index region when the block type field value and the magic number field value pass verification, compare five-tuple field content included in an index obtained after the analysis with a user retrieval condition, and when the five-tuple field content included in the index matches the user retrieval condition, directly read a message through file offset field content included in an index obtained after the analysis, so that a sending component 506 sends all messages, to which the five-tuple field content included in the index matches the user retrieval condition, to a user after the retrieval is completed.
According to this disclosed embodiment's retrieval pcapng data packet file's device, wherein the value of index type field includes: ipv4, ipv6, arp or other types.
In summary, by using the method and apparatus for retrieving the pcapng data packet file disclosed by the present disclosure, the pcpang data packet file is extended, the custom index block is added at the tail of the pcapng data packet file, and the message index data and the offset information of the data packet file are stored according to the message type. During retrieval, the file data is firstly shifted to the user-defined index block forward according to the length of the last 4 bytes of the pcapng data packet file, instead of being sequentially analyzed from front to back to analyze the file data to the user-defined index block, so that the user-defined index block can be quickly positioned. And retrieving the data packet in the index according to the retrieval condition, and reading the specified data packet through the file offset when the data packet is retrieved, so that the file reading IO performance is greatly reduced, the disk IO expense and the CPU utilization rate are reduced, and the retrieval time is short. In addition, by adding the custom index block, other data packet tools (such as wireshark and the like) can still be opened, and the user is not influenced to view the data packet.
In general, the pcapng file of the present disclosure contains multiple data blocks, each containing different types of information. For example, the method comprises a Header Block SHB Section Header Block, an Interface Statistics information Block ISB Interface Statistics Block, an Interface description information Block IDB Interface description Block, a domain Name Resolution information Block NRB Name Resolution Block, an Enhanced Packet Block EPB Enhanced Packet Block, a Simple Packet Block SPB Simple Packet Block and a Custom Packet Block CB Custom Block. The scheme provided by the present disclosure is that a self-defined block is added at the tail of a pcapng file for message indexing by expanding a data packet pcpang format file, and the format of the self-defined block is as follows: the method comprises the steps of obtaining a block type (self-defined block) + total block length + magic + index area + total block length, wherein the magic is 4 bytes, the content is 0 xcebeded, the index area is used for identifying the self-defined index block, the index area is stored in the self-defined index block by extracting quintuple information and data packet file offset of each data packet, and when retrieval is carried out according to the quintuple condition, the self-defined index block is read firstly and retrieved in the index block. The process of adding the index blocks is as follows: reading the saved pcapng format data packet, generally saving the data packet in an enhanced data packet block EPB and saving the data packet in a simple data packet block SPB by using a pcapng format file; and analyzing the data packet file according to the pcapng format, extracting message data when analyzing an EPB or SPB data packet block, and analyzing the message into 4 types at present, namely an IPV4 type, an IPV6 type, an arp type and other types. The specific message analysis process is as follows: the message data starts from a mac head of a two-layer header, an upper layer protocol is analyzed according to the mac head, if the mac head upper layer protocol is 0x0800, the upper layer is ipv4, and if the upper layer protocol is 0x86dd, the upper layer is ipv 6; if the upper layer protocol is ipv4, further analyzing according to a protocol header field, and if the upper layer protocol is ipv6, further analyzing according to a header field of next; an ipv4 header protocol field or an ipv6 header next header field is 6 to indicate TCP, and 17 is UDP, and four layers of port information can be further extracted from the TCP and UDP headers. In addition, the mac header upper layer protocol is 0x0806 and is indicated as an arp message, if the mac header upper layer protocol is 0x8100, the vlan header needs to be stripped off for further analysis, in short, the message is analyzed according to the message format, the message type, quintuple information (source and destination IP, source and destination port and four-layer protocol) and the file offset position are extracted, and if the mac header upper layer protocol is not more than 3 types of messages including an IPV4 type, an IPV6 type and an arp type, the mac header upper layer protocol is of other types, and the index of the other types of messages is recorded as 0 xff. Aiming at each message, the extracted message index information is uniformly stored in an index block according to the following format: index type (1 byte, optional) + file offset location (4 bytes, optional) + source IP (optional) + destination IP (optional) + source port (optional) + destination port format (optional) + layer 4 protocol (optional). If the IPV4 index type is 1, the IPV6 index type is 2, and the index corresponding to the IPV4 is as follows: the 1-byte index value is 1+ 4-byte file offset + 4-byte source IP + 4-byte destination IP + 2-byte source port + 2-byte destination port + 4-layer protocol, and the field which is not provided is filled with 0; the index corresponding to ipv6 is: the 1-byte index value is 2+ 4-byte file offset + 8-byte source IP + 8-byte destination IP + 2-byte source port + 2-byte destination port + 4-layer protocol, and no field is filled with 0. And according to the index types, the indexes with different lengths are respectively stored. And writing the index information into a custom index block positioned at the tail part of the pcapng file every time one data packet is analyzed. And when the data packet is traversed and all indexes are generated, calculating the total length of the indexes, wherein the total length of the indexes is less than 4 bytes, aligning according to 4 bytes, filling 0 in the index content to meet the 4 byte alignment, finally calculating the total length of the self-defined block, and storing the total length of the self-defined block according to the pcapng file format requirement. The retrieval process mainly comprises the following steps: firstly, opening a data packet file to be retrieved, positioning 4 bytes at the tail of the file, indicating the length of a self-defined block, then, carrying out proofreading on the block type and the magic after forward shifting to the length of the self-defined block, and if the data packet file is not a self-defined index block and the magic, indicating that the data packet file is a common pcapng file without indexes, needing to retrieve the whole traversal file; if the file is the self-defined index block file, analyzing the index according to the self-defined index format, extracting quintuple information and comparing the quintuple information with the user input condition, if the message meeting the condition is found, directly reading the content of the data packet through file offset, and finally uniformly returning the data packet information meeting the condition to the user for displaying.
The basic principles of the present disclosure have been described in connection with specific embodiments, but it should be noted that it will be understood by those skilled in the art that all or any of the steps or components of the method and apparatus of the present disclosure may be implemented in any computing device (including processors, storage media, etc.) or network of computing devices, in hardware, firmware, software, or a combination thereof, which can be implemented by those skilled in the art using their basic programming skills after reading the description of the present disclosure.
Thus, the objects of the present disclosure may also be achieved by running a program or a set of programs on any computing device. The computing device may be a general purpose device as is well known. Thus, the object of the present disclosure can also be achieved merely by providing a program product containing program code for implementing the method or apparatus. That is, such a program product also constitutes the present disclosure, and a storage medium storing such a program product also constitutes the present disclosure. It is to be understood that the storage medium may be any known storage medium or any storage medium developed in the future.
It is also noted that in the apparatus and methods of the present disclosure, it is apparent that the components or steps may be broken down and/or re-combined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
The above detailed description should not be construed as limiting the scope of the disclosure. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (10)

1. A method of retrieving a pcapng packet file, comprising:
adding a self-defined index block at the tail part of a pcapng data packet file containing an enhanced data packet block or a simple data packet block to form a pcapng extended data packet file, wherein the self-defined index block comprises an index aiming at a message contained in the enhanced data packet block;
after receiving a user retrieval condition, retrieving in the user-defined index block according to the user retrieval condition, and when an index meeting the user retrieval condition is retrieved in the user-defined index block, acquiring a message and a message in the enhanced data packet block or the simple data packet block according to a file offset contained in the index
And sending all messages meeting the user retrieval conditions to the user after the retrieval is finished.
2. The method of retrieving a pcapng packet file as recited in claim 1,
the custom index chunk comprises: a block type field, a first block total length field, a magic field, an index area and a second block total length field;
the block type field is used for identifying a custom index block;
the first block total length field is used for recording the total length of the user-defined index block;
the magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area;
the index area is used for storing an index aiming at a message contained in an enhanced data packet block or a simple data packet block;
and the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
3. The method of retrieving a pcapng packet file of claim 2,
the indexes contained in the index area sequentially comprise: an index type field, a file offset field, and a five-tuple, wherein the five-tuple comprises: source ip field, destination ip field, source port field, destination port field, transport layer protocol type field.
4. The method of retrieving a pcapng packet file of claim 3, when retrieving in said custom index chunk according to said user retrieval criteria, comprising:
acquiring a second block total length field value;
calculating the initial file position of the self-defined index block according to the field value of the total length of the second block;
acquiring a block type field value and a magic number field value based on the initial file position of the user-defined index block, and verifying the acquired block type field value and the magic number field value;
when the block type field value and the magic number field value pass verification, analyzing each index contained in the index area, comparing a quintuple contained in the index obtained after analysis with a user retrieval condition, and when the quintuple contained in the index is matched with the user retrieval condition, directly reading a message through the content of a file offset field contained in the index obtained after analysis so as to send all messages of which the quintuple contained in the index is matched with the user retrieval condition to a user after retrieval is finished.
5. The method of retrieving a pcapng packet file as recited in claim 3, wherein said indexing of said index type field comprises: ipv4, ipv6, arp or other types.
6. An apparatus for retrieving a pcapng packet file, comprising:
the pcapng extended data packet file forming component is used for forming a pcapng extended data packet file by adding a self-defined index block at the tail part of the pcapng extended data packet file containing an enhanced data packet block or a simple data packet block, wherein the self-defined index block comprises an index aiming at a message contained in the enhanced data packet block;
a retrieval component, configured to, after receiving a user retrieval condition, first perform retrieval in the custom index block according to the user retrieval condition, and when an index meeting the user retrieval condition is retrieved from the custom index block, obtain a message and a message in the enhanced data packet block or the simple data packet block according to a file offset included in the index
And the sending component is used for sending all messages meeting the user retrieval conditions to the user after the retrieval is finished.
7. The apparatus for retrieving a pcapng packet file as recited in claim 6,
the custom index block comprises: a block type field, a first block total length field, a magic field, an index area and a second block total length field;
the block type field is used for identifying a custom index block;
the first block total length field is used for recording the total length of the custom index block;
the magic number field stores a magic number preset fixed value used for identifying the initial file position of the index area;
the index area is used for storing an index aiming at a message contained in an enhanced data packet block or a simple data packet block;
and the value of the second block total length field is the same as that of the first block total length field, and the second block total length field is used for verifying the self-defined index block.
8. The apparatus for retrieving a pcapng packet file as recited in claim 7,
the indexes contained in the index area sequentially comprise: an index type field, a file offset field, and a five-tuple, wherein the five-tuple comprises: source ip field, destination ip field, source port field, destination port field, transport layer protocol type field.
9. The apparatus for retrieving a pcapng package file of claim 8, wherein said retrieval component, when retrieving in said custom index chunk according to said user retrieval criteria, comprises:
the user-defined index block positioning component is used for acquiring the field value of the total length of the second block and calculating the position of the initial file of the user-defined index block according to the field value of the total length of the second block;
the verification component is used for acquiring a block type field value and a magic number field value based on the initial file position of the user-defined index block and verifying the acquired block type field value and the magic number field value;
and the analysis and matching component is used for analyzing each index contained in the index area when the block type field value and the magic number field value pass verification, comparing quintuple field content contained in the index obtained after analysis with the user retrieval condition, and directly reading messages through file offset field content contained in the index obtained after analysis when the quintuple field content contained in the index is matched with the user retrieval condition so as to send all messages of which the quintuple field content contained in the index is matched with the user retrieval condition to the user by the sending component after the retrieval is finished.
10. The apparatus for retrieving a pcapng packet file as recited in claim 8, wherein said index type field values comprise: ipv4, ipv6, arp or other types.
CN202210309317.8A 2022-03-27 2022-03-27 Method and device for retrieving pcapng data packet file Pending CN114896207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210309317.8A CN114896207A (en) 2022-03-27 2022-03-27 Method and device for retrieving pcapng data packet file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210309317.8A CN114896207A (en) 2022-03-27 2022-03-27 Method and device for retrieving pcapng data packet file

Publications (1)

Publication Number Publication Date
CN114896207A true CN114896207A (en) 2022-08-12

Family

ID=82716225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210309317.8A Pending CN114896207A (en) 2022-03-27 2022-03-27 Method and device for retrieving pcapng data packet file

Country Status (1)

Country Link
CN (1) CN114896207A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302971A (en) * 2023-02-07 2023-06-23 北京大学 Extensible test generation method for programmable data plane

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302971A (en) * 2023-02-07 2023-06-23 北京大学 Extensible test generation method for programmable data plane

Similar Documents

Publication Publication Date Title
US6651099B1 (en) Method and apparatus for monitoring traffic in a network
US8578024B1 (en) Network application signatures for binary protocols
US7643431B2 (en) Distributed packet group identification for network testing
US8964548B1 (en) System and method for determining network application signatures using flow payloads
US20140369363A1 (en) Apparatus and Method for Uniquely Enumerating Paths in a Parse Tree
US20120182891A1 (en) Packet analysis system and method using hadoop based parallel computation
EP2530874B1 (en) Method and apparatus for detecting network attacks using a flow based technique
CN115037575A (en) Message processing method and device
US10873534B1 (en) Data plane with flow learning circuit
US9276853B2 (en) Hashing of network packet flows for efficient searching
CN106330584A (en) Identification method and identification device of business flow
CN101316232B (en) Fragmentation and reassembly method based on network protocol version six
CN114896207A (en) Method and device for retrieving pcapng data packet file
CN104009984A (en) Network flow index retrieving and compressing method based on inverted list
CN114157502A (en) Terminal identification method and device, electronic equipment and storage medium
CN108141387A (en) The length of packet header sampling is controlled
CN116634046A (en) Message processing method and device, electronic equipment and storage medium
CN107248939B (en) Network flow high-speed correlation method based on hash memory
CN102811158A (en) Data positioning and recombining method and corresponding device
CN109067711B (en) Rapid backtracking analysis method for network data packet
CN114327833A (en) Efficient flow processing method based on software-defined complex rule
CN112769520A (en) Complete data packet retention method and system based on IP fragmentation
CN102820997B (en) Method and device for storage and reduction of CDR (calling detail records) process packets
CN117435912A (en) Data packet index and retrieval method based on network data packet attribute value length characteristics
CN114760166A (en) Tunnel message processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination