CN110049023B - Unknown protocol reverse identification method and system based on machine learning - Google Patents

Unknown protocol reverse identification method and system based on machine learning Download PDF

Info

Publication number
CN110049023B
CN110049023B CN201910251538.2A CN201910251538A CN110049023B CN 110049023 B CN110049023 B CN 110049023B CN 201910251538 A CN201910251538 A CN 201910251538A CN 110049023 B CN110049023 B CN 110049023B
Authority
CN
China
Prior art keywords
protocol
measurement
control
unknown
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910251538.2A
Other languages
Chinese (zh)
Other versions
CN110049023A (en
Inventor
邱乐德
覃落雨
周钠
齐维孔
李明
王灏宇
李健珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Space Technology CAST
Original Assignee
China Academy of Space Technology CAST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Space Technology CAST filed Critical China Academy of Space Technology CAST
Priority to CN201910251538.2A priority Critical patent/CN110049023B/en
Publication of CN110049023A publication Critical patent/CN110049023A/en
Application granted granted Critical
Publication of CN110049023B publication Critical patent/CN110049023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0083Formatting with frames or packets; Protocol or part of protocol for error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/06Notations for structuring of protocol data, e.g. abstract syntax notation one [ASN.1]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/26Special purpose or proprietary protocols or architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Communication Control (AREA)

Abstract

The invention relates to an unknown protocol reverse identification method and system based on machine learning, which comprises the steps of firstly carrying out bit stream input reading, then carrying out guide sequence analysis and frame sequence analysis, cutting out a guide sequence and a frame sequence of a read unknown protocol measurement and control bit stream, then constructing a measurement and control protocol frame sequence format model, and then identifying a protocol type. And identifying the protocol type of the unknown protocol bit stream by utilizing the constructed PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model according to all the cut unknown protocol frame sequences. And then extracting the format content of the frame sequences, and extracting the format content of all the frame sequences of the unknown protocol measurement and control bit stream. And finally, outputting an unknown protocol analysis result, extracting information such as a filling sequence, a frame sequence structure, a frame structure, instruction content and the like of the unknown protocol bit stream, and realizing reverse identification of the unknown protocol.

Description

Unknown protocol reverse identification method and system based on machine learning
Technical Field
The invention relates to an unknown protocol reverse identification method and system based on machine learning, in particular to a method for improving the accuracy of unknown protocol reverse identification by constructing a CCSDS type measurement and control protocol format model and a PCM type measurement and control protocol format model and utilizing a character string statistical method in machine learning, belonging to the technical field of unknown protocol reverse identification.
Background
Most of the traditional protocol reverse analysis research is based on the communication protocols of the general standard type, and the protocols have open, standard and standardized protocol formats, so that the measures of matching and identification of the general protocol feature library and the like are adopted, and a better protocol reverse effect can be obtained. However, since the CCSDS protocol and the PCM protocol merely provide a framework for specification, the specific implementation of the protocols by users is quite different, and thus such protocols generally fall into the category of unknown protocols. For an unknown protocol, the protocol format of the unknown protocol can not be restored and the semantic expression of related fields can not be inferred simply by utilizing the traditional research means such as matching and identification of the existing protocol feature library, and the reverse analysis of the unknown protocol has few related researches.
Disclosure of Invention
The technical problems solved by the invention are as follows: aiming at the problem, the unknown protocol reverse identification method and system based on machine learning are started from the message sequence analysis, unknown protocol bit stream data is used as a research object, a standard PCM and CCSDS protocols are referred to, a universal PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model are creatively constructed, the universal PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model are used as prior information, and under the condition of a small amount of prior information, information such as a filling sequence, a frame sequence structure, a frame structure, instruction content and the like of unknown protocol bit stream is extracted by using a character string statistical algorithm and a KMP character string quick matching algorithm based on a red-black tree in machine learning, so that the reverse and analysis of the unknown protocol are realized. The main technical problem who solves includes: (1) constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model: a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model are creatively constructed by referring to standard PCM and CCSDS protocols and are used as prior information input of the reverse direction and analysis of a subsequent protocol. (2) Frame sequence and frame structure analysis: based on a KMP (Kernel-based P) character string fast matching algorithm and a RexBlueTou statistical algorithm in machine learning, protocol information such as a filling sequence, a frame sequence structure, a frame structure, instruction content and the like of an unknown protocol bit stream is restored according to a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model which are constructed in the text.
The technical solution of the invention is as follows: an unknown protocol reverse identification method based on machine learning comprises the following steps:
step 1, reading unknown protocol measurement and control bit streams in a binary form to realize bit stream input;
and 2, adopting a character string statistical method based on a red-black tree and a KMP algorithm in a machine learning theory to segment the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in the step 1, and realizing guide sequence analysis and frame sequence analysis.
And 3, constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model, and realizing the construction of a measurement and control protocol frame sequence format model.
And 4, identifying the protocol type of the unknown protocol measurement and control bit stream by utilizing the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model which are constructed in the step 3 according to all the frame sequences segmented in the step 2, and realizing protocol type identification.
And 5, extracting the content of each field of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified in the step 4, and realizing the extraction of the frame sequence format content.
And 6, outputting the protocol type of the unknown protocol measurement and control bit stream identified in the step 4 and the content of the frame sequence format extracted in the step 5, and outputting an unknown protocol analysis result.
Step 2, adopting a character string statistical method based on a red-black tree and a KMP algorithm based on fuzzy matching in a machine learning theory to segment the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in the step 1, wherein the optimal method comprises the following steps:
step 21, determining a correct guide code in the unknown protocol measurement and control bit stream by adopting a character string statistical method based on the red-black tree;
step 22, acquiring all positions of the guide codes determined in the step 21 in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching;
and step 23, because the bootstrap sequence consists of continuous bootstrap codes, according to all the positions of the bootstrap codes in the unknown protocol measurement and control data obtained in the step 22, according to the starting and ending positions of each continuous bootstrap code, segmenting out the bootstrap sequences of all the unknown protocol measurement and control bit streams, and then segmenting out all the frame sequences according to the positions of the bootstrap sequences.
Step 3, constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model, preferably comprising the following steps:
step 31, constructing a PCM measurement and control protocol format model, namely determining the PCM measurement and control protocol format model according to the PCM standard specification, wherein the PCM measurement and control protocol format model sequentially comprises three parts, namely a start word, a frame and an end word, the frame sequentially comprises a satellite synchronization word, a mode word, a data field and an end word, and the bit length of each part is set.
Step 32, constructing a CCSDS measurement and control protocol format model, that is, determining the CCSDS measurement and control protocol format model according to the CCSDS standard specification, wherein the CCSDS measurement and control protocol format model includes three parts of a start word, a frame and an end word in sequence, the frame includes a PCM frame header, a CCSDS remote control frame header (namely, a TC header) and a data field part in sequence, the TC header includes a version number, a type, a secondary leading head mark, an application process identification and a sequence mark part in sequence, and the bit length of each part is set.
Step 4, according to all the frame sequences cut out in step 2, identifying the protocol type of the unknown protocol measurement and control bit stream by using the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model constructed in step 3, wherein the specific method comprises the following steps:
and step 41, setting 5 arrays, respectively recording as BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and sequentially corresponding to the version number, the type, the auxiliary leading head mark, the application process identification and the sequence mark field of the CCSDS type measurement and control protocol.
42, intercepting 1 st to 3 rd bit positions of all frame sequences of the unknown protocol measurement and control bit stream and storing the 1 st to 3 th bit positions into a BBH (Becky-Bastile) [ ] array, storing a 4 th bit position into an LX (Becky-Bastile) [ ] array, storing a 5 th bit position into an FDT (frequency division multiple access) [ ] array, storing 6 th to 16 th bit positions into a WXSB (WXSB) [ ] array, and storing 17 th to 18 th bit positions into an XL [ ] array;
and 43, respectively comparing the data stored in each array of BBH [ ], LX [ ], FDT [ ], WXSB [ ]andXL [ ], if the data stored in each array are the same, judging that the protocol type of the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, otherwise, judging that the protocol type of the unknown protocol measurement and control bit stream is a PCM type measurement and control protocol.
Step 5, extracting all the fields of the frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified in the step 4, wherein the specific method comprises the following steps:
step 51, selecting a measurement and control protocol format model, and if the unknown protocol measurement and control bit stream is identified to be a CCSDS type measurement and control protocol in step 4, selecting the CCSDS type measurement and control protocol format model; otherwise, selecting a PCM type measurement and control protocol format model.
And step 52, intercepting the contents of all fields of all frame sequences of the unknown protocol measurement and control bit stream according to the bit length of each field specified by the measurement and control protocol format model by using the measurement and control protocol format model determined in the step 51.
As shown in fig. 3, the unknown protocol measurement and control bit stream has a length greater than 18 bits and is sequentially composed of noise bit data, a pilot sequence, and a frame sequence, where the pilot sequence data is composed of a pilot code, and the frame sequence data sequentially includes a frame sequence header, a frame header, and instruction data.
An unknown protocol reverse recognition system based on machine learning comprises a bit stream input module, a guide sequence analysis module, a construction module, a protocol type recognition module, a frame sequence format content extraction module and an unknown protocol analysis result output module;
the bit stream input module reads the unknown protocol measurement and control bit stream in a binary form to realize bit stream input;
and the guide sequence analysis module cuts out the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read by the bit stream input module by adopting a character string statistical method based on a red-black tree and a KMP algorithm in a machine learning theory.
And the construction module is used for constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model.
The protocol type identification module is used for identifying the protocol type of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model which are constructed by the construction module according to all the frame sequences cut out by the guide sequence analysis module;
and the frame sequence format content extraction module is used for extracting the content of each field of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified by the protocol type identification module. The unknown protocol analysis result output module and the unknown protocol measurement and control bit stream output module output the protocol type of the unknown protocol measurement and control bit stream identified by the protocol type identification module and the frame sequence format content extracted by the frame sequence format content extraction module.
The guide sequence analysis module adopts a character string statistical method based on a red-black tree and a KMP algorithm based on fuzzy matching in a machine learning theory to segment a guide sequence and a frame sequence of an unknown protocol measurement and control bit stream read by a bit stream input module, and the specific method comprises the following steps:
determining a correct guide code in the unknown protocol measurement and control bit stream by adopting a character string statistical method based on a red-black tree; then, acquiring all positions of the bootstrap code in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching; the guide sequence is composed of continuous guide codes, according to all positions of the guide codes in the unknown protocol measurement and control data, according to the starting position and the ending position of each continuous guide code, the guide sequences of all unknown protocol measurement and control bit streams are cut out, and then all frame sequences are cut out according to the positions of the guide sequences.
The construction module constructs a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model, preferably:
the method comprises the steps of constructing a PCM type measurement and control protocol format model, namely determining the PCM type measurement and control protocol format model according to PCM standard specifications, preferably sequentially comprising three parts of a start word, a frame and an end word, wherein the frame sequentially comprises a satellite synchronization word, a mode word, a data field and an end word, and setting the bit length of each part.
The method comprises the steps of constructing a CCSDS type measurement and control protocol format model, namely determining the CCSDS type measurement and control protocol format model according to a CCSDS standard specification, preferably sequentially comprising three parts of a start word, a frame and an end word, wherein the frame sequentially comprises a PCM frame header, a CCSDS remote control frame header (namely a TC header) and a data domain part, the TC header sequentially comprises a version number, a type, an auxiliary leading head mark, an application process identification part and a sequence mark part, and the bit length of each part is set.
The unknown protocol measurement and control bit stream length is more than 18 bits, and the unknown protocol measurement and control bit stream length is composed of noise bit data, a guide sequence and a frame sequence in sequence, wherein the guide sequence data preferably comprises a guide code, and the frame sequence data sequentially comprises a frame sequence header, a frame header and instruction data.
The protocol type identification module identifies the protocol type of the unknown protocol measurement and control bit stream by utilizing a PCM measurement and control protocol format model and a CCSDS measurement and control protocol format model which are constructed by the construction module according to all frame sequences segmented by the guide sequence analysis module, and the specific method comprises the following steps:
the unknown protocol measurement and control bit stream length is larger than 18 bits, preferably 5 arrays are set, and are respectively marked as BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and 5 array names sequentially correspond to the version number, the type, the auxiliary head mark, the application process identification and the sequence mark field of the CCSDS measurement and control protocol. Then preferably intercepting 1 to 3 bits of all frame sequences of the unknown protocol measurement and control bit stream and storing the bits into a BBH (Beckman-machine-tool) array, storing the 4 th bit into an LX (X-X) array, storing the 5 th bit into an FDT (frequency domain data) array, storing the bits 6 to 16 into a WXSB (WXSB) array, and storing the bits 17 to 18 into an XL (X-X) array; and finally, respectively comparing the data stored in each array in BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and if the data stored in each array are the same, judging that the protocol type of the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, otherwise, judging that the unknown protocol measurement and control bit stream is a PCM type measurement and control protocol.
The frame sequence format content extraction module extracts all frame sequence fields of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified by the protocol type identification module, and the preferred method is as follows: selecting a measurement and control protocol format model, and if the protocol type identification module identifies that the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, selecting the CCSDS type measurement and control protocol format model; otherwise, selecting a PCM type measurement and control protocol format model; and intercepting the contents of all fields of all frame sequences of the unknown protocol measurement and control bit stream by utilizing the determined measurement and control protocol format model according to the bit length of each field specified by the measurement and control protocol format model.
Compared with the prior art, the invention has the advantages that:
(1) the invention provides an unknown protocol reverse identification method based on machine learning, which creatively constructs a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model by referring to a standard PCM and a CCSDS protocol, and takes the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model as prior information.
(2) By utilizing a character string statistical algorithm and a KMP character string fast matching algorithm based on a red-black tree in machine learning, an unknown protocol reverse and analysis model based on machine learning is constructed, information such as a guide sequence, a frame sequence structure, a frame structure, instruction content and the like of an unknown protocol bit stream is extracted, and reverse and analysis of the unknown protocol are realized.
(3) The traditional protocol analysis methods such as the matching identification of the universal protocol feature library based on the universal standard type protocol are not suitable for the private unknown protocol any more, and the segmentation accuracy of the frame sequence data is lower. According to the unknown protocol reverse identification method based on machine learning, the accuracy of frame sequence data segmentation is improved by constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model and adopting a character string matching and statistical method in machine learning.
(4) The test contrast test shows that under the conditions that the prior information is not higher than 30% and the system error rate is not higher than 1%, the method can realize that the segmentation accuracy of the protocol frame sequence is not lower than 80% and the instruction extraction accuracy is not lower than 75%, obtain better effect and effectively fill the technical blank in the field of unknown protocol reversion and analysis.
(5) The invention can be applied to the identification of unknown spacecraft measurement and control protocol types with wireless channel error codes and the extraction of frame formats.
Drawings
FIG. 1 is a PCM measurement and control protocol format model
FIG. 2 is a CCSDS type measurement and control protocol format model
FIG. 3 is an unknown protocol measurement and control bitstream composition
FIG. 4 is a diagram of an unknown protocol inversion and analysis system
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
The invention relates to an unknown protocol reverse identification method based on machine learning, which comprises the following steps: 1. and inputting a bit stream. Reading unknown protocol measurement and control bit stream files to be analyzed in a binary form. 2. Pilot sequence analysis and frame sequence analysis. And (3) cutting out the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in the step (1) by adopting a character string statistical method based on a red-black tree and a KMP algorithm based on fuzzy matching in a machine learning theory. 3. And constructing a measurement and control protocol frame sequence format model. And constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model. 4. And identifying the protocol type. And identifying the protocol type of the unknown protocol bit stream by utilizing the constructed PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model according to all the cut unknown protocol frame sequences. 5. And extracting frame sequence format content. And (4) extracting the format contents of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM measurement and control protocol format model and a CCSDS measurement and control protocol format model according to the protocol type identified in the step (4). 6. And outputting an unknown protocol analysis result. And outputting the analyzed protocol type and frame sequence format content of the unknown protocol measurement and control bit stream.
1. Protocol reverse analysis technique
The protocol reverse analysis refers to that the relation among data and the protocol information implicit in the data are mined from the captured huge data stream, and the protocol format reduction and the protocol instruction content reconstruction are realized.
In the field of information security, the reverse analysis of protocols has been a focus of research. From an "attack" perspective, the format and content of the target communication can be deduced by reconnaissance and reverse analysis of the protocol. From the 'prevention' perspective, the safety design of the protocol can be further improved through the reverse protocol analysis on the basis of ensuring the normal communication function of the party, and the malicious access of the external unauthorized equipment is prevented.
2. Privacy of protocols
At present, the ccsds (coherent Committee for Space Data system) protocol and the PCM protocol are international mainstream Space-earth link communication standards, which are established by the National Aeronautics and astronautics and Space Administration (NASA) in 1982 to solve the problem of heterogeneous equipment intercommunication, interconnection and interoperation of world communication and international networking Data, and currently gradually become international mainstream Space information interaction model standards, mainly relating to the ranges of Data processing, Data classification and coding transmission, communication entity architecture, communication protocol framework, communication services and the like. Since the CCSDS protocol and the PCM protocol merely provide a specification framework, the specific implementation of the protocols by users is very different, and therefore, such protocols are generally called proprietary protocols and belong to the category of unknown protocols. The private protocol is widely applied in the communication interaction process of communication equipment for civil use, military use and the like, but compared with a universal standard protocol with a unified protocol structure, the private protocol has no unified standard protocol document as a priori knowledge reference, so that reverse analysis research of the unknown protocol of the private class is relatively less. However, from the perspective of network security defense, research on proprietary protocols is also becoming more and more important in order to test the security of the network attack and the robustness of the complex application environment.
3. Standard PCM protocol
In the national military standard "satellite measurement and control and data management PCM remote control", the message format of the specified standard PCM protocol is shown in fig. 1.
The remote transmission unit of the standard PCM protocol is composed of several frame sequences, which are filled and synchronized by idle sequences. The idle and pilot sequences are each made up of 8n "10", which are collectively referred to as "padding sequences" in the remainder of the scheme. Each frame sequence consists of a start word with the length of 16 bits, a plurality of frames (possibly 1) and an end word with the length of 16 bits; each frame consists of a satellite synchronization word with the length of 16 bits, a mode word with the length of 8 bits, a data field with the length of integral multiple of 8 bits, an optional CRC check code and a data frame end word with the length of 16 bits.
In the PCM national military standard, the length of each field in the PCM message format is strictly limited, and the content of a part of fields is selected by a user, such as a frame sequence starting word, a frame sequence ending word and the like; and the content of the other part of the field is strictly limited, such as a satellite synchronization word, a check code and the like.
4. CCSDS protocol framework
In the national military standard "satellite measurement and control and data management subcontract remote control", the format of a standard-specified subcontract remote control protocol is shown in fig. 2.
The data stream of the standard packetized remote control protocol is composed of a plurality of frame sequences, and filling sequences are used for filling and synchronizing the frame sequences. The padding sequence consists of 8n "10"; each frame sequence consists of a start word with the length of 16 bits, a plurality of frames (possibly 1) and an end word with the length of 16 bits; each frame consists of a frame header, a segment header and a packet header which are completely fixed in format and have the lengths of 32 bits, 8 bits and 48 bits respectively, a data field with the length of 8bit integral multiple, and an optional error control code.
5. Character string statistical algorithm based on red and black trees
The Red Black Tree (Red Black Tree) is a self-balancing binary search Tree, is a machine learning algorithm used for character string statistics in computer science, and can be regarded as an improved dictionary Tree. It was invented by Rudolf Bayer in 1972, and was then called a balanced binary B-tree (symmetry binary B-trees). Later, in 1978, it was modified by Leo j.
A red-black tree is a binary search tree with each node having a color attribute, either red or black. Beyond the general requirements imposed by binary search trees, we add the following additional requirement for any valid red-black tree, property 1: the nodes are red or black; properties 2: the root node is black; properties 3: each leaf node (NIL node, empty node) is black; properties 4: both children of each red node are black; (there cannot be two consecutive red nodes on all paths from each leaf to the root); properties 5: all paths from any node to each of its leaves contain the same number of black nodes.
These constraints enforce the key property of the red-black tree that the longest possible path from the root to the leaf is no more than twice as long as the shortest possible path. The result is that the tree is substantially balanced. Since the worst-case time for operations such as inserting, deleting, and looking up a value is required to be proportional to the height of the tree, this theoretical upper bound on height allows red and black trees to be efficient at worst, unlike ordinary dictionary trees. The red-black tree, like the AVL tree, provides the best possible worst case guarantees on insertion time, deletion time, and seek time.
6. KMP character string fast matching algorithm
In the process of segmenting the frame sequence, a string exact match algorithm is used to find all the positions where the padding sequence units appear in the data stream. The KMP algorithm is an efficient string matching algorithm, which is found by d.e.knuth, j.h.morris and v.r.pratt at the same time, so it is called as knudt-morris-pratt algorithm (abbreviated as KMP algorithm). The improvement of the algorithm on a naive matching algorithm is to introduce a jump table next [ ], and by utilizing the jump table structure, the algorithm can complete matching search in linear time for any pattern and target sequence without degradation, so that the algorithm is a very excellent pattern matching algorithm.
1. How to efficiently and accurately identify the protocol type of an unknown measurement and control protocol from the bit stream data and extract the frame format content is an important subject of spacecraft measurement and control protocol research. The traditional protocol analysis methods such as the matching identification of the universal protocol feature library based on the universal standard type protocol are not suitable for the private unknown protocol any more, and the segmentation accuracy of the frame sequence data is lower. According to the unknown protocol reverse identification method based on machine learning, the accuracy of frame sequence data segmentation is improved by constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model and adopting a character string matching and statistical method in machine learning.
2. The specific implementation mode of the invention comprises the following steps:
and step 1, inputting a bit stream. Reading unknown protocol measurement and control bit stream files to be analyzed in a binary form.
And 2, performing guide sequence analysis and frame sequence analysis. And (3) cutting out the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in the step (1) by adopting a character string statistical method based on a red-black tree and a KMP algorithm in a machine learning theory.
And 3, constructing a measurement and control protocol frame sequence format model. And constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model.
And 4, identifying the protocol type. According to all the cut unknown protocol frame sequences, identifying the protocol type of the unknown protocol bit stream by utilizing the constructed PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model;
and 5, extracting frame sequence format content. And (4) extracting the format contents of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM measurement and control protocol format model and a CCSDS measurement and control protocol format model according to the protocol type identified in the step (4).
And 6, outputting an unknown protocol analysis result. And outputting the analyzed protocol type and frame sequence format content of the unknown protocol measurement and control bit stream.
The specific method in the step 2 comprises the following steps:
step 21, adopting character strings based on red and black treesAnd a statistical method is used for determining the correct guide code in the unknown protocol measurement and control bit stream. (1) Setting the total number of probes according to the total length of the measurement and control data of unknown protocol
Figure BDA0002012540510000121
Wherein L is the total length of the measurement and control data of the unknown protocol; a and b are parameters of the number of probes, wherein a is a real number which is more than 0 and less than 1, and b is an integer which is more than 1 and less than exp (10, 1-a). (2) N different probe positions are generated by constructing random numbers and a global hash table is initialized for recording alternate bootstrap codes and the number of repetitions thereof. (3) And excavating the guide code with the fixed length from each probe position, taking the guide code at the excavation as an alternative guide code, and recording the alternative guide code and the repeated occurrence number thereof in the initialized hash table. (4) And extracting the boot code which is most repeated from all the alternative boot codes recorded in the hash table to be used as the correct boot code.
And step 22, acquiring all positions of the boot codes determined in the step 21 in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching. (1) And setting a fuzzy matching editing distance threshold value, and setting the position of a guide code probe as the starting position of the unknown protocol measurement and control data. (2) The edit distance of the boot code S1 and the unknown protocol instrumentation data S2 is calculated by using a KMP string fast matching algorithm. The calculation of the editing distance is based on a dynamic programming algorithm, namely, a big problem is divided into a plurality of small problems to be solved respectively based on a programming equation, and finally, the answers of the small problems are combined into the answer of the big problem. An edit distance function, edge (i, j), is defined that represents the edit distance from a substring of length i of the first string to a substring of length j of the second string. According to the following dynamic programming formula:
Figure BDA0002012540510000122
for two character strings S1 and S2, the lengths are l1 and l2 respectively, and the edit distance of the two character strings is obtained by solving the edge (l1, l2) through a dynamic programming algorithm. (3) When the editing distance between the two character strings is smaller than the set editing distance threshold, the guide code is considered to be successfully matched, only the error code is caused by channel noise, and the position of the guide code in the source data is recorded; otherwise, the boot code matching is considered to fail, and the next step is entered. (4) And moving the position of the guide code matching probe backwards, and judging whether the position of the probe is moved to the end position of the unknown protocol bit stream character string. If yes, the step is ended. Otherwise, the operation goes to (2).
And step 23, according to the positions of the pilot codes obtained in the step 22, as the pilot sequence consists of continuous pilot codes, according to the starting and ending positions of each segment of the pilot code sequence, cutting out pilot sequences of all unknown protocol measurement and control bit streams, and then cutting out all frame sequences according to the positions of the pilot sequences.
The specific method in the step 3 comprises the following steps:
and 31, constructing a PCM type measurement and control protocol format model. According to the PCM standard specification, a PCM measurement and control protocol format model is summarized as shown in fig. 1, and includes three parts, namely a start word, a frame and an end word, wherein the frame includes a satellite synchronization word, a mode word, a data field, an end word and the like, and the bit length of each part is shown in fig. 1.
And step 32, constructing a format model of the CCSDS measurement and control protocol. According to the CCSDS standard specification, a format model of a CCSDS measurement and control protocol is summarized as shown in fig. 2, and includes three parts, i.e., a start word, a frame, and an end word, wherein the frame includes a PCM header, a TC header, a data field, and the like, and the bit length of each part is shown in fig. 2.
The specific method in the step 4 comprises the following steps:
and step 41, setting 5 arrays, and respectively recording as BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ].
And 42, intercepting 1 st to 3 rd bit positions of all frame sequences of the unknown protocol measurement and control bit stream and storing the 1 st to 3 th bit positions into a BBH (Becky-Bastile) [ ] array, storing a 4 th bit position into an LX [ ] array, storing a 5 th bit position into an FDT [ ] array, storing 6 th to 16 th bit positions into a WXSB [ ] array, and storing 17 th to 18 th bit positions into an XL [ ] array.
And 43, respectively comparing the data stored in each array, such as BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and the like, if the data stored in each array are the same, judging that the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, otherwise, judging that the unknown protocol measurement and control bit stream is a PCM type measurement and control protocol.
The concrete method of the step 5 comprises the following steps:
and step 51, selecting a measurement and control protocol format model. If the unknown protocol measurement and control bit stream is identified to be a CCSDS type measurement and control protocol in the step 4, selecting a CCSDS type measurement and control protocol format model; otherwise, selecting a PCM type measurement and control protocol format model.
And step 52, intercepting the contents of all field formats of all frame sequences of the unknown protocol measurement and control bit stream according to the length of each field specified by the measurement and control protocol format model by using the measurement and control protocol format model determined in the step 51.
As shown in fig. 4, the unknown protocol reverse recognition system based on machine learning of the present invention includes a bitstream input module, a guide sequence analysis module, a construction module, a protocol type recognition module, a frame sequence format content extraction module, and an unknown protocol analysis result output module;
the bit stream input module reads the unknown protocol measurement and control bit stream in a binary form to realize bit stream input;
and the guide sequence analysis module cuts out the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read by the bit stream input module by adopting a character string statistical method based on a red-black tree and a KMP algorithm in a machine learning theory.
And the construction module is used for constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model.
The protocol type identification module is used for identifying the protocol type of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model which are constructed by the construction module according to all the frame sequences cut out by the guide sequence analysis module;
and the frame sequence format content extraction module is used for extracting the content of each field of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified by the protocol type identification module. The unknown protocol analysis result output module and the unknown protocol measurement and control bit stream output module output the protocol type of the unknown protocol measurement and control bit stream identified by the protocol type identification module and the frame sequence format content extracted by the frame sequence format content extraction module.
The guide sequence analysis module adopts a character string statistical method based on a red-black tree and a KMP algorithm based on fuzzy matching in a machine learning theory to segment a guide sequence and a frame sequence of an unknown protocol measurement and control bit stream read by a bit stream input module, and the specific method comprises the following steps:
determining a correct guide code in the unknown protocol measurement and control bit stream by adopting a character string statistical method based on a red-black tree; then, acquiring all positions of the bootstrap code in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching; the guide sequence is composed of continuous guide codes, according to all positions of the guide codes in the unknown protocol measurement and control data, according to the starting position and the ending position of each continuous guide code, the guide sequences of all unknown protocol measurement and control bit streams are cut out, and then all frame sequences are cut out according to the positions of the guide sequences.
The construction module constructs a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model, preferably:
the method comprises the steps of constructing a PCM type measurement and control protocol format model, namely determining the PCM type measurement and control protocol format model according to PCM standard specifications, preferably sequentially comprising three parts of a start word, a frame and an end word, wherein the frame sequentially comprises a satellite synchronization word, a mode word, a data field and an end word, and setting the bit length of each part.
The method comprises the steps of constructing a CCSDS type measurement and control protocol format model, namely determining the CCSDS type measurement and control protocol format model according to a CCSDS standard specification, preferably sequentially comprising three parts of a start word, a frame and an end word, wherein the frame sequentially comprises a PCM frame header, a CCSDS remote control frame header (namely a TC header) and a data domain part, the TC header sequentially comprises a version number, a type, an auxiliary leading head mark, an application process identification part and a sequence mark part, and the bit length of each part is set.
The unknown protocol measurement and control bit stream length is more than 18 bits, and the unknown protocol measurement and control bit stream length is composed of noise bit data, a guide sequence and a frame sequence in sequence, wherein the guide sequence data preferably comprises a guide code, and the frame sequence data sequentially comprises a frame sequence header, a frame header and instruction data.
The protocol type identification module identifies the protocol type of the unknown protocol measurement and control bit stream by utilizing a PCM measurement and control protocol format model and a CCSDS measurement and control protocol format model which are constructed by the construction module according to all frame sequences segmented by the guide sequence analysis module, and the specific method comprises the following steps:
the unknown protocol measurement and control bit stream length is larger than 18 bits, preferably 5 arrays are set, and are respectively marked as BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and 5 array names sequentially correspond to the version number, the type, the auxiliary head mark, the application process identification and the sequence mark field of the CCSDS measurement and control protocol. Then preferably intercepting 1 to 3 bits of all frame sequences of the unknown protocol measurement and control bit stream and storing the bits into a BBH (Beckman-machine-tool) array, storing the 4 th bit into an LX (X-X) array, storing the 5 th bit into an FDT (frequency domain data) array, storing the bits 6 to 16 into a WXSB (WXSB) array, and storing the bits 17 to 18 into an XL (X-X) array; and finally, respectively comparing the data stored in each array in BBH [ ], LX [ ], FDT [ ], WXSB [ ], XL [ ], and if the data stored in each array are the same, judging that the protocol type of the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, otherwise, judging that the unknown protocol measurement and control bit stream is a PCM type measurement and control protocol.
The frame sequence format content extraction module extracts all frame sequence fields of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified by the protocol type identification module, and the preferred method is as follows: selecting a measurement and control protocol format model, and if the protocol type identification module identifies that the unknown protocol measurement and control bit stream is a CCSDS type measurement and control protocol, selecting the CCSDS type measurement and control protocol format model; otherwise, selecting a PCM type measurement and control protocol format model; and intercepting the contents of all fields of all frame sequences of the unknown protocol measurement and control bit stream by utilizing the determined measurement and control protocol format model according to the bit length of each field specified by the measurement and control protocol format model.
The preferred scheme of the guide sequence analysis module is as follows:
and determining a correct guide code in the unknown protocol measurement and control bit stream by adopting a character string statistical method based on the red and black tree. Firstly, setting the total number of probes according to the total length of measurement and control data of an unknown protocol
Figure BDA0002012540510000161
Wherein L is the total length of the measurement and control data of the unknown protocol; a and b are parameters of the number of probes, wherein a is a real number which is more than 0 and less than 1, and b is an integer which is more than 1 and less than exp (10, 1-a). Then N different probe positions are generated by constructing random numbers, and a global hash table is initialized for recording alternative bootstrap codes and the repeated occurrence times of the alternative bootstrap codes. And then, starting from each probe position, excavating the guide codes with fixed length, taking the guide codes at the excavated positions as alternative guide codes, and recording the alternative guide codes and the repeated occurrence times thereof in the initialized hash table. And then extracting the boot code which is most repeated from all the alternative boot codes recorded in the hash table to be used as the correct boot code.
And acquiring all positions of the boot codes in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching. Firstly, setting a fuzzy matching edit distance threshold value, and setting the position of a guide code probe as the starting position of the unknown protocol measurement and control data. Then, the edit distance between the boot code S1 and the unknown protocol measurement and control data S2 is calculated by using a KMP character string quick matching algorithm. The calculation of the editing distance is based on a dynamic programming algorithm, namely, a big problem is divided into a plurality of small problems to be solved respectively based on a programming equation, and finally, the answers of the small problems are combined into the answer of the big problem. An edit distance function, edge (i, j), is defined that represents the edit distance from a substring of length i of the first string to a substring of length j of the second string. According to the following dynamic programming formula:
Figure BDA0002012540510000171
for two character strings S1 and S2, the lengths are l1 and l2 respectively, and the bit (l1, l2) is solved through a dynamic programming algorithm to obtain two character stringsEdit distance of individual character strings. Then when the editing distance between the two character strings is smaller than the set editing distance threshold, the guide code is considered to be successfully matched, only error codes are caused by channel noise, and the position of the guide code in the source data is recorded; otherwise, the boot code matching is considered to fail, and the next step is entered. And then the position of the guide code matching probe is moved backwards, and whether the position of the probe is moved to the end position of the unknown protocol bit stream character string is judged. If yes, the step is ended. Otherwise, the quick KMP string matching algorithm is used again to calculate the edit distance between the boot code S1 and the unknown protocol measurement and control data S2.
According to the obtained position of the guide code, because the guide sequence is composed of continuous guide codes, according to the starting position and the ending position of each section of the guide code sequence, the guide sequences of all unknown protocol measurement and control bit streams are cut out, and then all frame sequences are cut out according to the position of the guide sequences.
According to the unknown protocol reverse identification method and system based on machine learning, the accuracy of frame sequence data segmentation is improved by constructing the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model and adopting a character string matching and statistical method in machine learning.
The test contrast test shows that under the conditions that the prior information is not higher than 30% and the system error rate is not higher than 1%, the method can realize that the segmentation accuracy of the protocol frame sequence is not lower than 80% and the instruction extraction accuracy is not lower than 75%, obtain better effect and effectively fill the technical blank in the field of unknown protocol reversion and analysis.
The invention provides an unknown protocol reverse identification method and system based on machine learning, which creatively constructs a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model by referring to standard PCM and CCSDS protocols, and uses the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model as prior information. By utilizing a character string statistical algorithm and a KMP character string fast matching algorithm based on a red-black tree in machine learning, an unknown protocol reverse and analysis model based on machine learning is constructed, information such as a guide sequence, a frame sequence structure, a frame structure, instruction content and the like of an unknown protocol bit stream is extracted, and reverse and analysis of the unknown protocol are realized. According to the unknown protocol reverse identification method based on machine learning, the accuracy of frame sequence data segmentation is improved by constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model and adopting a character string matching and statistical method in machine learning.
The test contrast test shows that under the conditions that the prior information is not higher than 30% and the system error rate is not higher than 1%, the method can realize that the segmentation accuracy of the protocol frame sequence is not lower than 80% and the instruction extraction accuracy is not lower than 75%, obtain better effect and effectively fill the technical blank in the field of unknown protocol reversion and analysis. The invention can be applied to the identification of unknown spacecraft measurement and control protocol types with wireless channel error codes and the extraction of frame formats.

Claims (3)

1. An unknown protocol reverse identification method based on machine learning is characterized by comprising the following steps:
step 1, reading unknown protocol measurement and control bit streams in a binary form to realize bit stream input;
step 2, adopting a character string statistical method and a KMP algorithm based on a red-black tree in a machine learning theory to segment the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in the step 1; step 2 adopts a character string statistical method based on a red-black tree and a KMP algorithm based on fuzzy matching in a machine learning theory to segment the guide sequence and the frame sequence of the unknown protocol measurement and control bit stream read in step 1, and the specific method is as follows:
step 21, determining a correct guide code in the unknown protocol measurement and control bit stream by adopting a character string statistical method based on the red-black tree;
step 22, acquiring all positions of the guide codes determined in the step 21 in the unknown protocol measurement and control data by adopting a KMP algorithm based on fuzzy matching;
step 23, the bootstrap sequence consists of continuous bootstrap codes, and according to all positions of the bootstrap codes obtained in step 22 in the measurement and control data of the unknown protocol, the bootstrap sequences of all measurement and control bit streams of the unknown protocol are cut out according to the starting and ending positions of each continuous bootstrap code, and then all frame sequences are cut out according to the positions of the bootstrap sequences;
step 3, constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model; step 3, constructing a PCM type measurement and control protocol format model and a CCSDS type measurement and control protocol format model, comprising the following steps:
step 31, constructing a PCM type measurement and control protocol format model, namely determining the PCM type measurement and control protocol format model according to the PCM standard specification;
step 32, constructing a CCSDS type measurement and control protocol format model, namely determining the CCSDS type measurement and control protocol format model according to the CCSDS standard specification;
step 4, according to all frame sequences cut out in the step 2, utilizing the PCM type measurement and control protocol format model and the CCSDS type measurement and control protocol format model constructed in the step 3 to identify the protocol type of the unknown protocol measurement and control bit stream; and 4, identifying the protocol type of the unknown protocol measurement and control bit stream by using the PCM measurement and control protocol format model and the CCSDS measurement and control protocol format model which are constructed in the step 3 according to all the frame sequences segmented in the step 2, wherein the specific method comprises the following steps:
step 41, setting a plurality of arrays;
step 42, intercepting all frame sequences of the unknown protocol measurement and control bit stream and respectively storing the frame sequences into an array;
step 43, comparing the data stored in each array in the array, if the data stored in each array are the same, determining that the protocol type of the unknown protocol measurement and control bit stream is a CCSDS measurement and control protocol, otherwise, determining that the protocol type is a PCM measurement and control protocol;
step 5, according to the protocol type identified in the step 4, extracting the content of each field of all frame sequences of the unknown protocol measurement and control bit stream by utilizing a PCM type measurement and control protocol format model or a CCSDS type measurement and control protocol format model corresponding to the identified protocol type;
and 6, outputting the protocol type of the unknown protocol measurement and control bit stream identified in the step 4 and the content of the frame sequence format extracted in the step 5.
2. The unknown protocol reverse identification method based on machine learning according to claim 1, characterized in that: and step 5, extracting all the fields of the frame sequence of the unknown protocol measurement and control bit stream by using the PCM type measurement and control protocol format model or the CCSDS type measurement and control protocol format model corresponding to the identified protocol type according to the protocol type identified in step 4, wherein the specific method comprises the following steps:
51, selecting a measurement and control protocol format model;
and step 52, intercepting the contents of all fields of all frame sequences of the unknown protocol measurement and control bit stream by using the measurement and control protocol format model determined in the step 51.
3. The unknown protocol reverse identification method based on machine learning according to claim 1, characterized in that: the unknown protocol measurement and control bit stream is composed of noise bit data, a guide sequence and a frame sequence in sequence, wherein the guide sequence data is composed of a guide code, and the frame sequence data sequentially comprises a frame sequence header, a frame header and instruction data.
CN201910251538.2A 2019-03-29 2019-03-29 Unknown protocol reverse identification method and system based on machine learning Active CN110049023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910251538.2A CN110049023B (en) 2019-03-29 2019-03-29 Unknown protocol reverse identification method and system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910251538.2A CN110049023B (en) 2019-03-29 2019-03-29 Unknown protocol reverse identification method and system based on machine learning

Publications (2)

Publication Number Publication Date
CN110049023A CN110049023A (en) 2019-07-23
CN110049023B true CN110049023B (en) 2021-11-16

Family

ID=67275651

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910251538.2A Active CN110049023B (en) 2019-03-29 2019-03-29 Unknown protocol reverse identification method and system based on machine learning

Country Status (1)

Country Link
CN (1) CN110049023B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112134896B (en) * 2020-09-27 2022-09-16 中国科学院国家天文台 Data processing method and device for neutral atom detector and storage medium
CN115334179B (en) * 2022-07-19 2023-09-01 四川大学 Unknown protocol reverse analysis method based on named entity recognition

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297427A (en) * 2013-05-21 2013-09-11 中国科学院信息工程研究所 Unknown network protocol identification method and system
CN103873317A (en) * 2012-12-18 2014-06-18 中国科学院空间科学与应用研究中心 Method and system for detecting CCSDS (consultative committee for space data system) space link protocol
CN104270392A (en) * 2014-10-24 2015-01-07 中国科学院信息工程研究所 Method and system for network protocol recognition based on tri-classifier cooperative training learning
CN104506484A (en) * 2014-11-11 2015-04-08 中国电子科技集团公司第三十研究所 Proprietary protocol analysis and identification method
WO2015085102A1 (en) * 2013-12-05 2015-06-11 Huawei Technologies Co., Ltd. System and method for non-invasive application recognition
CN106878307A (en) * 2017-02-21 2017-06-20 电子科技大学 A kind of unknown communication protocol recognition method based on bit error rate model
CN108632252A (en) * 2018-04-03 2018-10-09 中国人民解放军战略支援部队信息工程大学 A kind of private network agreement iteration conversed analysis method, apparatus and server

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL206240A0 (en) * 2010-06-08 2011-02-28 Verint Systems Ltd Systems and methods for extracting media from network traffic having unknown protocols

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103873317A (en) * 2012-12-18 2014-06-18 中国科学院空间科学与应用研究中心 Method and system for detecting CCSDS (consultative committee for space data system) space link protocol
CN103297427A (en) * 2013-05-21 2013-09-11 中国科学院信息工程研究所 Unknown network protocol identification method and system
WO2015085102A1 (en) * 2013-12-05 2015-06-11 Huawei Technologies Co., Ltd. System and method for non-invasive application recognition
CN104270392A (en) * 2014-10-24 2015-01-07 中国科学院信息工程研究所 Method and system for network protocol recognition based on tri-classifier cooperative training learning
CN104506484A (en) * 2014-11-11 2015-04-08 中国电子科技集团公司第三十研究所 Proprietary protocol analysis and identification method
CN106878307A (en) * 2017-02-21 2017-06-20 电子科技大学 A kind of unknown communication protocol recognition method based on bit error rate model
CN108632252A (en) * 2018-04-03 2018-10-09 中国人民解放军战略支援部队信息工程大学 A kind of private network agreement iteration conversed analysis method, apparatus and server

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CCSDS-TC协议的识别方法研究;郑杰;《中国优秀硕士学位论文全文数据库信息科技辑(月刊 )》;20160215;第I138-1522页 *
Towards completion of the CCSDS space data link security protocol;I. A. Sanchez etal;;《2012 IEEE Aerospace Conference》;20120310;第1-18页 *
基于半监督聚类集成的未知网络协议识别方法;林荣强 等;《小型微型计算机系统》;20161025;第1234-1239页 *
宽带卫星通信系统无线资源管理技术研究;覃落雨 等;《空间电子技术》;20170131;第25-30页 *

Also Published As

Publication number Publication date
CN110049023A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
Larsen et al. Heavy hitters via cluster-preserving clustering
CN107665191B (en) Private protocol message format inference method based on extended prefix tree
CN110049023B (en) Unknown protocol reverse identification method and system based on machine learning
CN1979478B (en) File processing system and file processing method
CN1979472A (en) File-processing system
CN107609356A (en) Text carrier-free information concealing method based on label model
CN106663088A (en) Efficient copy paste in collaborative spreadsheet
CN108984593A (en) The method that multi-format text keeps off typing and compares
CN105046159B (en) OOX text document privacy information detection methods based on modified logo symbol
CN1979511A (en) File data safety management system and method
Boreale et al. A framework for the analysis of security protocols
CN111666575B (en) Text carrier-free information hiding method based on word element coding
CN109766710A (en) The difference method for secret protection of associated social networks data
Konrad et al. Validating XML documents in the streaming model with external memory
Leão et al. Evolutionary patterns in the geographic range size of Atlantic Forest plants
Ban et al. Efficient average-case population recovery in the presence of insertions and deletions
CN102663108B (en) Medicine corporation finding method based on parallelization label propagation algorithm for complex network model
Day et al. Subsequences with gap constraints: Complexity bounds for matching and analysis problems
US11429819B2 (en) Packer classification apparatus and method using PE section information
CN104079450B (en) Feature mode set creation method and device
Chen et al. Memory-hard functions from cryptographic primitives
CN110427179A (en) MSVL programming automatic generation method and system towards intelligent contract language
CN113947374A (en) Process mining system based on causal concurrency network
Crochemore et al. A trie-based approach for compacting automata
CN100507913C (en) File processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant